0%| | 0/509 [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:57:05,460 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:57:08,382 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7734, 'learning_rate': 0.0, 'epoch': 0.0} [WARNING|modeling_utils.py:388] 2022-03-01 14:57:11,369 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%|▏ | 1/509 [00:12<1:47:19, 12.68s/it][WARNING|modeling_utils.py:388] 2022-03-01 14:57:14,388 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:57:17,350 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:57:20,306 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.9114, 'learning_rate': 6.000000000000001e-08, 'epoch': 0.0} [WARNING|modeling_utils.py:388] 2022-03-01 14:57:23,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%|▎ | 2/509 [00:24<1:44:10, 12.33s/it][WARNING|modeling_utils.py:388] 2022-03-01 14:57:26,470 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:57:29,409 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:57:32,302 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:57:35,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8558, 'learning_rate': 1.2000000000000002e-07, 'epoch': 0.01} 1%|▍ | 3/509 [00:36<1:41:53, 12.08s/it][WARNING|modeling_utils.py:388] 2022-03-01 14:57:38,124 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:57:40,993 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:57:43,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7871, 'learning_rate': 1.8e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-03-01 14:57:46,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▋ | 4/509 [00:48<1:40:02, 11.89s/it][WARNING|modeling_utils.py:388] 2022-03-01 14:57:49,775 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:57:52,652 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:57:55,581 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7695, 'learning_rate': 2.4000000000000003e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-03-01 14:57:58,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▊ | 5/509 [00:59<1:39:01, 11.79s/it][WARNING|modeling_utils.py:388] 2022-03-01 14:58:01,373 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:58:04,256 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:58:07,070 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:58:09,969 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▉ | 6/509 [01:11<1:38:04, 11.70s/it] 1%|▉ | 6/509 [01:11<1:38:04, 11.70s/it][WARNING|modeling_utils.py:388] 2022-03-01 14:58:12,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:58:15,719 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:58:18,516 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8083, 'learning_rate': 3.0000000000000004e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-03-01 14:58:21,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|█ | 7/509 [01:22<1:37:07, 11.61s/it][WARNING|modeling_utils.py:388] 2022-03-01 14:58:24,240 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:58:27,098 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:58:29,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.883, 'learning_rate': 3.6e-07, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-03-01 14:58:32,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▎ | 8/509 [01:34<1:36:17, 11.53s/it][WARNING|modeling_utils.py:388] 2022-03-01 14:58:35,636 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:58:38,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:58:41,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:58:44,095 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▍ | 9/509 [01:45<1:35:43, 11.49s/it] 2%|█▍ | 9/509 [01:45<1:35:43, 11.49s/it][WARNING|modeling_utils.py:388] 2022-03-01 14:58:47,013 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:58:49,831 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:58:52,638 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:58:55,411 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7736, 'learning_rate': 4.800000000000001e-07, 'epoch': 0.02} 2%|█▌ | 10/509 [01:56<1:35:06, 11.44s/it][WARNING|modeling_utils.py:388] 2022-03-01 14:58:58,306 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:59:01,016 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:59:03,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:59:06,497 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▋ | 11/509 [02:07<1:34:00, 11.33s/it] 2%|█▋ | 11/509 [02:07<1:34:00, 11.33s/it][WARNING|modeling_utils.py:388] 2022-03-01 14:59:09,344 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:59:12,036 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:59:14,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:59:17,495 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▉ | 12/509 [02:18<1:33:00, 11.23s/it] 2%|█▉ | 12/509 [02:18<1:33:00, 11.23s/it][WARNING|modeling_utils.py:388] 2022-03-01 14:59:20,330 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:59:23,056 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:59:25,744 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8177, 'learning_rate': 6.599999999999999e-07, 'epoch': 0.03} [WARNING|modeling_utils.py:388] 2022-03-01 14:59:28,403 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██ | 13/509 [02:29<1:32:00, 11.13s/it][WARNING|modeling_utils.py:388] 2022-03-01 14:59:31,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:59:33,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:59:36,637 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7479, 'learning_rate': 7.2e-07, 'epoch': 0.03} [WARNING|modeling_utils.py:388] 2022-03-01 14:59:39,348 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▏ | 14/509 [02:40<1:31:21, 11.07s/it][WARNING|modeling_utils.py:388] 2022-03-01 14:59:42,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:59:44,913 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:59:47,614 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6528, 'learning_rate': 7.799999999999999e-07, 'epoch': 0.03} [WARNING|modeling_utils.py:388] 2022-03-01 14:59:50,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 15/509 [02:51<1:30:48, 11.03s/it][WARNING|modeling_utils.py:388] 2022-03-01 14:59:53,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:59:55,720 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 14:59:58,324 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8289, 'learning_rate': 8.4e-07, 'epoch': 0.03} [WARNING|modeling_utils.py:388] 2022-03-01 15:00:00,959 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▌ | 16/509 [03:02<1:29:47, 10.93s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:00:03,701 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:00:06,355 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:00:09,022 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:00:11,654 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 17/509 [03:13<1:29:01, 10.86s/it] 3%|██▋ | 17/509 [03:13<1:29:01, 10.86s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:00:14,382 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:00:17,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:00:19,693 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8635, 'learning_rate': 9.600000000000001e-07, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-01 15:00:22,284 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▊ | 18/509 [03:23<1:28:17, 10.79s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:00:25,013 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:00:27,591 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:00:30,203 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6639, 'learning_rate': 1.0200000000000002e-06, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-01 15:00:32,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▉ | 19/509 [03:34<1:27:27, 10.71s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:00:35,542 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:00:38,093 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:00:40,658 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.902, 'learning_rate': 1.08e-06, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-01 15:00:43,257 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▏ | 20/509 [03:44<1:26:38, 10.63s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:00:45,926 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:00:48,531 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:00:51,103 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.83, 'learning_rate': 1.14e-06, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-01 15:00:53,717 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▎ | 21/509 [03:55<1:26:02, 10.58s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:00:56,311 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:00:58,797 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:01:01,366 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7393, 'learning_rate': 1.2000000000000002e-06, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-01 15:01:03,913 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▍ | 22/509 [04:05<1:24:56, 10.47s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:01:06,521 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:01:09,022 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:01:11,576 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7581, 'learning_rate': 1.26e-06, 'epoch': 0.05} [WARNING|modeling_utils.py:388] 2022-03-01 15:01:14,085 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▌ | 23/509 [04:15<1:24:03, 10.38s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:01:16,745 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:01:19,306 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:01:21,770 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8586, 'learning_rate': 1.3199999999999999e-06, 'epoch': 0.05} [WARNING|modeling_utils.py:388] 2022-03-01 15:01:24,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▊ | 24/509 [04:25<1:23:33, 10.34s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:01:26,900 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:01:29,350 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:01:31,821 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:01:34,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 25/509 [04:36<1:23:53, 10.40s/it] 5%|███▉ | 25/509 [04:36<1:23:53, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:01:37,499 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:01:39,973 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:01:37,499 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:01:42,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:01:37,499 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 26/509 [04:46<1:22:58, 10.31s/it]g-point operations will not be computed-01 15:01:37,499 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 26/509 [04:46<1:22:58, 10.31s/it]g-point operations will not be computed-01 15:01:37,499 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 26/509 [04:46<1:22:58, 10.31s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:01:47,484 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:01:49,961 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:01:47,484 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:01:52,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:01:47,484 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:01:47,484 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:01:47,484 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▏ | 27/509 [04:56<1:21:47, 10.18s/it]g-point operations will not be computed-01 15:01:47,484 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▏ | 27/509 [04:56<1:21:47, 10.18s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:01:57,425 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:01:59,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:01:57,425 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:02:02,302 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:01:57,425 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 28/509 [05:06<1:20:52, 10.09s/it]g-point operations will not be computed-01 15:01:57,425 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 28/509 [05:06<1:20:52, 10.09s/it]g-point operations will not be computed-01 15:01:57,425 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 28/509 [05:06<1:20:52, 10.09s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:02:07,227 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:02:09,677 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:02:07,227 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:02:12,135 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:02:07,227 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 29/509 [05:15<1:20:07, 10.02s/it]g-point operations will not be computed-01 15:02:07,227 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 29/509 [05:15<1:20:07, 10.02s/it]g-point operations will not be computed-01 15:02:07,227 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 29/509 [05:15<1:20:07, 10.02s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:02:17,079 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:02:19,529 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:02:17,079 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:02:21,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:02:17,079 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:02:21,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:02:17,079 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 30/509 [05:25<1:19:15, 9.93s/it]g-point operations will not be computed-01 15:02:17,079 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 30/509 [05:25<1:19:15, 9.93s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:02:26,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:02:29,151 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:02:26,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:02:31,518 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:02:26,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:02:31,518 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:02:26,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 31/509 [05:35<1:18:12, 9.82s/it]g-point operations will not be computed-01 15:02:26,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 31/509 [05:35<1:18:12, 9.82s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:02:36,326 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:02:38,694 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:02:36,326 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:02:41,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:02:36,326 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:02:41,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:02:36,326 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████ | 32/509 [05:44<1:17:19, 9.73s/it]g-point operations will not be computed-01 15:02:36,326 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████ | 32/509 [05:44<1:17:19, 9.73s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:02:45,836 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:02:48,144 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:02:45,836 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:02:50,435 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:02:45,836 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████▏ | 33/509 [05:54<1:16:22, 9.63s/it]g-point operations will not be computed-01 15:02:45,836 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████▏ | 33/509 [05:54<1:16:22, 9.63s/it]g-point operations will not be computed-01 15:02:45,836 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████▏ | 33/509 [05:54<1:16:22, 9.63s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:02:55,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:02:57,420 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:02:55,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:02:59,689 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:02:55,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:02:59,689 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:02:55,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▎ | 34/509 [06:03<1:15:05, 9.48s/it]g-point operations will not be computed-01 15:02:55,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▎ | 34/509 [06:03<1:15:05, 9.48s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:03:04,269 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:03:06,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:03:04,269 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:03:08,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:03:04,269 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:03:08,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:03:04,269 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 35/509 [06:12<1:13:38, 9.32s/it]g-point operations will not be computed-01 15:03:04,269 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 35/509 [06:12<1:13:38, 9.32s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:03:13,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:03:15,367 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:03:13,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:03:17,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:03:13,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:03:13,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:03:13,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 36/509 [06:20<1:12:10, 9.16s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:03:21,875 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:03:23,994 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:03:21,875 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:03:26,123 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:03:21,875 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:03:28,181 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:03:21,875 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:03:28,181 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:03:21,875 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▊ | 37/509 [06:29<1:10:36, 8.98s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:03:30,371 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:03:32,466 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:03:30,371 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:03:34,539 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:03:30,371 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 38/509 [06:37<1:09:09, 8.81s/it]g-point operations will not be computed-01 15:03:30,371 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 38/509 [06:37<1:09:09, 8.81s/it]g-point operations will not be computed-01 15:03:30,371 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 38/509 [06:37<1:09:09, 8.81s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:03:38,717 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:03:40,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:03:38,717 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:03:42,666 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:03:38,717 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 39/509 [06:45<1:07:04, 8.56s/it]g-point operations will not be computed-01 15:03:38,717 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 39/509 [06:45<1:07:04, 8.56s/it]g-point operations will not be computed-01 15:03:38,717 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 39/509 [06:45<1:07:04, 8.56s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:03:46,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:03:48,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:03:46,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:03:50,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:03:46,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 40/509 [06:53<1:04:48, 8.29s/it]g-point operations will not be computed-01 15:03:46,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 40/509 [06:53<1:04:48, 8.29s/it]g-point operations will not be computed-01 15:03:46,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 40/509 [06:53<1:04:48, 8.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:03:54,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:03:56,014 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:03:54,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:03:59,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:03:54,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:03:59,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:03:54,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▍ | 41/509 [07:00<1:02:16, 7.98s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:04:01,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:04:03,022 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:04:01,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:04:04,720 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:04:01,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:04:04,720 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:04:01,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▊ | 42/509 [07:07<59:34, 7.65s/it]g-point operations will not be computed-01 15:04:01,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:04:09,694 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:04:08,133 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:04:09,694 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:04:08,133 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:04:11,238 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4273, 'learning_rate': 2.46e-06, 'epoch': 0.08} [WARNING|modeling_utils.py:388] 2022-03-01 15:04:12,723 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▉ | 43/509 [07:14<56:20, 7.25s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:04:14,340 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:04:15,796 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:04:17,189 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:04:14,340 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:04:17,189 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:04:14,340 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 44/509 [07:19<52:55, 6.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:04:20,014 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 44/509 [07:19<52:55, 6.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:04:20,014 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▏ | 45/509 [07:25<49:06, 6.35s/it]g-point operations will not be computed-01 15:04:20,014 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▏ | 45/509 [07:25<49:06, 6.35s/it]g-point operations will not be computed-01 15:04:20,014 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▏ | 45/509 [07:25<49:06, 6.35s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:04:25,111 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:04:27,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:04:25,111 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:04:27,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:04:25,111 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▍ | 46/509 [07:29<45:09, 5.85s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:04:29,659 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:04:31,641 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:04:29,659 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:04:31,641 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:04:29,659 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▌ | 47/509 [07:33<40:59, 5.32s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:04:33,609 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:04:33,609 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:04:33,609 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▋ | 48/509 [07:37<36:59, 4.81s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:04:37,130 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▋ | 48/509 [07:37<36:59, 4.81s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:04:37,130 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 49/509 [07:40<33:05, 4.32s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:04:40,139 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:04:41,384 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:04:40,139 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:04:41,384 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:04:40,139 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 50/509 [07:43<30:06, 3.94s/it]g-point operations will not be computed-01 15:04:40,139 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 50/509 [07:43<30:06, 3.94s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:04:45,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 50/509 [07:43<30:06, 3.94s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:04:45,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:04:51,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:04:45,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:04:51,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:04:45,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 51/509 [07:56<49:06, 6.43s/it]g-point operations will not be computed-01 15:04:45,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 51/509 [07:56<49:06, 6.43s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:04:57,664 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 51/509 [07:56<49:06, 6.43s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:04:57,664 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:05:03,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:04:57,664 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 52/509 [08:07<1:00:51, 7.99s/it]g-point operations will not be computed-01 15:04:57,664 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 52/509 [08:07<1:00:51, 7.99s/it]g-point operations will not be computed-01 15:04:57,664 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 52/509 [08:07<1:00:51, 7.99s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:05:09,325 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 52/509 [08:07<1:00:51, 7.99s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:05:09,325 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:05:15,097 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:05:09,325 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▎ | 53/509 [08:19<1:09:03, 9.09s/it]g-point operations will not be computed-01 15:05:09,325 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▎ | 53/509 [08:19<1:09:03, 9.09s/it]g-point operations will not be computed-01 15:05:09,325 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▎ | 53/509 [08:19<1:09:03, 9.09s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:05:20,959 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▎ | 53/509 [08:19<1:09:03, 9.09s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:05:20,959 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:05:26,652 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:05:20,959 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 54/509 [08:30<1:14:29, 9.82s/it]g-point operations will not be computed-01 15:05:20,959 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 54/509 [08:30<1:14:29, 9.82s/it]g-point operations will not be computed-01 15:05:20,959 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 54/509 [08:30<1:14:29, 9.82s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:05:32,469 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 54/509 [08:30<1:14:29, 9.82s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:05:32,469 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:05:38,093 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:05:32,469 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 55/509 [08:42<1:18:04, 10.32s/it]g-point operations will not be computed-01 15:05:32,469 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 55/509 [08:42<1:18:04, 10.32s/it]g-point operations will not be computed-01 15:05:32,469 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 55/509 [08:42<1:18:04, 10.32s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:05:43,871 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 55/509 [08:42<1:18:04, 10.32s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:05:43,871 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:05:49,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:05:43,871 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:05:43,871 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:05:43,871 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 56/509 [08:53<1:20:24, 10.65s/it]g-point operations will not be computed-01 15:05:43,871 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 56/509 [08:53<1:20:24, 10.65s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:05:55,333 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 56/509 [08:53<1:20:24, 10.65s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:05:55,333 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:06:00,918 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:05:55,333 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 57/509 [09:05<1:21:48, 10.86s/it]g-point operations will not be computed-01 15:05:55,333 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 57/509 [09:05<1:21:48, 10.86s/it]g-point operations will not be computed-01 15:05:55,333 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 57/509 [09:05<1:21:48, 10.86s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:06:06,598 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 57/509 [09:05<1:21:48, 10.86s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:06:06,598 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:06:12,275 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:06:06,598 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:06:12,275 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:06:06,598 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████ | 58/509 [09:16<1:22:40, 11.00s/it]g-point operations will not be computed-01 15:06:06,598 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████ | 58/509 [09:16<1:22:40, 11.00s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:06:17,983 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████ | 58/509 [09:16<1:22:40, 11.00s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:06:17,983 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:06:23,553 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:06:17,983 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:06:23,553 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:06:17,983 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 59/509 [09:27<1:23:09, 11.09s/it]g-point operations will not be computed-01 15:06:17,983 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 59/509 [09:27<1:23:09, 11.09s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:06:29,223 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 59/509 [09:27<1:23:09, 11.09s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:06:29,223 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:06:34,730 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:06:29,223 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 60/509 [09:38<1:23:15, 11.13s/it]g-point operations will not be computed-01 15:06:29,223 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 60/509 [09:38<1:23:15, 11.13s/it]g-point operations will not be computed-01 15:06:29,223 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 60/509 [09:38<1:23:15, 11.13s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:06:40,360 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 60/509 [09:38<1:23:15, 11.13s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:06:40,360 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:06:45,857 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:06:40,360 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:06:45,857 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:06:40,360 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▌ | 61/509 [09:49<1:22:56, 11.11s/it]g-point operations will not be computed-01 15:06:40,360 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▌ | 61/509 [09:49<1:22:56, 11.11s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:06:51,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▌ | 61/509 [09:49<1:22:56, 11.11s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:06:51,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:06:56,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:06:51,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 62/509 [10:00<1:22:17, 11.05s/it]g-point operations will not be computed-01 15:06:51,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 62/509 [10:00<1:22:17, 11.05s/it]g-point operations will not be computed-01 15:06:51,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 62/509 [10:00<1:22:17, 11.05s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:07:02,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 62/509 [10:00<1:22:17, 11.05s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:07:02,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:07:07,675 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:07:02,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▉ | 63/509 [10:11<1:21:40, 10.99s/it]g-point operations will not be computed-01 15:07:02,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▉ | 63/509 [10:11<1:21:40, 10.99s/it]g-point operations will not be computed-01 15:07:02,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▉ | 63/509 [10:11<1:21:40, 10.99s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:07:13,178 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▉ | 63/509 [10:11<1:21:40, 10.99s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:07:13,178 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:07:18,529 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:07:13,178 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:07:18,529 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:07:13,178 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:07:18,529 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:07:13,178 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████ | 64/509 [10:22<1:20:57, 10.92s/it]g-point operations will not be computed-01 15:07:13,178 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████ | 64/509 [10:22<1:20:57, 10.92s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:07:23,927 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████ | 64/509 [10:22<1:20:57, 10.92s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:07:23,927 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:07:29,304 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:07:23,927 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 65/509 [10:33<1:20:33, 10.89s/it]g-point operations will not be computed-01 15:07:23,927 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 65/509 [10:33<1:20:33, 10.89s/it]g-point operations will not be computed-01 15:07:23,927 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 65/509 [10:33<1:20:33, 10.89s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:07:34,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 65/509 [10:33<1:20:33, 10.89s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:07:34,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:07:40,027 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:07:34,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 66/509 [10:43<1:19:56, 10.83s/it]g-point operations will not be computed-01 15:07:34,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 66/509 [10:43<1:19:56, 10.83s/it]g-point operations will not be computed-01 15:07:34,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 66/509 [10:43<1:19:56, 10.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:07:45,372 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 66/509 [10:43<1:19:56, 10.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:07:45,372 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:07:50,640 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:07:45,372 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▌ | 67/509 [10:54<1:19:17, 10.76s/it]g-point operations will not be computed-01 15:07:45,372 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▌ | 67/509 [10:54<1:19:17, 10.76s/it]g-point operations will not be computed-01 15:07:45,372 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▌ | 67/509 [10:54<1:19:17, 10.76s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:07:55,983 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▌ | 67/509 [10:54<1:19:17, 10.76s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:07:55,983 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:08:01,166 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:07:55,983 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:08:01,166 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:07:55,983 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▋ | 68/509 [11:05<1:18:41, 10.71s/it]g-point operations will not be computed-01 15:07:55,983 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▋ | 68/509 [11:05<1:18:41, 10.71s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:06,509 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▋ | 68/509 [11:05<1:18:41, 10.71s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:06,509 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:08:11,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:08:06,509 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▊ | 69/509 [11:15<1:18:07, 10.65s/it]g-point operations will not be computed-01 15:08:06,509 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▊ | 69/509 [11:15<1:18:07, 10.65s/it]g-point operations will not be computed-01 15:08:06,509 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▊ | 69/509 [11:15<1:18:07, 10.65s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:17,046 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▊ | 69/509 [11:15<1:18:07, 10.65s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:17,046 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:08:22,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:08:17,046 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 70/509 [11:26<1:17:17, 10.56s/it]g-point operations will not be computed-01 15:08:17,046 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 70/509 [11:26<1:17:17, 10.56s/it]g-point operations will not be computed-01 15:08:17,046 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 70/509 [11:26<1:17:17, 10.56s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:27,352 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 70/509 [11:26<1:17:17, 10.56s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:27,352 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:08:32,454 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:08:27,352 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 71/509 [11:36<1:16:24, 10.47s/it]g-point operations will not be computed-01 15:08:27,352 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 71/509 [11:36<1:16:24, 10.47s/it]g-point operations will not be computed-01 15:08:27,352 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 71/509 [11:36<1:16:24, 10.47s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:37,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 71/509 [11:36<1:16:24, 10.47s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:37,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:08:42,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:08:37,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▎ | 72/509 [11:46<1:15:41, 10.39s/it]g-point operations will not be computed-01 15:08:37,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▎ | 72/509 [11:46<1:15:41, 10.39s/it]g-point operations will not be computed-01 15:08:37,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▎ | 72/509 [11:46<1:15:41, 10.39s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:47,757 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▎ | 72/509 [11:46<1:15:41, 10.39s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:47,757 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:08:52,734 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:08:47,757 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 73/509 [11:56<1:14:50, 10.30s/it]g-point operations will not be computed-01 15:08:47,757 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 73/509 [11:56<1:14:50, 10.30s/it]g-point operations will not be computed-01 15:08:47,757 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 73/509 [11:56<1:14:50, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 73/509 [11:56<1:14:50, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 73/509 [11:56<1:14:50, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▋ | 74/509 [12:06<1:14:20, 10.25s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▋ | 74/509 [12:06<1:14:20, 10.25s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4239, 'learning_rate': 4.26e-06, 'epoch': 0.15} 15%|███████████▋ | 74/509 [12:06<1:14:20, 10.25s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▋ | 74/509 [12:06<1:14:20, 10.25s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 75/509 [12:17<1:14:29, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 75/509 [12:17<1:14:29, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4061, 'learning_rate': 4.32e-06, 'epoch': 0.15} 15%|███████████▊ | 75/509 [12:17<1:14:29, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 75/509 [12:17<1:14:29, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 76/509 [12:27<1:13:34, 10.19s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 76/509 [12:27<1:13:34, 10.19s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3683, 'learning_rate': 4.3799999999999996e-06, 'epoch': 0.15} 15%|███████████▉ | 76/509 [12:27<1:13:34, 10.19s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 76/509 [12:27<1:13:34, 10.19s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████ | 77/509 [12:36<1:12:30, 10.07s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████ | 77/509 [12:36<1:12:30, 10.07s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2865, 'learning_rate': 4.44e-06, 'epoch': 0.15} 15%|████████████ | 77/509 [12:36<1:12:30, 10.07s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████ | 77/509 [12:36<1:12:30, 10.07s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▎ | 78/509 [12:46<1:11:37, 9.97s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▎ | 78/509 [12:46<1:11:37, 9.97s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3959, 'learning_rate': 4.5e-06, 'epoch': 0.15} 15%|████████████▎ | 78/509 [12:46<1:11:37, 9.97s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▎ | 78/509 [12:46<1:11:37, 9.97s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▍ | 79/509 [12:56<1:10:28, 9.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▍ | 79/509 [12:56<1:10:28, 9.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2603, 'learning_rate': 4.56e-06, 'epoch': 0.15} 16%|████████████▍ | 79/509 [12:56<1:10:28, 9.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▍ | 79/509 [12:56<1:10:28, 9.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▌ | 80/509 [13:05<1:09:40, 9.74s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▌ | 80/509 [13:05<1:09:40, 9.74s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4423, 'learning_rate': 4.62e-06, 'epoch': 0.16} 16%|████████████▌ | 80/509 [13:05<1:09:40, 9.74s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▌ | 80/509 [13:05<1:09:40, 9.74s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▋ | 81/509 [13:15<1:09:14, 9.71s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▋ | 81/509 [13:15<1:09:14, 9.71s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3879, 'learning_rate': 4.68e-06, 'epoch': 0.16} 16%|████████████▋ | 81/509 [13:15<1:09:14, 9.71s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▋ | 81/509 [13:15<1:09:14, 9.71s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▋ | 81/509 [13:15<1:09:14, 9.71s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▉ | 82/509 [13:24<1:08:11, 9.58s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▉ | 82/509 [13:24<1:08:11, 9.58s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▉ | 82/509 [13:24<1:08:11, 9.58s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▉ | 82/509 [13:24<1:08:11, 9.58s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▉ | 82/509 [13:24<1:08:11, 9.58s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████ | 83/509 [13:33<1:07:22, 9.49s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:10:37,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:10:37,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:10:37,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▏ | 84/509 [13:42<1:06:06, 9.33s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▏ | 84/509 [13:42<1:06:06, 9.33s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▏ | 84/509 [13:42<1:06:06, 9.33s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▏ | 84/509 [13:42<1:06:06, 9.33s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▎ | 85/509 [13:51<1:04:51, 9.18s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▎ | 85/509 [13:51<1:04:51, 9.18s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2755, 'learning_rate': 4.92e-06, 'epoch': 0.17} 17%|█████████████▎ | 85/509 [13:51<1:04:51, 9.18s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▎ | 85/509 [13:51<1:04:51, 9.18s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▌ | 86/509 [14:00<1:03:42, 9.04s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▌ | 86/509 [14:00<1:03:42, 9.04s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2691, 'learning_rate': 4.980000000000001e-06, 'epoch': 0.17} 17%|█████████████▌ | 86/509 [14:00<1:03:42, 9.04s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▌ | 86/509 [14:00<1:03:42, 9.04s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 87/509 [14:08<1:02:24, 8.87s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 87/509 [14:08<1:02:24, 8.87s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4937, 'learning_rate': 5.04e-06, 'epoch': 0.17} 17%|█████████████▋ | 87/509 [14:08<1:02:24, 8.87s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 87/509 [14:08<1:02:24, 8.87s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 88/509 [14:17<1:01:03, 8.70s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 88/509 [14:17<1:01:03, 8.70s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3326, 'learning_rate': 5.1e-06, 'epoch': 0.17} 17%|█████████████▊ | 88/509 [14:17<1:01:03, 8.70s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 88/509 [14:17<1:01:03, 8.70s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|██████████████▎ | 89/509 [14:25<59:30, 8.50s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|██████████████▎ | 89/509 [14:25<59:30, 8.50s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3345, 'learning_rate': 5.16e-06, 'epoch': 0.17} 17%|██████████████▎ | 89/509 [14:25<59:30, 8.50s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|██████████████▎ | 89/509 [14:25<59:30, 8.50s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▍ | 90/509 [14:32<57:38, 8.25s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▍ | 90/509 [14:32<57:38, 8.25s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3882, 'learning_rate': 5.22e-06, 'epoch': 0.18} 18%|██████████████▍ | 90/509 [14:32<57:38, 8.25s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▍ | 90/509 [14:32<57:38, 8.25s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▍ | 90/509 [14:32<57:38, 8.25s/it]g-point operations will not be computed-01 15:08:57,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▋ | 91/509 [14:40<55:13, 7.93s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:11:40,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▋ | 91/509 [14:40<55:13, 7.93s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:11:40,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▋ | 91/509 [14:40<55:13, 7.93s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:11:40,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▊ | 92/509 [14:46<52:34, 7.56s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:11:40,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▊ | 92/509 [14:46<52:34, 7.56s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:11:40,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:11:48,595 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:40,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:11:48,595 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:40,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:11:48,595 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:40,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▉ | 93/509 [14:52<49:40, 7.16s/it]g-point operations will not be computed-01 15:11:40,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:11:54,576 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:40,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:11:54,576 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:40,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:11:54,576 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:40,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|███████████████▏ | 94/509 [14:58<46:28, 6.72s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|███████████████▏ | 94/509 [14:58<46:28, 6.72s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|███████████████▏ | 94/509 [14:58<46:28, 6.72s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:12:02,319 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:12:04,757 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:12:04,757 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:12:06,986 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:12:09,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:12:11,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:12:11,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:12:13,140 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:12:14,830 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:12:14,830 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:12:16,539 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:12:16,539 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:12:19,460 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:12:21,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:12:21,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3133, 'learning_rate': 5.82e-06, 'epoch': 0.2} [WARNING|modeling_utils.py:388] 2022-03-01 15:12:21,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:12:21,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:12:21,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:12:21,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████ | 101/509 [15:34<43:29, 6.40s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████ | 101/509 [15:34<43:29, 6.40s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████ | 101/509 [15:34<43:29, 6.40s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████ | 101/509 [15:34<43:29, 6.40s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████ | 101/509 [15:34<43:29, 6.40s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 102/509 [15:46<54:13, 7.99s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 102/509 [15:46<54:13, 7.99s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 102/509 [15:46<54:13, 7.99s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 102/509 [15:46<54:13, 7.99s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 102/509 [15:46<54:13, 7.99s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 102/509 [15:46<54:13, 7.99s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1286, 'learning_rate': 6e-06, 'epoch': 0.2} 20%|████████████████▏ | 102/509 [15:46<54:13, 7.99s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 102/509 [15:46<54:13, 7.99s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 102/509 [15:46<54:13, 7.99s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 104/509 [16:09<1:06:16, 9.82s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 104/509 [16:09<1:06:16, 9.82s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1071, 'learning_rate': 6.0600000000000004e-06, 'epoch': 0.2} 20%|████████████████▏ | 104/509 [16:09<1:06:16, 9.82s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 104/509 [16:09<1:06:16, 9.82s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 104/509 [16:09<1:06:16, 9.82s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▎ | 105/509 [16:20<1:09:14, 10.28s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▎ | 105/509 [16:20<1:09:14, 10.28s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▎ | 105/509 [16:20<1:09:14, 10.28s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▎ | 105/509 [16:20<1:09:14, 10.28s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 106/509 [16:32<1:11:27, 10.64s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 106/509 [16:32<1:11:27, 10.64s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1192, 'learning_rate': 6.18e-06, 'epoch': 0.21} 21%|████████████████▍ | 106/509 [16:32<1:11:27, 10.64s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 106/509 [16:32<1:11:27, 10.64s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▌ | 107/509 [16:43<1:12:34, 10.83s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▌ | 107/509 [16:43<1:12:34, 10.83s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2128, 'learning_rate': 6.2399999999999995e-06, 'epoch': 0.21} 21%|████████████████▌ | 107/509 [16:43<1:12:34, 10.83s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▌ | 107/509 [16:43<1:12:34, 10.83s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▌ | 107/509 [16:43<1:12:34, 10.83s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▊ | 108/509 [16:54<1:13:14, 10.96s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▊ | 108/509 [16:54<1:13:14, 10.96s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▊ | 108/509 [16:54<1:13:14, 10.96s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▊ | 108/509 [16:54<1:13:14, 10.96s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▊ | 108/509 [16:54<1:13:14, 10.96s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 109/509 [17:05<1:13:20, 11.00s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 109/509 [17:05<1:13:20, 11.00s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 109/509 [17:05<1:13:20, 11.00s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 109/509 [17:05<1:13:20, 11.00s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 109/509 [17:05<1:13:20, 11.00s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████ | 110/509 [17:17<1:13:37, 11.07s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████ | 110/509 [17:17<1:13:37, 11.07s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████ | 110/509 [17:17<1:13:37, 11.07s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████ | 110/509 [17:17<1:13:37, 11.07s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████ | 110/509 [17:17<1:13:37, 11.07s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 111/509 [17:28<1:13:29, 11.08s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 111/509 [17:28<1:13:29, 11.08s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 111/509 [17:28<1:13:29, 11.08s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 111/509 [17:28<1:13:29, 11.08s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▍ | 112/509 [17:39<1:13:02, 11.04s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▍ | 112/509 [17:39<1:13:02, 11.04s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1725, 'learning_rate': 6.54e-06, 'epoch': 0.22} 22%|█████████████████▍ | 112/509 [17:39<1:13:02, 11.04s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▍ | 112/509 [17:39<1:13:02, 11.04s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▍ | 112/509 [17:39<1:13:02, 11.04s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▍ | 112/509 [17:39<1:13:02, 11.04s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▌ | 113/509 [17:50<1:12:26, 10.98s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▌ | 113/509 [17:50<1:12:26, 10.98s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▌ | 113/509 [17:50<1:12:26, 10.98s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▌ | 113/509 [17:50<1:12:26, 10.98s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▌ | 113/509 [17:50<1:12:26, 10.98s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▋ | 114/509 [18:00<1:11:52, 10.92s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▋ | 114/509 [18:00<1:11:52, 10.92s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▋ | 114/509 [18:00<1:11:52, 10.92s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▋ | 114/509 [18:00<1:11:52, 10.92s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▊ | 115/509 [18:11<1:11:24, 10.87s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▊ | 115/509 [18:11<1:11:24, 10.87s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2335, 'learning_rate': 6.72e-06, 'epoch': 0.23} 23%|█████████████████▊ | 115/509 [18:11<1:11:24, 10.87s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▊ | 115/509 [18:11<1:11:24, 10.87s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 116/509 [18:22<1:11:01, 10.84s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 116/509 [18:22<1:11:01, 10.84s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3098, 'learning_rate': 6.78e-06, 'epoch': 0.23} 23%|██████████████████ | 116/509 [18:22<1:11:01, 10.84s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 116/509 [18:22<1:11:01, 10.84s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 116/509 [18:22<1:11:01, 10.84s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▏ | 117/509 [18:33<1:10:20, 10.77s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▏ | 117/509 [18:33<1:10:20, 10.77s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▏ | 117/509 [18:33<1:10:20, 10.77s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▏ | 117/509 [18:33<1:10:20, 10.77s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▏ | 117/509 [18:33<1:10:20, 10.77s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▎ | 118/509 [18:43<1:09:48, 10.71s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▎ | 118/509 [18:43<1:09:48, 10.71s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▎ | 118/509 [18:43<1:09:48, 10.71s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▎ | 118/509 [18:43<1:09:48, 10.71s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▍ | 119/509 [18:54<1:09:16, 10.66s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▍ | 119/509 [18:54<1:09:16, 10.66s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2016, 'learning_rate': 6.96e-06, 'epoch': 0.23} 23%|██████████████████▍ | 119/509 [18:54<1:09:16, 10.66s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▍ | 119/509 [18:54<1:09:16, 10.66s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▌ | 120/509 [19:04<1:08:32, 10.57s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▌ | 120/509 [19:04<1:08:32, 10.57s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.205, 'learning_rate': 7.0200000000000006e-06, 'epoch': 0.24} 24%|██████████████████▌ | 120/509 [19:04<1:08:32, 10.57s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▌ | 120/509 [19:04<1:08:32, 10.57s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▌ | 120/509 [19:04<1:08:32, 10.57s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 121/509 [19:14<1:07:58, 10.51s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 121/509 [19:14<1:07:58, 10.51s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 121/509 [19:14<1:07:58, 10.51s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 121/509 [19:14<1:07:58, 10.51s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 121/509 [19:14<1:07:58, 10.51s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▉ | 122/509 [19:25<1:07:20, 10.44s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▉ | 122/509 [19:25<1:07:20, 10.44s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▉ | 122/509 [19:25<1:07:20, 10.44s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▉ | 122/509 [19:25<1:07:20, 10.44s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▉ | 122/509 [19:25<1:07:20, 10.44s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 123/509 [19:35<1:06:36, 10.35s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 123/509 [19:35<1:06:36, 10.35s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 123/509 [19:35<1:06:36, 10.35s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 123/509 [19:35<1:06:36, 10.35s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 123/509 [19:35<1:06:36, 10.35s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▏ | 124/509 [19:45<1:06:00, 10.29s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▏ | 124/509 [19:45<1:06:00, 10.29s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▏ | 124/509 [19:45<1:06:00, 10.29s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▏ | 124/509 [19:45<1:06:00, 10.29s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▏ | 124/509 [19:45<1:06:00, 10.29s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▍ | 125/509 [19:56<1:06:18, 10.36s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▍ | 125/509 [19:56<1:06:18, 10.36s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▍ | 125/509 [19:56<1:06:18, 10.36s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▍ | 125/509 [19:56<1:06:18, 10.36s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▍ | 125/509 [19:56<1:06:18, 10.36s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▌ | 126/509 [20:06<1:05:33, 10.27s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▌ | 126/509 [20:06<1:05:33, 10.27s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▌ | 126/509 [20:06<1:05:33, 10.27s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▌ | 126/509 [20:06<1:05:33, 10.27s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▌ | 126/509 [20:06<1:05:33, 10.27s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 127/509 [20:15<1:04:39, 10.16s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 127/509 [20:15<1:04:39, 10.16s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 127/509 [20:15<1:04:39, 10.16s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 127/509 [20:15<1:04:39, 10.16s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 128/509 [20:25<1:03:49, 10.05s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 128/509 [20:25<1:03:49, 10.05s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2016, 'learning_rate': 7.5e-06, 'epoch': 0.25} 25%|███████████████████▊ | 128/509 [20:25<1:03:49, 10.05s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 128/509 [20:25<1:03:49, 10.05s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 128/509 [20:25<1:03:49, 10.05s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████ | 129/509 [20:35<1:02:47, 9.91s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████ | 129/509 [20:35<1:02:47, 9.91s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████ | 129/509 [20:35<1:02:47, 9.91s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████ | 129/509 [20:35<1:02:47, 9.91s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████ | 129/509 [20:35<1:02:47, 9.91s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████ | 129/509 [20:35<1:02:47, 9.91s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.297, 'learning_rate': 7.62e-06, 'epoch': 0.26} 25%|████████████████████ | 129/509 [20:35<1:02:47, 9.91s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████ | 129/509 [20:35<1:02:47, 9.91s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████ | 129/509 [20:35<1:02:47, 9.91s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▎ | 131/509 [20:54<1:01:08, 9.71s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▎ | 131/509 [20:54<1:01:08, 9.71s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.187, 'learning_rate': 7.680000000000001e-06, 'epoch': 0.26} 26%|████████████████████▎ | 131/509 [20:54<1:01:08, 9.71s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▎ | 131/509 [20:54<1:01:08, 9.71s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▎ | 131/509 [20:54<1:01:08, 9.71s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 132/509 [21:03<1:00:29, 9.63s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 132/509 [21:03<1:00:29, 9.63s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 132/509 [21:03<1:00:29, 9.63s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 132/509 [21:03<1:00:29, 9.63s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████▏ | 133/509 [21:12<59:33, 9.50s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████▏ | 133/509 [21:12<59:33, 9.50s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2263, 'learning_rate': 7.8e-06, 'epoch': 0.26} 26%|█████████████████████▏ | 133/509 [21:12<59:33, 9.50s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████▏ | 133/509 [21:12<59:33, 9.50s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████▏ | 133/509 [21:12<59:33, 9.50s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████▎ | 134/509 [21:22<58:30, 9.36s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████▎ | 134/509 [21:22<58:30, 9.36s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████▎ | 134/509 [21:22<58:30, 9.36s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████▎ | 134/509 [21:22<58:30, 9.36s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▍ | 135/509 [21:30<57:23, 9.21s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▍ | 135/509 [21:30<57:23, 9.21s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2737, 'learning_rate': 7.92e-06, 'epoch': 0.26} 27%|█████████████████████▍ | 135/509 [21:30<57:23, 9.21s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▍ | 135/509 [21:30<57:23, 9.21s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▋ | 136/509 [21:39<56:22, 9.07s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▋ | 136/509 [21:39<56:22, 9.07s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3099, 'learning_rate': 7.98e-06, 'epoch': 0.27} 27%|█████████████████████▋ | 136/509 [21:39<56:22, 9.07s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▋ | 136/509 [21:39<56:22, 9.07s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▊ | 137/509 [21:47<54:57, 8.86s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▊ | 137/509 [21:47<54:57, 8.86s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3116, 'learning_rate': 8.040000000000001e-06, 'epoch': 0.27} 27%|█████████████████████▊ | 137/509 [21:47<54:57, 8.86s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▊ | 137/509 [21:47<54:57, 8.86s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▊ | 137/509 [21:47<54:57, 8.86s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▉ | 138/509 [21:56<53:32, 8.66s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▉ | 138/509 [21:56<53:32, 8.66s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▉ | 138/509 [21:56<53:32, 8.66s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▉ | 138/509 [21:56<53:32, 8.66s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▉ | 138/509 [21:56<53:32, 8.66s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████ | 139/509 [22:04<52:07, 8.45s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████ | 139/509 [22:04<52:07, 8.45s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████ | 139/509 [22:04<52:07, 8.45s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████ | 139/509 [22:04<52:07, 8.45s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████ | 139/509 [22:04<52:07, 8.45s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 140/509 [22:11<50:28, 8.21s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 140/509 [22:11<50:28, 8.21s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 140/509 [22:11<50:28, 8.21s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 140/509 [22:11<50:28, 8.21s/it]g-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:17,521 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:17,521 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:17,521 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:17,521 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:17,521 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:11:58,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▌ | 142/509 [22:25<46:01, 7.53s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:19:25,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▌ | 142/509 [22:25<46:01, 7.53s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:19:25,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▌ | 142/509 [22:25<46:01, 7.53s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:19:25,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▌ | 142/509 [22:25<46:01, 7.53s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:19:25,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▊ | 143/509 [22:31<43:29, 7.13s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▊ | 143/509 [22:31<43:29, 7.13s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:36,094 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:36,094 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2936, 'learning_rate': 8.459999999999999e-06, 'epoch': 0.28} [WARNING|modeling_utils.py:388] 2022-03-01 15:19:40,013 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:40,013 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|███████████████████████ | 145/509 [22:42<37:44, 6.22s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:43,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:43,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:45,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:47,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:47,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:49,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:51,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:51,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:53,302 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:56,371 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:56,371 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:57,770 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:57,770 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:59,462 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:19:59,462 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:20:05,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:20:05,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:20:05,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████ | 151/509 [23:13<38:02, 6.38s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████ | 151/509 [23:13<38:02, 6.38s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████ | 151/509 [23:13<38:02, 6.38s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████ | 151/509 [23:13<38:02, 6.38s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▏ | 152/509 [23:24<47:26, 7.97s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▏ | 152/509 [23:24<47:26, 7.97s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1689, 'learning_rate': 8.939999999999999e-06, 'epoch': 0.3} 30%|████████████████████████▏ | 152/509 [23:24<47:26, 7.97s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▏ | 152/509 [23:24<47:26, 7.97s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▎ | 153/509 [23:36<53:38, 9.04s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▎ | 153/509 [23:36<53:38, 9.04s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1502, 'learning_rate': 9e-06, 'epoch': 0.3} 30%|████████████████████████▎ | 153/509 [23:36<53:38, 9.04s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▎ | 153/509 [23:36<53:38, 9.04s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 154/509 [23:47<57:40, 9.75s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 154/509 [23:47<57:40, 9.75s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2982, 'learning_rate': 9.06e-06, 'epoch': 0.3} 30%|████████████████████████▌ | 154/509 [23:47<57:40, 9.75s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 154/509 [23:47<57:40, 9.75s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 154/509 [23:47<57:40, 9.75s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████ | 155/509 [23:59<1:00:28, 10.25s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████ | 155/509 [23:59<1:00:28, 10.25s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1447, 'learning_rate': 9.12e-06, 'epoch': 0.3} 30%|████████████████████████ | 155/509 [23:59<1:00:28, 10.25s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████ | 155/509 [23:59<1:00:28, 10.25s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████ | 155/509 [23:59<1:00:28, 10.25s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 156/509 [24:10<1:02:14, 10.58s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 156/509 [24:10<1:02:14, 10.58s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 156/509 [24:10<1:02:14, 10.58s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 156/509 [24:10<1:02:14, 10.58s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▎ | 157/509 [24:21<1:03:19, 10.79s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▎ | 157/509 [24:21<1:03:19, 10.79s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2283, 'learning_rate': 9.24e-06, 'epoch': 0.31} 31%|████████████████████████▎ | 157/509 [24:21<1:03:19, 10.79s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▎ | 157/509 [24:21<1:03:19, 10.79s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▌ | 158/509 [24:32<1:03:36, 10.87s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▌ | 158/509 [24:32<1:03:36, 10.87s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2782, 'learning_rate': 9.3e-06, 'epoch': 0.31} 31%|████████████████████████▌ | 158/509 [24:32<1:03:36, 10.87s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▌ | 158/509 [24:32<1:03:36, 10.87s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▌ | 158/509 [24:32<1:03:36, 10.87s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▋ | 159/509 [24:43<1:03:56, 10.96s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▋ | 159/509 [24:43<1:03:56, 10.96s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▋ | 159/509 [24:43<1:03:56, 10.96s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▋ | 159/509 [24:43<1:03:56, 10.96s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▊ | 160/509 [24:55<1:04:00, 11.01s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▊ | 160/509 [24:55<1:04:00, 11.01s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3297, 'learning_rate': 9.42e-06, 'epoch': 0.31} 31%|████████████████████████▊ | 160/509 [24:55<1:04:00, 11.01s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▊ | 160/509 [24:55<1:04:00, 11.01s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▊ | 160/509 [24:55<1:04:00, 11.01s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▉ | 161/509 [25:06<1:03:43, 10.99s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▉ | 161/509 [25:06<1:03:43, 10.99s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▉ | 161/509 [25:06<1:03:43, 10.99s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▉ | 161/509 [25:06<1:03:43, 10.99s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▉ | 161/509 [25:06<1:03:43, 10.99s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▏ | 162/509 [25:16<1:03:19, 10.95s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▏ | 162/509 [25:16<1:03:19, 10.95s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▏ | 162/509 [25:16<1:03:19, 10.95s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▏ | 162/509 [25:16<1:03:19, 10.95s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▎ | 163/509 [25:27<1:02:59, 10.92s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▎ | 163/509 [25:27<1:02:59, 10.92s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2805, 'learning_rate': 9.600000000000001e-06, 'epoch': 0.32} 32%|█████████████████████████▎ | 163/509 [25:27<1:02:59, 10.92s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▎ | 163/509 [25:27<1:02:59, 10.92s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▍ | 164/509 [25:38<1:02:24, 10.85s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▍ | 164/509 [25:38<1:02:24, 10.85s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2145, 'learning_rate': 9.66e-06, 'epoch': 0.32} 32%|█████████████████████████▍ | 164/509 [25:38<1:02:24, 10.85s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▍ | 164/509 [25:38<1:02:24, 10.85s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▍ | 164/509 [25:38<1:02:24, 10.85s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▍ | 164/509 [25:38<1:02:24, 10.85s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▌ | 165/509 [25:49<1:02:06, 10.83s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▌ | 165/509 [25:49<1:02:06, 10.83s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▌ | 165/509 [25:49<1:02:06, 10.83s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▌ | 165/509 [25:49<1:02:06, 10.83s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▌ | 165/509 [25:49<1:02:06, 10.83s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▊ | 166/509 [26:00<1:01:52, 10.82s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▊ | 166/509 [26:00<1:01:52, 10.82s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▊ | 166/509 [26:00<1:01:52, 10.82s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▊ | 166/509 [26:00<1:01:52, 10.82s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▉ | 167/509 [26:10<1:01:27, 10.78s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▉ | 167/509 [26:10<1:01:27, 10.78s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1685, 'learning_rate': 9.84e-06, 'epoch': 0.33} 33%|█████████████████████████▉ | 167/509 [26:10<1:01:27, 10.78s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▉ | 167/509 [26:10<1:01:27, 10.78s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████ | 168/509 [26:21<1:00:52, 10.71s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████ | 168/509 [26:21<1:00:52, 10.71s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2808, 'learning_rate': 9.9e-06, 'epoch': 0.33} 33%|██████████████████████████ | 168/509 [26:21<1:00:52, 10.71s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████ | 168/509 [26:21<1:00:52, 10.71s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▏ | 169/509 [26:31<1:00:26, 10.67s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▏ | 169/509 [26:31<1:00:26, 10.67s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2539, 'learning_rate': 9.960000000000001e-06, 'epoch': 0.33} 33%|██████████████████████████▏ | 169/509 [26:31<1:00:26, 10.67s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▏ | 169/509 [26:31<1:00:26, 10.67s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▏ | 169/509 [26:31<1:00:26, 10.67s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▏ | 169/509 [26:31<1:00:26, 10.67s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|███████████████████████████ | 170/509 [26:42<59:47, 10.58s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|███████████████████████████ | 170/509 [26:42<59:47, 10.58s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|███████████████████████████ | 170/509 [26:42<59:47, 10.58s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|███████████████████████████ | 170/509 [26:42<59:47, 10.58s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▏ | 171/509 [26:52<59:12, 10.51s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▏ | 171/509 [26:52<59:12, 10.51s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2015, 'learning_rate': 1.008e-05, 'epoch': 0.34} 34%|███████████████████████████▏ | 171/509 [26:52<59:12, 10.51s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▏ | 171/509 [26:52<59:12, 10.51s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▎ | 172/509 [27:02<58:40, 10.45s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▎ | 172/509 [27:02<58:40, 10.45s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1342, 'learning_rate': 1.0140000000000001e-05, 'epoch': 0.34} 34%|███████████████████████████▎ | 172/509 [27:02<58:40, 10.45s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▎ | 172/509 [27:02<58:40, 10.45s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▎ | 172/509 [27:02<58:40, 10.45s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 173/509 [27:12<57:56, 10.35s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 173/509 [27:12<57:56, 10.35s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 173/509 [27:12<57:56, 10.35s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 173/509 [27:12<57:56, 10.35s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 173/509 [27:12<57:56, 10.35s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 173/509 [27:12<57:56, 10.35s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2466, 'learning_rate': 1.0260000000000002e-05, 'epoch': 0.34} 34%|███████████████████████████▌ | 173/509 [27:12<57:56, 10.35s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 173/509 [27:12<57:56, 10.35s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 173/509 [27:12<57:56, 10.35s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 173/509 [27:12<57:56, 10.35s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 175/509 [27:33<57:32, 10.34s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 175/509 [27:33<57:32, 10.34s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 175/509 [27:33<57:32, 10.34s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 175/509 [27:33<57:32, 10.34s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 175/509 [27:33<57:32, 10.34s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▏ | 177/509 [27:53<55:40, 10.06s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▏ | 177/509 [27:53<55:40, 10.06s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▏ | 177/509 [27:53<55:40, 10.06s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▏ | 177/509 [27:53<55:40, 10.06s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▏ | 177/509 [27:53<55:40, 10.06s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▎ | 178/509 [28:02<55:00, 9.97s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▎ | 178/509 [28:02<55:00, 9.97s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:25:08,912 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:25:08,912 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▍ | 179/509 [28:12<54:28, 9.90s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▍ | 179/509 [28:12<54:28, 9.90s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▍ | 179/509 [28:12<54:28, 9.90s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▍ | 179/509 [28:12<54:28, 9.90s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2838, 'learning_rate': 1.062e-05, 'epoch': 0.35} g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▊ | 181/509 [28:31<52:51, 9.67s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▊ | 181/509 [28:31<52:51, 9.67s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▊ | 181/509 [28:31<52:51, 9.67s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▊ | 181/509 [28:31<52:51, 9.67s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 182/509 [28:40<52:05, 9.56s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 182/509 [28:40<52:05, 9.56s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2943, 'learning_rate': 1.074e-05, 'epoch': 0.36} 36%|████████████████████████████▉ | 182/509 [28:40<52:05, 9.56s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 182/509 [28:40<52:05, 9.56s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████ | 183/509 [28:49<51:02, 9.39s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████ | 183/509 [28:49<51:02, 9.39s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1471, 'learning_rate': 1.08e-05, 'epoch': 0.36} 36%|█████████████████████████████ | 183/509 [28:49<51:02, 9.39s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████ | 183/509 [28:49<51:02, 9.39s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▎ | 184/509 [28:58<50:01, 9.23s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▎ | 184/509 [28:58<50:01, 9.23s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2257, 'learning_rate': 1.086e-05, 'epoch': 0.36} 36%|█████████████████████████████▎ | 184/509 [28:58<50:01, 9.23s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▎ | 184/509 [28:58<50:01, 9.23s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▍ | 185/509 [29:07<49:04, 9.09s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▍ | 185/509 [29:07<49:04, 9.09s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3327, 'learning_rate': 1.092e-05, 'epoch': 0.36} 36%|█████████████████████████████▍ | 185/509 [29:07<49:04, 9.09s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:26:14,766 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:26:14,766 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1578, 'learning_rate': 1.098e-05, 'epoch': 0.36} [WARNING|modeling_utils.py:388] 2022-03-01 15:26:14,766 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:26:14,766 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:26:14,766 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▊ | 187/509 [29:24<47:02, 8.77s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▊ | 187/509 [29:24<47:02, 8.77s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▊ | 187/509 [29:24<47:02, 8.77s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▊ | 187/509 [29:24<47:02, 8.77s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▊ | 187/509 [29:24<47:02, 8.77s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▉ | 188/509 [29:32<45:57, 8.59s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▉ | 188/509 [29:32<45:57, 8.59s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▉ | 188/509 [29:32<45:57, 8.59s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▉ | 188/509 [29:32<45:57, 8.59s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▉ | 188/509 [29:32<45:57, 8.59s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████ | 189/509 [29:40<44:41, 8.38s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████ | 189/509 [29:40<44:41, 8.38s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████ | 189/509 [29:40<44:41, 8.38s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████ | 189/509 [29:40<44:41, 8.38s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████ | 189/509 [29:40<44:41, 8.38s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████▏ | 190/509 [29:48<43:08, 8.11s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████▏ | 190/509 [29:48<43:08, 8.11s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:26:51,931 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:26:51,931 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▍ | 191/509 [29:54<41:09, 7.77s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▍ | 191/509 [29:54<41:09, 7.77s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▍ | 191/509 [29:54<41:09, 7.77s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▍ | 191/509 [29:54<41:09, 7.77s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▍ | 191/509 [29:54<41:09, 7.77s/it]g-point operations will not be computed-01 15:19:31,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 192/509 [30:01<39:14, 7.43s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:27:01,879 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 192/509 [30:01<39:14, 7.43s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:27:01,879 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 192/509 [30:01<39:14, 7.43s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:27:01,879 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 192/509 [30:01<39:14, 7.43s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:27:01,879 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▋ | 193/509 [30:07<37:05, 7.04s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▋ | 193/509 [30:07<37:05, 7.04s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▋ | 193/509 [30:07<37:05, 7.04s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:12,050 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:12,050 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:15,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:15,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 195/509 [30:18<32:19, 6.18s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:19,582 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:19,582 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:21,762 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:23,862 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:23,862 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:25,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:27,654 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:27,654 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:30,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:32,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:32,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:33,760 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:33,760 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:35,437 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:35,437 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:41,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:41,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:27:41,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 201/509 [30:49<32:46, 6.38s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 201/509 [30:49<32:46, 6.38s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 201/509 [30:49<32:46, 6.38s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 201/509 [30:49<32:46, 6.38s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 201/509 [30:49<32:46, 6.38s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 201/509 [30:49<32:46, 6.38s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.179, 'learning_rate': 1.1940000000000001e-05, 'epoch': 0.4} 39%|███████████████████████████████▉ | 201/509 [30:49<32:46, 6.38s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 201/509 [30:49<32:46, 6.38s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 201/509 [30:49<32:46, 6.38s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 203/509 [31:12<46:12, 9.06s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 203/509 [31:12<46:12, 9.06s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1769, 'learning_rate': 1.2e-05, 'epoch': 0.4} 40%|████████████████████████████████▎ | 203/509 [31:12<46:12, 9.06s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 203/509 [31:12<46:12, 9.06s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 203/509 [31:12<46:12, 9.06s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 204/509 [31:23<49:45, 9.79s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 204/509 [31:23<49:45, 9.79s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 204/509 [31:23<49:45, 9.79s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 204/509 [31:23<49:45, 9.79s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▌ | 205/509 [31:35<51:59, 10.26s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▌ | 205/509 [31:35<51:59, 10.26s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2475, 'learning_rate': 1.2120000000000001e-05, 'epoch': 0.4} 40%|████████████████████████████████▌ | 205/509 [31:35<51:59, 10.26s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▌ | 205/509 [31:35<51:59, 10.26s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▌ | 205/509 [31:35<51:59, 10.26s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▊ | 206/509 [31:46<53:28, 10.59s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▊ | 206/509 [31:46<53:28, 10.59s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▊ | 206/509 [31:46<53:28, 10.59s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▊ | 206/509 [31:46<53:28, 10.59s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▉ | 207/509 [31:57<54:16, 10.78s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▉ | 207/509 [31:57<54:16, 10.78s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0892, 'learning_rate': 1.224e-05, 'epoch': 0.41} 41%|████████████████████████████████▉ | 207/509 [31:57<54:16, 10.78s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▉ | 207/509 [31:57<54:16, 10.78s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 208/509 [32:09<54:46, 10.92s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 208/509 [32:09<54:46, 10.92s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1223, 'learning_rate': 1.2299999999999999e-05, 'epoch': 0.41} 41%|█████████████████████████████████ | 208/509 [32:09<54:46, 10.92s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 208/509 [32:09<54:46, 10.92s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 208/509 [32:09<54:46, 10.92s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▎ | 209/509 [32:20<54:49, 10.97s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▎ | 209/509 [32:20<54:49, 10.97s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▎ | 209/509 [32:20<54:49, 10.97s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▎ | 209/509 [32:20<54:49, 10.97s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▎ | 209/509 [32:20<54:49, 10.97s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▍ | 210/509 [32:31<54:44, 10.98s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▍ | 210/509 [32:31<54:44, 10.98s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▍ | 210/509 [32:31<54:44, 10.98s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▍ | 210/509 [32:31<54:44, 10.98s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 211/509 [32:42<54:23, 10.95s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 211/509 [32:42<54:23, 10.95s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1797, 'learning_rate': 1.2479999999999999e-05, 'epoch': 0.41} 41%|█████████████████████████████████▌ | 211/509 [32:42<54:23, 10.95s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 211/509 [32:42<54:23, 10.95s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 211/509 [32:42<54:23, 10.95s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 211/509 [32:42<54:23, 10.95s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1706, 'learning_rate': 1.254e-05, 'epoch': 0.42} 41%|█████████████████████████████████▌ | 211/509 [32:42<54:23, 10.95s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 211/509 [32:42<54:23, 10.95s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 211/509 [32:42<54:23, 10.95s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 211/509 [32:42<54:23, 10.95s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▉ | 213/509 [33:03<53:32, 10.85s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▉ | 213/509 [33:03<53:32, 10.85s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▉ | 213/509 [33:03<53:32, 10.85s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▉ | 213/509 [33:03<53:32, 10.85s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 214/509 [33:14<53:05, 10.80s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 214/509 [33:14<53:05, 10.80s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2172, 'learning_rate': 1.2659999999999999e-05, 'epoch': 0.42} 42%|██████████████████████████████████ | 214/509 [33:14<53:05, 10.80s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 214/509 [33:14<53:05, 10.80s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▏ | 215/509 [33:24<52:40, 10.75s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▏ | 215/509 [33:24<52:40, 10.75s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.16, 'learning_rate': 1.272e-05, 'epoch': 0.42} 42%|██████████████████████████████████▏ | 215/509 [33:24<52:40, 10.75s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▏ | 215/509 [33:24<52:40, 10.75s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1358, 'learning_rate': 1.278e-05, 'epoch': 0.42} g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1766, 'learning_rate': 1.284e-05, 'epoch': 0.43} g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 218/509 [33:56<51:20, 10.58s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 218/509 [33:56<51:20, 10.58s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2376, 'learning_rate': 1.29e-05, 'epoch': 0.43} 43%|██████████████████████████████████▋ | 218/509 [33:56<51:20, 10.58s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 218/509 [33:56<51:20, 10.58s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 218/509 [33:56<51:20, 10.58s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▊ | 219/509 [34:06<50:47, 10.51s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▊ | 219/509 [34:06<50:47, 10.51s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▊ | 219/509 [34:06<50:47, 10.51s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▊ | 219/509 [34:06<50:47, 10.51s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████ | 220/509 [34:17<50:18, 10.45s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████ | 220/509 [34:17<50:18, 10.45s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3389, 'learning_rate': 1.302e-05, 'epoch': 0.43} 43%|███████████████████████████████████ | 220/509 [34:17<50:18, 10.45s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████ | 220/509 [34:17<50:18, 10.45s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 221/509 [34:27<49:48, 10.38s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 221/509 [34:27<49:48, 10.38s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1918, 'learning_rate': 1.308e-05, 'epoch': 0.43} 43%|███████████████████████████████████▏ | 221/509 [34:27<49:48, 10.38s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 221/509 [34:27<49:48, 10.38s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 221/509 [34:27<49:48, 10.38s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▎ | 222/509 [34:37<49:09, 10.28s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▎ | 222/509 [34:37<49:09, 10.28s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▎ | 222/509 [34:37<49:09, 10.28s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▎ | 222/509 [34:37<49:09, 10.28s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 223/509 [34:47<48:49, 10.24s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 223/509 [34:47<48:49, 10.24s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2459, 'learning_rate': 1.32e-05, 'epoch': 0.44} 44%|███████████████████████████████████▍ | 223/509 [34:47<48:49, 10.24s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 223/509 [34:47<48:49, 10.24s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 224/509 [34:57<48:18, 10.17s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 224/509 [34:57<48:18, 10.17s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1807, 'learning_rate': 1.326e-05, 'epoch': 0.44} 44%|███████████████████████████████████▋ | 224/509 [34:57<48:18, 10.17s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 224/509 [34:57<48:18, 10.17s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▊ | 225/509 [35:07<48:26, 10.23s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▊ | 225/509 [35:07<48:26, 10.23s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2959, 'learning_rate': 1.3320000000000001e-05, 'epoch': 0.44} 44%|███████████████████████████████████▊ | 225/509 [35:07<48:26, 10.23s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▊ | 225/509 [35:07<48:26, 10.23s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▉ | 226/509 [35:17<47:54, 10.16s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▉ | 226/509 [35:17<47:54, 10.16s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1672, 'learning_rate': 1.338e-05, 'epoch': 0.44} 44%|███████████████████████████████████▉ | 226/509 [35:17<47:54, 10.16s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▉ | 226/509 [35:17<47:54, 10.16s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████ | 227/509 [35:27<47:09, 10.03s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████ | 227/509 [35:27<47:09, 10.03s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2157, 'learning_rate': 1.344e-05, 'epoch': 0.45} 45%|████████████████████████████████████ | 227/509 [35:27<47:09, 10.03s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:32:35,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:32:35,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0494, 'learning_rate': 1.3500000000000001e-05, 'epoch': 0.45} [WARNING|modeling_utils.py:388] 2022-03-01 15:32:35,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:32:35,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:32:35,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▍ | 229/509 [35:46<45:50, 9.82s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▍ | 229/509 [35:46<45:50, 9.82s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2698, 'learning_rate': 1.356e-05, 'epoch': 0.45} 45%|████████████████████████████████████▍ | 229/509 [35:46<45:50, 9.82s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▍ | 229/509 [35:46<45:50, 9.82s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▍ | 229/509 [35:46<45:50, 9.82s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 230/509 [35:56<45:23, 9.76s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 230/509 [35:56<45:23, 9.76s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 230/509 [35:56<45:23, 9.76s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 230/509 [35:56<45:23, 9.76s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 231/509 [36:05<44:46, 9.66s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 231/509 [36:05<44:46, 9.66s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3458, 'learning_rate': 1.3680000000000001e-05, 'epoch': 0.45} 45%|████████████████████████████████████▊ | 231/509 [36:05<44:46, 9.66s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 231/509 [36:05<44:46, 9.66s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 231/509 [36:05<44:46, 9.66s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 232/509 [36:15<44:10, 9.57s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 232/509 [36:15<44:10, 9.57s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 232/509 [36:15<44:10, 9.57s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 232/509 [36:15<44:10, 9.57s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 232/509 [36:15<44:10, 9.57s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 233/509 [36:24<43:46, 9.52s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 233/509 [36:24<43:46, 9.52s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 233/509 [36:24<43:46, 9.52s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 233/509 [36:24<43:46, 9.52s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 234/509 [36:33<43:00, 9.38s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 234/509 [36:33<43:00, 9.38s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0553, 'learning_rate': 1.3860000000000001e-05, 'epoch': 0.46} 46%|█████████████████████████████████████▏ | 234/509 [36:33<43:00, 9.38s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 234/509 [36:33<43:00, 9.38s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▍ | 235/509 [36:42<42:12, 9.24s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▍ | 235/509 [36:42<42:12, 9.24s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2966, 'learning_rate': 1.392e-05, 'epoch': 0.46} 46%|█████████████████████████████████████▍ | 235/509 [36:42<42:12, 9.24s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▍ | 235/509 [36:42<42:12, 9.24s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▍ | 235/509 [36:42<42:12, 9.24s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▌ | 236/509 [36:51<41:25, 9.10s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▌ | 236/509 [36:51<41:25, 9.10s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:33:56,521 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▋ | 237/509 [36:59<40:31, 8.94s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▋ | 237/509 [36:59<40:31, 8.94s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2394, 'learning_rate': 1.4040000000000001e-05, 'epoch': 0.46} 47%|█████████████████████████████████████▋ | 237/509 [36:59<40:31, 8.94s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▋ | 237/509 [36:59<40:31, 8.94s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▋ | 237/509 [36:59<40:31, 8.94s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▊ | 238/509 [37:08<39:22, 8.72s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▊ | 238/509 [37:08<39:22, 8.72s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:34:12,904 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:34:12,904 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████ | 239/509 [37:16<38:17, 8.51s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████ | 239/509 [37:16<38:17, 8.51s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████ | 239/509 [37:16<38:17, 8.51s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████ | 239/509 [37:16<38:17, 8.51s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████ | 239/509 [37:16<38:17, 8.51s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 240/509 [37:23<36:56, 8.24s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 240/509 [37:23<36:56, 8.24s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 240/509 [37:23<36:56, 8.24s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 240/509 [37:23<36:56, 8.24s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 240/509 [37:23<36:56, 8.24s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▎ | 241/509 [37:31<35:37, 7.97s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:34:33,360 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:34:33,360 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:34:33,360 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▌ | 242/509 [37:37<33:57, 7.63s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▌ | 242/509 [37:37<33:57, 7.63s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:34:41,458 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:34:41,458 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▋ | 243/509 [37:44<32:10, 7.26s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▋ | 243/509 [37:44<32:10, 7.26s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:34:47,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:34:47,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▊ | 244/509 [37:50<30:18, 6.86s/it]g-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:34:51,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:34:51,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:34:51,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:27:07,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▉ | 245/509 [37:55<28:20, 6.44s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:34:58,136 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:34:58,136 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|███████████████████████████████████████▏ | 246/509 [38:00<26:08, 5.96s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:35:01,532 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:35:01,532 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:35:03,559 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:35:05,538 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:35:05,538 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:35:07,270 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:35:08,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:35:08,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:35:11,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:35:11,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:35:13,561 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:35:13,561 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:35:19,837 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:35:19,837 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:35:19,837 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▉ | 251/509 [38:27<27:46, 6.46s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▉ | 251/509 [38:27<27:46, 6.46s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▉ | 251/509 [38:27<27:46, 6.46s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▉ | 251/509 [38:27<27:46, 6.46s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▉ | 251/509 [38:27<27:46, 6.46s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████ | 252/509 [38:38<34:25, 8.04s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████ | 252/509 [38:38<34:25, 8.04s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████ | 252/509 [38:38<34:25, 8.04s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████ | 252/509 [38:38<34:25, 8.04s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▎ | 253/509 [38:50<38:44, 9.08s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▎ | 253/509 [38:50<38:44, 9.08s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2892, 'learning_rate': 1.5e-05, 'epoch': 0.5} 50%|████████████████████████████████████████▎ | 253/509 [38:50<38:44, 9.08s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▎ | 253/509 [38:50<38:44, 9.08s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▍ | 254/509 [39:01<41:32, 9.77s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▍ | 254/509 [39:01<41:32, 9.77s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1895, 'learning_rate': 1.506e-05, 'epoch': 0.5} 50%|████████████████████████████████████████▍ | 254/509 [39:01<41:32, 9.77s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▍ | 254/509 [39:01<41:32, 9.77s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▍ | 254/509 [39:01<41:32, 9.77s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▌ | 255/509 [39:13<43:29, 10.28s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▌ | 255/509 [39:13<43:29, 10.28s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▌ | 255/509 [39:13<43:29, 10.28s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▌ | 255/509 [39:13<43:29, 10.28s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▋ | 256/509 [39:24<44:34, 10.57s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▋ | 256/509 [39:24<44:34, 10.57s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2414, 'learning_rate': 1.518e-05, 'epoch': 0.5} 50%|████████████████████████████████████████▋ | 256/509 [39:24<44:34, 10.57s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▋ | 256/509 [39:24<44:34, 10.57s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▉ | 257/509 [39:35<45:09, 10.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▉ | 257/509 [39:35<45:09, 10.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1619, 'learning_rate': 1.524e-05, 'epoch': 0.5} 50%|████████████████████████████████████████▉ | 257/509 [39:35<45:09, 10.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▉ | 257/509 [39:35<45:09, 10.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████ | 258/509 [39:46<45:37, 10.91s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████ | 258/509 [39:46<45:37, 10.91s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2641, 'learning_rate': 1.53e-05, 'epoch': 0.51} 51%|█████████████████████████████████████████ | 258/509 [39:46<45:37, 10.91s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████ | 258/509 [39:46<45:37, 10.91s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████ | 258/509 [39:46<45:37, 10.91s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 259/509 [39:57<45:35, 10.94s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 259/509 [39:57<45:35, 10.94s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 259/509 [39:57<45:35, 10.94s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 259/509 [39:57<45:35, 10.94s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 260/509 [40:08<45:29, 10.96s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 260/509 [40:08<45:29, 10.96s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2044, 'learning_rate': 1.542e-05, 'epoch': 0.51} 51%|█████████████████████████████████████████▍ | 260/509 [40:08<45:29, 10.96s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 260/509 [40:08<45:29, 10.96s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 260/509 [40:08<45:29, 10.96s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 260/509 [40:08<45:29, 10.96s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1339, 'learning_rate': 1.548e-05, 'epoch': 0.51} 51%|█████████████████████████████████████████▍ | 260/509 [40:08<45:29, 10.96s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 260/509 [40:08<45:29, 10.96s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 260/509 [40:08<45:29, 10.96s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 260/509 [40:08<45:29, 10.96s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▋ | 262/509 [40:30<44:48, 10.88s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▋ | 262/509 [40:30<44:48, 10.88s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▋ | 262/509 [40:30<44:48, 10.88s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▋ | 262/509 [40:30<44:48, 10.88s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▋ | 262/509 [40:30<44:48, 10.88s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▊ | 263/509 [40:41<44:22, 10.82s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▊ | 263/509 [40:41<44:22, 10.82s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▊ | 263/509 [40:41<44:22, 10.82s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▊ | 263/509 [40:41<44:22, 10.82s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 264/509 [40:51<44:09, 10.82s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 264/509 [40:51<44:09, 10.82s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1346, 'learning_rate': 1.5660000000000003e-05, 'epoch': 0.52} 52%|██████████████████████████████████████████ | 264/509 [40:51<44:09, 10.82s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 264/509 [40:51<44:09, 10.82s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▏ | 265/509 [41:02<43:41, 10.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▏ | 265/509 [41:02<43:41, 10.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2563, 'learning_rate': 1.5720000000000002e-05, 'epoch': 0.52} 52%|██████████████████████████████████████████▏ | 265/509 [41:02<43:41, 10.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▏ | 265/509 [41:02<43:41, 10.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 266/509 [41:13<43:18, 10.69s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 266/509 [41:13<43:18, 10.69s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2049, 'learning_rate': 1.578e-05, 'epoch': 0.52} 52%|██████████████████████████████████████████▎ | 266/509 [41:13<43:18, 10.69s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 266/509 [41:13<43:18, 10.69s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 266/509 [41:13<43:18, 10.69s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▍ | 267/509 [41:23<43:00, 10.66s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▍ | 267/509 [41:23<43:00, 10.66s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▍ | 267/509 [41:23<43:00, 10.66s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▍ | 267/509 [41:23<43:00, 10.66s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▋ | 268/509 [41:34<42:48, 10.66s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▋ | 268/509 [41:34<42:48, 10.66s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1505, 'learning_rate': 1.59e-05, 'epoch': 0.53} 53%|██████████████████████████████████████████▋ | 268/509 [41:34<42:48, 10.66s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▋ | 268/509 [41:34<42:48, 10.66s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▊ | 269/509 [41:44<42:27, 10.61s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▊ | 269/509 [41:44<42:27, 10.61s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1773, 'learning_rate': 1.596e-05, 'epoch': 0.53} 53%|██████████████████████████████████████████▊ | 269/509 [41:44<42:27, 10.61s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▊ | 269/509 [41:44<42:27, 10.61s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▊ | 269/509 [41:44<42:27, 10.61s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▉ | 270/509 [41:55<42:01, 10.55s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▉ | 270/509 [41:55<42:01, 10.55s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▉ | 270/509 [41:55<42:01, 10.55s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▉ | 270/509 [41:55<42:01, 10.55s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▉ | 270/509 [41:55<42:01, 10.55s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▏ | 271/509 [42:05<41:31, 10.47s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▏ | 271/509 [42:05<41:31, 10.47s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▏ | 271/509 [42:05<41:31, 10.47s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▏ | 271/509 [42:05<41:31, 10.47s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▏ | 271/509 [42:05<41:31, 10.47s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▎ | 272/509 [42:15<40:57, 10.37s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▎ | 272/509 [42:15<40:57, 10.37s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▎ | 272/509 [42:15<40:57, 10.37s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▎ | 272/509 [42:15<40:57, 10.37s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▎ | 272/509 [42:15<40:57, 10.37s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▍ | 273/509 [42:25<40:29, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▍ | 273/509 [42:25<40:29, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▍ | 273/509 [42:25<40:29, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▍ | 273/509 [42:25<40:29, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▍ | 273/509 [42:25<40:29, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▌ | 274/509 [42:35<40:04, 10.23s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▌ | 274/509 [42:35<40:04, 10.23s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▌ | 274/509 [42:35<40:04, 10.23s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▌ | 274/509 [42:35<40:04, 10.23s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▊ | 275/509 [42:46<40:10, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▊ | 275/509 [42:46<40:10, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1935, 'learning_rate': 1.6320000000000003e-05, 'epoch': 0.54} 54%|███████████████████████████████████████████▊ | 275/509 [42:46<40:10, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▊ | 275/509 [42:46<40:10, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▊ | 275/509 [42:46<40:10, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▊ | 275/509 [42:46<40:10, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1559, 'learning_rate': 1.6380000000000002e-05, 'epoch': 0.54} 54%|███████████████████████████████████████████▊ | 275/509 [42:46<40:10, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▊ | 275/509 [42:46<40:10, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▊ | 275/509 [42:46<40:10, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|████████████████████████████████████████████ | 277/509 [43:06<38:55, 10.07s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|████████████████████████████████████████████ | 277/509 [43:06<38:55, 10.07s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3968, 'learning_rate': 1.6440000000000002e-05, 'epoch': 0.54} 54%|████████████████████████████████████████████ | 277/509 [43:06<38:55, 10.07s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|████████████████████████████████████████████ | 277/509 [43:06<38:55, 10.07s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▏ | 278/509 [43:15<38:15, 9.94s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▏ | 278/509 [43:15<38:15, 9.94s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1877, 'learning_rate': 1.65e-05, 'epoch': 0.55} 55%|████████████████████████████████████████████▏ | 278/509 [43:15<38:15, 9.94s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▏ | 278/509 [43:15<38:15, 9.94s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▏ | 278/509 [43:15<38:15, 9.94s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 279/509 [43:25<37:37, 9.82s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 279/509 [43:25<37:37, 9.82s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 279/509 [43:25<37:37, 9.82s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 279/509 [43:25<37:37, 9.82s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 279/509 [43:25<37:37, 9.82s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▌ | 280/509 [43:34<37:13, 9.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▌ | 280/509 [43:34<37:13, 9.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▌ | 280/509 [43:34<37:13, 9.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▌ | 280/509 [43:34<37:13, 9.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▌ | 280/509 [43:34<37:13, 9.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 281/509 [43:44<36:48, 9.68s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 281/509 [43:44<36:48, 9.68s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 281/509 [43:44<36:48, 9.68s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 281/509 [43:44<36:48, 9.68s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▉ | 282/509 [43:53<36:16, 9.59s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▉ | 282/509 [43:53<36:16, 9.59s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0727, 'learning_rate': 1.6740000000000002e-05, 'epoch': 0.55} 55%|████████████████████████████████████████████▉ | 282/509 [43:53<36:16, 9.59s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▉ | 282/509 [43:53<36:16, 9.59s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▉ | 282/509 [43:53<36:16, 9.59s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 283/509 [44:02<35:45, 9.49s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 283/509 [44:02<35:45, 9.49s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 283/509 [44:02<35:45, 9.49s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 283/509 [44:02<35:45, 9.49s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▏ | 284/509 [44:11<35:00, 9.34s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▏ | 284/509 [44:11<35:00, 9.34s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2479, 'learning_rate': 1.686e-05, 'epoch': 0.56} 56%|█████████████████████████████████████████████▏ | 284/509 [44:11<35:00, 9.34s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▏ | 284/509 [44:11<35:00, 9.34s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▏ | 284/509 [44:11<35:00, 9.34s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 285/509 [44:20<34:23, 9.21s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 285/509 [44:20<34:23, 9.21s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 285/509 [44:20<34:23, 9.21s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 285/509 [44:20<34:23, 9.21s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 285/509 [44:20<34:23, 9.21s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▌ | 286/509 [44:29<33:39, 9.06s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▌ | 286/509 [44:29<33:39, 9.06s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▌ | 286/509 [44:29<33:39, 9.06s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▌ | 286/509 [44:29<33:39, 9.06s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▋ | 287/509 [44:38<32:58, 8.91s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▋ | 287/509 [44:38<32:58, 8.91s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2078, 'learning_rate': 1.704e-05, 'epoch': 0.56} 56%|█████████████████████████████████████████████▋ | 287/509 [44:38<32:58, 8.91s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▋ | 287/509 [44:38<32:58, 8.91s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▊ | 288/509 [44:46<32:07, 8.72s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▊ | 288/509 [44:46<32:07, 8.72s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1439, 'learning_rate': 1.71e-05, 'epoch': 0.56} 57%|█████████████████████████████████████████████▊ | 288/509 [44:46<32:07, 8.72s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▊ | 288/509 [44:46<32:07, 8.72s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▊ | 288/509 [44:46<32:07, 8.72s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▉ | 289/509 [44:54<30:59, 8.45s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▉ | 289/509 [44:54<30:59, 8.45s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▉ | 289/509 [44:54<30:59, 8.45s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▉ | 289/509 [44:54<30:59, 8.45s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▉ | 289/509 [44:54<30:59, 8.45s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▏ | 290/509 [45:01<29:39, 8.12s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:03,839 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:07,242 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:07,242 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1491, 'learning_rate': 1.728e-05, 'epoch': 0.57} [WARNING|modeling_utils.py:388] 2022-03-01 15:42:07,242 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:13,798 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:13,798 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3486, 'learning_rate': 1.734e-05, 'epoch': 0.57} [WARNING|modeling_utils.py:388] 2022-03-01 15:42:13,798 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:19,778 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:19,778 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4428, 'learning_rate': 1.74e-05, 'epoch': 0.57} [WARNING|modeling_utils.py:388] 2022-03-01 15:42:23,853 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:23,853 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▊ | 294/509 [45:26<23:15, 6.49s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:27,603 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:29,833 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:29,833 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:32,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:34,075 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:34,075 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:36,093 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:37,899 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:37,899 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:39,709 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:39,709 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:41,338 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:44,272 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:44,272 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:45,634 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:45,634 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:47,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:47,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:53,502 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:53,502 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:42:53,502 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▉ | 301/509 [46:00<21:35, 6.23s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▉ | 301/509 [46:00<21:35, 6.23s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▉ | 301/509 [46:00<21:35, 6.23s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▉ | 301/509 [46:00<21:35, 6.23s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 302/509 [46:12<27:03, 7.84s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 302/509 [46:12<27:03, 7.84s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1743, 'learning_rate': 1.794e-05, 'epoch': 0.59} 59%|████████████████████████████████████████████████ | 302/509 [46:12<27:03, 7.84s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 302/509 [46:12<27:03, 7.84s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▏ | 303/509 [46:23<30:44, 8.95s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▏ | 303/509 [46:23<30:44, 8.95s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2248, 'learning_rate': 1.8e-05, 'epoch': 0.59} 60%|████████████████████████████████████████████████▏ | 303/509 [46:23<30:44, 8.95s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▏ | 303/509 [46:23<30:44, 8.95s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▏ | 303/509 [46:23<30:44, 8.95s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▏ | 303/509 [46:23<30:44, 8.95s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0996, 'learning_rate': 1.806e-05, 'epoch': 0.6} 60%|████████████████████████████████████████████████▏ | 303/509 [46:23<30:44, 8.95s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▏ | 303/509 [46:23<30:44, 8.95s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▏ | 303/509 [46:23<30:44, 8.95s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 305/509 [46:46<34:32, 10.16s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 305/509 [46:46<34:32, 10.16s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1084, 'learning_rate': 1.812e-05, 'epoch': 0.6} 60%|████████████████████████████████████████████████▌ | 305/509 [46:46<34:32, 10.16s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 305/509 [46:46<34:32, 10.16s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▋ | 306/509 [46:57<35:31, 10.50s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▋ | 306/509 [46:57<35:31, 10.50s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1411, 'learning_rate': 1.818e-05, 'epoch': 0.6} 60%|████████████████████████████████████████████████▋ | 306/509 [46:57<35:31, 10.50s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▋ | 306/509 [46:57<35:31, 10.50s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▊ | 307/509 [47:09<36:09, 10.74s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▊ | 307/509 [47:09<36:09, 10.74s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1641, 'learning_rate': 1.824e-05, 'epoch': 0.6} 60%|████████████████████████████████████████████████▊ | 307/509 [47:09<36:09, 10.74s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▊ | 307/509 [47:09<36:09, 10.74s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████ | 308/509 [47:20<36:21, 10.86s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████ | 308/509 [47:20<36:21, 10.86s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1869, 'learning_rate': 1.83e-05, 'epoch': 0.6} 61%|█████████████████████████████████████████████████ | 308/509 [47:20<36:21, 10.86s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████ | 308/509 [47:20<36:21, 10.86s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▏ | 309/509 [47:31<36:26, 10.93s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▏ | 309/509 [47:31<36:26, 10.93s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.136, 'learning_rate': 1.836e-05, 'epoch': 0.61} 61%|█████████████████████████████████████████████████▏ | 309/509 [47:31<36:26, 10.93s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▏ | 309/509 [47:31<36:26, 10.93s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▏ | 309/509 [47:31<36:26, 10.93s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▎ | 310/509 [47:42<36:15, 10.93s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▎ | 310/509 [47:42<36:15, 10.93s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▎ | 310/509 [47:42<36:15, 10.93s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▎ | 310/509 [47:42<36:15, 10.93s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▎ | 310/509 [47:42<36:15, 10.93s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▍ | 311/509 [47:53<36:05, 10.93s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▍ | 311/509 [47:53<36:05, 10.93s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▍ | 311/509 [47:53<36:05, 10.93s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▍ | 311/509 [47:53<36:05, 10.93s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▍ | 311/509 [47:53<36:05, 10.93s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▋ | 312/509 [48:04<35:47, 10.90s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▋ | 312/509 [48:04<35:47, 10.90s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▋ | 312/509 [48:04<35:47, 10.90s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▋ | 312/509 [48:04<35:47, 10.90s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▋ | 312/509 [48:04<35:47, 10.90s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▋ | 312/509 [48:04<35:47, 10.90s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▊ | 313/509 [48:14<35:32, 10.88s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▊ | 313/509 [48:14<35:32, 10.88s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▊ | 313/509 [48:14<35:32, 10.88s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▊ | 313/509 [48:14<35:32, 10.88s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▊ | 313/509 [48:14<35:32, 10.88s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 314/509 [48:25<35:14, 10.84s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 314/509 [48:25<35:14, 10.84s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 314/509 [48:25<35:14, 10.84s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 314/509 [48:25<35:14, 10.84s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 314/509 [48:25<35:14, 10.84s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 315/509 [48:36<35:00, 10.83s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 315/509 [48:36<35:00, 10.83s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 315/509 [48:36<35:00, 10.83s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 315/509 [48:36<35:00, 10.83s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▎ | 316/509 [48:46<34:34, 10.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▎ | 316/509 [48:46<34:34, 10.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0649, 'learning_rate': 1.878e-05, 'epoch': 0.62} 62%|██████████████████████████████████████████████████▎ | 316/509 [48:46<34:34, 10.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▎ | 316/509 [48:46<34:34, 10.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▎ | 316/509 [48:46<34:34, 10.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 317/509 [48:57<34:14, 10.70s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 317/509 [48:57<34:14, 10.70s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 317/509 [48:57<34:14, 10.70s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 317/509 [48:57<34:14, 10.70s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 317/509 [48:57<34:14, 10.70s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▌ | 318/509 [49:08<33:48, 10.62s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▌ | 318/509 [49:08<33:48, 10.62s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▌ | 318/509 [49:08<33:48, 10.62s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▌ | 318/509 [49:08<33:48, 10.62s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▌ | 318/509 [49:08<33:48, 10.62s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▊ | 319/509 [49:18<33:24, 10.55s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▊ | 319/509 [49:18<33:24, 10.55s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▊ | 319/509 [49:18<33:24, 10.55s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▊ | 319/509 [49:18<33:24, 10.55s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▊ | 319/509 [49:18<33:24, 10.55s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▉ | 320/509 [49:28<33:01, 10.48s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▉ | 320/509 [49:28<33:01, 10.48s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▉ | 320/509 [49:28<33:01, 10.48s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▉ | 320/509 [49:28<33:01, 10.48s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▉ | 320/509 [49:28<33:01, 10.48s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 321/509 [49:39<32:43, 10.44s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 321/509 [49:39<32:43, 10.44s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 321/509 [49:39<32:43, 10.44s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 321/509 [49:39<32:43, 10.44s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 321/509 [49:39<32:43, 10.44s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▏ | 322/509 [49:49<32:27, 10.41s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▏ | 322/509 [49:49<32:27, 10.41s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▏ | 322/509 [49:49<32:27, 10.41s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▏ | 322/509 [49:49<32:27, 10.41s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 323/509 [49:59<31:56, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 323/509 [49:59<31:56, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0638, 'learning_rate': 1.9200000000000003e-05, 'epoch': 0.63} 63%|███████████████████████████████████████████████████▍ | 323/509 [49:59<31:56, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 323/509 [49:59<31:56, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 323/509 [49:59<31:56, 10.30s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▌ | 324/509 [50:09<31:38, 10.26s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▌ | 324/509 [50:09<31:38, 10.26s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▌ | 324/509 [50:09<31:38, 10.26s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▌ | 324/509 [50:09<31:38, 10.26s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▋ | 325/509 [50:20<31:44, 10.35s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▋ | 325/509 [50:20<31:44, 10.35s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2083, 'learning_rate': 1.932e-05, 'epoch': 0.64} 64%|███████████████████████████████████████████████████▋ | 325/509 [50:20<31:44, 10.35s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▋ | 325/509 [50:20<31:44, 10.35s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▉ | 326/509 [50:30<31:14, 10.24s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▉ | 326/509 [50:30<31:14, 10.24s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1813, 'learning_rate': 1.938e-05, 'epoch': 0.64} 64%|███████████████████████████████████████████████████▉ | 326/509 [50:30<31:14, 10.24s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▉ | 326/509 [50:30<31:14, 10.24s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▉ | 326/509 [50:30<31:14, 10.24s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 327/509 [50:40<30:47, 10.15s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 327/509 [50:40<30:47, 10.15s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 327/509 [50:40<30:47, 10.15s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 327/509 [50:40<30:47, 10.15s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████▏ | 328/509 [50:49<30:15, 10.03s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████▏ | 328/509 [50:49<30:15, 10.03s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2536, 'learning_rate': 1.95e-05, 'epoch': 0.64} 64%|████████████████████████████████████████████████████▏ | 328/509 [50:49<30:15, 10.03s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████▏ | 328/509 [50:49<30:15, 10.03s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▎ | 329/509 [50:59<29:49, 9.94s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▎ | 329/509 [50:59<29:49, 9.94s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0342, 'learning_rate': 1.9560000000000002e-05, 'epoch': 0.65} 65%|████████████████████████████████████████████████████▎ | 329/509 [50:59<29:49, 9.94s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▎ | 329/509 [50:59<29:49, 9.94s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▌ | 330/509 [51:09<29:23, 9.85s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▌ | 330/509 [51:09<29:23, 9.85s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3253, 'learning_rate': 1.9620000000000002e-05, 'epoch': 0.65} 65%|████████████████████████████████████████████████████▌ | 330/509 [51:09<29:23, 9.85s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▌ | 330/509 [51:09<29:23, 9.85s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▌ | 330/509 [51:09<29:23, 9.85s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▋ | 331/509 [51:18<28:54, 9.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▋ | 331/509 [51:18<28:54, 9.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▋ | 331/509 [51:18<28:54, 9.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▋ | 331/509 [51:18<28:54, 9.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▋ | 331/509 [51:18<28:54, 9.75s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▊ | 332/509 [51:28<28:27, 9.65s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▊ | 332/509 [51:28<28:27, 9.65s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▊ | 332/509 [51:28<28:27, 9.65s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▊ | 332/509 [51:28<28:27, 9.65s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▊ | 332/509 [51:28<28:27, 9.65s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▊ | 332/509 [51:28<28:27, 9.65s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9813, 'learning_rate': 1.98e-05, 'epoch': 0.65} 65%|████████████████████████████████████████████████████▊ | 332/509 [51:28<28:27, 9.65s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▊ | 332/509 [51:28<28:27, 9.65s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▊ | 332/509 [51:28<28:27, 9.65s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▊ | 332/509 [51:28<28:27, 9.65s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▏ | 334/509 [51:46<27:24, 9.39s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▏ | 334/509 [51:46<27:24, 9.39s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▏ | 334/509 [51:46<27:24, 9.39s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▏ | 334/509 [51:46<27:24, 9.39s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▎ | 335/509 [51:55<26:55, 9.29s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▎ | 335/509 [51:55<26:55, 9.29s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3166, 'learning_rate': 1.9920000000000002e-05, 'epoch': 0.66} 66%|█████████████████████████████████████████████████████▎ | 335/509 [51:55<26:55, 9.29s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▎ | 335/509 [51:55<26:55, 9.29s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▎ | 335/509 [51:55<26:55, 9.29s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▍ | 336/509 [52:04<26:27, 9.18s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▍ | 336/509 [52:04<26:27, 9.18s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▍ | 336/509 [52:04<26:27, 9.18s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▍ | 336/509 [52:04<26:27, 9.18s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▍ | 336/509 [52:04<26:27, 9.18s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▋ | 337/509 [52:13<25:49, 9.01s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▋ | 337/509 [52:13<25:49, 9.01s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▋ | 337/509 [52:13<25:49, 9.01s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▋ | 337/509 [52:13<25:49, 9.01s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▋ | 337/509 [52:13<25:49, 9.01s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▊ | 338/509 [52:21<25:12, 8.85s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▊ | 338/509 [52:21<25:12, 8.85s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▊ | 338/509 [52:21<25:12, 8.85s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▊ | 338/509 [52:21<25:12, 8.85s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▊ | 338/509 [52:21<25:12, 8.85s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|█████████████████████████████████████████████████████▉ | 339/509 [52:29<24:28, 8.64s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|█████████████████████████████████████████████████████▉ | 339/509 [52:29<24:28, 8.64s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|█████████████████████████████████████████████████████▉ | 339/509 [52:29<24:28, 8.64s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|█████████████████████████████████████████████████████▉ | 339/509 [52:29<24:28, 8.64s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|█████████████████████████████████████████████████████▉ | 339/509 [52:29<24:28, 8.64s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████ | 340/509 [52:37<23:38, 8.39s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████ | 340/509 [52:37<23:38, 8.39s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████ | 340/509 [52:37<23:38, 8.39s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:49:43,467 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:49:43,467 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2514, 'learning_rate': 2.0280000000000002e-05, 'epoch': 0.67} [WARNING|modeling_utils.py:388] 2022-03-01 15:49:43,467 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:49:43,467 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:49:43,467 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▍ | 342/509 [52:51<21:31, 7.73s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:49:53,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:49:53,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:49:53,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▌ | 343/509 [52:58<20:12, 7.31s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▌ | 343/509 [52:58<20:12, 7.31s/it]g-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:50:01,135 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:50:01,135 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:34:55,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▋ | 344/509 [53:03<18:50, 6.85s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:03,936 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▋ | 344/509 [53:03<18:50, 6.85s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:03,936 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:50:07,693 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:50:03,936 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:50:07,693 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:50:03,936 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2776, 'learning_rate': 2.0520000000000003e-05, 'epoch': 0.68} [WARNING|modeling_utils.py:388] 2022-03-01 15:50:11,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:50:03,936 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:50:11,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:50:03,936 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████ | 346/509 [53:13<15:51, 5.84s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:13,502 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:50:15,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:50:13,502 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:50:15,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:50:13,502 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▏ | 347/509 [53:17<14:23, 5.33s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:17,493 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:50:19,277 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:50:17,493 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:50:19,277 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:50:17,493 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▍ | 348/509 [53:21<12:57, 4.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:21,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▌ | 349/509 [53:24<11:29, 4.31s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:23,996 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▌ | 349/509 [53:24<11:29, 4.31s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:23,996 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:50:25,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:50:23,996 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:50:25,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:50:23,996 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 350/509 [53:27<10:28, 3.95s/it]g-point operations will not be computed-01 15:50:23,996 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 350/509 [53:27<10:28, 3.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 350/509 [53:27<10:28, 3.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 350/509 [53:27<10:28, 3.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 350/509 [53:27<10:28, 3.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▊ | 351/509 [53:40<17:07, 6.50s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▊ | 351/509 [53:40<17:07, 6.50s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▊ | 351/509 [53:40<17:07, 6.50s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▊ | 351/509 [53:40<17:07, 6.50s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████ | 352/509 [53:51<21:05, 8.06s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████ | 352/509 [53:51<21:05, 8.06s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1917, 'learning_rate': 2.094e-05, 'epoch': 0.69} 69%|████████████████████████████████████████████████████████ | 352/509 [53:51<21:05, 8.06s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████ | 352/509 [53:51<21:05, 8.06s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 353/509 [54:03<23:41, 9.11s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 353/509 [54:03<23:41, 9.11s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1645, 'learning_rate': 2.1e-05, 'epoch': 0.69} 69%|████████████████████████████████████████████████████████▏ | 353/509 [54:03<23:41, 9.11s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 353/509 [54:03<23:41, 9.11s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 353/509 [54:03<23:41, 9.11s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▎ | 354/509 [54:14<25:21, 9.82s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▎ | 354/509 [54:14<25:21, 9.82s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▎ | 354/509 [54:14<25:21, 9.82s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▎ | 354/509 [54:14<25:21, 9.82s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 355/509 [54:26<26:25, 10.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 355/509 [54:26<26:25, 10.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1916, 'learning_rate': 2.1119999999999998e-05, 'epoch': 0.7} 70%|████████████████████████████████████████████████████████▍ | 355/509 [54:26<26:25, 10.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 355/509 [54:26<26:25, 10.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 355/509 [54:26<26:25, 10.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 356/509 [54:37<27:02, 10.61s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 356/509 [54:37<27:02, 10.61s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 356/509 [54:37<27:02, 10.61s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 356/509 [54:37<27:02, 10.61s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▊ | 357/509 [54:48<27:19, 10.79s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▊ | 357/509 [54:48<27:19, 10.79s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1109, 'learning_rate': 2.124e-05, 'epoch': 0.7} 70%|████████████████████████████████████████████████████████▊ | 357/509 [54:48<27:19, 10.79s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▊ | 357/509 [54:48<27:19, 10.79s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▊ | 357/509 [54:48<27:19, 10.79s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▉ | 358/509 [55:00<27:26, 10.90s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▉ | 358/509 [55:00<27:26, 10.90s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▉ | 358/509 [55:00<27:26, 10.90s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▉ | 358/509 [55:00<27:26, 10.90s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▏ | 359/509 [55:11<27:22, 10.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▏ | 359/509 [55:11<27:22, 10.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1315, 'learning_rate': 2.136e-05, 'epoch': 0.7} 71%|█████████████████████████████████████████████████████████▏ | 359/509 [55:11<27:22, 10.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▏ | 359/509 [55:11<27:22, 10.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 360/509 [55:22<27:16, 10.98s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 360/509 [55:22<27:16, 10.98s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2767, 'learning_rate': 2.1419999999999998e-05, 'epoch': 0.71} 71%|█████████████████████████████████████████████████████████▎ | 360/509 [55:22<27:16, 10.98s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 360/509 [55:22<27:16, 10.98s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 360/509 [55:22<27:16, 10.98s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▍ | 361/509 [55:33<27:02, 10.96s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▍ | 361/509 [55:33<27:02, 10.96s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▍ | 361/509 [55:33<27:02, 10.96s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▍ | 361/509 [55:33<27:02, 10.96s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▍ | 361/509 [55:33<27:02, 10.96s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 362/509 [55:43<26:49, 10.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 362/509 [55:43<26:49, 10.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 362/509 [55:43<26:49, 10.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 362/509 [55:43<26:49, 10.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 362/509 [55:43<26:49, 10.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 363/509 [55:54<26:33, 10.91s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 363/509 [55:54<26:33, 10.91s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 363/509 [55:54<26:33, 10.91s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 363/509 [55:54<26:33, 10.91s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 363/509 [55:54<26:33, 10.91s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|█████████████████████████████████████████████████████████▉ | 364/509 [56:05<26:10, 10.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|█████████████████████████████████████████████████████████▉ | 364/509 [56:05<26:10, 10.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|█████████████████████████████████████████████████████████▉ | 364/509 [56:05<26:10, 10.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|█████████████████████████████████████████████████████████▉ | 364/509 [56:05<26:10, 10.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|█████████████████████████████████████████████████████████▉ | 364/509 [56:05<26:10, 10.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 365/509 [56:16<25:52, 10.78s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 365/509 [56:16<25:52, 10.78s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 365/509 [56:16<25:52, 10.78s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 365/509 [56:16<25:52, 10.78s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 365/509 [56:16<25:52, 10.78s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▏ | 366/509 [56:26<25:33, 10.72s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▏ | 366/509 [56:26<25:33, 10.72s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▏ | 366/509 [56:26<25:33, 10.72s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▏ | 366/509 [56:26<25:33, 10.72s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▏ | 366/509 [56:26<25:33, 10.72s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▍ | 367/509 [56:37<25:15, 10.68s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▍ | 367/509 [56:37<25:15, 10.68s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▍ | 367/509 [56:37<25:15, 10.68s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▍ | 367/509 [56:37<25:15, 10.68s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▍ | 367/509 [56:37<25:15, 10.68s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▌ | 368/509 [56:47<25:02, 10.66s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▌ | 368/509 [56:47<25:02, 10.66s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▌ | 368/509 [56:47<25:02, 10.66s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▌ | 368/509 [56:47<25:02, 10.66s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▌ | 368/509 [56:47<25:02, 10.66s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 369/509 [56:58<24:44, 10.60s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 369/509 [56:58<24:44, 10.60s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 369/509 [56:58<24:44, 10.60s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 369/509 [56:58<24:44, 10.60s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 369/509 [56:58<24:44, 10.60s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 369/509 [56:58<24:44, 10.60s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1969, 'learning_rate': 2.202e-05, 'epoch': 0.73} 72%|██████████████████████████████████████████████████████████▋ | 369/509 [56:58<24:44, 10.60s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 369/509 [56:58<24:44, 10.60s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 369/509 [56:58<24:44, 10.60s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████ | 371/509 [57:19<24:17, 10.56s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████ | 371/509 [57:19<24:17, 10.56s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0807, 'learning_rate': 2.208e-05, 'epoch': 0.73} 73%|███████████████████████████████████████████████████████████ | 371/509 [57:19<24:17, 10.56s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████ | 371/509 [57:19<24:17, 10.56s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▏ | 372/509 [57:29<23:48, 10.43s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▏ | 372/509 [57:29<23:48, 10.43s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9431, 'learning_rate': 2.214e-05, 'epoch': 0.73} 73%|███████████████████████████████████████████████████████████▏ | 372/509 [57:29<23:48, 10.43s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▏ | 372/509 [57:29<23:48, 10.43s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▎ | 373/509 [57:39<23:31, 10.38s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▎ | 373/509 [57:39<23:31, 10.38s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2103, 'learning_rate': 2.22e-05, 'epoch': 0.73} 73%|███████████████████████████████████████████████████████████▎ | 373/509 [57:39<23:31, 10.38s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▎ | 373/509 [57:39<23:31, 10.38s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▌ | 374/509 [57:50<23:18, 10.36s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▌ | 374/509 [57:50<23:18, 10.36s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0377, 'learning_rate': 2.226e-05, 'epoch': 0.73} 73%|███████████████████████████████████████████████████████████▌ | 374/509 [57:50<23:18, 10.36s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▌ | 374/509 [57:50<23:18, 10.36s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▌ | 374/509 [57:50<23:18, 10.36s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▋ | 375/509 [58:00<23:21, 10.46s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▋ | 375/509 [58:00<23:21, 10.46s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▋ | 375/509 [58:00<23:21, 10.46s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▋ | 375/509 [58:00<23:21, 10.46s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1532, 'learning_rate': 2.238e-05, 'epoch': 0.74} [WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▉ | 377/509 [58:20<22:23, 10.18s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▉ | 377/509 [58:20<22:23, 10.18s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1324, 'learning_rate': 2.2440000000000002e-05, 'epoch': 0.74} 74%|███████████████████████████████████████████████████████████▉ | 377/509 [58:20<22:23, 10.18s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▉ | 377/509 [58:20<22:23, 10.18s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▏ | 378/509 [58:30<21:56, 10.05s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▏ | 378/509 [58:30<21:56, 10.05s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0863, 'learning_rate': 2.25e-05, 'epoch': 0.74} 74%|████████████████████████████████████████████████████████████▏ | 378/509 [58:30<21:56, 10.05s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▏ | 378/509 [58:30<21:56, 10.05s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▎ | 379/509 [58:40<21:33, 9.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▎ | 379/509 [58:40<21:33, 9.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.069, 'learning_rate': 2.256e-05, 'epoch': 0.74} 74%|████████████████████████████████████████████████████████████▎ | 379/509 [58:40<21:33, 9.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▎ | 379/509 [58:40<21:33, 9.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▎ | 379/509 [58:40<21:33, 9.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▎ | 379/509 [58:40<21:33, 9.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1447, 'learning_rate': 2.262e-05, 'epoch': 0.75} 74%|████████████████████████████████████████████████████████████▎ | 379/509 [58:40<21:33, 9.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▎ | 379/509 [58:40<21:33, 9.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▎ | 379/509 [58:40<21:33, 9.95s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▋ | 381/509 [58:59<20:51, 9.78s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▋ | 381/509 [58:59<20:51, 9.78s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2234, 'learning_rate': 2.268e-05, 'epoch': 0.75} 75%|████████████████████████████████████████████████████████████▋ | 381/509 [58:59<20:51, 9.78s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▋ | 381/509 [58:59<20:51, 9.78s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▊ | 382/509 [59:08<20:32, 9.71s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▊ | 382/509 [59:08<20:32, 9.71s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.11, 'learning_rate': 2.274e-05, 'epoch': 0.75} 75%|████████████████████████████████████████████████████████████▊ | 382/509 [59:08<20:32, 9.71s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▊ | 382/509 [59:08<20:32, 9.71s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▉ | 383/509 [59:18<20:09, 9.60s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▉ | 383/509 [59:18<20:09, 9.60s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2472, 'learning_rate': 2.2800000000000002e-05, 'epoch': 0.75} 75%|████████████████████████████████████████████████████████████▉ | 383/509 [59:18<20:09, 9.60s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▉ | 383/509 [59:18<20:09, 9.60s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████████ | 384/509 [59:27<19:43, 9.47s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████████ | 384/509 [59:27<19:43, 9.47s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1121, 'learning_rate': 2.286e-05, 'epoch': 0.75} 75%|█████████████████████████████████████████████████████████████ | 384/509 [59:27<19:43, 9.47s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████████ | 384/509 [59:27<19:43, 9.47s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▎ | 385/509 [59:36<19:16, 9.33s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▎ | 385/509 [59:36<19:16, 9.33s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2897, 'learning_rate': 2.292e-05, 'epoch': 0.76} 76%|█████████████████████████████████████████████████████████████▎ | 385/509 [59:36<19:16, 9.33s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▎ | 385/509 [59:36<19:16, 9.33s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▎ | 385/509 [59:36<19:16, 9.33s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▍ | 386/509 [59:45<18:50, 9.19s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▍ | 386/509 [59:45<18:50, 9.19s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▍ | 386/509 [59:45<18:50, 9.19s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:56:52,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:56:52,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1188, 'learning_rate': 2.304e-05, 'epoch': 0.76} [WARNING|modeling_utils.py:388] 2022-03-01 15:56:52,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:56:52,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:56:52,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:56:52,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▏ | 388/509 [1:00:02<17:57, 8.90s/it]g-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▏ | 388/509 [1:00:02<17:57, 8.90s/it]g-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▏ | 388/509 [1:00:02<17:57, 8.90s/it]g-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▏ | 388/509 [1:00:02<17:57, 8.90s/it]g-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▏ | 388/509 [1:00:02<17:57, 8.90s/it]g-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 389/509 [1:00:10<17:22, 8.69s/it]g-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 389/509 [1:00:10<17:22, 8.69s/it]g-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 389/509 [1:00:10<17:22, 8.69s/it]g-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 389/509 [1:00:10<17:22, 8.69s/it]g-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 389/509 [1:00:10<17:22, 8.69s/it]g-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▌ | 390/509 [1:00:18<16:47, 8.47s/it]g-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▌ | 390/509 [1:00:18<16:47, 8.47s/it]g-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▌ | 390/509 [1:00:18<16:47, 8.47s/it]g-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▌ | 390/509 [1:00:18<16:47, 8.47s/it]g-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▌ | 390/509 [1:00:18<16:47, 8.47s/it]g-point operations will not be computed-01 15:50:29,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▋ | 391/509 [1:00:26<16:06, 8.19s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:57:26,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▋ | 391/509 [1:00:26<16:06, 8.19s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:57:26,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▋ | 391/509 [1:00:26<16:06, 8.19s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:57:26,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▋ | 391/509 [1:00:26<16:06, 8.19s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:57:26,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▊ | 392/509 [1:00:33<15:22, 7.89s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:57:26,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▊ | 392/509 [1:00:33<15:22, 7.89s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:57:26,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:57:36,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:26,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▉ | 393/509 [1:00:39<14:25, 7.47s/it]g-point operations will not be computed-01 15:57:26,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▉ | 393/509 [1:00:39<14:25, 7.47s/it]g-point operations will not be computed-01 15:57:26,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2241, 'learning_rate': 2.3400000000000003e-05, 'epoch': 0.77} [WARNING|modeling_utils.py:388] 2022-03-01 15:57:43,023 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:26,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:57:43,023 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:26,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████▏ | 394/509 [1:00:45<13:25, 7.01s/it]g-point operations will not be computed-01 15:57:26,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:57:47,266 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:26,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:57:47,266 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:26,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:57:47,266 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:26,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▎ | 395/509 [1:00:51<12:23, 6.53s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▎ | 395/509 [1:00:51<12:23, 6.53s/it][WARNING|modeling_utils.py:388] 2022-03-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:57:54,695 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:57:54,695 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:57:56,989 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:57:59,037 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:57:59,037 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:58:01,090 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:58:02,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:58:02,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:58:04,613 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:58:04,613 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:58:07,471 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:58:09,146 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:58:09,146 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4054, 'learning_rate': 2.3820000000000002e-05, 'epoch': 0.78} [WARNING|modeling_utils.py:388] 2022-03-01 15:58:15,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 15:58:15,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 401/509 [1:01:22<11:36, 6.44s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 401/509 [1:01:22<11:36, 6.44s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2727, 'learning_rate': 2.3880000000000002e-05, 'epoch': 0.79} 79%|██████████████████████████████████████████████████████████████▏ | 401/509 [1:01:22<11:36, 6.44s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 401/509 [1:01:22<11:36, 6.44s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▍ | 402/509 [1:01:34<14:14, 7.99s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▍ | 402/509 [1:01:34<14:14, 7.99s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1263, 'learning_rate': 2.394e-05, 'epoch': 0.79} 79%|██████████████████████████████████████████████████████████████▍ | 402/509 [1:01:34<14:14, 7.99s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▍ | 402/509 [1:01:34<14:14, 7.99s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▍ | 402/509 [1:01:34<14:14, 7.99s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 403/509 [1:01:45<16:00, 9.06s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 403/509 [1:01:45<16:00, 9.06s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 403/509 [1:01:45<16:00, 9.06s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 403/509 [1:01:45<16:00, 9.06s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▋ | 404/509 [1:01:57<17:04, 9.75s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▋ | 404/509 [1:01:57<17:04, 9.75s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0201, 'learning_rate': 2.4060000000000003e-05, 'epoch': 0.79} 79%|██████████████████████████████████████████████████████████████▋ | 404/509 [1:01:57<17:04, 9.75s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▋ | 404/509 [1:01:57<17:04, 9.75s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|██████████████████████████████████████████████████████████████▊ | 405/509 [1:02:08<17:47, 10.26s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|██████████████████████████████████████████████████████████████▊ | 405/509 [1:02:08<17:47, 10.26s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1718, 'learning_rate': 2.4120000000000003e-05, 'epoch': 0.79} 80%|██████████████████████████████████████████████████████████████▊ | 405/509 [1:02:08<17:47, 10.26s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|██████████████████████████████████████████████████████████████▊ | 405/509 [1:02:08<17:47, 10.26s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████ | 406/509 [1:02:19<18:11, 10.59s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████ | 406/509 [1:02:19<18:11, 10.59s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1473, 'learning_rate': 2.4180000000000002e-05, 'epoch': 0.8} 80%|███████████████████████████████████████████████████████████████ | 406/509 [1:02:19<18:11, 10.59s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████ | 406/509 [1:02:19<18:11, 10.59s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████ | 406/509 [1:02:19<18:11, 10.59s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▎ | 408/509 [1:02:42<18:22, 10.92s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▎ | 408/509 [1:02:42<18:22, 10.92s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▎ | 408/509 [1:02:42<18:22, 10.92s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▎ | 408/509 [1:02:42<18:22, 10.92s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▍ | 409/509 [1:02:53<18:19, 11.00s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▍ | 409/509 [1:02:53<18:19, 11.00s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0903, 'learning_rate': 2.4360000000000004e-05, 'epoch': 0.8} 80%|███████████████████████████████████████████████████████████████▍ | 409/509 [1:02:53<18:19, 11.00s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▍ | 409/509 [1:02:53<18:19, 11.00s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▍ | 409/509 [1:02:53<18:19, 11.00s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▋ | 410/509 [1:03:04<18:10, 11.02s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▋ | 410/509 [1:03:04<18:10, 11.02s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▋ | 410/509 [1:03:04<18:10, 11.02s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▋ | 410/509 [1:03:04<18:10, 11.02s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▋ | 410/509 [1:03:04<18:10, 11.02s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▊ | 411/509 [1:03:15<17:58, 11.00s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▊ | 411/509 [1:03:15<17:58, 11.00s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▊ | 411/509 [1:03:15<17:58, 11.00s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▊ | 411/509 [1:03:15<17:58, 11.00s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▉ | 412/509 [1:03:26<17:47, 11.00s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▉ | 412/509 [1:03:26<17:47, 11.00s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1518, 'learning_rate': 2.454e-05, 'epoch': 0.81} 81%|███████████████████████████████████████████████████████████████▉ | 412/509 [1:03:26<17:47, 11.00s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▉ | 412/509 [1:03:26<17:47, 11.00s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▉ | 412/509 [1:03:26<17:47, 11.00s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████ | 413/509 [1:03:37<17:29, 10.93s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████ | 413/509 [1:03:37<17:29, 10.93s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████ | 413/509 [1:03:37<17:29, 10.93s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████ | 413/509 [1:03:37<17:29, 10.93s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████ | 413/509 [1:03:37<17:29, 10.93s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▎ | 414/509 [1:03:48<17:15, 10.90s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▎ | 414/509 [1:03:48<17:15, 10.90s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▎ | 414/509 [1:03:48<17:15, 10.90s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▎ | 414/509 [1:03:48<17:15, 10.90s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▍ | 415/509 [1:03:58<16:58, 10.83s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▍ | 415/509 [1:03:58<16:58, 10.83s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2478, 'learning_rate': 2.472e-05, 'epoch': 0.81} 82%|████████████████████████████████████████████████████████████████▍ | 415/509 [1:03:58<16:58, 10.83s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▍ | 415/509 [1:03:58<16:58, 10.83s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▌ | 416/509 [1:04:09<16:41, 10.77s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▌ | 416/509 [1:04:09<16:41, 10.77s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1177, 'learning_rate': 2.478e-05, 'epoch': 0.82} 82%|████████████████████████████████████████████████████████████████▌ | 416/509 [1:04:09<16:41, 10.77s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▌ | 416/509 [1:04:09<16:41, 10.77s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▋ | 417/509 [1:04:20<16:26, 10.73s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▋ | 417/509 [1:04:20<16:26, 10.73s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2212, 'learning_rate': 2.484e-05, 'epoch': 0.82} 82%|████████████████████████████████████████████████████████████████▋ | 417/509 [1:04:20<16:26, 10.73s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▋ | 417/509 [1:04:20<16:26, 10.73s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▋ | 417/509 [1:04:20<16:26, 10.73s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▉ | 418/509 [1:04:30<16:11, 10.68s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▉ | 418/509 [1:04:30<16:11, 10.68s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▉ | 418/509 [1:04:30<16:11, 10.68s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▉ | 418/509 [1:04:30<16:11, 10.68s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|█████████████████████████████████████████████████████████████████ | 419/509 [1:04:41<15:53, 10.59s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|█████████████████████████████████████████████████████████████████ | 419/509 [1:04:41<15:53, 10.59s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0685, 'learning_rate': 2.4959999999999998e-05, 'epoch': 0.82} 82%|█████████████████████████████████████████████████████████████████ | 419/509 [1:04:41<15:53, 10.59s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|█████████████████████████████████████████████████████████████████ | 419/509 [1:04:41<15:53, 10.59s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▏ | 420/509 [1:04:51<15:38, 10.54s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▏ | 420/509 [1:04:51<15:38, 10.54s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0582, 'learning_rate': 2.502e-05, 'epoch': 0.82} 83%|█████████████████████████████████████████████████████████████████▏ | 420/509 [1:04:51<15:38, 10.54s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▏ | 420/509 [1:04:51<15:38, 10.54s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▎ | 421/509 [1:05:01<15:24, 10.50s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▎ | 421/509 [1:05:01<15:24, 10.50s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0466, 'learning_rate': 2.508e-05, 'epoch': 0.83} 83%|█████████████████████████████████████████████████████████████████▎ | 421/509 [1:05:01<15:24, 10.50s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▎ | 421/509 [1:05:01<15:24, 10.50s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▍ | 422/509 [1:05:12<15:08, 10.45s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▍ | 422/509 [1:05:12<15:08, 10.45s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1161, 'learning_rate': 2.514e-05, 'epoch': 0.83} 83%|█████████████████████████████████████████████████████████████████▍ | 422/509 [1:05:12<15:08, 10.45s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▍ | 422/509 [1:05:12<15:08, 10.45s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▋ | 423/509 [1:05:22<14:53, 10.39s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▋ | 423/509 [1:05:22<14:53, 10.39s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1698, 'learning_rate': 2.52e-05, 'epoch': 0.83} 83%|█████████████████████████████████████████████████████████████████▋ | 423/509 [1:05:22<14:53, 10.39s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▋ | 423/509 [1:05:22<14:53, 10.39s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▊ | 424/509 [1:05:32<14:37, 10.32s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▊ | 424/509 [1:05:32<14:37, 10.32s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9876, 'learning_rate': 2.526e-05, 'epoch': 0.83} 83%|█████████████████████████████████████████████████████████████████▊ | 424/509 [1:05:32<14:37, 10.32s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▊ | 424/509 [1:05:32<14:37, 10.32s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▉ | 425/509 [1:05:43<14:38, 10.46s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▉ | 425/509 [1:05:43<14:38, 10.46s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1936, 'learning_rate': 2.5319999999999998e-05, 'epoch': 0.83} 83%|█████████████████████████████████████████████████████████████████▉ | 425/509 [1:05:43<14:38, 10.46s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▉ | 425/509 [1:05:43<14:38, 10.46s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▉ | 425/509 [1:05:43<14:38, 10.46s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████ | 426/509 [1:05:53<14:20, 10.37s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████ | 426/509 [1:05:53<14:20, 10.37s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1012, 'learning_rate': 2.538e-05, 'epoch': 0.84} 84%|██████████████████████████████████████████████████████████████████ | 426/509 [1:05:53<14:20, 10.37s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████ | 426/509 [1:05:53<14:20, 10.37s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████ | 426/509 [1:05:53<14:20, 10.37s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████ | 426/509 [1:05:53<14:20, 10.37s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1076, 'learning_rate': 2.544e-05, 'epoch': 0.84} 84%|██████████████████████████████████████████████████████████████████ | 426/509 [1:05:53<14:20, 10.37s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████ | 426/509 [1:05:53<14:20, 10.37s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▍ | 428/509 [1:06:13<13:37, 10.09s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▍ | 428/509 [1:06:13<13:37, 10.09s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1327, 'learning_rate': 2.55e-05, 'epoch': 0.84} 84%|██████████████████████████████████████████████████████████████████▍ | 428/509 [1:06:13<13:37, 10.09s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▍ | 428/509 [1:06:13<13:37, 10.09s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▍ | 428/509 [1:06:13<13:37, 10.09s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▌ | 429/509 [1:06:22<13:14, 9.93s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▌ | 429/509 [1:06:22<13:14, 9.93s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▌ | 429/509 [1:06:22<13:14, 9.93s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▌ | 429/509 [1:06:22<13:14, 9.93s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▋ | 430/509 [1:06:32<12:57, 9.85s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▋ | 430/509 [1:06:32<12:57, 9.85s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0903, 'learning_rate': 2.562e-05, 'epoch': 0.84} 84%|██████████████████████████████████████████████████████████████████▋ | 430/509 [1:06:32<12:57, 9.85s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▋ | 430/509 [1:06:32<12:57, 9.85s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▋ | 430/509 [1:06:32<12:57, 9.85s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|██████████████████████████████████████████████████████████████████▉ | 431/509 [1:06:41<12:39, 9.74s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|██████████████████████████████████████████████████████████████████▉ | 431/509 [1:06:41<12:39, 9.74s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|██████████████████████████████████████████████████████████████████▉ | 431/509 [1:06:41<12:39, 9.74s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|██████████████████████████████████████████████████████████████████▉ | 431/509 [1:06:41<12:39, 9.74s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|██████████████████████████████████████████████████████████████████▉ | 431/509 [1:06:41<12:39, 9.74s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████ | 432/509 [1:06:51<12:22, 9.64s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████ | 432/509 [1:06:51<12:22, 9.64s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████ | 432/509 [1:06:51<12:22, 9.64s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████ | 432/509 [1:06:51<12:22, 9.64s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▏ | 433/509 [1:07:00<12:05, 9.54s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▏ | 433/509 [1:07:00<12:05, 9.54s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1258, 'learning_rate': 2.58e-05, 'epoch': 0.85} 85%|███████████████████████████████████████████████████████████████████▏ | 433/509 [1:07:00<12:05, 9.54s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▏ | 433/509 [1:07:00<12:05, 9.54s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0542, 'learning_rate': 2.586e-05, 'epoch': 0.85} [WARNING|modeling_utils.py:388] 2022-03-01 16:04:12,910 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:04:12,910 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:04:12,910 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:04:12,910 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1034, 'learning_rate': 2.592e-05, 'epoch': 0.85} [WARNING|modeling_utils.py:388] 2022-03-01 16:04:12,910 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:04:12,910 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:04:12,910 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▋ | 436/509 [1:07:27<11:06, 9.13s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▋ | 436/509 [1:07:27<11:06, 9.13s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0396, 'learning_rate': 2.5980000000000002e-05, 'epoch': 0.86} 86%|███████████████████████████████████████████████████████████████████▋ | 436/509 [1:07:27<11:06, 9.13s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▋ | 436/509 [1:07:27<11:06, 9.13s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 437/509 [1:07:36<10:47, 8.99s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 437/509 [1:07:36<10:47, 8.99s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.209, 'learning_rate': 2.604e-05, 'epoch': 0.86} 86%|███████████████████████████████████████████████████████████████████▊ | 437/509 [1:07:36<10:47, 8.99s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 437/509 [1:07:36<10:47, 8.99s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▉ | 438/509 [1:07:44<10:25, 8.81s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▉ | 438/509 [1:07:44<10:25, 8.81s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.177, 'learning_rate': 2.61e-05, 'epoch': 0.86} 86%|███████████████████████████████████████████████████████████████████▉ | 438/509 [1:07:44<10:25, 8.81s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▉ | 438/509 [1:07:44<10:25, 8.81s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▏ | 439/509 [1:07:52<10:03, 8.62s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▏ | 439/509 [1:07:52<10:03, 8.62s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1952, 'learning_rate': 2.616e-05, 'epoch': 0.86} 86%|████████████████████████████████████████████████████████████████████▏ | 439/509 [1:07:52<10:03, 8.62s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▏ | 439/509 [1:07:52<10:03, 8.62s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▎ | 440/509 [1:08:00<09:36, 8.36s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▎ | 440/509 [1:08:00<09:36, 8.36s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0754, 'learning_rate': 2.622e-05, 'epoch': 0.86} 86%|████████████████████████████████████████████████████████████████████▎ | 440/509 [1:08:00<09:36, 8.36s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▎ | 440/509 [1:08:00<09:36, 8.36s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▍ | 441/509 [1:08:07<09:10, 8.10s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▍ | 441/509 [1:08:07<09:10, 8.10s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:05:10,220 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:05:10,220 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▌ | 442/509 [1:08:15<08:42, 7.81s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▌ | 442/509 [1:08:15<08:42, 7.81s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0893, 'learning_rate': 2.6340000000000002e-05, 'epoch': 0.87} 87%|████████████████████████████████████████████████████████████████████▌ | 442/509 [1:08:15<08:42, 7.81s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▌ | 442/509 [1:08:15<08:42, 7.81s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▌ | 442/509 [1:08:15<08:42, 7.81s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▊ | 443/509 [1:08:21<08:15, 7.51s/it]g-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:05:23,741 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:05:23,741 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:05:23,741 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 15:57:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▉ | 444/509 [1:08:28<07:42, 7.11s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:05:28,234 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▉ | 444/509 [1:08:28<07:42, 7.11s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:05:28,234 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▉ | 444/509 [1:08:28<07:42, 7.11s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:05:28,234 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:05:32,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:05:28,234 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:05:32,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:05:28,234 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:05:35,961 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:05:28,234 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:05:35,961 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:05:28,234 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▏ | 446/509 [1:08:38<06:24, 6.10s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:05:38,282 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:05:40,362 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:05:38,282 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:05:40,362 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:05:38,282 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▍ | 447/509 [1:08:42<05:43, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:05:42,416 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:05:44,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:05:42,416 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:05:44,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:05:42,416 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▌ | 448/509 [1:08:46<05:04, 4.99s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:05:45,973 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▌ | 448/509 [1:08:46<05:04, 4.99s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:05:45,973 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▋ | 449/509 [1:08:49<04:28, 4.47s/it]g-point operations will not be computed-01 16:05:45,973 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:05:50,441 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:05:49,131 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 16:05:50,441 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:05:49,131 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▊ | 450/509 [1:08:52<04:02, 4.11s/it]g-point operations will not be computed-01 16:05:49,131 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)2<04:02, 4.11s/it]Traceback (most recent call last):puted-01 16:05:49,131 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)2<04:02, 4.11s/it]Traceback (most recent call last):puted-01 16:05:49,131 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)2<04:02, 4.11s/it]Traceback (most recent call last):puted-01 16:05:49,131 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed