0%| | 0/1019 [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.795, 'learning_rate': 0.0, 'epoch': 0.0} [WARNING|modeling_utils.py:388] 2022-03-02 05:56:04,138 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%| | 1/1019 [00:06<1:55:02, 6.78s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:56:07,171 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 5.0164, 'learning_rate': 0.0, 'epoch': 0.0} [WARNING|modeling_utils.py:388] 2022-03-02 05:56:10,141 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%|▏ | 2/1019 [00:12<1:47:09, 6.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:56:13,149 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 5.0042, 'learning_rate': 2.0000000000000002e-07, 'epoch': 0.0} [WARNING|modeling_utils.py:388] 2022-03-02 05:56:16,077 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%|▏ | 3/1019 [00:18<1:45:53, 6.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:56:19,323 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8341, 'learning_rate': 4.0000000000000003e-07, 'epoch': 0.0} [WARNING|modeling_utils.py:388] 2022-03-02 05:56:22,262 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%|▎ | 4/1019 [00:24<1:44:06, 6.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:56:25,225 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6488, 'learning_rate': 6.000000000000001e-07, 'epoch': 0.0} [WARNING|modeling_utils.py:388] 2022-03-02 05:56:28,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%|▍ | 5/1019 [00:30<1:41:45, 6.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:56:31,043 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:56:33,923 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▍ | 6/1019 [00:36<1:40:49, 5.97s/it] 1%|▍ | 6/1019 [00:36<1:40:49, 5.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:56:36,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:56:39,731 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▌ | 7/1019 [00:42<1:39:48, 5.92s/it] 1%|▌ | 7/1019 [00:42<1:39:48, 5.92s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:56:42,632 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7924, 'learning_rate': 1.2000000000000002e-06, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-03-02 05:56:45,468 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▋ | 8/1019 [00:48<1:38:45, 5.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:56:48,300 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.9304, 'learning_rate': 1.4000000000000001e-06, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-03-02 05:56:51,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▋ | 9/1019 [00:53<1:37:34, 5.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:56:54,029 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:56:56,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▊ | 10/1019 [00:59<1:36:28, 5.74s/it] 1%|▊ | 10/1019 [00:59<1:36:28, 5.74s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:56:59,626 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:57:02,408 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▊ | 11/1019 [01:05<1:36:05, 5.72s/it] 1%|▊ | 11/1019 [01:05<1:36:05, 5.72s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:57:05,279 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8212, 'learning_rate': 1.8e-06, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-03-02 05:57:08,002 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▉ | 12/1019 [01:10<1:35:20, 5.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:57:10,875 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7876, 'learning_rate': 2.0000000000000003e-06, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-03-02 05:57:13,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|█ | 13/1019 [01:16<1:34:47, 5.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:57:16,406 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:57:19,107 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|█ | 14/1019 [01:21<1:33:59, 5.61s/it] 1%|█ | 14/1019 [01:21<1:33:59, 5.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:57:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:57:24,531 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|█▏ | 15/1019 [01:27<1:32:58, 5.56s/it] 1%|█▏ | 15/1019 [01:27<1:32:58, 5.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:57:27,358 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8406, 'learning_rate': 2.6e-06, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-03-02 05:57:29,998 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▏ | 16/1019 [01:32<1:32:25, 5.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:57:32,721 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:57:35,340 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▎ | 17/1019 [01:38<1:31:22, 5.47s/it] 2%|█▎ | 17/1019 [01:38<1:31:22, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:57:38,048 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:57:40,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▍ | 18/1019 [01:43<1:30:15, 5.41s/it] 2%|█▍ | 18/1019 [01:43<1:30:15, 5.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:57:43,321 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:57:45,905 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▍ | 19/1019 [01:48<1:29:37, 5.38s/it] 2%|█▍ | 19/1019 [01:48<1:29:37, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:57:48,652 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:57:51,225 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▌ | 20/1019 [01:53<1:29:15, 5.36s/it] 2%|█▌ | 20/1019 [01:53<1:29:15, 5.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:57:53,894 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:57:56,433 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▋ | 21/1019 [01:59<1:28:23, 5.31s/it] 2%|█▋ | 21/1019 [01:59<1:28:23, 5.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:57:59,089 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:58:01,650 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▋ | 22/1019 [02:04<1:27:49, 5.29s/it] 2%|█▋ | 22/1019 [02:04<1:27:49, 5.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:58:04,295 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6241, 'learning_rate': 4.000000000000001e-06, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-03-02 05:58:06,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▊ | 23/1019 [02:09<1:27:28, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:58:09,500 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:58:12,009 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▊ | 24/1019 [02:14<1:26:39, 5.23s/it] 2%|█▊ | 24/1019 [02:14<1:26:39, 5.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:58:14,585 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5592, 'learning_rate': 4.4e-06, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-03-02 05:58:17,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▉ | 25/1019 [02:19<1:25:41, 5.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:58:19,666 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:58:22,135 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██ | 26/1019 [02:24<1:25:07, 5.14s/it] 3%|██ | 26/1019 [02:24<1:25:07, 5.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:58:24,710 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:58:27,124 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██ | 27/1019 [02:29<1:24:17, 5.10s/it] 3%|██ | 27/1019 [02:29<1:24:17, 5.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:58:29,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:58:32,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▏ | 28/1019 [02:34<1:23:20, 5.05s/it] 3%|██▏ | 28/1019 [02:34<1:23:20, 5.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:58:34,591 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:58:37,002 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▏ | 29/1019 [02:39<1:22:48, 5.02s/it] 3%|██▏ | 29/1019 [02:39<1:22:48, 5.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:58:39,497 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:58:41,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 30/1019 [02:44<1:22:05, 4.98s/it] 3%|██▎ | 30/1019 [02:44<1:22:05, 4.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:58:44,394 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:58:46,797 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▍ | 31/1019 [02:49<1:21:37, 4.96s/it] 3%|██▍ | 31/1019 [02:49<1:21:37, 4.96s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:58:49,222 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:58:51,578 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▍ | 32/1019 [02:54<1:20:40, 4.90s/it] 3%|██▍ | 32/1019 [02:54<1:20:40, 4.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:58:54,020 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:58:56,313 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3961, 'learning_rate': 6e-06, 'epoch': 0.03} 3%|██▌ | 33/1019 [02:59<1:19:45, 4.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:58:58,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:59:00,885 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 34/1019 [03:03<1:18:17, 4.77s/it] 3%|██▋ | 34/1019 [03:03<1:18:17, 4.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:59:03,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:59:05,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 35/1019 [03:08<1:16:57, 4.69s/it] 3%|██▋ | 35/1019 [03:08<1:16:57, 4.69s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:59:07,702 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5288, 'learning_rate': 6.6e-06, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-02 05:59:09,900 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▊ | 36/1019 [03:12<1:15:57, 4.64s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:59:12,175 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4346, 'learning_rate': 6.800000000000001e-06, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-02 05:59:14,333 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▊ | 37/1019 [03:17<1:14:52, 4.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:59:16,513 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4915, 'learning_rate': 7.000000000000001e-06, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-02 05:59:18,542 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▉ | 38/1019 [03:21<1:12:59, 4.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:59:20,652 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6269, 'learning_rate': 7.2e-06, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-02 05:59:22,673 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███ | 39/1019 [03:25<1:11:17, 4.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:59:24,734 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4273, 'learning_rate': 7.4e-06, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-02 05:59:26,678 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███ | 40/1019 [03:29<1:09:27, 4.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:59:28,657 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2809, 'learning_rate': 7.6e-06, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-02 05:59:30,524 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▏ | 41/1019 [03:33<1:07:22, 4.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:59:32,368 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4596, 'learning_rate': 7.8e-06, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-02 05:59:34,093 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▎ | 42/1019 [03:36<1:04:32, 3.96s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:59:35,823 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3236, 'learning_rate': 8.000000000000001e-06, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-02 05:59:37,363 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▎ | 43/1019 [03:40<1:01:06, 3.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:59:38,969 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5072, 'learning_rate': 8.200000000000001e-06, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-02 05:59:40,403 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▍ | 44/1019 [03:43<57:33, 3.54s/it] 4%|███▍ | 44/1019 [03:43<57:33, 3.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:59:41,848 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:59:43,065 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▌ | 45/1019 [03:45<53:12, 3.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:59:44,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.662, 'learning_rate': 8.599999999999999e-06, 'epoch': 0.05} [WARNING|modeling_utils.py:388] 2022-03-02 05:59:45,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▋ | 46/1019 [03:48<48:37, 3.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:59:46,538 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5382, 'learning_rate': 8.8e-06, 'epoch': 0.05} [WARNING|modeling_utils.py:388] 2022-03-02 05:59:47,491 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▋ | 47/1019 [03:50<44:05, 2.72s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:59:48,486 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:59:49,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8504, 'learning_rate': 9e-06, 'epoch': 0.05} {'loss': 5.2149, 'learning_rate': 9.2e-06, 'epoch': 0.05} 5%|███▊ | 48/1019 [03:52<39:45, 2.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:59:50,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:59:50,942 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 49/1019 [03:53<35:38, 2.20s/it] 5%|███▉ | 49/1019 [03:53<35:38, 2.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:59:51,736 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 05:59:52,904 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 50/1019 [03:55<34:25, 2.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:59:56,125 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 50/1019 [03:55<34:25, 2.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 05:59:56,125 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 51/1019 [04:01<54:22, 3.37s/it]g-point operations will not be computed-02 05:59:56,125 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 51/1019 [04:01<54:22, 3.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:00:02,194 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 51/1019 [04:01<54:22, 3.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:00:02,194 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 52/1019 [04:07<1:06:57, 4.15s/it]g-point operations will not be computed-02 06:00:02,194 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 52/1019 [04:07<1:06:57, 4.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:00:08,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 53/1019 [04:13<1:15:26, 4.69s/it]g-point operations will not be computed-02 06:00:08,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 53/1019 [04:13<1:15:26, 4.69s/it]g-point operations will not be computed-02 06:00:08,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 53/1019 [04:13<1:15:26, 4.69s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:00:14,068 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▏ | 54/1019 [04:19<1:21:05, 5.04s/it]g-point operations will not be computed-02 06:00:14,068 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▏ | 54/1019 [04:19<1:21:05, 5.04s/it]g-point operations will not be computed-02 06:00:14,068 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▏ | 54/1019 [04:19<1:21:05, 5.04s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:00:19,891 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 55/1019 [04:25<1:24:26, 5.26s/it]g-point operations will not be computed-02 06:00:19,891 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 55/1019 [04:25<1:24:26, 5.26s/it]g-point operations will not be computed-02 06:00:19,891 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 55/1019 [04:25<1:24:26, 5.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:00:25,682 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 56/1019 [04:31<1:27:04, 5.43s/it]g-point operations will not be computed-02 06:00:25,682 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 56/1019 [04:31<1:27:04, 5.43s/it]g-point operations will not be computed-02 06:00:25,682 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 56/1019 [04:31<1:27:04, 5.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:00:31,492 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 56/1019 [04:31<1:27:04, 5.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:00:31,492 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 57/1019 [04:37<1:28:43, 5.53s/it]g-point operations will not be computed-02 06:00:31,492 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 57/1019 [04:37<1:28:43, 5.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:00:37,167 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 57/1019 [04:37<1:28:43, 5.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:00:37,167 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 58/1019 [04:42<1:29:26, 5.58s/it]g-point operations will not be computed-02 06:00:37,167 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 58/1019 [04:42<1:29:26, 5.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:00:42,957 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 58/1019 [04:42<1:29:26, 5.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:00:42,957 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 59/1019 [04:48<1:29:48, 5.61s/it]g-point operations will not be computed-02 06:00:42,957 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 59/1019 [04:48<1:29:48, 5.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:00:48,537 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 59/1019 [04:48<1:29:48, 5.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:00:48,537 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 60/1019 [04:54<1:30:04, 5.64s/it]g-point operations will not be computed-02 06:00:48,537 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 60/1019 [04:54<1:30:04, 5.64s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:00:54,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 61/1019 [04:59<1:29:13, 5.59s/it]g-point operations will not be computed-02 06:00:54,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 61/1019 [04:59<1:29:13, 5.59s/it]g-point operations will not be computed-02 06:00:54,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 61/1019 [04:59<1:29:13, 5.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:00:59,671 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 61/1019 [04:59<1:29:13, 5.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:00:59,671 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 62/1019 [05:05<1:28:56, 5.58s/it]g-point operations will not be computed-02 06:00:59,671 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 62/1019 [05:05<1:28:56, 5.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:01:05,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 62/1019 [05:05<1:28:56, 5.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:01:05,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▉ | 63/1019 [05:10<1:28:30, 5.55s/it]g-point operations will not be computed-02 06:01:05,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▉ | 63/1019 [05:10<1:28:30, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:01:10,750 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▉ | 64/1019 [05:16<1:28:12, 5.54s/it]g-point operations will not be computed-02 06:01:10,750 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▉ | 64/1019 [05:16<1:28:12, 5.54s/it]g-point operations will not be computed-02 06:01:10,750 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▉ | 64/1019 [05:16<1:28:12, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:01:16,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████ | 65/1019 [05:21<1:27:38, 5.51s/it]g-point operations will not be computed-02 06:01:16,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████ | 65/1019 [05:21<1:27:38, 5.51s/it]g-point operations will not be computed-02 06:01:16,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████ | 65/1019 [05:21<1:27:38, 5.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:01:21,662 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████ | 65/1019 [05:21<1:27:38, 5.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:01:21,662 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████ | 66/1019 [05:26<1:27:06, 5.48s/it]g-point operations will not be computed-02 06:01:21,662 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████ | 66/1019 [05:26<1:27:06, 5.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:01:27,097 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▏ | 67/1019 [05:32<1:27:02, 5.49s/it]g-point operations will not be computed-02 06:01:27,097 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▏ | 67/1019 [05:32<1:27:02, 5.49s/it]g-point operations will not be computed-02 06:01:27,097 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▏ | 67/1019 [05:32<1:27:02, 5.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:01:32,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▏ | 67/1019 [05:32<1:27:02, 5.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:01:32,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▎ | 68/1019 [05:37<1:26:08, 5.44s/it]g-point operations will not be computed-02 06:01:32,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▎ | 68/1019 [05:37<1:26:08, 5.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:01:37,907 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▎ | 68/1019 [05:37<1:26:08, 5.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:01:37,907 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▎ | 69/1019 [05:43<1:25:58, 5.43s/it]g-point operations will not be computed-02 06:01:37,907 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▎ | 69/1019 [05:43<1:25:58, 5.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:01:43,173 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▍ | 70/1019 [05:48<1:25:01, 5.38s/it]g-point operations will not be computed-02 06:01:43,173 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▍ | 70/1019 [05:48<1:25:01, 5.38s/it]g-point operations will not be computed-02 06:01:43,173 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▍ | 70/1019 [05:48<1:25:01, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:01:48,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▍ | 70/1019 [05:48<1:25:01, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:01:48,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 71/1019 [05:53<1:24:20, 5.34s/it]g-point operations will not be computed-02 06:01:48,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 71/1019 [05:53<1:24:20, 5.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:01:53,698 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 72/1019 [05:58<1:23:56, 5.32s/it]g-point operations will not be computed-02 06:01:53,698 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 72/1019 [05:58<1:23:56, 5.32s/it]g-point operations will not be computed-02 06:01:53,698 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 72/1019 [05:58<1:23:56, 5.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:01:58,941 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 72/1019 [05:58<1:23:56, 5.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:01:58,941 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 73/1019 [06:04<1:23:08, 5.27s/it]g-point operations will not be computed-02 06:01:58,941 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 73/1019 [06:04<1:23:08, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:02:04,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 74/1019 [06:09<1:22:25, 5.23s/it]g-point operations will not be computed-02 06:02:04,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 74/1019 [06:09<1:22:25, 5.23s/it]g-point operations will not be computed-02 06:02:04,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 74/1019 [06:09<1:22:25, 5.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:02:09,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 74/1019 [06:09<1:22:25, 5.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:02:09,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▊ | 75/1019 [06:14<1:21:54, 5.21s/it]g-point operations will not be computed-02 06:02:09,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▊ | 75/1019 [06:14<1:21:54, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:02:14,345 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 76/1019 [06:19<1:21:17, 5.17s/it]g-point operations will not be computed-02 06:02:14,345 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 76/1019 [06:19<1:21:17, 5.17s/it]g-point operations will not be computed-02 06:02:14,345 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 76/1019 [06:19<1:21:17, 5.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:02:19,403 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 76/1019 [06:19<1:21:17, 5.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:02:19,403 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|█████▉ | 77/1019 [06:24<1:20:18, 5.12s/it]g-point operations will not be computed-02 06:02:19,403 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|█████▉ | 77/1019 [06:24<1:20:18, 5.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:02:24,365 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████ | 78/1019 [06:29<1:19:26, 5.07s/it]g-point operations will not be computed-02 06:02:24,365 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████ | 78/1019 [06:29<1:19:26, 5.07s/it]g-point operations will not be computed-02 06:02:24,365 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████ | 78/1019 [06:29<1:19:26, 5.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:02:29,282 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████ | 78/1019 [06:29<1:19:26, 5.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:02:29,282 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████ | 79/1019 [06:34<1:18:47, 5.03s/it]g-point operations will not be computed-02 06:02:29,282 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████ | 79/1019 [06:34<1:18:47, 5.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:02:34,248 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 80/1019 [06:39<1:18:03, 4.99s/it]g-point operations will not be computed-02 06:02:34,248 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 80/1019 [06:39<1:18:03, 4.99s/it]g-point operations will not be computed-02 06:02:34,248 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 80/1019 [06:39<1:18:03, 4.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:02:39,096 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 81/1019 [06:44<1:17:24, 4.95s/it]g-point operations will not be computed-02 06:02:39,096 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 81/1019 [06:44<1:17:24, 4.95s/it]g-point operations will not be computed-02 06:02:39,096 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 81/1019 [06:44<1:17:24, 4.95s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:02:43,950 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 81/1019 [06:44<1:17:24, 4.95s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:02:43,950 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 82/1019 [06:48<1:16:30, 4.90s/it]g-point operations will not be computed-02 06:02:43,950 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 82/1019 [06:48<1:16:30, 4.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:02:48,631 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▍ | 83/1019 [06:53<1:15:25, 4.83s/it]g-point operations will not be computed-02 06:02:48,631 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▍ | 83/1019 [06:53<1:15:25, 4.83s/it]g-point operations will not be computed-02 06:02:48,631 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▍ | 83/1019 [06:53<1:15:25, 4.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:02:53,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▌ | 84/1019 [06:58<1:14:08, 4.76s/it]g-point operations will not be computed-02 06:02:53,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▌ | 84/1019 [06:58<1:14:08, 4.76s/it]g-point operations will not be computed-02 06:02:53,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▌ | 84/1019 [06:58<1:14:08, 4.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:02:57,786 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▌ | 85/1019 [07:02<1:12:31, 4.66s/it]g-point operations will not be computed-02 06:02:57,786 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▌ | 85/1019 [07:02<1:12:31, 4.66s/it]g-point operations will not be computed-02 06:02:57,786 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▌ | 85/1019 [07:02<1:12:31, 4.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:02,235 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▋ | 86/1019 [07:07<1:11:40, 4.61s/it]g-point operations will not be computed-02 06:03:02,235 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▋ | 86/1019 [07:07<1:11:40, 4.61s/it]g-point operations will not be computed-02 06:03:02,235 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▋ | 86/1019 [07:07<1:11:40, 4.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:06,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▋ | 86/1019 [07:07<1:11:40, 4.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:06,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▋ | 87/1019 [07:11<1:10:20, 4.53s/it]g-point operations will not be computed-02 06:03:06,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▋ | 87/1019 [07:11<1:10:20, 4.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:10,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▋ | 87/1019 [07:11<1:10:20, 4.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:10,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▊ | 88/1019 [07:15<1:08:52, 4.44s/it]g-point operations will not be computed-02 06:03:10,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▊ | 88/1019 [07:15<1:08:52, 4.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:15,066 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▊ | 88/1019 [07:15<1:08:52, 4.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:15,066 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▉ | 89/1019 [07:19<1:06:50, 4.31s/it]g-point operations will not be computed-02 06:03:15,066 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▉ | 89/1019 [07:19<1:06:50, 4.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:19,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▉ | 89/1019 [07:19<1:06:50, 4.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:19,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▉ | 90/1019 [07:23<1:04:25, 4.16s/it]g-point operations will not be computed-02 06:03:19,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:03:24,439 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:03:22,704 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:03:24,439 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:03:22,704 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3593, 'learning_rate': 1.76e-05, 'epoch': 0.09} 9%|███████ | 91/1019 [07:27<1:01:52, 4.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:26,234 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:03:27,873 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 92/1019 [07:30<59:11, 3.83s/it] 9%|███████▎ | 92/1019 [07:30<59:11, 3.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:29,536 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 92/1019 [07:30<59:11, 3.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:29,536 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▍ | 93/1019 [07:33<56:08, 3.64s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:32,629 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▍ | 94/1019 [07:36<53:03, 3.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:32,629 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▍ | 94/1019 [07:36<53:03, 3.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:32,629 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▍ | 94/1019 [07:36<53:03, 3.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:35,484 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▍ | 94/1019 [07:36<53:03, 3.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:35,484 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▌ | 95/1019 [07:39<49:36, 3.22s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:38,085 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▌ | 95/1019 [07:39<49:36, 3.22s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:38,085 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▋ | 96/1019 [07:41<46:03, 2.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:40,377 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▋ | 97/1019 [07:44<42:07, 2.74s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:42,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▋ | 97/1019 [07:44<42:07, 2.74s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:42,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2987, 'learning_rate': 1.88e-05, 'epoch': 0.1} 10%|███████▊ | 98/1019 [07:45<38:06, 2.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:44,160 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 98/1019 [07:45<38:06, 2.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:44,160 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:03:45,671 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:03:45,671 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 100/1019 [07:49<32:33, 2.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:45,671 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 100/1019 [07:49<32:33, 2.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:50,016 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 100/1019 [07:49<32:33, 2.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:50,016 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 101/1019 [07:55<51:16, 3.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:50,016 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 101/1019 [07:55<51:16, 3.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:55,940 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 102/1019 [08:01<1:02:27, 4.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:55,940 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 102/1019 [08:01<1:02:27, 4.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:03:55,940 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 102/1019 [08:01<1:02:27, 4.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:01,829 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 103/1019 [08:07<1:10:55, 4.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:01,829 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 103/1019 [08:07<1:10:55, 4.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:01,829 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 103/1019 [08:07<1:10:55, 4.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:07,638 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 104/1019 [08:13<1:15:38, 4.96s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:07,638 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 104/1019 [08:13<1:15:38, 4.96s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:07,638 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 104/1019 [08:13<1:15:38, 4.96s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:13,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 104/1019 [08:13<1:15:38, 4.96s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:13,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 105/1019 [08:18<1:19:10, 5.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:13,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 105/1019 [08:18<1:19:10, 5.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:19,155 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 106/1019 [08:24<1:21:50, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:19,155 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 106/1019 [08:24<1:21:50, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:19,155 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 106/1019 [08:24<1:21:50, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:24,887 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▏ | 107/1019 [08:30<1:23:15, 5.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:24,887 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▏ | 107/1019 [08:30<1:23:15, 5.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:24,887 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▏ | 107/1019 [08:30<1:23:15, 5.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:30,568 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▎ | 108/1019 [08:36<1:24:03, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:30,568 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▎ | 108/1019 [08:36<1:24:03, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:30,568 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▎ | 108/1019 [08:36<1:24:03, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:36,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▎ | 109/1019 [08:41<1:24:33, 5.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:36,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▎ | 109/1019 [08:41<1:24:33, 5.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:36,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▎ | 109/1019 [08:41<1:24:33, 5.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:41,911 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 110/1019 [08:47<1:24:47, 5.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:41,911 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 110/1019 [08:47<1:24:47, 5.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:41,911 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 110/1019 [08:47<1:24:47, 5.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:47,475 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 111/1019 [08:52<1:24:17, 5.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:47,475 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 111/1019 [08:52<1:24:17, 5.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:47,475 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 111/1019 [08:52<1:24:17, 5.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:52,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▌ | 112/1019 [08:58<1:23:44, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:52,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▌ | 112/1019 [08:58<1:23:44, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:52,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▌ | 112/1019 [08:58<1:23:44, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:58,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 113/1019 [09:03<1:23:20, 5.52s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:58,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 113/1019 [09:03<1:23:20, 5.52s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:04:58,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 113/1019 [09:03<1:23:20, 5.52s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:03,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 114/1019 [09:09<1:22:45, 5.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:03,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 114/1019 [09:09<1:22:45, 5.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:03,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 114/1019 [09:09<1:22:45, 5.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:09,297 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 115/1019 [09:14<1:22:27, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:09,297 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 115/1019 [09:14<1:22:27, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:09,297 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 115/1019 [09:14<1:22:27, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:14,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 115/1019 [09:14<1:22:27, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:14,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 116/1019 [09:20<1:22:07, 5.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:14,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 116/1019 [09:20<1:22:07, 5.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:20,192 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:05:20,192 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:05:20,192 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 117/1019 [09:25<1:22:07, 5.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:25,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████ | 118/1019 [09:30<1:20:58, 5.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:25,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████ | 118/1019 [09:30<1:20:58, 5.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:25,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████ | 118/1019 [09:30<1:20:58, 5.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:30,766 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████ | 119/1019 [09:36<1:20:09, 5.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:30,766 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████ | 119/1019 [09:36<1:20:09, 5.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:30,766 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████ | 119/1019 [09:36<1:20:09, 5.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:36,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▏ | 120/1019 [09:41<1:19:42, 5.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:36,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▏ | 120/1019 [09:41<1:19:42, 5.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:36,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▏ | 120/1019 [09:41<1:19:42, 5.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:41,322 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▏ | 120/1019 [09:41<1:19:42, 5.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:41,322 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▏ | 120/1019 [09:41<1:19:42, 5.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:41,322 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 121/1019 [09:46<1:19:23, 5.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:41,322 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 121/1019 [09:46<1:19:23, 5.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:46,505 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 122/1019 [09:51<1:18:38, 5.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:46,505 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 122/1019 [09:51<1:18:38, 5.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:46,505 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 122/1019 [09:51<1:18:38, 5.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:51,636 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:05:51,636 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:05:51,636 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 123/1019 [09:56<1:18:16, 5.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:51,636 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 123/1019 [09:56<1:18:16, 5.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:56,864 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 123/1019 [09:56<1:18:16, 5.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:56,864 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 124/1019 [10:02<1:17:43, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:05:56,864 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 124/1019 [10:02<1:17:43, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:01,915 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▌ | 125/1019 [10:07<1:16:52, 5.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:01,915 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▌ | 125/1019 [10:07<1:16:52, 5.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:01,915 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▌ | 125/1019 [10:07<1:16:52, 5.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:07,017 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▌ | 125/1019 [10:07<1:16:52, 5.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:07,017 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 126/1019 [10:12<1:16:29, 5.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:07,017 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 126/1019 [10:12<1:16:29, 5.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:12,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 126/1019 [10:12<1:16:29, 5.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:12,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 127/1019 [10:17<1:15:44, 5.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:12,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 127/1019 [10:17<1:15:44, 5.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:16,984 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 127/1019 [10:17<1:15:44, 5.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:16,984 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|█████████▊ | 128/1019 [10:22<1:14:46, 5.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:16,984 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|█████████▊ | 128/1019 [10:22<1:14:46, 5.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:21,889 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|█████████▊ | 128/1019 [10:22<1:14:46, 5.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:21,889 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|█████████▊ | 129/1019 [10:26<1:14:11, 5.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:21,889 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|█████████▊ | 129/1019 [10:26<1:14:11, 5.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:26,754 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|█████████▊ | 129/1019 [10:26<1:14:11, 5.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:26,754 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|█████████▉ | 130/1019 [10:31<1:13:11, 4.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:26,754 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|█████████▉ | 130/1019 [10:31<1:13:11, 4.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:31,543 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|█████████▉ | 130/1019 [10:31<1:13:11, 4.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:31,543 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████ | 131/1019 [10:36<1:12:33, 4.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:31,543 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████ | 131/1019 [10:36<1:12:33, 4.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:36,336 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████ | 132/1019 [10:41<1:11:48, 4.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:36,336 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████ | 132/1019 [10:41<1:11:48, 4.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:36,336 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████ | 132/1019 [10:41<1:11:48, 4.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:41,097 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████ | 132/1019 [10:41<1:11:48, 4.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:41,097 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 133/1019 [10:46<1:11:14, 4.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:41,097 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 133/1019 [10:46<1:11:14, 4.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:45,818 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 133/1019 [10:46<1:11:14, 4.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:45,818 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 134/1019 [10:50<1:10:09, 4.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:45,818 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 134/1019 [10:50<1:10:09, 4.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:50,326 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 134/1019 [10:50<1:10:09, 4.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:50,326 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 135/1019 [10:55<1:09:08, 4.69s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:50,326 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 135/1019 [10:55<1:09:08, 4.69s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:54,835 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 135/1019 [10:55<1:09:08, 4.69s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:54,835 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▍ | 136/1019 [10:59<1:08:03, 4.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:54,835 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▍ | 136/1019 [10:59<1:08:03, 4.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:59,275 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▍ | 136/1019 [10:59<1:08:03, 4.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:59,275 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▍ | 137/1019 [11:04<1:06:56, 4.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:06:59,275 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▍ | 137/1019 [11:04<1:06:56, 4.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:03,607 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▍ | 137/1019 [11:04<1:06:56, 4.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:03,607 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▌ | 138/1019 [11:08<1:05:19, 4.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:03,607 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▌ | 138/1019 [11:08<1:05:19, 4.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:07,682 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▌ | 138/1019 [11:08<1:05:19, 4.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:07,682 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▋ | 139/1019 [11:12<1:03:19, 4.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:07,682 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▋ | 139/1019 [11:12<1:03:19, 4.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:11,684 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▋ | 139/1019 [11:12<1:03:19, 4.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:11,684 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▋ | 140/1019 [11:16<1:01:37, 4.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:11,684 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▋ | 140/1019 [11:16<1:01:37, 4.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:15,488 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▋ | 140/1019 [11:16<1:01:37, 4.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:15,488 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 141/1019 [11:19<59:09, 4.04s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:15,488 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 141/1019 [11:19<59:09, 4.04s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:19,014 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 141/1019 [11:19<59:09, 4.04s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:19,014 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 142/1019 [11:23<56:11, 3.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:19,014 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 142/1019 [11:23<56:11, 3.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:19,014 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 143/1019 [11:26<52:57, 3.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:22,242 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 143/1019 [11:26<52:57, 3.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:25,235 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 143/1019 [11:26<52:57, 3.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:25,235 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▎ | 144/1019 [11:29<49:21, 3.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:27,901 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▎ | 144/1019 [11:29<49:21, 3.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:27,901 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 145/1019 [11:31<46:00, 3.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:27,901 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 145/1019 [11:31<46:00, 3.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:27,901 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 146/1019 [11:34<42:10, 2.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:30,421 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 146/1019 [11:34<42:10, 2.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:30,421 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▌ | 147/1019 [11:36<38:30, 2.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:32,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▌ | 147/1019 [11:36<38:30, 2.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:32,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▌ | 148/1019 [11:38<34:50, 2.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:36,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▌ | 148/1019 [11:38<34:50, 2.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:36,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▋ | 149/1019 [11:39<31:18, 2.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:37,702 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▋ | 149/1019 [11:39<31:18, 2.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:37,702 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 150/1019 [11:41<30:03, 2.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:37,702 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 150/1019 [11:41<30:03, 2.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:42,173 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 150/1019 [11:41<30:03, 2.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:42,173 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 151/1019 [11:47<48:33, 3.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:42,173 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 151/1019 [11:47<48:33, 3.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:48,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 151/1019 [11:47<48:33, 3.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:48,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 152/1019 [11:53<59:39, 4.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:48,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 152/1019 [11:53<59:39, 4.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 152/1019 [11:53<59:39, 4.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 152/1019 [11:53<59:39, 4.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3893, 'learning_rate': 3e-05, 'epoch': 0.15} 15%|███████████▉ | 152/1019 [11:53<59:39, 4.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 154/1019 [12:05<1:11:55, 4.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 154/1019 [12:05<1:11:55, 4.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2352, 'learning_rate': 3.02e-05, 'epoch': 0.15} 15%|███████████▊ | 155/1019 [12:11<1:14:47, 5.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 155/1019 [12:11<1:14:47, 5.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1155, 'learning_rate': 3.04e-05, 'epoch': 0.15} 15%|███████████▊ | 155/1019 [12:11<1:14:47, 5.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 156/1019 [12:16<1:16:46, 5.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 156/1019 [12:16<1:16:46, 5.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 156/1019 [12:16<1:16:46, 5.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████ | 157/1019 [12:22<1:18:18, 5.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████ | 157/1019 [12:22<1:18:18, 5.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████ | 158/1019 [12:28<1:19:21, 5.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████ | 158/1019 [12:28<1:19:21, 5.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4512, 'learning_rate': 3.1e-05, 'epoch': 0.15} 16%|████████████▏ | 159/1019 [12:33<1:19:50, 5.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▏ | 159/1019 [12:33<1:19:50, 5.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3256, 'learning_rate': 3.12e-05, 'epoch': 0.16} 16%|████████████▏ | 160/1019 [12:39<1:20:05, 5.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▏ | 160/1019 [12:39<1:20:05, 5.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.199, 'learning_rate': 3.1400000000000004e-05, 'epoch': 0.16} 16%|████████████▎ | 161/1019 [12:45<1:19:56, 5.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▎ | 161/1019 [12:45<1:19:56, 5.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2251, 'learning_rate': 3.16e-05, 'epoch': 0.16} 16%|████████████▍ | 162/1019 [12:50<1:19:23, 5.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▍ | 162/1019 [12:50<1:19:23, 5.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3597, 'learning_rate': 3.18e-05, 'epoch': 0.16} 16%|████████████▍ | 163/1019 [12:56<1:19:10, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▍ | 163/1019 [12:56<1:19:10, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2757, 'learning_rate': 3.2000000000000005e-05, 'epoch': 0.16} 16%|████████████▍ | 163/1019 [12:56<1:19:10, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▍ | 163/1019 [12:56<1:19:10, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2402, 'learning_rate': 3.2200000000000003e-05, 'epoch': 0.16} 16%|████████████▍ | 163/1019 [12:56<1:19:10, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▋ | 165/1019 [13:07<1:18:17, 5.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▋ | 165/1019 [13:07<1:18:17, 5.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2667, 'learning_rate': 3.24e-05, 'epoch': 0.16} 16%|████████████▋ | 165/1019 [13:07<1:18:17, 5.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▋ | 166/1019 [13:12<1:17:56, 5.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▋ | 166/1019 [13:12<1:17:56, 5.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▊ | 167/1019 [13:17<1:17:28, 5.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▊ | 167/1019 [13:17<1:17:28, 5.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2134, 'learning_rate': 3.2800000000000004e-05, 'epoch': 0.16} 16%|████████████▊ | 168/1019 [13:23<1:16:56, 5.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▊ | 168/1019 [13:23<1:16:56, 5.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0583, 'learning_rate': 3.3e-05, 'epoch': 0.16} 16%|████████████▊ | 168/1019 [13:23<1:16:56, 5.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|████████████▉ | 169/1019 [13:28<1:16:22, 5.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|████████████▉ | 169/1019 [13:28<1:16:22, 5.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|████████████▉ | 169/1019 [13:28<1:16:22, 5.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████ | 170/1019 [13:33<1:15:37, 5.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████ | 170/1019 [13:33<1:15:37, 5.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████ | 171/1019 [13:38<1:14:32, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████ | 171/1019 [13:38<1:14:32, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3876, 'learning_rate': 3.3600000000000004e-05, 'epoch': 0.17} 17%|█████████████ | 171/1019 [13:38<1:14:32, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▏ | 172/1019 [13:44<1:13:59, 5.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▏ | 172/1019 [13:44<1:13:59, 5.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▏ | 173/1019 [13:49<1:13:04, 5.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▏ | 173/1019 [13:49<1:13:04, 5.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2226, 'learning_rate': 3.4000000000000007e-05, 'epoch': 0.17} 17%|█████████████▏ | 173/1019 [13:49<1:13:04, 5.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▎ | 174/1019 [13:54<1:12:11, 5.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▎ | 174/1019 [13:54<1:12:11, 5.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▎ | 174/1019 [13:54<1:12:11, 5.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▍ | 175/1019 [13:59<1:11:23, 5.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▍ | 175/1019 [13:59<1:11:23, 5.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▍ | 175/1019 [13:59<1:11:23, 5.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▍ | 176/1019 [14:04<1:10:54, 5.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▍ | 176/1019 [14:04<1:10:54, 5.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▌ | 177/1019 [14:09<1:10:32, 5.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▌ | 177/1019 [14:09<1:10:32, 5.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1949, 'learning_rate': 3.48e-05, 'epoch': 0.17} 17%|█████████████▋ | 178/1019 [14:13<1:09:56, 4.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 178/1019 [14:13<1:09:56, 4.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2429, 'learning_rate': 3.5e-05, 'epoch': 0.17} 17%|█████████████▋ | 178/1019 [14:13<1:09:56, 4.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▋ | 179/1019 [14:18<1:09:14, 4.95s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▋ | 179/1019 [14:18<1:09:14, 4.95s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▊ | 180/1019 [14:23<1:08:35, 4.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▊ | 180/1019 [14:23<1:08:35, 4.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3763, 'learning_rate': 3.54e-05, 'epoch': 0.18} 18%|█████████████▊ | 181/1019 [14:28<1:07:56, 4.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▊ | 181/1019 [14:28<1:07:56, 4.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4287, 'learning_rate': 3.56e-05, 'epoch': 0.18} 18%|█████████████▊ | 181/1019 [14:28<1:07:56, 4.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▉ | 182/1019 [14:33<1:07:23, 4.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▉ | 182/1019 [14:33<1:07:23, 4.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████ | 183/1019 [14:37<1:06:18, 4.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████ | 183/1019 [14:37<1:06:18, 4.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2801, 'learning_rate': 3.6e-05, 'epoch': 0.18} 18%|██████████████ | 184/1019 [14:42<1:05:30, 4.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████ | 184/1019 [14:42<1:05:30, 4.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2117, 'learning_rate': 3.62e-05, 'epoch': 0.18} 18%|██████████████▏ | 185/1019 [14:46<1:04:37, 4.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▏ | 185/1019 [14:46<1:04:37, 4.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1775, 'learning_rate': 3.6400000000000004e-05, 'epoch': 0.18} 18%|██████████████▏ | 185/1019 [14:46<1:04:37, 4.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▏ | 186/1019 [14:51<1:03:44, 4.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▏ | 186/1019 [14:51<1:03:44, 4.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▏ | 186/1019 [14:51<1:03:44, 4.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:07:54,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 187/1019 [14:55<1:02:31, 4.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:10:55,064 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▍ | 188/1019 [14:59<1:01:21, 4.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:10:55,064 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▍ | 188/1019 [14:59<1:01:21, 4.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:10:55,064 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▍ | 188/1019 [14:59<1:01:21, 4.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:10:55,064 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▊ | 189/1019 [15:03<59:51, 4.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:10:55,064 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▊ | 189/1019 [15:03<59:51, 4.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:10:55,064 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.183, 'learning_rate': 3.72e-05, 'epoch': 0.19} 19%|██████████████▉ | 190/1019 [15:07<58:02, 4.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:10:55,064 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▉ | 190/1019 [15:07<58:02, 4.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:10:55,064 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3126, 'learning_rate': 3.74e-05, 'epoch': 0.19} 19%|██████████████▉ | 191/1019 [15:11<55:47, 4.04s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:10:55,064 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▉ | 191/1019 [15:11<55:47, 4.04s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:10:55,064 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3151, 'learning_rate': 3.76e-05, 'epoch': 0.19} 19%|███████████████ | 192/1019 [15:15<53:33, 3.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:10:55,064 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████ | 192/1019 [15:15<53:33, 3.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:10:55,064 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2967, 'learning_rate': 3.7800000000000004e-05, 'epoch': 0.19} 19%|███████████████ | 192/1019 [15:15<53:33, 3.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:10:55,064 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▏ | 193/1019 [15:18<50:51, 3.69s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:17,178 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▏ | 194/1019 [15:21<48:05, 3.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:17,178 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▏ | 194/1019 [15:21<48:05, 3.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:17,178 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:11:21,315 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:11:17,178 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:11:21,315 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:11:17,178 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3432, 'learning_rate': 3.8400000000000005e-05, 'epoch': 0.19} [WARNING|modeling_utils.py:388] 2022-03-02 06:11:21,315 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:11:17,178 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▍ | 196/1019 [15:26<41:51, 3.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:25,104 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▍ | 197/1019 [15:28<38:49, 2.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:27,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▍ | 197/1019 [15:28<38:49, 2.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:27,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▌ | 198/1019 [15:30<35:28, 2.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:29,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▌ | 198/1019 [15:30<35:28, 2.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:29,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4893, 'learning_rate': 3.9000000000000006e-05, 'epoch': 0.19} 20%|███████████████▌ | 199/1019 [15:32<32:01, 2.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 199/1019 [15:32<32:01, 2.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▋ | 200/1019 [15:34<30:20, 2.22s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▋ | 200/1019 [15:34<30:20, 2.22s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▋ | 200/1019 [15:34<30:20, 2.22s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▊ | 201/1019 [15:40<46:22, 3.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▊ | 201/1019 [15:40<46:22, 3.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▊ | 201/1019 [15:40<46:22, 3.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▊ | 201/1019 [15:40<46:22, 3.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2302, 'learning_rate': 3.9800000000000005e-05, 'epoch': 0.2} 20%|███████████████▊ | 201/1019 [15:40<46:22, 3.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 203/1019 [15:52<1:03:30, 4.67s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 203/1019 [15:52<1:03:30, 4.67s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2423, 'learning_rate': 4e-05, 'epoch': 0.2} 20%|███████████████▌ | 204/1019 [15:58<1:08:04, 5.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 204/1019 [15:58<1:08:04, 5.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1356, 'learning_rate': 4.02e-05, 'epoch': 0.2} 20%|███████████████▋ | 205/1019 [16:04<1:10:37, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▋ | 205/1019 [16:04<1:10:37, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0912, 'learning_rate': 4.0400000000000006e-05, 'epoch': 0.2} 20%|███████████████▋ | 205/1019 [16:04<1:10:37, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▋ | 205/1019 [16:04<1:10:37, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1332, 'learning_rate': 4.0600000000000004e-05, 'epoch': 0.2} 20%|███████████████▋ | 205/1019 [16:04<1:10:37, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▊ | 207/1019 [16:15<1:13:29, 5.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▊ | 207/1019 [16:15<1:13:29, 5.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1937, 'learning_rate': 4.08e-05, 'epoch': 0.2} 20%|███████████████▉ | 208/1019 [16:20<1:14:08, 5.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▉ | 208/1019 [16:20<1:14:08, 5.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2914, 'learning_rate': 4.1e-05, 'epoch': 0.2} 21%|███████████████▉ | 209/1019 [16:26<1:14:09, 5.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|███████████████▉ | 209/1019 [16:26<1:14:09, 5.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4561, 'learning_rate': 4.12e-05, 'epoch': 0.21} 21%|████████████████ | 210/1019 [16:32<1:14:28, 5.52s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████ | 210/1019 [16:32<1:14:28, 5.52s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2576, 'learning_rate': 4.14e-05, 'epoch': 0.21} 21%|████████████████ | 210/1019 [16:32<1:14:28, 5.52s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▏ | 211/1019 [16:37<1:14:38, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▏ | 211/1019 [16:37<1:14:38, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▏ | 211/1019 [16:37<1:14:38, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▏ | 212/1019 [16:43<1:14:33, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▏ | 212/1019 [16:43<1:14:33, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▏ | 212/1019 [16:43<1:14:33, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▏ | 212/1019 [16:43<1:14:33, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1316, 'learning_rate': 4.2e-05, 'epoch': 0.21} 21%|████████████████▏ | 212/1019 [16:43<1:14:33, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▏ | 212/1019 [16:43<1:14:33, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 214/1019 [16:54<1:13:40, 5.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 214/1019 [16:54<1:13:40, 5.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 214/1019 [16:54<1:13:40, 5.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 215/1019 [16:59<1:13:01, 5.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 215/1019 [16:59<1:13:01, 5.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▌ | 216/1019 [17:04<1:12:24, 5.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▌ | 216/1019 [17:04<1:12:24, 5.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3526, 'learning_rate': 4.26e-05, 'epoch': 0.21} 21%|████████████████▌ | 216/1019 [17:04<1:12:24, 5.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▌ | 217/1019 [17:10<1:11:53, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▌ | 217/1019 [17:10<1:11:53, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▋ | 218/1019 [17:15<1:11:49, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▋ | 218/1019 [17:15<1:11:49, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1111, 'learning_rate': 4.3e-05, 'epoch': 0.21} 21%|████████████████▋ | 218/1019 [17:15<1:11:49, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▊ | 219/1019 [17:20<1:11:35, 5.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▊ | 219/1019 [17:20<1:11:35, 5.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1343, 'learning_rate': 4.3400000000000005e-05, 'epoch': 0.22} [WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|████████████████▉ | 221/1019 [17:31<1:09:43, 5.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|████████████████▉ | 221/1019 [17:31<1:09:43, 5.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|████████████████▉ | 222/1019 [17:36<1:08:54, 5.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|████████████████▉ | 222/1019 [17:36<1:08:54, 5.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4093, 'learning_rate': 4.38e-05, 'epoch': 0.22} 22%|█████████████████ | 223/1019 [17:41<1:08:31, 5.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████ | 223/1019 [17:41<1:08:31, 5.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1783, 'learning_rate': 4.4000000000000006e-05, 'epoch': 0.22} 22%|█████████████████▏ | 224/1019 [17:46<1:07:49, 5.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 224/1019 [17:46<1:07:49, 5.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3176, 'learning_rate': 4.4200000000000004e-05, 'epoch': 0.22} 22%|█████████████████▏ | 224/1019 [17:46<1:07:49, 5.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 225/1019 [17:51<1:07:07, 5.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 225/1019 [17:51<1:07:07, 5.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▎ | 226/1019 [17:56<1:06:47, 5.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▎ | 226/1019 [17:56<1:06:47, 5.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1444, 'learning_rate': 4.46e-05, 'epoch': 0.22} 22%|█████████████████▎ | 226/1019 [17:56<1:06:47, 5.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▎ | 226/1019 [17:56<1:06:47, 5.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1897, 'learning_rate': 4.4800000000000005e-05, 'epoch': 0.22} 22%|█████████████████▎ | 226/1019 [17:56<1:06:47, 5.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▍ | 228/1019 [18:06<1:05:30, 4.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▍ | 228/1019 [18:06<1:05:30, 4.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.095, 'learning_rate': 4.5e-05, 'epoch': 0.22} 22%|█████████████████▌ | 229/1019 [18:10<1:04:31, 4.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▌ | 229/1019 [18:10<1:04:31, 4.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1946, 'learning_rate': 4.52e-05, 'epoch': 0.22} 23%|█████████████████▌ | 230/1019 [18:15<1:03:51, 4.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▌ | 230/1019 [18:15<1:03:51, 4.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2182, 'learning_rate': 4.5400000000000006e-05, 'epoch': 0.23} 23%|█████████████████▌ | 230/1019 [18:15<1:03:51, 4.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▋ | 231/1019 [18:20<1:03:02, 4.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▋ | 231/1019 [18:20<1:03:02, 4.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▊ | 232/1019 [18:24<1:02:33, 4.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▊ | 232/1019 [18:24<1:02:33, 4.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1917, 'learning_rate': 4.58e-05, 'epoch': 0.23} 23%|█████████████████▊ | 233/1019 [18:29<1:01:32, 4.70s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▊ | 233/1019 [18:29<1:01:32, 4.70s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3202, 'learning_rate': 4.600000000000001e-05, 'epoch': 0.23} 23%|█████████████████▉ | 234/1019 [18:33<1:00:33, 4.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▉ | 234/1019 [18:33<1:00:33, 4.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2299, 'learning_rate': 4.6200000000000005e-05, 'epoch': 0.23} 23%|██████████████████▍ | 235/1019 [18:38<59:51, 4.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▍ | 235/1019 [18:38<59:51, 4.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3656, 'learning_rate': 4.64e-05, 'epoch': 0.23} 23%|██████████████████▍ | 235/1019 [18:38<59:51, 4.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▍ | 235/1019 [18:38<59:51, 4.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1586, 'learning_rate': 4.660000000000001e-05, 'epoch': 0.23} 23%|██████████████████▍ | 235/1019 [18:38<59:51, 4.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▍ | 235/1019 [18:38<59:51, 4.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▌ | 237/1019 [18:46<57:34, 4.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▌ | 237/1019 [18:46<57:34, 4.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▌ | 237/1019 [18:46<57:34, 4.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▋ | 238/1019 [18:51<56:28, 4.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▋ | 238/1019 [18:51<56:28, 4.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▋ | 238/1019 [18:51<56:28, 4.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▊ | 239/1019 [18:55<55:12, 4.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▊ | 239/1019 [18:55<55:12, 4.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▊ | 239/1019 [18:55<55:12, 4.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 240/1019 [18:58<53:39, 4.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:14:59,954 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:14:59,954 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2641, 'learning_rate': 4.76e-05, 'epoch': 0.24} [WARNING|modeling_utils.py:388] 2022-03-02 06:14:59,954 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▉ | 242/1019 [19:06<49:16, 3.80s/it]g-point operations will not be computed-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▉ | 242/1019 [19:06<49:16, 3.80s/it]g-point operations will not be computed-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▉ | 242/1019 [19:06<49:16, 3.80s/it]g-point operations will not be computed-02 06:11:30,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 243/1019 [19:09<46:25, 3.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 243/1019 [19:09<46:25, 3.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▏ | 244/1019 [19:11<43:11, 3.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▏ | 244/1019 [19:11<43:11, 3.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:15:11,730 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:15:11,730 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:15:14,068 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:15:14,068 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:15:16,041 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:15:16,041 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:15:17,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:15:17,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:15:19,413 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:15:19,413 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:15:21,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:15:21,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 251/1019 [19:30<42:23, 3.31s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 251/1019 [19:30<42:23, 3.31s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0882, 'learning_rate': 4.96e-05, 'epoch': 0.25} 25%|███████████████████▊ | 252/1019 [19:36<52:14, 4.09s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 252/1019 [19:36<52:14, 4.09s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3265, 'learning_rate': 4.9800000000000004e-05, 'epoch': 0.25} 25%|███████████████████▊ | 252/1019 [19:36<52:14, 4.09s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 253/1019 [19:41<58:48, 4.61s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 253/1019 [19:41<58:48, 4.61s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 253/1019 [19:41<58:48, 4.61s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 253/1019 [19:41<58:48, 4.61s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9841, 'learning_rate': 5.02e-05, 'epoch': 0.25} 25%|███████████████████▊ | 253/1019 [19:41<58:48, 4.61s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▌ | 255/1019 [19:53<1:06:26, 5.22s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▌ | 255/1019 [19:53<1:06:26, 5.22s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1311, 'learning_rate': 5.0400000000000005e-05, 'epoch': 0.25} 25%|███████████████████▌ | 255/1019 [19:53<1:06:26, 5.22s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▌ | 256/1019 [19:59<1:08:23, 5.38s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▌ | 256/1019 [19:59<1:08:23, 5.38s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▌ | 256/1019 [19:59<1:08:23, 5.38s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 257/1019 [20:04<1:09:04, 5.44s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 257/1019 [20:04<1:09:04, 5.44s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 258/1019 [20:10<1:09:42, 5.50s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 258/1019 [20:10<1:09:42, 5.50s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9816, 'learning_rate': 5.1000000000000006e-05, 'epoch': 0.25} 25%|███████████████████▋ | 258/1019 [20:10<1:09:42, 5.50s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 259/1019 [20:16<1:10:15, 5.55s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 259/1019 [20:16<1:10:15, 5.55s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 259/1019 [20:16<1:10:15, 5.55s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|███████████████████▉ | 260/1019 [20:21<1:10:14, 5.55s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|███████████████████▉ | 260/1019 [20:21<1:10:14, 5.55s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|███████████████████▉ | 261/1019 [20:27<1:09:34, 5.51s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|███████████████████▉ | 261/1019 [20:27<1:09:34, 5.51s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1301, 'learning_rate': 5.16e-05, 'epoch': 0.26} 26%|████████████████████ | 262/1019 [20:32<1:09:06, 5.48s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████ | 262/1019 [20:32<1:09:06, 5.48s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.173, 'learning_rate': 5.1800000000000005e-05, 'epoch': 0.26} 26%|████████████████████ | 262/1019 [20:32<1:09:06, 5.48s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▏ | 263/1019 [20:38<1:09:05, 5.48s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▏ | 263/1019 [20:38<1:09:05, 5.48s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▏ | 263/1019 [20:38<1:09:05, 5.48s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▏ | 263/1019 [20:38<1:09:05, 5.48s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2179, 'learning_rate': 5.22e-05, 'epoch': 0.26} 26%|████████████████████▏ | 263/1019 [20:38<1:09:05, 5.48s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▎ | 265/1019 [20:48<1:08:27, 5.45s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▎ | 265/1019 [20:48<1:08:27, 5.45s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0886, 'learning_rate': 5.2400000000000007e-05, 'epoch': 0.26} 26%|████████████████████▎ | 265/1019 [20:48<1:08:27, 5.45s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▎ | 266/1019 [20:54<1:08:18, 5.44s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▎ | 266/1019 [20:54<1:08:18, 5.44s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 267/1019 [20:59<1:07:58, 5.42s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 267/1019 [20:59<1:07:58, 5.42s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3285, 'learning_rate': 5.28e-05, 'epoch': 0.26} 26%|████████████████████▍ | 267/1019 [20:59<1:07:58, 5.42s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▌ | 268/1019 [21:05<1:07:29, 5.39s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▌ | 268/1019 [21:05<1:07:29, 5.39s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▌ | 268/1019 [21:05<1:07:29, 5.39s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▌ | 269/1019 [21:10<1:07:11, 5.38s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▌ | 269/1019 [21:10<1:07:11, 5.38s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▋ | 270/1019 [21:15<1:06:45, 5.35s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▋ | 270/1019 [21:15<1:06:45, 5.35s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.375, 'learning_rate': 5.3400000000000004e-05, 'epoch': 0.26} 26%|████████████████████▋ | 270/1019 [21:15<1:06:45, 5.35s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▋ | 271/1019 [21:20<1:06:06, 5.30s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▋ | 271/1019 [21:20<1:06:06, 5.30s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▊ | 272/1019 [21:26<1:05:22, 5.25s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▊ | 272/1019 [21:26<1:05:22, 5.25s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1679, 'learning_rate': 5.380000000000001e-05, 'epoch': 0.27} 27%|████████████████████▊ | 272/1019 [21:26<1:05:22, 5.25s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▊ | 272/1019 [21:26<1:05:22, 5.25s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2512, 'learning_rate': 5.4000000000000005e-05, 'epoch': 0.27} 27%|████████████████████▊ | 272/1019 [21:26<1:05:22, 5.25s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▉ | 274/1019 [21:36<1:04:06, 5.16s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▉ | 274/1019 [21:36<1:04:06, 5.16s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.432, 'learning_rate': 5.420000000000001e-05, 'epoch': 0.27} 27%|█████████████████████ | 275/1019 [21:41<1:03:28, 5.12s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████ | 275/1019 [21:41<1:03:28, 5.12s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2654, 'learning_rate': 5.440000000000001e-05, 'epoch': 0.27} 27%|█████████████████████▏ | 276/1019 [21:46<1:03:14, 5.11s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▏ | 276/1019 [21:46<1:03:14, 5.11s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5613, 'learning_rate': 5.4600000000000006e-05, 'epoch': 0.27} 27%|█████████████████████▏ | 277/1019 [21:51<1:02:53, 5.09s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▏ | 277/1019 [21:51<1:02:53, 5.09s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5257, 'learning_rate': 5.4800000000000004e-05, 'epoch': 0.27} 27%|█████████████████████▏ | 277/1019 [21:51<1:02:53, 5.09s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▎ | 278/1019 [21:56<1:02:27, 5.06s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▎ | 278/1019 [21:56<1:02:27, 5.06s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▎ | 279/1019 [22:01<1:02:00, 5.03s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▎ | 279/1019 [22:01<1:02:00, 5.03s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3021, 'learning_rate': 5.520000000000001e-05, 'epoch': 0.27} 27%|█████████████████████▎ | 279/1019 [22:01<1:02:00, 5.03s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▍ | 280/1019 [22:06<1:01:21, 4.98s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▍ | 280/1019 [22:06<1:01:21, 4.98s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▌ | 281/1019 [22:10<1:00:27, 4.92s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▌ | 281/1019 [22:10<1:00:27, 4.92s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.439, 'learning_rate': 5.560000000000001e-05, 'epoch': 0.28} 28%|██████████████████████▏ | 282/1019 [22:15<59:52, 4.87s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▏ | 282/1019 [22:15<59:52, 4.87s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2421, 'learning_rate': 5.580000000000001e-05, 'epoch': 0.28} 28%|██████████████████████▏ | 282/1019 [22:15<59:52, 4.87s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▏ | 283/1019 [22:20<58:57, 4.81s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▏ | 283/1019 [22:20<58:57, 4.81s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▏ | 283/1019 [22:20<58:57, 4.81s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 284/1019 [22:24<58:15, 4.76s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 284/1019 [22:24<58:15, 4.76s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 285/1019 [22:29<57:28, 4.70s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 285/1019 [22:29<57:28, 4.70s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5336, 'learning_rate': 5.6399999999999995e-05, 'epoch': 0.28} 28%|██████████████████████▍ | 286/1019 [22:34<56:42, 4.64s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▍ | 286/1019 [22:34<56:42, 4.64s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2397, 'learning_rate': 5.66e-05, 'epoch': 0.28} 28%|██████████████████████▌ | 287/1019 [22:38<55:29, 4.55s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▌ | 287/1019 [22:38<55:29, 4.55s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2868, 'learning_rate': 5.68e-05, 'epoch': 0.28} 28%|██████████████████████▌ | 288/1019 [22:42<54:18, 4.46s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▌ | 288/1019 [22:42<54:18, 4.46s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5147, 'learning_rate': 5.6999999999999996e-05, 'epoch': 0.28} 28%|██████████████████████▋ | 289/1019 [22:46<52:44, 4.34s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▋ | 289/1019 [22:46<52:44, 4.34s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3183, 'learning_rate': 5.72e-05, 'epoch': 0.28} 28%|██████████████████████▊ | 290/1019 [22:50<50:56, 4.19s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▊ | 290/1019 [22:50<50:56, 4.19s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3677, 'learning_rate': 5.74e-05, 'epoch': 0.28} 29%|██████████████████████▊ | 291/1019 [22:54<48:53, 4.03s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▊ | 291/1019 [22:54<48:53, 4.03s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3637, 'learning_rate': 5.76e-05, 'epoch': 0.29} 29%|██████████████████████▉ | 292/1019 [22:57<46:47, 3.86s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▉ | 292/1019 [22:57<46:47, 3.86s/it]g-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:18:58,115 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:18:58,115 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:15:07,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4503, 'learning_rate': 5.8e-05, 'epoch': 0.29} 29%|███████████████████████ | 294/1019 [23:03<41:38, 3.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████ | 294/1019 [23:03<41:38, 3.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▏ | 295/1019 [23:06<39:02, 3.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▏ | 295/1019 [23:06<39:02, 3.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:19:06,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:19:06,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:19:08,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:19:08,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:19:10,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:19:10,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:19:11,766 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:19:11,766 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7895, 'learning_rate': 5.92e-05, 'epoch': 0.29} [WARNING|modeling_utils.py:388] 2022-03-02 06:19:13,645 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:19:13,645 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▋ | 301/1019 [23:22<39:12, 3.28s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▋ | 301/1019 [23:22<39:12, 3.28s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5139, 'learning_rate': 5.96e-05, 'epoch': 0.3} 30%|███████████████████████▋ | 302/1019 [23:28<48:36, 4.07s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▋ | 302/1019 [23:28<48:36, 4.07s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2047, 'learning_rate': 5.9800000000000003e-05, 'epoch': 0.3} 30%|███████████████████████▋ | 302/1019 [23:28<48:36, 4.07s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▊ | 303/1019 [23:34<54:49, 4.59s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▊ | 303/1019 [23:34<54:49, 4.59s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▊ | 303/1019 [23:34<54:49, 4.59s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▊ | 304/1019 [23:39<58:59, 4.95s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▊ | 304/1019 [23:39<58:59, 4.95s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▊ | 304/1019 [23:39<58:59, 4.95s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▎ | 305/1019 [23:45<1:01:41, 5.18s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▎ | 305/1019 [23:45<1:01:41, 5.18s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▎ | 305/1019 [23:45<1:01:41, 5.18s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▍ | 306/1019 [23:51<1:03:32, 5.35s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▍ | 306/1019 [23:51<1:03:32, 5.35s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▍ | 306/1019 [23:51<1:03:32, 5.35s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▍ | 307/1019 [23:57<1:04:40, 5.45s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▍ | 307/1019 [23:57<1:04:40, 5.45s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1365, 'learning_rate': 6.1e-05, 'epoch': 0.3} g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▋ | 309/1019 [24:08<1:05:48, 5.56s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▋ | 309/1019 [24:08<1:05:48, 5.56s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3245, 'learning_rate': 6.12e-05, 'epoch': 0.3} 30%|███████████████████████▋ | 309/1019 [24:08<1:05:48, 5.56s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▋ | 310/1019 [24:13<1:05:48, 5.57s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▋ | 310/1019 [24:13<1:05:48, 5.57s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▋ | 310/1019 [24:13<1:05:48, 5.57s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|███████████████████████▊ | 311/1019 [24:19<1:05:34, 5.56s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|███████████████████████▊ | 311/1019 [24:19<1:05:34, 5.56s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|███████████████████████▉ | 312/1019 [24:24<1:05:02, 5.52s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|███████████████████████▉ | 312/1019 [24:24<1:05:02, 5.52s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2266, 'learning_rate': 6.18e-05, 'epoch': 0.31} 31%|███████████████████████▉ | 312/1019 [24:24<1:05:02, 5.52s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|███████████████████████▉ | 313/1019 [24:30<1:04:23, 5.47s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|███████████████████████▉ | 313/1019 [24:30<1:04:23, 5.47s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|███████████████████████▉ | 313/1019 [24:30<1:04:23, 5.47s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████ | 314/1019 [24:35<1:03:57, 5.44s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████ | 314/1019 [24:35<1:03:57, 5.44s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████ | 315/1019 [24:40<1:03:33, 5.42s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████ | 315/1019 [24:40<1:03:33, 5.42s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2351, 'learning_rate': 6.24e-05, 'epoch': 0.31} 31%|████████████████████████ | 315/1019 [24:40<1:03:33, 5.42s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 316/1019 [24:46<1:03:18, 5.40s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 316/1019 [24:46<1:03:18, 5.40s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 316/1019 [24:46<1:03:18, 5.40s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 316/1019 [24:46<1:03:18, 5.40s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3449, 'learning_rate': 6.280000000000001e-05, 'epoch': 0.31} 31%|████████████████████████▏ | 316/1019 [24:46<1:03:18, 5.40s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 316/1019 [24:46<1:03:18, 5.40s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▎ | 318/1019 [24:57<1:02:40, 5.36s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▎ | 318/1019 [24:57<1:02:40, 5.36s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▎ | 318/1019 [24:57<1:02:40, 5.36s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▍ | 319/1019 [25:02<1:02:02, 5.32s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▍ | 319/1019 [25:02<1:02:02, 5.32s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▍ | 319/1019 [25:02<1:02:02, 5.32s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▍ | 320/1019 [25:07<1:01:30, 5.28s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▍ | 320/1019 [25:07<1:01:30, 5.28s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▍ | 320/1019 [25:07<1:01:30, 5.28s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▌ | 321/1019 [25:12<1:01:05, 5.25s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▌ | 321/1019 [25:12<1:01:05, 5.25s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▌ | 321/1019 [25:12<1:01:05, 5.25s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▋ | 322/1019 [25:17<1:00:32, 5.21s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▋ | 322/1019 [25:17<1:00:32, 5.21s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▋ | 322/1019 [25:17<1:00:32, 5.21s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▎ | 323/1019 [25:22<59:52, 5.16s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▎ | 323/1019 [25:22<59:52, 5.16s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▎ | 323/1019 [25:22<59:52, 5.16s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▍ | 324/1019 [25:27<59:30, 5.14s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▍ | 324/1019 [25:27<59:30, 5.14s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.342, 'learning_rate': 6.440000000000001e-05, 'epoch': 0.32} g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▌ | 326/1019 [25:37<57:55, 5.02s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▌ | 326/1019 [25:37<57:55, 5.02s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▋ | 327/1019 [25:42<57:35, 4.99s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▋ | 327/1019 [25:42<57:35, 4.99s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1516, 'learning_rate': 6.48e-05, 'epoch': 0.32} 32%|█████████████████████████▋ | 327/1019 [25:42<57:35, 4.99s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▊ | 328/1019 [25:47<57:14, 4.97s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▊ | 328/1019 [25:47<57:14, 4.97s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▊ | 328/1019 [25:47<57:14, 4.97s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▊ | 329/1019 [25:52<56:25, 4.91s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▊ | 329/1019 [25:52<56:25, 4.91s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▉ | 330/1019 [25:57<55:58, 4.87s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▉ | 330/1019 [25:57<55:58, 4.87s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:21:59,104 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:21:59,104 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2191, 'learning_rate': 6.560000000000001e-05, 'epoch': 0.32} [WARNING|modeling_utils.py:388] 2022-03-02 06:21:59,104 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████ | 332/1019 [26:06<54:43, 4.78s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████ | 332/1019 [26:06<54:43, 4.78s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▏ | 333/1019 [26:10<53:47, 4.70s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▏ | 333/1019 [26:10<53:47, 4.70s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0214, 'learning_rate': 6.6e-05, 'epoch': 0.33} 33%|██████████████████████████▏ | 334/1019 [26:15<53:04, 4.65s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▏ | 334/1019 [26:15<53:04, 4.65s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3892, 'learning_rate': 6.620000000000001e-05, 'epoch': 0.33} 33%|██████████████████████████▎ | 335/1019 [26:19<52:16, 4.59s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▎ | 335/1019 [26:19<52:16, 4.59s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3803, 'learning_rate': 6.64e-05, 'epoch': 0.33} 33%|██████████████████████████▍ | 336/1019 [26:24<51:18, 4.51s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▍ | 336/1019 [26:24<51:18, 4.51s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2512, 'learning_rate': 6.66e-05, 'epoch': 0.33} 33%|██████████████████████████▍ | 337/1019 [26:28<50:31, 4.44s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▍ | 337/1019 [26:28<50:31, 4.44s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4332, 'learning_rate': 6.680000000000001e-05, 'epoch': 0.33} 33%|██████████████████████████▌ | 338/1019 [26:32<49:18, 4.34s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▌ | 338/1019 [26:32<49:18, 4.34s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3038, 'learning_rate': 6.7e-05, 'epoch': 0.33} 33%|██████████████████████████▌ | 339/1019 [26:36<47:58, 4.23s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▌ | 339/1019 [26:36<47:58, 4.23s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4374, 'learning_rate': 6.720000000000001e-05, 'epoch': 0.33} 33%|██████████████████████████▋ | 340/1019 [26:40<46:16, 4.09s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▋ | 340/1019 [26:40<46:16, 4.09s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4142, 'learning_rate': 6.740000000000001e-05, 'epoch': 0.33} 33%|██████████████████████████▊ | 341/1019 [26:43<44:25, 3.93s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▊ | 341/1019 [26:43<44:25, 3.93s/it]g-point operations will not be computed-02 06:19:02,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3873, 'learning_rate': 6.76e-05, 'epoch': 0.33} 34%|██████████████████████████▊ | 342/1019 [26:47<42:29, 3.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:22:46,298 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▊ | 342/1019 [26:47<42:29, 3.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:22:46,298 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▉ | 343/1019 [26:50<40:19, 3.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:22:46,298 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▉ | 343/1019 [26:50<40:19, 3.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:22:46,298 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8563, 'learning_rate': 6.800000000000001e-05, 'epoch': 0.34} 34%|██████████████████████████▉ | 343/1019 [26:50<40:19, 3.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:22:46,298 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████ | 344/1019 [26:53<37:58, 3.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████ | 345/1019 [26:56<35:35, 3.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████ | 345/1019 [26:56<35:35, 3.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:22:55,746 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:22:55,746 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:22:57,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:22:57,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:22:59,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:22:59,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:23:01,268 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:23:01,268 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:23:03,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:23:03,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6739, 'learning_rate': 6.939999999999999e-05, 'epoch': 0.34} [WARNING|modeling_utils.py:388] 2022-03-02 06:23:03,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 351/1019 [27:12<36:58, 3.32s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 351/1019 [27:12<36:58, 3.32s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 351/1019 [27:12<36:58, 3.32s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▋ | 352/1019 [27:17<45:27, 4.09s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▋ | 352/1019 [27:17<45:27, 4.09s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▋ | 353/1019 [27:23<51:07, 4.61s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▋ | 353/1019 [27:23<51:07, 4.61s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2387, 'learning_rate': 7e-05, 'epoch': 0.35} 35%|███████████████████████████▋ | 353/1019 [27:23<51:07, 4.61s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▊ | 354/1019 [27:29<55:05, 4.97s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▊ | 354/1019 [27:29<55:05, 4.97s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▊ | 354/1019 [27:29<55:05, 4.97s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▊ | 355/1019 [27:35<57:38, 5.21s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▊ | 355/1019 [27:35<57:38, 5.21s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▊ | 355/1019 [27:35<57:38, 5.21s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▉ | 356/1019 [27:40<58:56, 5.33s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▉ | 356/1019 [27:40<58:56, 5.33s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▉ | 356/1019 [27:40<58:56, 5.33s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▉ | 356/1019 [27:40<58:56, 5.33s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3082, 'learning_rate': 7.08e-05, 'epoch': 0.35} 35%|███████████████████████████▉ | 356/1019 [27:40<58:56, 5.33s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▍ | 358/1019 [27:52<1:00:44, 5.51s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▍ | 358/1019 [27:52<1:00:44, 5.51s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6187, 'learning_rate': 7.1e-05, 'epoch': 0.35} 35%|███████████████████████████▍ | 358/1019 [27:52<1:00:44, 5.51s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▍ | 359/1019 [27:58<1:01:17, 5.57s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▍ | 359/1019 [27:58<1:01:17, 5.57s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▍ | 359/1019 [27:58<1:01:17, 5.57s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▌ | 360/1019 [28:03<1:01:31, 5.60s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▌ | 360/1019 [28:03<1:01:31, 5.60s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▋ | 361/1019 [28:09<1:01:14, 5.58s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▋ | 361/1019 [28:09<1:01:14, 5.58s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3837, 'learning_rate': 7.16e-05, 'epoch': 0.35} 36%|███████████████████████████▋ | 362/1019 [28:14<1:00:41, 5.54s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|███████████████████████████▋ | 362/1019 [28:14<1:00:41, 5.54s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5376, 'learning_rate': 7.18e-05, 'epoch': 0.36} 36%|███████████████████████████▋ | 362/1019 [28:14<1:00:41, 5.54s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|███████████████████████████▊ | 363/1019 [28:20<1:00:15, 5.51s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|███████████████████████████▊ | 363/1019 [28:20<1:00:15, 5.51s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▌ | 364/1019 [28:25<59:47, 5.48s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▌ | 364/1019 [28:25<59:47, 5.48s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3393, 'learning_rate': 7.22e-05, 'epoch': 0.36} 36%|████████████████████████████▋ | 365/1019 [28:30<59:14, 5.43s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 365/1019 [28:30<59:14, 5.43s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3848, 'learning_rate': 7.24e-05, 'epoch': 0.36} 36%|████████████████████████████▋ | 365/1019 [28:30<59:14, 5.43s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 366/1019 [28:36<59:00, 5.42s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 366/1019 [28:36<59:00, 5.42s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▊ | 367/1019 [28:41<58:37, 5.39s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▊ | 367/1019 [28:41<58:37, 5.39s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1314, 'learning_rate': 7.280000000000001e-05, 'epoch': 0.36} 36%|████████████████████████████▊ | 367/1019 [28:41<58:37, 5.39s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 368/1019 [28:46<58:18, 5.37s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 368/1019 [28:46<58:18, 5.37s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 368/1019 [28:46<58:18, 5.37s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 368/1019 [28:46<58:18, 5.37s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3967, 'learning_rate': 7.32e-05, 'epoch': 0.36} 36%|████████████████████████████▉ | 368/1019 [28:46<58:18, 5.37s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 368/1019 [28:46<58:18, 5.37s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████ | 370/1019 [28:57<57:32, 5.32s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████ | 370/1019 [28:57<57:32, 5.32s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████ | 370/1019 [28:57<57:32, 5.32s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▏ | 371/1019 [29:02<57:06, 5.29s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▏ | 371/1019 [29:02<57:06, 5.29s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▏ | 372/1019 [29:07<56:33, 5.24s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▏ | 372/1019 [29:07<56:33, 5.24s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2514, 'learning_rate': 7.38e-05, 'epoch': 0.36} 37%|█████████████████████████████▏ | 372/1019 [29:07<56:33, 5.24s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▎ | 373/1019 [29:12<56:07, 5.21s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▎ | 373/1019 [29:12<56:07, 5.21s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▎ | 374/1019 [29:18<55:41, 5.18s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▎ | 374/1019 [29:18<55:41, 5.18s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1936, 'learning_rate': 7.42e-05, 'epoch': 0.37} 37%|█████████████████████████████▍ | 375/1019 [29:23<54:54, 5.12s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▍ | 375/1019 [29:23<54:54, 5.12s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.721, 'learning_rate': 7.44e-05, 'epoch': 0.37} 37%|█████████████████████████████▌ | 376/1019 [29:28<54:29, 5.08s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▌ | 376/1019 [29:28<54:29, 5.08s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4243, 'learning_rate': 7.46e-05, 'epoch': 0.37} 37%|█████████████████████████████▌ | 377/1019 [29:33<54:08, 5.06s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▌ | 377/1019 [29:33<54:08, 5.06s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2335, 'learning_rate': 7.48e-05, 'epoch': 0.37} 37%|█████████████████████████████▌ | 377/1019 [29:33<54:08, 5.06s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▋ | 378/1019 [29:37<53:40, 5.02s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▋ | 378/1019 [29:37<53:40, 5.02s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▊ | 379/1019 [29:42<53:08, 4.98s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▊ | 379/1019 [29:42<53:08, 4.98s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5591, 'learning_rate': 7.52e-05, 'epoch': 0.37} 37%|█████████████████████████████▊ | 380/1019 [29:47<52:31, 4.93s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▊ | 380/1019 [29:47<52:31, 4.93s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6034, 'learning_rate': 7.54e-05, 'epoch': 0.37} 37%|█████████████████████████████▉ | 381/1019 [29:52<51:54, 4.88s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▉ | 381/1019 [29:52<51:54, 4.88s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.193, 'learning_rate': 7.560000000000001e-05, 'epoch': 0.37} 37%|█████████████████████████████▉ | 382/1019 [29:57<51:08, 4.82s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▉ | 382/1019 [29:57<51:08, 4.82s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2547, 'learning_rate': 7.58e-05, 'epoch': 0.37} 38%|██████████████████████████████ | 383/1019 [30:01<50:32, 4.77s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████ | 383/1019 [30:01<50:32, 4.77s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.316, 'learning_rate': 7.6e-05, 'epoch': 0.38} 38%|██████████████████████████████ | 383/1019 [30:01<50:32, 4.77s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████ | 383/1019 [30:01<50:32, 4.77s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████ | 383/1019 [30:01<50:32, 4.77s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3824, 'learning_rate': 7.620000000000001e-05, 'epoch': 0.38} 38%|██████████████████████████████▏ | 385/1019 [30:10<49:11, 4.66s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▏ | 385/1019 [30:10<49:11, 4.66s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4127, 'learning_rate': 7.64e-05, 'epoch': 0.38} 38%|██████████████████████████████▎ | 386/1019 [30:15<48:26, 4.59s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▎ | 386/1019 [30:15<48:26, 4.59s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3601, 'learning_rate': 7.66e-05, 'epoch': 0.38} 38%|██████████████████████████████▍ | 387/1019 [30:19<47:24, 4.50s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▍ | 387/1019 [30:19<47:24, 4.50s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1869, 'learning_rate': 7.680000000000001e-05, 'epoch': 0.38} 38%|██████████████████████████████▍ | 388/1019 [30:23<46:24, 4.41s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▍ | 388/1019 [30:23<46:24, 4.41s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3994, 'learning_rate': 7.7e-05, 'epoch': 0.38} 38%|██████████████████████████████▌ | 389/1019 [30:27<45:12, 4.30s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 389/1019 [30:27<45:12, 4.30s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2377, 'learning_rate': 7.72e-05, 'epoch': 0.38} 38%|██████████████████████████████▌ | 390/1019 [30:31<43:31, 4.15s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 390/1019 [30:31<43:31, 4.15s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3469, 'learning_rate': 7.740000000000001e-05, 'epoch': 0.38} 38%|██████████████████████████████▋ | 391/1019 [30:35<41:28, 3.96s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▋ | 391/1019 [30:35<41:28, 3.96s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:26:35,910 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:26:35,910 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3896, 'learning_rate': 7.780000000000001e-05, 'epoch': 0.38} 39%|██████████████████████████████▊ | 393/1019 [30:41<37:52, 3.63s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|██████████████████████████████▊ | 393/1019 [30:41<37:52, 3.63s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.316, 'learning_rate': 7.800000000000001e-05, 'epoch': 0.39} 39%|██████████████████████████████▊ | 393/1019 [30:41<37:52, 3.63s/it]g-point operations will not be computed-02 06:22:52,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|██████████████████████████████▉ | 394/1019 [30:44<35:52, 3.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:26:43,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████ | 395/1019 [30:47<33:18, 3.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████ | 395/1019 [30:47<33:18, 3.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████ | 396/1019 [30:49<30:36, 2.95s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████ | 396/1019 [30:49<30:36, 2.95s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:26:49,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:26:49,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:26:51,063 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:26:51,063 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:26:52,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:26:52,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:26:54,538 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:26:54,538 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2974, 'learning_rate': 7.94e-05, 'epoch': 0.39} 39%|███████████████████████████████▍ | 401/1019 [31:03<34:08, 3.31s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▍ | 401/1019 [31:03<34:08, 3.31s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3962, 'learning_rate': 7.960000000000001e-05, 'epoch': 0.39} 39%|███████████████████████████████▍ | 401/1019 [31:03<34:08, 3.31s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▍ | 401/1019 [31:03<34:08, 3.31s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3178, 'learning_rate': 7.98e-05, 'epoch': 0.39} 39%|███████████████████████████████▍ | 401/1019 [31:03<34:08, 3.31s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▋ | 403/1019 [31:15<47:12, 4.60s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▋ | 403/1019 [31:15<47:12, 4.60s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3112, 'learning_rate': 8e-05, 'epoch': 0.4} 40%|███████████████████████████████▋ | 404/1019 [31:20<50:38, 4.94s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▋ | 404/1019 [31:20<50:38, 4.94s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2446, 'learning_rate': 8.020000000000001e-05, 'epoch': 0.4} 40%|███████████████████████████████▋ | 404/1019 [31:20<50:38, 4.94s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▊ | 405/1019 [31:26<53:10, 5.20s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▊ | 405/1019 [31:26<53:10, 5.20s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▊ | 406/1019 [31:32<54:36, 5.34s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▊ | 406/1019 [31:32<54:36, 5.34s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7824, 'learning_rate': 8.060000000000001e-05, 'epoch': 0.4} 40%|███████████████████████████████▊ | 406/1019 [31:32<54:36, 5.34s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▊ | 406/1019 [31:32<54:36, 5.34s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.457, 'learning_rate': 8.080000000000001e-05, 'epoch': 0.4} 40%|███████████████████████████████▊ | 406/1019 [31:32<54:36, 5.34s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▊ | 406/1019 [31:32<54:36, 5.34s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████ | 408/1019 [31:43<55:50, 5.48s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████ | 408/1019 [31:43<55:50, 5.48s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████ | 408/1019 [31:43<55:50, 5.48s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████ | 409/1019 [31:49<56:14, 5.53s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████ | 409/1019 [31:49<56:14, 5.53s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████ | 409/1019 [31:49<56:14, 5.53s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▏ | 410/1019 [31:54<56:09, 5.53s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▏ | 410/1019 [31:54<56:09, 5.53s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▏ | 410/1019 [31:54<56:09, 5.53s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 411/1019 [32:00<55:54, 5.52s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 411/1019 [32:00<55:54, 5.52s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 411/1019 [32:00<55:54, 5.52s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 412/1019 [32:05<55:36, 5.50s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 412/1019 [32:05<55:36, 5.50s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 412/1019 [32:05<55:36, 5.50s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 412/1019 [32:05<55:36, 5.50s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1417, 'learning_rate': 8.2e-05, 'epoch': 0.41} 40%|████████████████████████████████▎ | 412/1019 [32:05<55:36, 5.50s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▌ | 414/1019 [32:16<55:00, 5.46s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▌ | 414/1019 [32:16<55:00, 5.46s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3982, 'learning_rate': 8.22e-05, 'epoch': 0.41} 41%|████████████████████████████████▌ | 414/1019 [32:16<55:00, 5.46s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▌ | 415/1019 [32:21<54:31, 5.42s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▌ | 415/1019 [32:21<54:31, 5.42s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3939, 'learning_rate': 8.26e-05, 'epoch': 0.41} g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▋ | 417/1019 [32:32<53:47, 5.36s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▋ | 417/1019 [32:32<53:47, 5.36s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▋ | 417/1019 [32:32<53:47, 5.36s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▋ | 417/1019 [32:32<53:47, 5.36s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4452, 'learning_rate': 8.3e-05, 'epoch': 0.41} 41%|████████████████████████████████▋ | 417/1019 [32:32<53:47, 5.36s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▉ | 419/1019 [32:42<53:04, 5.31s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▉ | 419/1019 [32:42<53:04, 5.31s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3762, 'learning_rate': 8.32e-05, 'epoch': 0.41} 41%|████████████████████████████████▉ | 419/1019 [32:42<53:04, 5.31s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▉ | 420/1019 [32:48<52:38, 5.27s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▉ | 420/1019 [32:48<52:38, 5.27s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 421/1019 [32:53<52:30, 5.27s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 421/1019 [32:53<52:30, 5.27s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2238, 'learning_rate': 8.36e-05, 'epoch': 0.41} 41%|█████████████████████████████████ | 421/1019 [32:53<52:30, 5.27s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▏ | 422/1019 [32:58<52:11, 5.24s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▏ | 422/1019 [32:58<52:11, 5.24s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▏ | 423/1019 [33:03<51:39, 5.20s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▏ | 423/1019 [33:03<51:39, 5.20s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2047, 'learning_rate': 8.4e-05, 'epoch': 0.41} 42%|█████████████████████████████████▎ | 424/1019 [33:08<50:56, 5.14s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▎ | 424/1019 [33:08<50:56, 5.14s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6421, 'learning_rate': 8.42e-05, 'epoch': 0.42} 42%|█████████████████████████████████▎ | 425/1019 [33:13<50:31, 5.10s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▎ | 425/1019 [33:13<50:31, 5.10s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2442, 'learning_rate': 8.44e-05, 'epoch': 0.42} 42%|█████████████████████████████████▍ | 426/1019 [33:18<50:11, 5.08s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▍ | 426/1019 [33:18<50:11, 5.08s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2809, 'learning_rate': 8.46e-05, 'epoch': 0.42} 42%|█████████████████████████████████▍ | 426/1019 [33:18<50:11, 5.08s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▌ | 427/1019 [33:23<49:34, 5.02s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▌ | 427/1019 [33:23<49:34, 5.02s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▌ | 428/1019 [33:28<49:17, 5.00s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▌ | 428/1019 [33:28<49:17, 5.00s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4218, 'learning_rate': 8.5e-05, 'epoch': 0.42} 42%|█████████████████████████████████▌ | 428/1019 [33:28<49:17, 5.00s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▋ | 429/1019 [33:33<48:44, 4.96s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▋ | 429/1019 [33:33<48:44, 4.96s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▊ | 430/1019 [33:38<48:12, 4.91s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▊ | 430/1019 [33:38<48:12, 4.91s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3741, 'learning_rate': 8.54e-05, 'epoch': 0.42} 42%|█████████████████████████████████▊ | 431/1019 [33:43<47:38, 4.86s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▊ | 431/1019 [33:43<47:38, 4.86s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3195, 'learning_rate': 8.560000000000001e-05, 'epoch': 0.42} 42%|█████████████████████████████████▉ | 432/1019 [33:47<46:51, 4.79s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▉ | 432/1019 [33:47<46:51, 4.79s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4016, 'learning_rate': 8.58e-05, 'epoch': 0.42} 42%|█████████████████████████████████▉ | 432/1019 [33:47<46:51, 4.79s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▉ | 432/1019 [33:47<46:51, 4.79s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▉ | 432/1019 [33:47<46:51, 4.79s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6563, 'learning_rate': 8.6e-05, 'epoch': 0.42} 43%|██████████████████████████████████ | 434/1019 [33:56<45:44, 4.69s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████ | 434/1019 [33:56<45:44, 4.69s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2991, 'learning_rate': 8.620000000000001e-05, 'epoch': 0.43} 43%|██████████████████████████████████▏ | 435/1019 [34:01<45:14, 4.65s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▏ | 435/1019 [34:01<45:14, 4.65s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2143, 'learning_rate': 8.64e-05, 'epoch': 0.43} 43%|██████████████████████████████████▏ | 436/1019 [34:05<44:17, 4.56s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▏ | 436/1019 [34:05<44:17, 4.56s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4766, 'learning_rate': 8.66e-05, 'epoch': 0.43} 43%|██████████████████████████████████▎ | 437/1019 [34:10<43:23, 4.47s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▎ | 437/1019 [34:10<43:23, 4.47s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2582, 'learning_rate': 8.680000000000001e-05, 'epoch': 0.43} 43%|██████████████████████████████████▍ | 438/1019 [34:14<42:15, 4.36s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▍ | 438/1019 [34:14<42:15, 4.36s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3976, 'learning_rate': 8.7e-05, 'epoch': 0.43} 43%|██████████████████████████████████▍ | 439/1019 [34:18<41:13, 4.26s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▍ | 439/1019 [34:18<41:13, 4.26s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2676, 'learning_rate': 8.72e-05, 'epoch': 0.43} 43%|██████████████████████████████████▌ | 440/1019 [34:22<39:56, 4.14s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 440/1019 [34:22<39:56, 4.14s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3342, 'learning_rate': 8.740000000000001e-05, 'epoch': 0.43} 43%|██████████████████████████████████▌ | 441/1019 [34:25<38:37, 4.01s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 441/1019 [34:25<38:37, 4.01s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6316, 'learning_rate': 8.76e-05, 'epoch': 0.43} 43%|██████████████████████████████████▋ | 442/1019 [34:29<37:18, 3.88s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 442/1019 [34:29<37:18, 3.88s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:30:29,910 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:30:29,910 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3716, 'learning_rate': 8.800000000000001e-05, 'epoch': 0.43} 44%|██████████████████████████████████▊ | 444/1019 [34:35<33:49, 3.53s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|██████████████████████████████████▊ | 444/1019 [34:35<33:49, 3.53s/it]g-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:30:35,862 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:30:35,862 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:26:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.874, 'learning_rate': 8.840000000000001e-05, 'epoch': 0.44} 44%|███████████████████████████████████ | 446/1019 [34:41<29:22, 3.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:39,550 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████ | 446/1019 [34:41<29:22, 3.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:39,550 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████ | 447/1019 [34:43<26:46, 2.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:41,569 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████ | 447/1019 [34:43<26:46, 2.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:41,569 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▎ | 449/1019 [34:46<21:20, 2.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:43,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▎ | 449/1019 [34:46<21:20, 2.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:43,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5231, 'learning_rate': 8.900000000000001e-05, 'epoch': 0.44} 44%|███████████████████████████████████▎ | 450/1019 [34:48<20:26, 2.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:44,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▎ | 450/1019 [34:48<20:26, 2.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:44,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▎ | 450/1019 [34:48<20:26, 2.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 451/1019 [34:54<32:17, 3.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 451/1019 [34:54<32:17, 3.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 5.0006, 'learning_rate': 8.960000000000001e-05, 'epoch': 0.44} 44%|███████████████████████████████████▍ | 452/1019 [35:00<39:07, 4.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 452/1019 [35:00<39:07, 4.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5564, 'learning_rate': 8.98e-05, 'epoch': 0.44} 44%|███████████████████████████████████▌ | 453/1019 [35:06<43:48, 4.64s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▌ | 453/1019 [35:06<43:48, 4.64s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5007, 'learning_rate': 9e-05, 'epoch': 0.44} 45%|███████████████████████████████████▋ | 454/1019 [35:12<47:13, 5.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▋ | 454/1019 [35:12<47:13, 5.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6749, 'learning_rate': 9.020000000000001e-05, 'epoch': 0.45} 45%|███████████████████████████████████▋ | 455/1019 [35:18<48:59, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▋ | 455/1019 [35:18<48:59, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4925, 'learning_rate': 9.04e-05, 'epoch': 0.45} 45%|███████████████████████████████████▋ | 455/1019 [35:18<48:59, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▊ | 456/1019 [35:23<50:17, 5.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▊ | 456/1019 [35:23<50:17, 5.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▉ | 457/1019 [35:29<51:11, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▉ | 457/1019 [35:29<51:11, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1533, 'learning_rate': 9.080000000000001e-05, 'epoch': 0.45} 45%|███████████████████████████████████▉ | 458/1019 [35:35<51:42, 5.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▉ | 458/1019 [35:35<51:42, 5.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3659, 'learning_rate': 9.1e-05, 'epoch': 0.45} 45%|███████████████████████████████████▉ | 458/1019 [35:35<51:42, 5.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████ | 459/1019 [35:40<51:45, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████ | 459/1019 [35:40<51:45, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████ | 460/1019 [35:46<51:34, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████ | 460/1019 [35:46<51:34, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5776, 'learning_rate': 9.140000000000001e-05, 'epoch': 0.45} 45%|████████████████████████████████████ | 460/1019 [35:46<51:34, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▏ | 461/1019 [35:51<51:21, 5.52s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▏ | 461/1019 [35:51<51:21, 5.52s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▏ | 461/1019 [35:51<51:21, 5.52s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 462/1019 [35:57<51:04, 5.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 462/1019 [35:57<51:04, 5.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 462/1019 [35:57<51:04, 5.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 463/1019 [36:02<50:40, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 463/1019 [36:02<50:40, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 463/1019 [36:02<50:40, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▍ | 464/1019 [36:08<50:31, 5.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▍ | 464/1019 [36:08<50:31, 5.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▍ | 464/1019 [36:08<50:31, 5.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▌ | 465/1019 [36:13<50:01, 5.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▌ | 465/1019 [36:13<50:01, 5.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▌ | 465/1019 [36:13<50:01, 5.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▌ | 466/1019 [36:18<49:55, 5.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▌ | 466/1019 [36:18<49:55, 5.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▌ | 466/1019 [36:18<49:55, 5.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▋ | 467/1019 [36:24<49:38, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▋ | 467/1019 [36:24<49:38, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▋ | 467/1019 [36:24<49:38, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▋ | 467/1019 [36:24<49:38, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2403, 'learning_rate': 9.300000000000001e-05, 'epoch': 0.46} 46%|████████████████████████████████████▋ | 467/1019 [36:24<49:38, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▊ | 469/1019 [36:34<48:54, 5.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▊ | 469/1019 [36:34<48:54, 5.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 470/1019 [36:40<48:31, 5.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 470/1019 [36:40<48:31, 5.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4285, 'learning_rate': 9.340000000000001e-05, 'epoch': 0.46} 46%|████████████████████████████████████▉ | 470/1019 [36:40<48:31, 5.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 471/1019 [36:45<48:07, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 471/1019 [36:45<48:07, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 472/1019 [36:50<47:30, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 472/1019 [36:50<47:30, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.992, 'learning_rate': 9.38e-05, 'epoch': 0.46} 46%|█████████████████████████████████████ | 472/1019 [36:50<47:30, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 473/1019 [36:55<47:23, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 473/1019 [36:55<47:23, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▏ | 474/1019 [37:00<47:12, 5.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▏ | 474/1019 [37:00<47:12, 5.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5496, 'learning_rate': 9.42e-05, 'epoch': 0.46} 47%|█████████████████████████████████████▏ | 474/1019 [37:00<47:12, 5.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▎ | 475/1019 [37:05<46:47, 5.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▎ | 475/1019 [37:05<46:47, 5.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▎ | 476/1019 [37:10<46:06, 5.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▎ | 476/1019 [37:10<46:06, 5.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4337, 'learning_rate': 9.46e-05, 'epoch': 0.47} 47%|█████████████████████████████████████▎ | 476/1019 [37:10<46:06, 5.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▍ | 477/1019 [37:15<45:47, 5.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▍ | 477/1019 [37:15<45:47, 5.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▌ | 478/1019 [37:20<45:20, 5.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▌ | 478/1019 [37:20<45:20, 5.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.593, 'learning_rate': 9.5e-05, 'epoch': 0.47} 47%|█████████████████████████████████████▌ | 478/1019 [37:20<45:20, 5.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▌ | 479/1019 [37:25<45:07, 5.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▌ | 479/1019 [37:25<45:07, 5.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▌ | 479/1019 [37:25<45:07, 5.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▋ | 480/1019 [37:30<44:45, 4.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▋ | 480/1019 [37:30<44:45, 4.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▊ | 481/1019 [37:35<44:12, 4.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▊ | 481/1019 [37:35<44:12, 4.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4579, 'learning_rate': 9.56e-05, 'epoch': 0.47} 47%|█████████████████████████████████████▊ | 481/1019 [37:35<44:12, 4.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▊ | 482/1019 [37:40<43:34, 4.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▊ | 482/1019 [37:40<43:34, 4.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▊ | 482/1019 [37:40<43:34, 4.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▉ | 483/1019 [37:44<43:08, 4.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▉ | 483/1019 [37:44<43:08, 4.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▉ | 484/1019 [37:49<42:44, 4.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▉ | 484/1019 [37:49<42:44, 4.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5826, 'learning_rate': 9.620000000000001e-05, 'epoch': 0.47} 48%|██████████████████████████████████████ | 485/1019 [37:54<41:56, 4.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████ | 485/1019 [37:54<41:56, 4.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0891, 'learning_rate': 9.64e-05, 'epoch': 0.48} [WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6883, 'learning_rate': 9.66e-05, 'epoch': 0.48} [WARNING|modeling_utils.py:388] 2022-03-02 06:34:00,205 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:34:00,205 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4641, 'learning_rate': 9.680000000000001e-05, 'epoch': 0.48} [WARNING|modeling_utils.py:388] 2022-03-02 06:34:00,205 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▎ | 488/1019 [38:07<39:38, 4.48s/it]g-point operations will not be computed-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▎ | 488/1019 [38:07<39:38, 4.48s/it]g-point operations will not be computed-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▎ | 488/1019 [38:07<39:38, 4.48s/it]g-point operations will not be computed-02 06:30:49,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▍ | 489/1019 [38:11<38:45, 4.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:34:10,723 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▍ | 490/1019 [38:15<37:41, 4.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:34:10,723 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▍ | 490/1019 [38:15<37:41, 4.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:34:10,723 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2576, 'learning_rate': 9.74e-05, 'epoch': 0.48} 48%|██████████████████████████████████████▌ | 491/1019 [38:19<36:17, 4.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:34:10,723 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▌ | 491/1019 [38:19<36:17, 4.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:34:10,723 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4475, 'learning_rate': 9.76e-05, 'epoch': 0.48} 48%|██████████████████████████████████████▋ | 492/1019 [38:22<34:38, 3.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:34:10,723 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▋ | 492/1019 [38:22<34:38, 3.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:34:10,723 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:34:23,226 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:34:10,723 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:34:23,226 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:34:10,723 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:34:23,226 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:34:10,723 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▊ | 494/1019 [38:28<30:53, 3.53s/it]g-point operations will not be computed-02 06:34:10,723 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▊ | 494/1019 [38:28<30:53, 3.53s/it]g-point operations will not be computed-02 06:34:10,723 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:34:29,072 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:34:10,723 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:34:29,072 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:34:10,723 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:34:29,072 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:34:10,723 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████▉ | 496/1019 [38:34<26:50, 3.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████▉ | 496/1019 [38:34<26:50, 3.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████ | 497/1019 [38:36<24:47, 2.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████ | 497/1019 [38:36<24:47, 2.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:34:35,892 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:34:35,892 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:34:37,607 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 06:34:37,607 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|trainer.py:2369] 2022-03-02 06:34:39,691 >> Batch size = 14luation *****e number of tokens of the input, floating-point operations will not be computed-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|trainer.py:2369] 2022-03-02 06:34:39,691 >> Batch size = 14luation *****e number of tokens of the input, floating-point operations will not be computed-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%| | 0/189 [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▉ | 2/189 [00:02<04:02, 1.30s/it]g-point operations will not be computed-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▎ | 3/189 [00:06<06:50, 2.21s/it]g-point operations will not be computed-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▊ | 4/189 [00:09<07:41, 2.50s/it]g-point operations will not be computed-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▏ | 5/189 [00:12<08:53, 2.90s/it]g-point operations will not be computed-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 6/189 [00:16<10:06, 3.32s/it]g-point operations will not be computed-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███ | 7/189 [00:20<09:59, 3.30s/it]g-point operations will not be computed-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▌ | 8/189 [00:23<09:53, 3.28s/it]g-point operations will not be computed-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 9/189 [00:28<11:11, 3.73s/it]g-point operations will not be computed-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 10/189 [00:32<11:29, 3.85s/it]g-point operations will not be computed-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 11/189 [00:35<10:56, 3.69s/it]g-point operations will not be computed-02 06:34:32,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed RuntimeError: CUDA out of memory. Tried to allocate 1.69 GiB (GPU 0; 15.78 GiB total capacity; 9.19 GiB already allocated; 1.65 GiB free; 12.44 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF RuntimeError: CUDA out of memory. Tried to allocate 1.69 GiB (GPU 0; 15.78 GiB total capacity; 9.19 GiB already allocated; 1.65 GiB free; 12.44 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF RuntimeError: CUDA out of memory. Tried to allocate 1.69 GiB (GPU 0; 15.78 GiB total capacity; 9.19 GiB already allocated; 1.65 GiB free; 12.44 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF