The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`. 0it [00:00, ?it/s] 0it [00:00, ?it/s] /opt/conda/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: '/opt/conda/lib/python3.10/site-packages/torchvision/image.so: undefined symbol: _ZN3c1017RegisterOperatorsD1Ev'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source? warn( 2024-10-25 01:18:03.423955: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-10-25 01:18:03.424084: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-10-25 01:18:03.562275: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered /opt/conda/lib/python3.10/site-packages/transformers/deepspeed.py:24: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations warnings.warn( /opt/conda/lib/python3.10/site-packages/transformers/training_args.py:1525: FutureWarning: `evaluation_strategy` is deprecated and will be removed in version 4.46 of 🤗 Transformers. Use `eval_strategy` instead warnings.warn( /opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884 warnings.warn( Generating train split: 0 examples [00:00, ? examples/s] Generating train split: 1 examples [00:00, 9.64 examples/s] Generating train split: 1629 examples [00:00, 9433.13 examples/s] Generating train split: 3247 examples [00:00, 12482.57 examples/s] Generating train split: 5000 examples [00:00, 14388.71 examples/s] Generating train split: 6882 examples [00:00, 15971.88 examples/s] Generating train split: 8690 examples [00:00, 16683.55 examples/s] Generating train split: 10562 examples [00:00, 17344.81 examples/s] Generating train split: 12423 examples [00:00, 17744.79 examples/s] Generating train split: 14292 examples [00:00, 18036.96 examples/s] Generating train split: 16160 examples [00:01, 18234.20 examples/s] Generating train split: 18000 examples [00:01, 18062.25 examples/s] Generating train split: 20513 examples [00:01, 17541.22 examples/s] Generating train split: 22355 examples [00:01, 17775.77 examples/s] Generating train split: 24183 examples [00:01, 17914.16 examples/s] Generating train split: 26032 examples [00:01, 18077.37 examples/s] Generating train split: 28000 examples [00:01, 18229.14 examples/s] Generating train split: 29906 examples [00:01, 18467.16 examples/s] Generating train split: 32664 examples [00:01, 18433.43 examples/s] Generating train split: 35285 examples [00:02, 18091.59 examples/s] Generating train split: 38013 examples [00:02, 18120.13 examples/s] Generating train split: 39890 examples [00:02, 18275.79 examples/s] Generating train split: 42593 examples [00:02, 18185.28 examples/s] Generating train split: 44444 examples [00:02, 18262.23 examples/s] Generating train split: 46284 examples [00:02, 18293.94 examples/s] Generating train split: 48127 examples [00:02, 18326.88 examples/s] Generating train split: 50000 examples [00:02, 18332.91 examples/s] Generating train split: 51893 examples [00:02, 18500.54 examples/s] Generating train split: 54625 examples [00:03, 18388.69 examples/s] Generating train split: 57265 examples [00:03, 18108.54 examples/s] Generating train split: 60000 examples [00:03, 18095.93 examples/s] Generating train split: 62000 examples [00:03, 18222.44 examples/s] Generating train split: 63941 examples [00:03, 18520.69 examples/s] Generating train split: 65812 examples [00:03, 18568.69 examples/s] Generating train split: 68595 examples [00:03, 18561.16 examples/s] Generating train split: 71360 examples [00:04, 18511.88 examples/s] Generating train split: 74075 examples [00:04, 18374.99 examples/s] Generating train split: 75995 examples [00:04, 18568.35 examples/s] Generating train split: 78707 examples [00:04, 18399.49 examples/s] Generating train split: 81461 examples [00:04, 18384.03 examples/s] Generating train split: 84174 examples [00:04, 18285.59 examples/s] Generating train split: 86021 examples [00:04, 18326.39 examples/s] Generating train split: 88000 examples [00:04, 18361.99 examples/s] Generating train split: 89911 examples [00:05, 18552.70 examples/s] Generating train split: 92600 examples [00:05, 18327.45 examples/s] Generating train split: 92867 examples [00:05, 17814.29 examples/s] Generating validation split: 0 examples [00:00, ? examples/s] Generating validation split: 1722 examples [00:00, 17667.23 examples/s] Running tokenizer on train dataset: 0%| | 0/92867 [00:00> Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model) instead. This warning will be raised to an exception in v4.41. Non-default generation parameters: {'max_length': 200, 'early_stopping': True, 'num_beams': 5, 'forced_eos_token_id': 2} /opt/conda/lib/python3.10/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock. self.pid = os.fork() 6%|▋ | 3870/61904 [2:01:52<9730:10:48, 603.59s/it] 6%|▋ | 3871/61904 [2:01:54<6817:16:16, 422.90s/it] 6%|▋ | 3872/61904 [2:01:55<4778:34:57, 296.44s/it] 6%|▋ | 3873/61904 [2:01:57<3352:37:10, 207.98s/it] 6%|▋ | 3874/61904 [2:01:58<2354:23:50, 146.06s/it] 6%|▋ | 3875/61904 [2:02:00<1655:35:21, 102.71s/it] 6%|▋ | 3876/61904 [2:02:01<1166:37:02, 72.38s/it] 6%|▋ | 3877/61904 [2:02:03<822:42:15, 51.04s/it] 6%|▋ | 3878/61904 [2:02:04<583:02:18, 36.17s/it] 6%|▋ | 3879/61904 [2:02:06<414:54:48, 25.74s/it] 6%|▋ | 3880/61904 [2:02:07<297:12:41, 18.44s/it] {'loss': 3.0219, 'learning_rate': 1.9403604304421105e-07, 'epoch': 1.0} 6%|▋ | 3880/61904 [2:02:07<297:12:41, 18.44s/it] 6%|▋ | 3881/61904 [2:02:08<214:38:11, 13.32s/it] 6%|▋ | 3882/61904 [2:02:10<157:09:50, 9.75s/it] 6%|▋ | 3883/61904 [2:02:11<116:33:26, 7.23s/it] 6%|▋ | 3884/61904 [2:02:13<88:43:15, 5.50s/it] 6%|▋ | 3885/61904 [2:02:14<69:09:26, 4.29s/it] 6%|▋ | 3886/61904 [2:02:15<55:16:04, 3.43s/it] 6%|▋ | 3887/61904 [2:02:17<45:40:38, 2.83s/it] 6%|▋ | 3888/61904 [2:02:19<39:43:49, 2.47s/it] 6%|▋ | 3889/61904 [2:02:20<35:31:27, 2.20s/it] 6%|▋ | 3890/61904 [2:02:22<31:52:09, 1.98s/it] 6%|▋ | 3891/61904 [2:02:23<29:19:44, 1.82s/it] 6%|▋ | 3892/61904 [2:02:24<26:55:09, 1.67s/it] 6%|▋ | 3893/61904 [2:02:26<26:00:11, 1.61s/it] 6%|▋ | 3894/61904 [2:02:27<25:00:24, 1.55s/it] 6%|▋ | 3895/61904 [2:02:29<23:50:16, 1.48s/it] 6%|▋ | 3896/61904 [2:02:30<23:19:09, 1.45s/it] 6%|▋ | 3897/61904 [2:02:31<23:37:07, 1.47s/it] 6%|▋ | 3898/61904 [2:02:33<22:55:14, 1.42s/it] 6%|▋ | 3899/61904 [2:02:34<22:33:14, 1.40s/it] 6%|▋ | 3900/61904 [2:02:35<21:50:24, 1.36s/it] {'loss': 3.0378, 'learning_rate': 1.9400363023466873e-07, 'epoch': 1.01} 6%|▋ | 3900/61904 [2:02:35<21:50:24, 1.36s/it] 6%|▋ | 3901/61904 [2:02:37<21:52:31, 1.36s/it] 6%|▋ | 3902/61904 [2:02:38<21:48:03, 1.35s/it] 6%|▋ | 3903/61904 [2:02:39<21:42:57, 1.35s/it] 6%|▋ | 3904/61904 [2:02:41<22:11:27, 1.38s/it] 6%|▋ | 3905/61904 [2:02:42<21:41:42, 1.35s/it] 6%|▋ | 3906/61904 [2:02:43<21:16:43, 1.32s/it] 6%|▋ | 3907/61904 [2:02:45<20:43:01, 1.29s/it] 6%|▋ | 3908/61904 [2:02:46<21:13:35, 1.32s/it] 6%|▋ | 3909/61904 [2:02:47<21:24:06, 1.33s/it] 6%|▋ | 3910/61904 [2:02:49<21:56:10, 1.36s/it] 6%|▋ | 3911/61904 [2:02:50<21:15:30, 1.32s/it] 6%|▋ | 3912/61904 [2:02:51<20:59:11, 1.30s/it] 6%|▋ | 3913/61904 [2:02:53<21:08:25, 1.31s/it] 6%|▋ | 3914/61904 [2:02:54<21:16:06, 1.32s/it] 6%|▋ | 3915/61904 [2:02:55<21:57:50, 1.36s/it] 6%|▋ | 3916/61904 [2:02:57<21:42:19, 1.35s/it] 6%|▋ | 3917/61904 [2:02:58<21:24:50, 1.33s/it] 6%|▋ | 3918/61904 [2:02:59<21:49:14, 1.35s/it] 6%|▋ | 3919/61904 [2:03:01<21:56:23, 1.36s/it] 6%|▋ | 3920/61904 [2:03:02<22:05:47, 1.37s/it] {'loss': 3.058, 'learning_rate': 1.939712174251264e-07, 'epoch': 1.01} 6%|▋ | 3920/61904 [2:03:02<22:05:47, 1.37s/it] 6%|▋ | 3921/61904 [2:03:04<22:24:28, 1.39s/it] 6%|▋ | 3922/61904 [2:03:05<23:03:21, 1.43s/it] 6%|▋ | 3923/61904 [2:03:07<22:55:23, 1.42s/it] 6%|▋ | 3924/61904 [2:03:08<22:46:08, 1.41s/it] 6%|▋ | 3925/61904 [2:03:09<23:12:58, 1.44s/it] 6%|▋ | 3926/61904 [2:03:11<23:29:19, 1.46s/it] 6%|▋ | 3927/61904 [2:03:12<23:42:16, 1.47s/it] 6%|▋ | 3928/61904 [2:03:14<23:23:38, 1.45s/it] 6%|▋ | 3929/61904 [2:03:15<22:52:03, 1.42s/it] 6%|▋ | 3930/61904 [2:03:17<23:11:51, 1.44s/it] 6%|▋ | 3931/61904 [2:03:18<23:44:14, 1.47s/it] 6%|▋ | 3932/61904 [2:03:20<23:02:58, 1.43s/it] 6%|▋ | 3933/61904 [2:03:21<23:12:23, 1.44s/it] 6%|▋ | 3934/61904 [2:03:22<22:28:11, 1.40s/it] 6%|▋ | 3935/61904 [2:03:24<22:43:57, 1.41s/it] 6%|▋ | 3936/61904 [2:03:25<22:28:19, 1.40s/it] 6%|▋ | 3937/61904 [2:03:26<22:07:19, 1.37s/it] 6%|▋ | 3938/61904 [2:03:28<21:53:10, 1.36s/it] 6%|▋ | 3939/61904 [2:03:29<21:55:22, 1.36s/it] 6%|▋ | 3940/61904 [2:03:31<22:58:28, 1.43s/it] {'loss': 3.0479, 'learning_rate': 1.9393880461558406e-07, 'epoch': 1.02} 6%|▋ | 3940/61904 [2:03:31<22:58:28, 1.43s/it] 6%|▋ | 3941/61904 [2:03:32<22:53:19, 1.42s/it] 6%|▋ | 3942/61904 [2:03:34<22:54:48, 1.42s/it] 6%|▋ | 3943/61904 [2:03:35<23:18:06, 1.45s/it] 6%|▋ | 3944/61904 [2:03:36<22:31:05, 1.40s/it] 6%|▋ | 3945/61904 [2:03:38<22:43:44, 1.41s/it] 6%|▋ | 3946/61904 [2:03:39<22:23:20, 1.39s/it] 6%|▋ | 3947/61904 [2:03:40<22:02:40, 1.37s/it] 6%|▋ | 3948/61904 [2:03:42<22:08:22, 1.38s/it] 6%|▋ | 3949/61904 [2:03:43<22:32:03, 1.40s/it] 6%|▋ | 3950/61904 [2:03:45<22:23:07, 1.39s/it] 6%|▋ | 3951/61904 [2:03:46<22:21:24, 1.39s/it] 6%|▋ | 3952/61904 [2:03:48<22:52:13, 1.42s/it] 6%|▋ | 3953/61904 [2:03:49<22:22:43, 1.39s/it] 6%|▋ | 3954/61904 [2:03:50<22:49:08, 1.42s/it] 6%|▋ | 3955/61904 [2:03:52<23:14:56, 1.44s/it] 6%|▋ | 3956/61904 [2:03:53<22:53:25, 1.42s/it] 6%|▋ | 3957/61904 [2:03:55<22:30:06, 1.40s/it] 6%|▋ | 3958/61904 [2:03:56<22:24:57, 1.39s/it] 6%|▋ | 3959/61904 [2:03:57<21:56:54, 1.36s/it] 6%|▋ | 3960/61904 [2:03:59<22:28:38, 1.40s/it] {'loss': 3.0733, 'learning_rate': 1.9390639180604175e-07, 'epoch': 1.02} 6%|▋ | 3960/61904 [2:03:59<22:28:38, 1.40s/it] 6%|▋ | 3961/61904 [2:04:00<22:36:24, 1.40s/it] 6%|▋ | 3962/61904 [2:04:02<22:47:41, 1.42s/it] 6%|▋ | 3963/61904 [2:04:03<22:46:27, 1.42s/it] 6%|▋ | 3964/61904 [2:04:05<23:52:30, 1.48s/it] 6%|▋ | 3965/61904 [2:04:06<23:07:17, 1.44s/it] 6%|▋ | 3966/61904 [2:04:07<22:36:55, 1.41s/it] 6%|▋ | 3967/61904 [2:04:09<22:17:53, 1.39s/it] 6%|▋ | 3968/61904 [2:04:10<22:38:11, 1.41s/it] 6%|▋ | 3969/61904 [2:04:12<22:58:46, 1.43s/it] 6%|▋ | 3970/61904 [2:04:13<23:42:38, 1.47s/it] 6%|▋ | 3971/61904 [2:04:14<22:56:23, 1.43s/it] 6%|▋ | 3972/61904 [2:04:16<23:26:12, 1.46s/it] 6%|▋ | 3973/61904 [2:04:17<23:23:45, 1.45s/it] 6%|▋ | 3974/61904 [2:04:19<23:04:30, 1.43s/it] 6%|▋ | 3975/61904 [2:04:20<23:10:17, 1.44s/it] 6%|▋ | 3976/61904 [2:04:22<22:52:36, 1.42s/it] 6%|▋ | 3977/61904 [2:04:23<21:57:00, 1.36s/it] 6%|▋ | 3978/61904 [2:04:24<22:09:20, 1.38s/it] 6%|▋ | 3979/61904 [2:04:26<21:30:43, 1.34s/it] 6%|▋ | 3980/61904 [2:04:27<21:42:28, 1.35s/it] {'loss': 3.0543, 'learning_rate': 1.938739789964994e-07, 'epoch': 1.03} 6%|▋ | 3980/61904 [2:04:27<21:42:28, 1.35s/it] 6%|▋ | 3981/61904 [2:04:28<21:29:11, 1.34s/it] 6%|▋ | 3982/61904 [2:04:30<21:13:45, 1.32s/it] 6%|▋ | 3983/61904 [2:04:31<21:06:47, 1.31s/it] 6%|▋ | 3984/61904 [2:04:32<21:40:11, 1.35s/it] 6%|▋ | 3985/61904 [2:04:34<22:27:45, 1.40s/it] 6%|▋ | 3986/61904 [2:04:35<22:00:15, 1.37s/it] 6%|▋ | 3987/61904 [2:04:36<22:18:08, 1.39s/it] 6%|▋ | 3988/61904 [2:04:38<21:44:03, 1.35s/it] 6%|▋ | 3989/61904 [2:04:39<22:14:30, 1.38s/it] 6%|▋ | 3990/61904 [2:04:40<21:36:17, 1.34s/it] 6%|▋ | 3991/61904 [2:04:42<22:10:06, 1.38s/it] 6%|▋ | 3992/61904 [2:04:43<22:10:03, 1.38s/it] 6%|▋ | 3993/61904 [2:04:45<21:51:15, 1.36s/it] 6%|▋ | 3994/61904 [2:04:46<21:45:57, 1.35s/it] 6%|▋ | 3995/61904 [2:04:47<22:08:16, 1.38s/it] 6%|▋ | 3996/61904 [2:04:49<21:43:23, 1.35s/it] 6%|▋ | 3997/61904 [2:04:50<21:30:02, 1.34s/it] 6%|▋ | 3998/61904 [2:04:51<20:54:57, 1.30s/it] 6%|▋ | 3999/61904 [2:04:53<21:06:21, 1.31s/it] 6%|▋ | 4000/61904 [2:04:54<20:55:21, 1.30s/it] {'loss': 2.9607, 'learning_rate': 1.9384156618695707e-07, 'epoch': 1.03} 6%|▋ | 4000/61904 [2:04:54<20:55:21, 1.30s/it] 6%|▋ | 4001/61904 [2:04:55<21:17:26, 1.32s/it] 6%|▋ | 4002/61904 [2:04:57<21:26:32, 1.33s/it] 6%|▋ | 4003/61904 [2:04:58<21:34:45, 1.34s/it] 6%|▋ | 4004/61904 [2:04:59<21:26:44, 1.33s/it] 6%|▋ | 4005/61904 [2:05:01<21:21:24, 1.33s/it] 6%|▋ | 4006/61904 [2:05:02<21:36:33, 1.34s/it] 6%|▋ | 4007/61904 [2:05:03<21:32:07, 1.34s/it] 6%|▋ | 4008/61904 [2:05:05<22:21:38, 1.39s/it] 6%|▋ | 4009/61904 [2:05:06<22:52:23, 1.42s/it] 6%|▋ | 4010/61904 [2:05:08<22:53:08, 1.42s/it] 6%|▋ | 4011/61904 [2:05:09<21:45:13, 1.35s/it] 6%|▋ | 4012/61904 [2:05:10<21:08:42, 1.31s/it] 6%|▋ | 4013/61904 [2:05:11<21:00:42, 1.31s/it] 6%|▋ | 4014/61904 [2:05:13<20:42:28, 1.29s/it] 6%|▋ | 4015/61904 [2:05:14<20:40:17, 1.29s/it] 6%|▋ | 4016/61904 [2:05:15<21:31:11, 1.34s/it] 6%|▋ | 4017/61904 [2:05:17<21:48:42, 1.36s/it] 6%|▋ | 4018/61904 [2:05:18<21:36:45, 1.34s/it] 6%|▋ | 4019/61904 [2:05:19<21:45:36, 1.35s/it] 6%|▋ | 4020/61904 [2:05:21<21:47:18, 1.36s/it] {'loss': 2.9848, 'learning_rate': 1.9380915337741476e-07, 'epoch': 1.04} 6%|▋ | 4020/61904 [2:05:21<21:47:18, 1.36s/it] 6%|▋ | 4021/61904 [2:05:22<21:59:02, 1.37s/it] 6%|▋ | 4022/61904 [2:05:24<22:07:03, 1.38s/it] 6%|▋ | 4023/61904 [2:05:25<22:43:30, 1.41s/it] 7%|▋ | 4024/61904 [2:05:26<22:37:10, 1.41s/it] 7%|▋ | 4025/61904 [2:05:28<22:10:17, 1.38s/it] 7%|▋ | 4026/61904 [2:05:29<21:54:58, 1.36s/it] 7%|▋ | 4027/61904 [2:05:30<21:37:39, 1.35s/it] 7%|▋ | 4028/61904 [2:05:32<21:15:59, 1.32s/it] 7%|▋ | 4029/61904 [2:05:33<20:56:53, 1.30s/it] 7%|▋ | 4030/61904 [2:05:34<21:55:25, 1.36s/it] 7%|▋ | 4031/61904 [2:05:36<22:15:11, 1.38s/it] 7%|▋ | 4032/61904 [2:05:37<21:24:57, 1.33s/it] 7%|▋ | 4033/61904 [2:05:38<21:40:47, 1.35s/it] 7%|▋ | 4034/61904 [2:05:40<21:36:55, 1.34s/it] 7%|▋ | 4035/61904 [2:05:41<21:18:27, 1.33s/it] 7%|▋ | 4036/61904 [2:05:42<21:27:05, 1.33s/it] 7%|▋ | 4037/61904 [2:05:44<21:27:35, 1.34s/it] 7%|▋ | 4038/61904 [2:05:45<21:43:47, 1.35s/it] 7%|▋ | 4039/61904 [2:05:47<22:10:17, 1.38s/it] 7%|▋ | 4040/61904 [2:05:48<21:41:40, 1.35s/it] {'loss': 3.0765, 'learning_rate': 1.937767405678724e-07, 'epoch': 1.04} 7%|▋ | 4040/61904 [2:05:48<21:41:40, 1.35s/it] 7%|▋ | 4041/61904 [2:05:50<23:09:44, 1.44s/it] 7%|▋ | 4042/61904 [2:05:51<22:40:56, 1.41s/it] 7%|▋ | 4043/61904 [2:05:52<21:46:40, 1.35s/it] 7%|▋ | 4044/61904 [2:05:53<21:27:39, 1.34s/it] 7%|▋ | 4045/61904 [2:05:55<22:40:59, 1.41s/it] 7%|▋ | 4046/61904 [2:05:56<22:55:18, 1.43s/it] 7%|▋ | 4047/61904 [2:05:58<23:58:12, 1.49s/it] 7%|▋ | 4048/61904 [2:06:00<23:55:53, 1.49s/it] 7%|▋ | 4049/61904 [2:06:01<23:16:07, 1.45s/it] 7%|▋ | 4050/61904 [2:06:02<22:09:33, 1.38s/it] 7%|▋ | 4051/61904 [2:06:04<22:05:46, 1.37s/it] 7%|▋ | 4052/61904 [2:06:05<22:27:30, 1.40s/it] 7%|▋ | 4053/61904 [2:06:06<22:12:26, 1.38s/it] 7%|▋ | 4054/61904 [2:06:08<22:20:25, 1.39s/it] 7%|▋ | 4055/61904 [2:06:09<21:38:51, 1.35s/it] 7%|▋ | 4056/61904 [2:06:10<21:17:06, 1.32s/it] 7%|▋ | 4057/61904 [2:06:12<21:17:15, 1.32s/it] 7%|▋ | 4058/61904 [2:06:13<21:30:52, 1.34s/it] 7%|▋ | 4059/61904 [2:06:14<22:15:32, 1.39s/it] 7%|▋ | 4060/61904 [2:06:16<22:41:30, 1.41s/it] {'loss': 2.9827, 'learning_rate': 1.9374432775833008e-07, 'epoch': 1.05} 7%|▋ | 4060/61904 [2:06:16<22:41:30, 1.41s/it] 7%|▋ | 4061/61904 [2:06:17<22:20:38, 1.39s/it] 7%|▋ | 4062/61904 [2:06:19<22:01:34, 1.37s/it] 7%|▋ | 4063/61904 [2:06:20<21:41:03, 1.35s/it] 7%|▋ | 4064/61904 [2:06:21<22:11:22, 1.38s/it] 7%|▋ | 4065/61904 [2:06:23<21:44:59, 1.35s/it] 7%|▋ | 4066/61904 [2:06:24<21:27:05, 1.34s/it] 7%|▋ | 4067/61904 [2:06:25<21:03:10, 1.31s/it] 7%|▋ | 4068/61904 [2:06:27<22:09:03, 1.38s/it] 7%|▋ | 4069/61904 [2:06:28<22:35:37, 1.41s/it] 7%|▋ | 4070/61904 [2:06:29<21:54:23, 1.36s/it] 7%|▋ | 4071/61904 [2:06:31<22:20:21, 1.39s/it] 7%|▋ | 4072/61904 [2:06:32<22:11:46, 1.38s/it] 7%|▋ | 4073/61904 [2:06:34<23:03:34, 1.44s/it] 7%|▋ | 4074/61904 [2:06:35<22:48:32, 1.42s/it] 7%|▋ | 4075/61904 [2:06:36<22:01:28, 1.37s/it] 7%|▋ | 4076/61904 [2:06:38<21:22:01, 1.33s/it] 7%|▋ | 4077/61904 [2:06:39<20:48:31, 1.30s/it] 7%|▋ | 4078/61904 [2:06:40<21:00:06, 1.31s/it] 7%|▋ | 4079/61904 [2:06:42<21:55:31, 1.37s/it] 7%|▋ | 4080/61904 [2:06:43<21:51:36, 1.36s/it] {'loss': 3.0336, 'learning_rate': 1.9371191494878777e-07, 'epoch': 1.05} 7%|▋ | 4080/61904 [2:06:43<21:51:36, 1.36s/it] 7%|▋ | 4081/61904 [2:06:44<21:41:30, 1.35s/it] 7%|▋ | 4082/61904 [2:06:46<21:31:30, 1.34s/it] 7%|▋ | 4083/61904 [2:06:47<21:09:19, 1.32s/it] 7%|▋ | 4084/61904 [2:06:48<21:15:43, 1.32s/it] 7%|▋ | 4085/61904 [2:06:50<21:08:02, 1.32s/it] 7%|▋ | 4086/61904 [2:06:51<22:17:13, 1.39s/it] 7%|▋ | 4087/61904 [2:06:52<21:51:32, 1.36s/it] 7%|▋ | 4088/61904 [2:06:54<21:30:08, 1.34s/it] 7%|▋ | 4089/61904 [2:06:55<21:44:07, 1.35s/it] 7%|▋ | 4090/61904 [2:06:57<22:15:49, 1.39s/it] 7%|▋ | 4091/61904 [2:06:58<22:16:48, 1.39s/it] 7%|▋ | 4092/61904 [2:06:59<22:30:20, 1.40s/it] 7%|▋ | 4093/61904 [2:07:01<22:37:42, 1.41s/it] 7%|▋ | 4094/61904 [2:07:02<22:52:55, 1.42s/it] 7%|▋ | 4095/61904 [2:07:04<22:43:29, 1.42s/it] 7%|▋ | 4096/61904 [2:07:05<22:13:14, 1.38s/it] 7%|▋ | 4097/61904 [2:07:06<21:45:57, 1.36s/it] 7%|▋ | 4098/61904 [2:07:08<21:42:27, 1.35s/it] 7%|▋ | 4099/61904 [2:07:09<21:59:22, 1.37s/it] 7%|▋ | 4100/61904 [2:07:10<21:55:34, 1.37s/it] {'loss': 3.0382, 'learning_rate': 1.936795021392454e-07, 'epoch': 1.06} 7%|▋ | 4100/61904 [2:07:10<21:55:34, 1.37s/it] 7%|▋ | 4101/61904 [2:07:12<22:06:10, 1.38s/it] 7%|▋ | 4102/61904 [2:07:13<22:03:33, 1.37s/it] 7%|▋ | 4103/61904 [2:07:15<21:59:59, 1.37s/it] 7%|▋ | 4104/61904 [2:07:16<21:40:10, 1.35s/it] 7%|▋ | 4105/61904 [2:07:17<22:00:13, 1.37s/it] 7%|▋ | 4106/61904 [2:07:19<21:20:43, 1.33s/it] 7%|▋ | 4107/61904 [2:07:20<20:58:03, 1.31s/it] 7%|▋ | 4108/61904 [2:07:21<21:26:21, 1.34s/it] 7%|▋ | 4109/61904 [2:07:23<21:22:39, 1.33s/it] 7%|▋ | 4110/61904 [2:07:24<21:48:04, 1.36s/it] 7%|▋ | 4111/61904 [2:07:25<22:07:34, 1.38s/it] 7%|▋ | 4112/61904 [2:07:27<22:30:23, 1.40s/it] 7%|▋ | 4113/61904 [2:07:28<21:13:54, 1.32s/it] 7%|▋ | 4114/61904 [2:07:29<21:29:11, 1.34s/it] 7%|▋ | 4115/61904 [2:07:31<21:40:06, 1.35s/it] 7%|▋ | 4116/61904 [2:07:32<21:14:45, 1.32s/it] 7%|▋ | 4117/61904 [2:07:33<21:00:34, 1.31s/it] 7%|▋ | 4118/61904 [2:07:35<21:48:57, 1.36s/it] 7%|▋ | 4119/61904 [2:07:36<21:31:49, 1.34s/it] 7%|▋ | 4120/61904 [2:07:37<21:01:30, 1.31s/it] {'loss': 2.9706, 'learning_rate': 1.936470893297031e-07, 'epoch': 1.06} 7%|▋ | 4120/61904 [2:07:37<21:01:30, 1.31s/it] 7%|▋ | 4121/61904 [2:07:39<21:49:28, 1.36s/it] 7%|▋ | 4122/61904 [2:07:40<22:14:23, 1.39s/it] 7%|▋ | 4123/61904 [2:07:42<22:07:11, 1.38s/it] 7%|▋ | 4124/61904 [2:07:43<22:27:40, 1.40s/it] 7%|▋ | 4125/61904 [2:07:44<22:16:20, 1.39s/it] 7%|▋ | 4126/61904 [2:07:46<22:01:35, 1.37s/it] 7%|▋ | 4127/61904 [2:07:47<22:19:50, 1.39s/it] 7%|▋ | 4128/61904 [2:07:49<22:29:44, 1.40s/it] 7%|▋ | 4129/61904 [2:07:50<21:39:44, 1.35s/it] 7%|▋ | 4130/61904 [2:07:51<20:56:42, 1.31s/it] 7%|▋ | 4131/61904 [2:07:52<21:19:34, 1.33s/it] 7%|▋ | 4132/61904 [2:07:54<21:45:46, 1.36s/it] 7%|▋ | 4133/61904 [2:07:55<21:29:18, 1.34s/it] 7%|▋ | 4134/61904 [2:07:56<21:44:27, 1.35s/it] 7%|▋ | 4135/61904 [2:07:58<21:41:01, 1.35s/it] 7%|▋ | 4136/61904 [2:07:59<21:33:48, 1.34s/it] 7%|▋ | 4137/61904 [2:08:00<21:28:00, 1.34s/it] 7%|▋ | 4138/61904 [2:08:02<21:58:07, 1.37s/it] 7%|▋ | 4139/61904 [2:08:03<21:23:07, 1.33s/it] 7%|▋ | 4140/61904 [2:08:05<21:53:59, 1.36s/it] {'loss': 2.9261, 'learning_rate': 1.9361467652016076e-07, 'epoch': 1.07} 7%|▋ | 4140/61904 [2:08:05<21:53:59, 1.36s/it] 7%|▋ | 4141/61904 [2:08:06<21:56:30, 1.37s/it] 7%|▋ | 4142/61904 [2:08:07<21:49:41, 1.36s/it] 7%|▋ | 4143/61904 [2:08:09<22:18:08, 1.39s/it] 7%|▋ | 4144/61904 [2:08:10<22:01:49, 1.37s/it] 7%|▋ | 4145/61904 [2:08:11<21:39:54, 1.35s/it] 7%|▋ | 4146/61904 [2:08:13<22:08:55, 1.38s/it] 7%|▋ | 4147/61904 [2:08:14<21:39:57, 1.35s/it] 7%|▋ | 4148/61904 [2:08:16<22:00:04, 1.37s/it] 7%|▋ | 4149/61904 [2:08:17<22:40:35, 1.41s/it] 7%|▋ | 4150/61904 [2:08:18<22:08:09, 1.38s/it] 7%|▋ | 4151/61904 [2:08:20<22:34:32, 1.41s/it] 7%|▋ | 4152/61904 [2:08:21<22:14:39, 1.39s/it] 7%|▋ | 4153/61904 [2:08:23<23:07:12, 1.44s/it] 7%|▋ | 4154/61904 [2:08:24<23:00:06, 1.43s/it] 7%|▋ | 4155/61904 [2:08:25<22:18:00, 1.39s/it] 7%|▋ | 4156/61904 [2:08:27<22:07:57, 1.38s/it] 7%|▋ | 4157/61904 [2:08:28<22:09:30, 1.38s/it] 7%|▋ | 4158/61904 [2:08:30<22:44:00, 1.42s/it] 7%|▋ | 4159/61904 [2:08:31<22:28:16, 1.40s/it] 7%|▋ | 4160/61904 [2:08:32<21:46:07, 1.36s/it] {'loss': 3.0358, 'learning_rate': 1.9358226371061842e-07, 'epoch': 1.08} 7%|▋ | 4160/61904 [2:08:32<21:46:07, 1.36s/it] 7%|▋ | 4161/61904 [2:08:34<22:02:47, 1.37s/it] 7%|▋ | 4162/61904 [2:08:35<21:42:05, 1.35s/it] 7%|▋ | 4163/61904 [2:08:36<22:09:05, 1.38s/it] 7%|▋ | 4164/61904 [2:08:38<22:34:24, 1.41s/it] 7%|▋ | 4165/61904 [2:08:39<21:51:12, 1.36s/it] 7%|▋ | 4166/61904 [2:08:41<22:04:09, 1.38s/it] 7%|▋ | 4167/61904 [2:08:42<21:44:09, 1.36s/it] 7%|▋ | 4168/61904 [2:08:43<21:34:25, 1.35s/it] 7%|▋ | 4169/61904 [2:08:45<21:28:40, 1.34s/it] 7%|▋ | 4170/61904 [2:08:46<21:42:51, 1.35s/it] 7%|▋ | 4171/61904 [2:08:47<22:23:57, 1.40s/it] 7%|▋ | 4172/61904 [2:08:49<22:28:35, 1.40s/it] 7%|▋ | 4173/61904 [2:08:50<22:28:48, 1.40s/it] 7%|▋ | 4174/61904 [2:08:52<22:28:48, 1.40s/it] 7%|▋ | 4175/61904 [2:08:53<22:33:18, 1.41s/it] 7%|▋ | 4176/61904 [2:08:54<21:59:24, 1.37s/it] 7%|▋ | 4177/61904 [2:08:56<21:43:55, 1.36s/it] 7%|▋ | 4178/61904 [2:08:57<21:07:11, 1.32s/it] 7%|▋ | 4179/61904 [2:08:58<20:54:37, 1.30s/it] 7%|▋ | 4180/61904 [2:08:59<20:23:55, 1.27s/it] {'loss': 3.0442, 'learning_rate': 1.935498509010761e-07, 'epoch': 1.08} 7%|▋ | 4180/61904 [2:08:59<20:23:55, 1.27s/it] 7%|▋ | 4181/61904 [2:09:01<20:27:07, 1.28s/it] 7%|▋ | 4182/61904 [2:09:02<20:34:21, 1.28s/it] 7%|▋ | 4183/61904 [2:09:03<21:26:16, 1.34s/it] 7%|▋ | 4184/61904 [2:09:05<21:46:35, 1.36s/it] 7%|▋ | 4185/61904 [2:09:06<22:21:38, 1.39s/it] 7%|▋ | 4186/61904 [2:09:08<22:54:57, 1.43s/it] 7%|▋ | 4187/61904 [2:09:09<22:47:34, 1.42s/it] 7%|▋ | 4188/61904 [2:09:10<22:00:11, 1.37s/it] 7%|▋ | 4189/61904 [2:09:12<21:23:26, 1.33s/it] 7%|▋ | 4190/61904 [2:09:13<21:10:41, 1.32s/it] 7%|▋ | 4191/61904 [2:09:14<21:17:50, 1.33s/it] 7%|▋ | 4192/61904 [2:09:16<21:35:01, 1.35s/it] 7%|▋ | 4193/61904 [2:09:17<21:39:38, 1.35s/it] 7%|▋ | 4194/61904 [2:09:18<21:32:45, 1.34s/it] 7%|▋ | 4195/61904 [2:09:20<21:05:35, 1.32s/it] 7%|▋ | 4196/61904 [2:09:21<21:26:44, 1.34s/it] 7%|▋ | 4197/61904 [2:09:23<22:14:16, 1.39s/it] 7%|▋ | 4198/61904 [2:09:24<22:27:25, 1.40s/it] 7%|▋ | 4199/61904 [2:09:25<22:08:12, 1.38s/it] 7%|▋ | 4200/61904 [2:09:27<22:17:34, 1.39s/it] {'loss': 2.9493, 'learning_rate': 1.9351743809153377e-07, 'epoch': 1.09} 7%|▋ | 4200/61904 [2:09:27<22:17:34, 1.39s/it] 7%|▋ | 4201/61904 [2:09:28<22:10:11, 1.38s/it] 7%|▋ | 4202/61904 [2:09:29<21:51:06, 1.36s/it] 7%|▋ | 4203/61904 [2:09:31<21:32:50, 1.34s/it] 7%|▋ | 4204/61904 [2:09:32<21:15:41, 1.33s/it] 7%|▋ | 4205/61904 [2:09:33<21:15:54, 1.33s/it] 7%|▋ | 4206/61904 [2:09:35<21:25:38, 1.34s/it] 7%|▋ | 4207/61904 [2:09:36<21:45:13, 1.36s/it] 7%|▋ | 4208/61904 [2:09:37<21:35:05, 1.35s/it] 7%|▋ | 4209/61904 [2:09:39<23:00:06, 1.44s/it] 7%|▋ | 4210/61904 [2:09:40<22:50:10, 1.42s/it] 7%|▋ | 4211/61904 [2:09:42<22:36:45, 1.41s/it] 7%|▋ | 4212/61904 [2:09:43<23:18:40, 1.45s/it] 7%|▋ | 4213/61904 [2:09:45<22:43:37, 1.42s/it] 7%|▋ | 4214/61904 [2:09:46<22:43:26, 1.42s/it] 7%|▋ | 4215/61904 [2:09:48<22:50:22, 1.43s/it] 7%|▋ | 4216/61904 [2:09:49<22:04:26, 1.38s/it] 7%|▋ | 4217/61904 [2:09:50<21:39:35, 1.35s/it] 7%|▋ | 4218/61904 [2:09:52<21:51:24, 1.36s/it] 7%|▋ | 4219/61904 [2:09:53<22:04:32, 1.38s/it] 7%|▋ | 4220/61904 [2:09:54<22:17:52, 1.39s/it] {'loss': 2.9963, 'learning_rate': 1.9348502528199143e-07, 'epoch': 1.09} 7%|▋ | 4220/61904 [2:09:54<22:17:52, 1.39s/it] 7%|▋ | 4221/61904 [2:09:56<23:32:17, 1.47s/it] 7%|▋ | 4222/61904 [2:09:57<23:04:38, 1.44s/it] 7%|▋ | 4223/61904 [2:09:59<22:46:57, 1.42s/it] 7%|▋ | 4224/61904 [2:10:00<22:45:06, 1.42s/it] 7%|▋ | 4225/61904 [2:10:02<23:06:48, 1.44s/it] 7%|▋ | 4226/61904 [2:10:03<22:33:04, 1.41s/it] 7%|▋ | 4227/61904 [2:10:05<23:41:34, 1.48s/it] 7%|▋ | 4228/61904 [2:10:06<24:04:48, 1.50s/it] 7%|▋ | 4229/61904 [2:10:08<23:51:13, 1.49s/it] 7%|▋ | 4230/61904 [2:10:09<23:30:12, 1.47s/it] 7%|▋ | 4231/61904 [2:10:10<22:32:24, 1.41s/it] 7%|▋ | 4232/61904 [2:10:12<21:22:40, 1.33s/it] 7%|▋ | 4233/61904 [2:10:13<21:44:56, 1.36s/it] 7%|▋ | 4234/61904 [2:10:14<21:45:09, 1.36s/it] 7%|▋ | 4235/61904 [2:10:16<21:48:25, 1.36s/it] 7%|▋ | 4236/61904 [2:10:17<22:07:42, 1.38s/it] 7%|▋ | 4237/61904 [2:10:18<22:00:59, 1.37s/it] 7%|▋ | 4238/61904 [2:10:20<21:59:38, 1.37s/it] 7%|▋ | 4239/61904 [2:10:21<21:29:53, 1.34s/it] 7%|▋ | 4240/61904 [2:10:23<21:54:32, 1.37s/it] {'loss': 3.0102, 'learning_rate': 1.9345261247244912e-07, 'epoch': 1.1} 7%|▋ | 4240/61904 [2:10:23<21:54:32, 1.37s/it] 7%|▋ | 4241/61904 [2:10:24<21:46:10, 1.36s/it] 7%|▋ | 4242/61904 [2:10:25<21:56:07, 1.37s/it] 7%|▋ | 4243/61904 [2:10:27<21:28:02, 1.34s/it] 7%|▋ | 4244/61904 [2:10:28<21:02:51, 1.31s/it] 7%|▋ | 4245/61904 [2:10:29<21:06:14, 1.32s/it] 7%|▋ | 4246/61904 [2:10:30<21:14:03, 1.33s/it] 7%|▋ | 4247/61904 [2:10:32<21:48:24, 1.36s/it] 7%|▋ | 4248/61904 [2:10:33<21:36:39, 1.35s/it] 7%|▋ | 4249/61904 [2:10:35<22:40:34, 1.42s/it] 7%|▋ | 4250/61904 [2:10:36<23:21:29, 1.46s/it] 7%|▋ | 4251/61904 [2:10:38<22:36:02, 1.41s/it] 7%|▋ | 4252/61904 [2:10:39<22:35:02, 1.41s/it] 7%|▋ | 4253/61904 [2:10:40<21:52:54, 1.37s/it] 7%|▋ | 4254/61904 [2:10:42<21:21:40, 1.33s/it] 7%|▋ | 4255/61904 [2:10:43<21:28:02, 1.34s/it] 7%|▋ | 4256/61904 [2:10:44<21:33:44, 1.35s/it] 7%|▋ | 4257/61904 [2:10:46<20:58:43, 1.31s/it] 7%|▋ | 4258/61904 [2:10:47<20:54:26, 1.31s/it] 7%|▋ | 4259/61904 [2:10:48<20:51:39, 1.30s/it] 7%|▋ | 4260/61904 [2:10:50<21:14:50, 1.33s/it] {'loss': 2.9861, 'learning_rate': 1.9342019966290675e-07, 'epoch': 1.1} 7%|▋ | 4260/61904 [2:10:50<21:14:50, 1.33s/it] 7%|▋ | 4261/61904 [2:10:51<21:24:25, 1.34s/it] 7%|▋ | 4262/61904 [2:10:52<21:45:08, 1.36s/it] 7%|▋ | 4263/61904 [2:10:54<21:39:24, 1.35s/it] 7%|▋ | 4264/61904 [2:10:55<21:33:26, 1.35s/it] 7%|▋ | 4265/61904 [2:10:56<21:51:20, 1.37s/it] 7%|▋ | 4266/61904 [2:10:58<21:31:19, 1.34s/it] 7%|▋ | 4267/61904 [2:10:59<21:16:09, 1.33s/it] 7%|▋ | 4268/61904 [2:11:00<21:13:16, 1.33s/it] 7%|▋ | 4269/61904 [2:11:02<21:15:14, 1.33s/it] 7%|▋ | 4270/61904 [2:11:03<20:59:51, 1.31s/it] 7%|▋ | 4271/61904 [2:11:04<20:53:33, 1.31s/it] 7%|▋ | 4272/61904 [2:11:06<22:25:31, 1.40s/it] 7%|▋ | 4273/61904 [2:11:07<21:55:58, 1.37s/it] 7%|▋ | 4274/61904 [2:11:09<22:29:24, 1.40s/it] 7%|▋ | 4275/61904 [2:11:10<22:13:02, 1.39s/it] 7%|▋ | 4276/61904 [2:11:11<22:36:34, 1.41s/it] 7%|▋ | 4277/61904 [2:11:13<21:58:58, 1.37s/it] 7%|▋ | 4278/61904 [2:11:14<23:16:53, 1.45s/it] 7%|▋ | 4279/61904 [2:11:16<22:30:53, 1.41s/it] 7%|▋ | 4280/61904 [2:11:17<22:32:06, 1.41s/it] {'loss': 3.0209, 'learning_rate': 1.9338778685336444e-07, 'epoch': 1.11} 7%|▋ | 4280/61904 [2:11:17<22:32:06, 1.41s/it] 7%|▋ | 4281/61904 [2:11:18<22:07:16, 1.38s/it] 7%|▋ | 4282/61904 [2:11:20<21:19:24, 1.33s/it] 7%|▋ | 4283/61904 [2:11:21<22:04:09, 1.38s/it] 7%|▋ | 4284/61904 [2:11:22<22:16:57, 1.39s/it] 7%|▋ | 4285/61904 [2:11:24<22:25:41, 1.40s/it] 7%|▋ | 4286/61904 [2:11:25<22:54:56, 1.43s/it] 7%|▋ | 4287/61904 [2:11:27<22:48:45, 1.43s/it] 7%|▋ | 4288/61904 [2:11:28<22:13:29, 1.39s/it] 7%|▋ | 4289/61904 [2:11:30<22:35:36, 1.41s/it] 7%|▋ | 4290/61904 [2:11:31<22:22:52, 1.40s/it] 7%|▋ | 4291/61904 [2:11:32<22:34:00, 1.41s/it] 7%|▋ | 4292/61904 [2:11:34<22:02:46, 1.38s/it] 7%|▋ | 4293/61904 [2:11:35<21:57:59, 1.37s/it] 7%|▋ | 4294/61904 [2:11:36<22:07:55, 1.38s/it] 7%|▋ | 4295/61904 [2:11:38<21:44:04, 1.36s/it] 7%|▋ | 4296/61904 [2:11:39<22:02:47, 1.38s/it] 7%|▋ | 4297/61904 [2:11:40<21:42:20, 1.36s/it] 7%|▋ | 4298/61904 [2:11:42<21:13:16, 1.33s/it] 7%|▋ | 4299/61904 [2:11:43<21:34:06, 1.35s/it] 7%|▋ | 4300/61904 [2:11:45<21:42:07, 1.36s/it] {'loss': 2.9864, 'learning_rate': 1.933553740438221e-07, 'epoch': 1.11} 7%|▋ | 4300/61904 [2:11:45<21:42:07, 1.36s/it] 7%|▋ | 4301/61904 [2:11:46<23:03:25, 1.44s/it] 7%|▋ | 4302/61904 [2:11:47<22:13:14, 1.39s/it] 7%|▋ | 4303/61904 [2:11:49<21:33:06, 1.35s/it] 7%|▋ | 4304/61904 [2:11:50<21:24:29, 1.34s/it] 7%|▋ | 4305/61904 [2:11:51<21:11:49, 1.32s/it] 7%|▋ | 4306/61904 [2:11:53<21:34:39, 1.35s/it] 7%|▋ | 4307/61904 [2:11:54<21:45:06, 1.36s/it] 7%|▋ | 4308/61904 [2:11:55<21:26:38, 1.34s/it] 7%|▋ | 4309/61904 [2:11:57<21:55:39, 1.37s/it] 7%|▋ | 4310/61904 [2:11:58<21:24:26, 1.34s/it] 7%|▋ | 4311/61904 [2:11:59<21:25:22, 1.34s/it] 7%|▋ | 4312/61904 [2:12:01<21:14:41, 1.33s/it] 7%|▋ | 4313/61904 [2:12:02<22:28:00, 1.40s/it] 7%|▋ | 4314/61904 [2:12:04<21:52:49, 1.37s/it] 7%|▋ | 4315/61904 [2:12:05<22:09:22, 1.39s/it] 7%|▋ | 4316/61904 [2:12:06<21:24:52, 1.34s/it] 7%|▋ | 4317/61904 [2:12:08<21:05:21, 1.32s/it] 7%|▋ | 4318/61904 [2:12:09<20:49:37, 1.30s/it] 7%|▋ | 4319/61904 [2:12:10<20:54:54, 1.31s/it] 7%|▋ | 4320/61904 [2:12:12<21:35:32, 1.35s/it] {'loss': 2.9766, 'learning_rate': 1.9332296123427977e-07, 'epoch': 1.12} 7%|▋ | 4320/61904 [2:12:12<21:35:32, 1.35s/it] 7%|▋ | 4321/61904 [2:12:13<21:37:06, 1.35s/it] 7%|▋ | 4322/61904 [2:12:14<22:00:41, 1.38s/it] 7%|▋ | 4323/61904 [2:12:16<22:19:53, 1.40s/it] 7%|▋ | 4324/61904 [2:12:17<22:19:41, 1.40s/it] 7%|▋ | 4325/61904 [2:12:18<21:42:02, 1.36s/it] 7%|▋ | 4326/61904 [2:12:20<22:03:50, 1.38s/it] 7%|▋ | 4327/61904 [2:12:21<21:52:51, 1.37s/it] 7%|▋ | 4328/61904 [2:12:23<22:38:16, 1.42s/it] 7%|▋ | 4329/61904 [2:12:24<22:28:58, 1.41s/it] 7%|▋ | 4330/61904 [2:12:26<22:37:18, 1.41s/it] 7%|▋ | 4331/61904 [2:12:27<22:22:28, 1.40s/it] 7%|▋ | 4332/61904 [2:12:28<22:03:34, 1.38s/it] 7%|▋ | 4333/61904 [2:12:30<22:06:58, 1.38s/it] 7%|▋ | 4334/61904 [2:12:31<22:12:35, 1.39s/it] 7%|▋ | 4335/61904 [2:12:32<21:58:49, 1.37s/it] 7%|▋ | 4336/61904 [2:12:34<22:14:24, 1.39s/it] 7%|▋ | 4337/61904 [2:12:35<22:01:58, 1.38s/it] 7%|▋ | 4338/61904 [2:12:36<21:26:43, 1.34s/it] 7%|▋ | 4339/61904 [2:12:38<21:06:48, 1.32s/it] 7%|▋ | 4340/61904 [2:12:39<21:56:27, 1.37s/it] {'loss': 3.0403, 'learning_rate': 1.9329054842473745e-07, 'epoch': 1.12} 7%|▋ | 4340/61904 [2:12:39<21:56:27, 1.37s/it] 7%|▋ | 4341/61904 [2:12:41<22:23:00, 1.40s/it] 7%|▋ | 4342/61904 [2:12:42<22:20:15, 1.40s/it] 7%|▋ | 4343/61904 [2:12:43<22:03:43, 1.38s/it] 7%|▋ | 4344/61904 [2:12:45<21:46:07, 1.36s/it] 7%|▋ | 4345/61904 [2:12:46<21:56:03, 1.37s/it] 7%|▋ | 4346/61904 [2:12:47<21:42:57, 1.36s/it] 7%|▋ | 4347/61904 [2:12:49<21:51:25, 1.37s/it] 7%|▋ | 4348/61904 [2:12:50<22:04:13, 1.38s/it] 7%|▋ | 4349/61904 [2:12:52<21:54:32, 1.37s/it] 7%|▋ | 4350/61904 [2:12:53<23:06:46, 1.45s/it] 7%|▋ | 4351/61904 [2:12:55<22:33:51, 1.41s/it] 7%|▋ | 4352/61904 [2:12:56<22:23:25, 1.40s/it] 7%|▋ | 4353/61904 [2:12:57<21:26:10, 1.34s/it] 7%|▋ | 4354/61904 [2:12:58<21:17:51, 1.33s/it] 7%|▋ | 4355/61904 [2:13:00<20:41:47, 1.29s/it] 7%|▋ | 4356/61904 [2:13:01<21:47:42, 1.36s/it] 7%|▋ | 4357/61904 [2:13:03<22:11:15, 1.39s/it] 7%|▋ | 4358/61904 [2:13:04<21:34:22, 1.35s/it] 7%|▋ | 4359/61904 [2:13:05<21:24:18, 1.34s/it] 7%|▋ | 4360/61904 [2:13:06<21:19:11, 1.33s/it] {'loss': 2.9861, 'learning_rate': 1.9325813561519512e-07, 'epoch': 1.13} 7%|▋ | 4360/61904 [2:13:06<21:19:11, 1.33s/it] 7%|▋ | 4361/61904 [2:13:08<21:17:27, 1.33s/it] 7%|▋ | 4362/61904 [2:13:09<21:08:59, 1.32s/it] 7%|▋ | 4363/61904 [2:13:10<21:28:44, 1.34s/it] 7%|▋ | 4364/61904 [2:13:12<22:16:29, 1.39s/it] 7%|▋ | 4365/61904 [2:13:13<22:10:12, 1.39s/it] 7%|▋ | 4366/61904 [2:13:15<22:17:03, 1.39s/it] 7%|▋ | 4367/61904 [2:13:16<22:30:48, 1.41s/it] 7%|▋ | 4368/61904 [2:13:18<22:13:36, 1.39s/it] 7%|▋ | 4369/61904 [2:13:19<21:44:00, 1.36s/it] 7%|▋ | 4370/61904 [2:13:20<21:27:29, 1.34s/it] 7%|▋ | 4371/61904 [2:13:21<21:13:04, 1.33s/it] 7%|▋ | 4372/61904 [2:13:23<21:40:55, 1.36s/it] 7%|▋ | 4373/61904 [2:13:24<21:08:55, 1.32s/it] 7%|▋ | 4374/61904 [2:13:25<20:36:51, 1.29s/it] 7%|▋ | 4375/61904 [2:13:27<20:53:08, 1.31s/it] 7%|▋ | 4376/61904 [2:13:28<20:35:46, 1.29s/it] 7%|▋ | 4377/61904 [2:13:29<20:59:22, 1.31s/it] 7%|▋ | 4378/61904 [2:13:31<21:01:22, 1.32s/it] 7%|▋ | 4379/61904 [2:13:32<21:55:11, 1.37s/it] 7%|▋ | 4380/61904 [2:13:33<21:25:00, 1.34s/it] {'loss': 3.0244, 'learning_rate': 1.9322572280565278e-07, 'epoch': 1.13} 7%|▋ | 4380/61904 [2:13:33<21:25:00, 1.34s/it] 7%|▋ | 4381/61904 [2:13:35<21:24:08, 1.34s/it] 7%|▋ | 4382/61904 [2:13:36<21:45:59, 1.36s/it] 7%|▋ | 4383/61904 [2:13:38<21:46:27, 1.36s/it] 7%|▋ | 4384/61904 [2:13:39<21:34:10, 1.35s/it] 7%|▋ | 4385/61904 [2:13:40<21:31:16, 1.35s/it] 7%|▋ | 4386/61904 [2:13:41<21:16:20, 1.33s/it] 7%|▋ | 4387/61904 [2:13:43<21:21:03, 1.34s/it] 7%|▋ | 4388/61904 [2:13:44<21:36:43, 1.35s/it] 7%|▋ | 4389/61904 [2:13:45<21:08:33, 1.32s/it] 7%|▋ | 4390/61904 [2:13:47<21:53:35, 1.37s/it] 7%|▋ | 4391/61904 [2:13:48<21:40:14, 1.36s/it] 7%|▋ | 4392/61904 [2:13:50<21:25:54, 1.34s/it] 7%|▋ | 4393/61904 [2:13:51<21:05:17, 1.32s/it] 7%|▋ | 4394/61904 [2:13:52<21:38:24, 1.35s/it] 7%|▋ | 4395/61904 [2:13:54<21:55:10, 1.37s/it] 7%|▋ | 4396/61904 [2:13:55<21:43:11, 1.36s/it] 7%|▋ | 4397/61904 [2:13:56<22:02:49, 1.38s/it] 7%|▋ | 4398/61904 [2:13:58<22:08:40, 1.39s/it] 7%|▋ | 4399/61904 [2:13:59<23:03:52, 1.44s/it] 7%|▋ | 4400/61904 [2:14:01<22:22:07, 1.40s/it] {'loss': 2.9827, 'learning_rate': 1.9319330999611047e-07, 'epoch': 1.14} 7%|▋ | 4400/61904 [2:14:01<22:22:07, 1.40s/it] 7%|▋ | 4401/61904 [2:14:02<21:43:03, 1.36s/it] 7%|▋ | 4402/61904 [2:14:03<21:05:27, 1.32s/it] 7%|▋ | 4403/61904 [2:14:05<21:09:37, 1.32s/it] 7%|▋ | 4404/61904 [2:14:06<21:05:00, 1.32s/it] 7%|▋ | 4405/61904 [2:14:07<21:06:13, 1.32s/it] 7%|▋ | 4406/61904 [2:14:09<21:19:15, 1.33s/it] 7%|▋ | 4407/61904 [2:14:10<21:28:22, 1.34s/it] 7%|▋ | 4408/61904 [2:14:11<21:42:57, 1.36s/it] 7%|▋ | 4409/61904 [2:14:13<21:34:22, 1.35s/it] 7%|▋ | 4410/61904 [2:14:14<21:56:08, 1.37s/it] 7%|▋ | 4411/61904 [2:14:15<22:04:11, 1.38s/it] 7%|▋ | 4412/61904 [2:14:17<21:46:11, 1.36s/it] 7%|▋ | 4413/61904 [2:14:18<21:55:05, 1.37s/it] 7%|▋ | 4414/61904 [2:14:20<22:47:12, 1.43s/it] 7%|▋ | 4415/61904 [2:14:21<22:56:53, 1.44s/it] 7%|▋ | 4416/61904 [2:14:23<22:33:57, 1.41s/it] 7%|▋ | 4417/61904 [2:14:24<22:20:15, 1.40s/it] 7%|▋ | 4418/61904 [2:14:25<21:48:57, 1.37s/it] 7%|▋ | 4419/61904 [2:14:27<22:07:48, 1.39s/it] 7%|▋ | 4420/61904 [2:14:28<21:45:12, 1.36s/it] {'loss': 3.0343, 'learning_rate': 1.931608971865681e-07, 'epoch': 1.14} 7%|▋ | 4420/61904 [2:14:28<21:45:12, 1.36s/it] 7%|▋ | 4421/61904 [2:14:29<21:36:25, 1.35s/it] 7%|▋ | 4422/61904 [2:14:31<21:58:25, 1.38s/it] 7%|▋ | 4423/61904 [2:14:32<22:08:30, 1.39s/it] 7%|▋ | 4424/61904 [2:14:33<21:40:59, 1.36s/it] 7%|▋ | 4425/61904 [2:14:35<22:40:35, 1.42s/it] 7%|▋ | 4426/61904 [2:14:36<22:17:47, 1.40s/it] 7%|▋ | 4427/61904 [2:14:38<22:35:02, 1.41s/it] 7%|▋ | 4428/61904 [2:14:39<22:25:16, 1.40s/it] 7%|▋ | 4429/61904 [2:14:40<21:52:11, 1.37s/it] 7%|▋ | 4430/61904 [2:14:42<22:05:42, 1.38s/it] 7%|▋ | 4431/61904 [2:14:43<22:25:42, 1.40s/it] 7%|▋ | 4432/61904 [2:14:45<21:43:43, 1.36s/it] 7%|▋ | 4433/61904 [2:14:46<21:57:12, 1.38s/it] 7%|▋ | 4434/61904 [2:14:47<22:13:45, 1.39s/it] 7%|▋ | 4435/61904 [2:14:49<22:13:31, 1.39s/it] 7%|▋ | 4436/61904 [2:14:50<22:01:58, 1.38s/it] 7%|▋ | 4437/61904 [2:14:51<21:30:21, 1.35s/it] 7%|▋ | 4438/61904 [2:14:53<21:36:22, 1.35s/it] 7%|▋ | 4439/61904 [2:14:54<22:02:43, 1.38s/it] 7%|▋ | 4440/61904 [2:14:56<21:38:46, 1.36s/it] {'loss': 2.9305, 'learning_rate': 1.931284843770258e-07, 'epoch': 1.15} 7%|▋ | 4440/61904 [2:14:56<21:38:46, 1.36s/it] 7%|▋ | 4441/61904 [2:14:57<21:56:39, 1.37s/it] 7%|▋ | 4442/61904 [2:14:58<22:12:12, 1.39s/it] 7%|▋ | 4443/61904 [2:15:00<21:40:47, 1.36s/it] 7%|▋ | 4444/61904 [2:15:01<20:56:45, 1.31s/it] 7%|▋ | 4445/61904 [2:15:02<20:45:58, 1.30s/it] 7%|▋ | 4446/61904 [2:15:03<20:22:37, 1.28s/it] 7%|▋ | 4447/61904 [2:15:05<21:32:53, 1.35s/it] 7%|▋ | 4448/61904 [2:15:06<22:19:03, 1.40s/it] 7%|▋ | 4449/61904 [2:15:08<22:01:30, 1.38s/it] 7%|▋ | 4450/61904 [2:15:09<22:13:29, 1.39s/it] 7%|▋ | 4451/61904 [2:15:11<23:22:41, 1.46s/it] 7%|▋ | 4452/61904 [2:15:12<22:25:04, 1.40s/it] 7%|▋ | 4453/61904 [2:15:13<22:13:44, 1.39s/it] 7%|▋ | 4454/61904 [2:15:15<21:52:45, 1.37s/it] 7%|▋ | 4455/61904 [2:15:16<22:22:30, 1.40s/it] 7%|▋ | 4456/61904 [2:15:17<21:37:57, 1.36s/it] 7%|▋ | 4457/61904 [2:15:19<22:02:06, 1.38s/it] 7%|▋ | 4458/61904 [2:15:20<22:04:25, 1.38s/it] 7%|▋ | 4459/61904 [2:15:22<22:23:54, 1.40s/it] 7%|▋ | 4460/61904 [2:15:23<21:58:13, 1.38s/it] {'loss': 2.9797, 'learning_rate': 1.9309607156748348e-07, 'epoch': 1.15} 7%|▋ | 4460/61904 [2:15:23<21:58:13, 1.38s/it] 7%|▋ | 4461/61904 [2:15:24<21:28:13, 1.35s/it] 7%|▋ | 4462/61904 [2:15:26<21:01:15, 1.32s/it] 7%|▋ | 4463/61904 [2:15:27<21:09:26, 1.33s/it] 7%|▋ | 4464/61904 [2:15:28<21:16:45, 1.33s/it] 7%|▋ | 4465/61904 [2:15:30<21:42:16, 1.36s/it] 7%|▋ | 4466/61904 [2:15:31<21:30:41, 1.35s/it] 7%|▋ | 4467/61904 [2:15:33<22:13:21, 1.39s/it] 7%|▋ | 4468/61904 [2:15:34<22:05:18, 1.38s/it] 7%|▋ | 4469/61904 [2:15:35<21:48:59, 1.37s/it] 7%|▋ | 4470/61904 [2:15:37<21:56:48, 1.38s/it] 7%|▋ | 4471/61904 [2:15:38<22:04:14, 1.38s/it] 7%|▋ | 4472/61904 [2:15:40<22:33:57, 1.41s/it] 7%|▋ | 4473/61904 [2:15:41<22:56:28, 1.44s/it] 7%|▋ | 4474/61904 [2:15:42<22:55:59, 1.44s/it] 7%|▋ | 4475/61904 [2:15:44<22:48:25, 1.43s/it] 7%|▋ | 4476/61904 [2:15:45<22:19:31, 1.40s/it] 7%|▋ | 4477/61904 [2:15:47<22:01:17, 1.38s/it] 7%|▋ | 4478/61904 [2:15:48<21:57:46, 1.38s/it] 7%|▋ | 4479/61904 [2:15:49<21:24:26, 1.34s/it] 7%|▋ | 4480/61904 [2:15:51<21:41:19, 1.36s/it] {'loss': 2.9846, 'learning_rate': 1.930636587579411e-07, 'epoch': 1.16} 7%|▋ | 4480/61904 [2:15:51<21:41:19, 1.36s/it] 7%|▋ | 4481/61904 [2:15:52<22:26:07, 1.41s/it] 7%|▋ | 4482/61904 [2:15:54<23:08:01, 1.45s/it] 7%|▋ | 4483/61904 [2:15:55<22:39:12, 1.42s/it] 7%|▋ | 4484/61904 [2:15:56<22:25:16, 1.41s/it] 7%|▋ | 4485/61904 [2:15:58<22:15:46, 1.40s/it] 7%|▋ | 4486/61904 [2:15:59<22:23:21, 1.40s/it] 7%|▋ | 4487/61904 [2:16:00<22:11:37, 1.39s/it] 7%|▋ | 4488/61904 [2:16:02<22:11:46, 1.39s/it] 7%|▋ | 4489/61904 [2:16:03<21:47:55, 1.37s/it] 7%|▋ | 4490/61904 [2:16:04<20:59:23, 1.32s/it] 7%|▋ | 4491/61904 [2:16:06<21:03:04, 1.32s/it] 7%|▋ | 4492/61904 [2:16:07<21:09:09, 1.33s/it] 7%|▋ | 4493/61904 [2:16:08<21:17:57, 1.34s/it] 7%|▋ | 4494/61904 [2:16:10<20:58:37, 1.32s/it] 7%|▋ | 4495/61904 [2:16:11<20:42:59, 1.30s/it] 7%|▋ | 4496/61904 [2:16:12<21:02:42, 1.32s/it] 7%|▋ | 4497/61904 [2:16:14<21:13:38, 1.33s/it] 7%|▋ | 4498/61904 [2:16:15<22:18:21, 1.40s/it] 7%|▋ | 4499/61904 [2:16:17<21:53:25, 1.37s/it] 7%|▋ | 4500/61904 [2:16:18<22:28:44, 1.41s/it] {'loss': 3.0176, 'learning_rate': 1.930312459483988e-07, 'epoch': 1.16} 7%|▋ | 4500/61904 [2:16:18<22:28:44, 1.41s/it] 7%|▋ | 4501/61904 [2:16:19<22:34:53, 1.42s/it] 7%|▋ | 4502/61904 [2:16:21<22:20:21, 1.40s/it] 7%|▋ | 4503/61904 [2:16:22<22:37:50, 1.42s/it] 7%|▋ | 4504/61904 [2:16:24<22:25:50, 1.41s/it] 7%|▋ | 4505/61904 [2:16:25<22:01:01, 1.38s/it] 7%|▋ | 4506/61904 [2:16:27<22:47:46, 1.43s/it] 7%|▋ | 4507/61904 [2:16:28<22:30:14, 1.41s/it] 7%|▋ | 4508/61904 [2:16:29<21:58:34, 1.38s/it] 7%|▋ | 4509/61904 [2:16:31<21:47:02, 1.37s/it] 7%|▋ | 4510/61904 [2:16:32<21:47:14, 1.37s/it] 7%|▋ | 4511/61904 [2:16:33<21:49:55, 1.37s/it] 7%|▋ | 4512/61904 [2:16:35<22:07:38, 1.39s/it] 7%|▋ | 4513/61904 [2:16:36<21:41:32, 1.36s/it] 7%|▋ | 4514/61904 [2:16:37<21:15:17, 1.33s/it] 7%|▋ | 4515/61904 [2:16:39<21:29:30, 1.35s/it] 7%|▋ | 4516/61904 [2:16:40<21:35:31, 1.35s/it] 7%|▋ | 4517/61904 [2:16:41<21:31:14, 1.35s/it] 7%|▋ | 4518/61904 [2:16:43<21:43:46, 1.36s/it] 7%|▋ | 4519/61904 [2:16:44<21:39:19, 1.36s/it] 7%|▋ | 4520/61904 [2:16:45<21:33:02, 1.35s/it] {'loss': 2.9897, 'learning_rate': 1.9299883313885646e-07, 'epoch': 1.17} 7%|▋ | 4520/61904 [2:16:45<21:33:02, 1.35s/it] 7%|▋ | 4521/61904 [2:16:47<21:25:23, 1.34s/it] 7%|▋ | 4522/61904 [2:16:48<21:18:08, 1.34s/it] 7%|▋ | 4523/61904 [2:16:50<22:12:54, 1.39s/it] 7%|▋ | 4524/61904 [2:16:51<21:44:37, 1.36s/it] 7%|▋ | 4525/61904 [2:16:52<21:44:16, 1.36s/it] 7%|▋ | 4526/61904 [2:16:54<21:20:38, 1.34s/it] 7%|▋ | 4527/61904 [2:16:55<21:53:23, 1.37s/it] 7%|▋ | 4528/61904 [2:16:56<22:17:59, 1.40s/it] 7%|▋ | 4529/61904 [2:16:58<22:15:33, 1.40s/it] 7%|▋ | 4530/61904 [2:16:59<21:51:47, 1.37s/it] 7%|▋ | 4531/61904 [2:17:01<22:03:52, 1.38s/it] 7%|▋ | 4532/61904 [2:17:02<21:27:44, 1.35s/it] 7%|▋ | 4533/61904 [2:17:03<22:03:23, 1.38s/it] 7%|▋ | 4534/61904 [2:17:05<21:57:36, 1.38s/it] 7%|▋ | 4535/61904 [2:17:06<21:37:36, 1.36s/it] 7%|▋ | 4536/61904 [2:17:07<21:35:45, 1.36s/it] 7%|▋ | 4537/61904 [2:17:09<21:35:54, 1.36s/it] 7%|▋ | 4538/61904 [2:17:10<21:25:01, 1.34s/it] 7%|▋ | 4539/61904 [2:17:12<23:06:47, 1.45s/it] 7%|▋ | 4540/61904 [2:17:13<23:17:04, 1.46s/it] {'loss': 3.0404, 'learning_rate': 1.9296642032931413e-07, 'epoch': 1.17} 7%|▋ | 4540/61904 [2:17:13<23:17:04, 1.46s/it] 7%|▋ | 4541/61904 [2:17:15<23:02:02, 1.45s/it] 7%|▋ | 4542/61904 [2:17:16<22:07:39, 1.39s/it] 7%|▋ | 4543/61904 [2:17:17<23:00:07, 1.44s/it] 7%|▋ | 4544/61904 [2:17:19<22:06:45, 1.39s/it] 7%|▋ | 4545/61904 [2:17:20<22:53:21, 1.44s/it] 7%|▋ | 4546/61904 [2:17:22<22:59:12, 1.44s/it] 7%|▋ | 4547/61904 [2:17:23<22:42:44, 1.43s/it] 7%|▋ | 4548/61904 [2:17:24<21:55:47, 1.38s/it] 7%|▋ | 4549/61904 [2:17:26<22:00:26, 1.38s/it] 7%|▋ | 4550/61904 [2:17:27<22:23:50, 1.41s/it] 7%|▋ | 4551/61904 [2:17:29<22:06:58, 1.39s/it] 7%|▋ | 4552/61904 [2:17:30<22:00:03, 1.38s/it] 7%|▋ | 4553/61904 [2:17:31<21:24:46, 1.34s/it] 7%|▋ | 4554/61904 [2:17:33<21:38:10, 1.36s/it] 7%|▋ | 4555/61904 [2:17:34<21:17:13, 1.34s/it] 7%|▋ | 4556/61904 [2:17:35<21:13:20, 1.33s/it] 7%|▋ | 4557/61904 [2:17:37<21:26:11, 1.35s/it] 7%|▋ | 4558/61904 [2:17:38<20:49:07, 1.31s/it] 7%|▋ | 4559/61904 [2:17:39<21:36:35, 1.36s/it] 7%|▋ | 4560/61904 [2:17:41<22:27:16, 1.41s/it] {'loss': 2.9724, 'learning_rate': 1.9293400751977181e-07, 'epoch': 1.18} 7%|▋ | 4560/61904 [2:17:41<22:27:16, 1.41s/it] 7%|▋ | 4561/61904 [2:17:42<21:51:20, 1.37s/it] 7%|▋ | 4562/61904 [2:17:44<22:16:32, 1.40s/it] 7%|▋ | 4563/61904 [2:17:45<22:49:36, 1.43s/it] 7%|▋ | 4564/61904 [2:17:46<22:33:27, 1.42s/it] 7%|▋ | 4565/61904 [2:17:48<21:45:44, 1.37s/it] 7%|▋ | 4566/61904 [2:17:49<22:11:50, 1.39s/it] 7%|▋ | 4567/61904 [2:17:51<22:09:44, 1.39s/it] 7%|▋ | 4568/61904 [2:17:52<21:44:47, 1.37s/it] 7%|▋ | 4569/61904 [2:17:53<21:40:02, 1.36s/it] 7%|▋ | 4570/61904 [2:17:54<21:28:46, 1.35s/it] 7%|▋ | 4571/61904 [2:17:56<21:46:35, 1.37s/it] 7%|▋ | 4572/61904 [2:17:57<22:01:35, 1.38s/it] 7%|▋ | 4573/61904 [2:17:59<22:05:00, 1.39s/it] 7%|▋ | 4574/61904 [2:18:00<21:24:23, 1.34s/it] 7%|▋ | 4575/61904 [2:18:01<21:10:12, 1.33s/it] 7%|▋ | 4576/61904 [2:18:03<22:15:51, 1.40s/it] 7%|▋ | 4577/61904 [2:18:04<22:36:08, 1.42s/it] 7%|▋ | 4578/61904 [2:18:06<21:59:05, 1.38s/it] 7%|▋ | 4579/61904 [2:18:07<22:03:43, 1.39s/it] 7%|▋ | 4580/61904 [2:18:08<21:55:41, 1.38s/it] {'loss': 3.019, 'learning_rate': 1.9290159471022948e-07, 'epoch': 1.18} 7%|▋ | 4580/61904 [2:18:08<21:55:41, 1.38s/it] 7%|▋ | 4581/61904 [2:18:10<21:24:16, 1.34s/it] 7%|▋ | 4582/61904 [2:18:11<21:36:31, 1.36s/it] 7%|▋ | 4583/61904 [2:18:12<21:48:19, 1.37s/it] 7%|▋ | 4584/61904 [2:18:14<21:43:50, 1.36s/it] 7%|▋ | 4585/61904 [2:18:15<21:35:41, 1.36s/it] 7%|▋ | 4586/61904 [2:18:16<21:37:25, 1.36s/it] 7%|▋ | 4587/61904 [2:18:18<21:18:30, 1.34s/it] 7%|▋ | 4588/61904 [2:18:19<20:56:15, 1.32s/it] 7%|▋ | 4589/61904 [2:18:20<21:01:17, 1.32s/it] 7%|▋ | 4590/61904 [2:18:22<20:38:33, 1.30s/it] 7%|▋ | 4591/61904 [2:18:23<21:12:26, 1.33s/it] 7%|▋ | 4592/61904 [2:18:24<21:50:06, 1.37s/it] 7%|▋ | 4593/61904 [2:18:26<21:58:07, 1.38s/it] 7%|▋ | 4594/61904 [2:18:27<21:27:49, 1.35s/it] 7%|▋ | 4595/61904 [2:18:28<21:30:36, 1.35s/it] 7%|▋ | 4596/61904 [2:18:30<21:49:14, 1.37s/it] 7%|▋ | 4597/61904 [2:18:31<22:25:39, 1.41s/it] 7%|▋ | 4598/61904 [2:18:33<21:46:08, 1.37s/it] 7%|▋ | 4599/61904 [2:18:34<22:03:22, 1.39s/it] 7%|▋ | 4600/61904 [2:18:35<21:56:58, 1.38s/it] {'loss': 2.9637, 'learning_rate': 1.9286918190068714e-07, 'epoch': 1.19} 7%|▋ | 4600/61904 [2:18:35<21:56:58, 1.38s/it] 7%|▋ | 4601/61904 [2:18:37<22:21:37, 1.40s/it] 7%|▋ | 4602/61904 [2:18:38<22:15:30, 1.40s/it] 7%|▋ | 4603/61904 [2:18:40<21:52:38, 1.37s/it] 7%|▋ | 4604/61904 [2:18:41<21:44:57, 1.37s/it] 7%|▋ | 4605/61904 [2:18:42<21:12:49, 1.33s/it] 7%|▋ | 4606/61904 [2:18:43<20:39:02, 1.30s/it] 7%|▋ | 4607/61904 [2:18:45<20:59:22, 1.32s/it] 7%|▋ | 4608/61904 [2:18:46<21:50:36, 1.37s/it] 7%|▋ | 4609/61904 [2:18:48<21:58:31, 1.38s/it] 7%|▋ | 4610/61904 [2:18:49<22:05:17, 1.39s/it] 7%|▋ | 4611/61904 [2:18:50<21:51:56, 1.37s/it] 7%|▋ | 4612/61904 [2:18:52<22:19:43, 1.40s/it] 7%|▋ | 4613/61904 [2:18:53<21:35:44, 1.36s/it] 7%|▋ | 4614/61904 [2:18:54<21:21:37, 1.34s/it] 7%|▋ | 4615/61904 [2:18:56<21:09:42, 1.33s/it] 7%|▋ | 4616/61904 [2:18:57<21:03:11, 1.32s/it] 7%|▋ | 4617/61904 [2:18:59<21:52:39, 1.37s/it] 7%|▋ | 4618/61904 [2:19:00<22:52:14, 1.44s/it] 7%|▋ | 4619/61904 [2:19:01<22:09:36, 1.39s/it] 7%|▋ | 4620/61904 [2:19:03<21:46:16, 1.37s/it] {'loss': 3.0281, 'learning_rate': 1.9283676909114483e-07, 'epoch': 1.19} 7%|▋ | 4620/61904 [2:19:03<21:46:16, 1.37s/it] 7%|▋ | 4621/61904 [2:19:04<21:54:22, 1.38s/it] 7%|▋ | 4622/61904 [2:19:05<21:21:44, 1.34s/it] 7%|▋ | 4623/61904 [2:19:07<21:29:59, 1.35s/it] 7%|▋ | 4624/61904 [2:19:08<20:45:18, 1.30s/it] 7%|▋ | 4625/61904 [2:19:09<20:46:42, 1.31s/it] 7%|▋ | 4626/61904 [2:19:11<21:12:32, 1.33s/it] 7%|▋ | 4627/61904 [2:19:12<22:15:30, 1.40s/it] 7%|▋ | 4628/61904 [2:19:14<22:13:58, 1.40s/it] 7%|▋ | 4629/61904 [2:19:15<21:32:19, 1.35s/it] 7%|▋ | 4630/61904 [2:19:16<21:33:47, 1.36s/it] 7%|▋ | 4631/61904 [2:19:18<21:10:21, 1.33s/it] 7%|▋ | 4632/61904 [2:19:19<21:26:26, 1.35s/it] 7%|▋ | 4633/61904 [2:19:20<21:51:33, 1.37s/it] 7%|▋ | 4634/61904 [2:19:22<22:21:41, 1.41s/it] 7%|▋ | 4635/61904 [2:19:23<22:23:01, 1.41s/it] 7%|▋ | 4636/61904 [2:19:25<22:01:17, 1.38s/it] 7%|▋ | 4637/61904 [2:19:26<22:21:48, 1.41s/it] 7%|▋ | 4638/61904 [2:19:27<22:14:47, 1.40s/it] 7%|▋ | 4639/61904 [2:19:29<21:37:20, 1.36s/it] 7%|▋ | 4640/61904 [2:19:30<21:17:27, 1.34s/it] {'loss': 2.9964, 'learning_rate': 1.9280435628160246e-07, 'epoch': 1.2} 7%|▋ | 4640/61904 [2:19:30<21:17:27, 1.34s/it] 7%|▋ | 4641/61904 [2:19:31<22:11:01, 1.39s/it] 7%|▋ | 4642/61904 [2:19:33<22:28:01, 1.41s/it] 8%|▊ | 4643/61904 [2:19:34<22:14:21, 1.40s/it] 8%|▊ | 4644/61904 [2:19:36<21:35:39, 1.36s/it] 8%|▊ | 4645/61904 [2:19:37<21:09:47, 1.33s/it] 8%|▊ | 4646/61904 [2:19:38<21:11:20, 1.33s/it] 8%|▊ | 4647/61904 [2:19:39<20:55:46, 1.32s/it] 8%|▊ | 4648/61904 [2:19:41<21:39:46, 1.36s/it] 8%|▊ | 4649/61904 [2:19:42<21:39:01, 1.36s/it] 8%|▊ | 4650/61904 [2:19:44<21:27:03, 1.35s/it] 8%|▊ | 4651/61904 [2:19:45<21:43:27, 1.37s/it] 8%|▊ | 4652/61904 [2:19:46<21:29:08, 1.35s/it] 8%|▊ | 4653/61904 [2:19:48<21:29:29, 1.35s/it] 8%|▊ | 4654/61904 [2:19:49<22:12:54, 1.40s/it] 8%|▊ | 4655/61904 [2:19:51<22:06:05, 1.39s/it] 8%|▊ | 4656/61904 [2:19:52<21:16:25, 1.34s/it] 8%|▊ | 4657/61904 [2:19:53<21:41:26, 1.36s/it] 8%|▊ | 4658/61904 [2:19:54<20:53:01, 1.31s/it] 8%|▊ | 4659/61904 [2:19:56<20:26:35, 1.29s/it] 8%|▊ | 4660/61904 [2:19:57<21:16:15, 1.34s/it] {'loss': 3.0762, 'learning_rate': 1.9277194347206015e-07, 'epoch': 1.2} 8%|▊ | 4660/61904 [2:19:57<21:16:15, 1.34s/it] 8%|▊ | 4661/61904 [2:19:58<21:42:33, 1.37s/it] 8%|▊ | 4662/61904 [2:20:00<22:04:44, 1.39s/it] 8%|▊ | 4663/61904 [2:20:01<21:27:42, 1.35s/it] 8%|▊ | 4664/61904 [2:20:03<21:15:59, 1.34s/it] 8%|▊ | 4665/61904 [2:20:04<21:05:09, 1.33s/it] 8%|▊ | 4666/61904 [2:20:05<21:40:34, 1.36s/it] 8%|▊ | 4667/61904 [2:20:07<21:47:04, 1.37s/it] 8%|▊ | 4668/61904 [2:20:08<21:45:05, 1.37s/it] 8%|▊ | 4669/61904 [2:20:09<21:15:51, 1.34s/it] 8%|▊ | 4670/61904 [2:20:11<21:12:07, 1.33s/it] 8%|▊ | 4671/61904 [2:20:12<21:27:17, 1.35s/it] 8%|▊ | 4672/61904 [2:20:13<21:28:36, 1.35s/it] 8%|▊ | 4673/61904 [2:20:15<21:44:44, 1.37s/it] 8%|▊ | 4674/61904 [2:20:16<21:54:31, 1.38s/it] 8%|▊ | 4675/61904 [2:20:18<22:29:30, 1.41s/it] 8%|▊ | 4676/61904 [2:20:19<21:56:26, 1.38s/it] 8%|▊ | 4677/61904 [2:20:20<21:53:22, 1.38s/it] 8%|▊ | 4678/61904 [2:20:22<21:27:00, 1.35s/it] 8%|▊ | 4679/61904 [2:20:23<21:42:12, 1.37s/it] 8%|▊ | 4680/61904 [2:20:24<21:42:22, 1.37s/it] {'loss': 3.035, 'learning_rate': 1.9273953066251784e-07, 'epoch': 1.21} 8%|▊ | 4680/61904 [2:20:24<21:42:22, 1.37s/it] 8%|▊ | 4681/61904 [2:20:26<21:23:17, 1.35s/it] 8%|▊ | 4682/61904 [2:20:27<21:15:06, 1.34s/it] 8%|▊ | 4683/61904 [2:20:28<21:47:51, 1.37s/it] 8%|▊ | 4684/61904 [2:20:30<21:35:41, 1.36s/it] 8%|▊ | 4685/61904 [2:20:31<21:05:02, 1.33s/it] 8%|▊ | 4686/61904 [2:20:32<20:54:27, 1.32s/it] 8%|▊ | 4687/61904 [2:20:34<21:35:58, 1.36s/it] 8%|▊ | 4688/61904 [2:20:35<21:44:32, 1.37s/it] 8%|▊ | 4689/61904 [2:20:36<21:29:11, 1.35s/it] 8%|▊ | 4690/61904 [2:20:38<21:25:36, 1.35s/it] 8%|▊ | 4691/61904 [2:20:39<21:11:22, 1.33s/it] 8%|▊ | 4692/61904 [2:20:41<22:11:02, 1.40s/it] 8%|▊ | 4693/61904 [2:20:42<21:39:49, 1.36s/it] 8%|▊ | 4694/61904 [2:20:43<21:29:14, 1.35s/it] 8%|▊ | 4695/61904 [2:20:45<21:19:35, 1.34s/it] 8%|▊ | 4696/61904 [2:20:46<21:24:04, 1.35s/it] 8%|▊ | 4697/61904 [2:20:47<21:14:19, 1.34s/it] 8%|▊ | 4698/61904 [2:20:49<21:46:15, 1.37s/it] 8%|▊ | 4699/61904 [2:20:50<20:50:12, 1.31s/it] 8%|▊ | 4700/61904 [2:20:51<21:44:16, 1.37s/it] {'loss': 2.9578, 'learning_rate': 1.9270711785297547e-07, 'epoch': 1.21} 8%|▊ | 4700/61904 [2:20:51<21:44:16, 1.37s/it] 8%|▊ | 4701/61904 [2:20:53<22:23:48, 1.41s/it] 8%|▊ | 4702/61904 [2:20:54<22:18:15, 1.40s/it] 8%|▊ | 4703/61904 [2:20:56<22:05:46, 1.39s/it] 8%|▊ | 4704/61904 [2:20:57<21:25:45, 1.35s/it] 8%|▊ | 4705/61904 [2:20:58<21:30:53, 1.35s/it] 8%|▊ | 4706/61904 [2:21:00<21:42:50, 1.37s/it] 8%|▊ | 4707/61904 [2:21:01<22:37:51, 1.42s/it] 8%|▊ | 4708/61904 [2:21:03<22:40:35, 1.43s/it] 8%|▊ | 4709/61904 [2:21:04<21:54:08, 1.38s/it] 8%|▊ | 4710/61904 [2:21:05<21:37:11, 1.36s/it] 8%|▊ | 4711/61904 [2:21:07<21:24:57, 1.35s/it] 8%|▊ | 4712/61904 [2:21:08<21:25:45, 1.35s/it] 8%|▊ | 4713/61904 [2:21:09<21:49:33, 1.37s/it] 8%|▊ | 4714/61904 [2:21:11<21:04:39, 1.33s/it] 8%|▊ | 4715/61904 [2:21:12<20:51:39, 1.31s/it] 8%|▊ | 4716/61904 [2:21:13<20:43:54, 1.31s/it] 8%|▊ | 4717/61904 [2:21:14<20:22:51, 1.28s/it] 8%|▊ | 4718/61904 [2:21:16<21:37:13, 1.36s/it] 8%|▊ | 4719/61904 [2:21:17<22:01:22, 1.39s/it] 8%|▊ | 4720/61904 [2:21:19<21:37:49, 1.36s/it] {'loss': 2.9568, 'learning_rate': 1.9267470504343316e-07, 'epoch': 1.22} 8%|▊ | 4720/61904 [2:21:19<21:37:49, 1.36s/it] 8%|▊ | 4721/61904 [2:21:20<22:01:31, 1.39s/it] 8%|▊ | 4722/61904 [2:21:21<22:05:30, 1.39s/it] 8%|▊ | 4723/61904 [2:21:23<21:53:40, 1.38s/it] 8%|▊ | 4724/61904 [2:21:24<21:50:26, 1.38s/it] 8%|▊ | 4725/61904 [2:21:26<22:07:15, 1.39s/it] 8%|▊ | 4726/61904 [2:21:27<21:28:45, 1.35s/it] 8%|▊ | 4727/61904 [2:21:28<21:14:30, 1.34s/it] 8%|▊ | 4728/61904 [2:21:30<21:33:21, 1.36s/it] 8%|▊ | 4729/61904 [2:21:31<21:28:31, 1.35s/it] 8%|▊ | 4730/61904 [2:21:32<21:36:15, 1.36s/it] 8%|▊ | 4731/61904 [2:21:34<21:12:02, 1.33s/it] 8%|▊ | 4732/61904 [2:21:35<21:01:41, 1.32s/it] 8%|▊ | 4733/61904 [2:21:36<22:11:27, 1.40s/it] 8%|▊ | 4734/61904 [2:21:38<22:00:21, 1.39s/it] 8%|▊ | 4735/61904 [2:21:39<21:58:17, 1.38s/it] 8%|▊ | 4736/61904 [2:21:41<22:27:02, 1.41s/it] 8%|▊ | 4737/61904 [2:21:42<22:07:53, 1.39s/it] 8%|▊ | 4738/61904 [2:21:43<21:54:42, 1.38s/it] 8%|▊ | 4739/61904 [2:21:45<22:06:37, 1.39s/it] 8%|▊ | 4740/61904 [2:21:46<22:01:29, 1.39s/it] {'loss': 3.0495, 'learning_rate': 1.9264229223389082e-07, 'epoch': 1.22} 8%|▊ | 4740/61904 [2:21:46<22:01:29, 1.39s/it] 8%|▊ | 4741/61904 [2:21:47<21:17:59, 1.34s/it] 8%|▊ | 4742/61904 [2:21:49<21:37:46, 1.36s/it] 8%|▊ | 4743/61904 [2:21:50<21:18:51, 1.34s/it] 8%|▊ | 4744/61904 [2:21:52<21:37:13, 1.36s/it] 8%|▊ | 4745/61904 [2:21:53<21:51:15, 1.38s/it] 8%|▊ | 4746/61904 [2:21:54<22:11:41, 1.40s/it] 8%|▊ | 4747/61904 [2:21:56<22:29:16, 1.42s/it] 8%|▊ | 4748/61904 [2:21:57<22:32:02, 1.42s/it] 8%|▊ | 4749/61904 [2:21:59<21:47:03, 1.37s/it] 8%|▊ | 4750/61904 [2:22:00<21:00:05, 1.32s/it] 8%|▊ | 4751/61904 [2:22:01<20:59:59, 1.32s/it] 8%|▊ | 4752/61904 [2:22:02<21:14:49, 1.34s/it] 8%|▊ | 4753/61904 [2:22:04<21:30:12, 1.35s/it] 8%|▊ | 4754/61904 [2:22:05<22:06:06, 1.39s/it] 8%|▊ | 4755/61904 [2:22:07<21:18:02, 1.34s/it] 8%|▊ | 4756/61904 [2:22:08<21:09:40, 1.33s/it] 8%|▊ | 4757/61904 [2:22:09<21:31:56, 1.36s/it] 8%|▊ | 4758/61904 [2:22:11<22:02:04, 1.39s/it] 8%|▊ | 4759/61904 [2:22:12<22:07:00, 1.39s/it] 8%|▊ | 4760/61904 [2:22:13<21:35:31, 1.36s/it] {'loss': 3.0054, 'learning_rate': 1.9260987942434848e-07, 'epoch': 1.23} 8%|▊ | 4760/61904 [2:22:13<21:35:31, 1.36s/it] 8%|▊ | 4761/61904 [2:22:15<21:31:53, 1.36s/it] 8%|▊ | 4762/61904 [2:22:16<21:29:31, 1.35s/it] 8%|▊ | 4763/61904 [2:22:17<21:43:27, 1.37s/it] 8%|▊ | 4764/61904 [2:22:19<22:45:37, 1.43s/it] 8%|▊ | 4765/61904 [2:22:20<22:20:41, 1.41s/it] 8%|▊ | 4766/61904 [2:22:22<21:57:45, 1.38s/it] 8%|▊ | 4767/61904 [2:22:23<21:56:58, 1.38s/it] 8%|▊ | 4768/61904 [2:22:25<21:59:53, 1.39s/it] 8%|▊ | 4769/61904 [2:22:26<22:14:39, 1.40s/it] 8%|▊ | 4770/61904 [2:22:27<22:29:35, 1.42s/it] 8%|▊ | 4771/61904 [2:22:29<21:49:51, 1.38s/it] 8%|▊ | 4772/61904 [2:22:30<21:12:53, 1.34s/it] 8%|▊ | 4773/61904 [2:22:31<21:25:23, 1.35s/it] 8%|▊ | 4774/61904 [2:22:33<21:07:49, 1.33s/it] 8%|▊ | 4775/61904 [2:22:34<21:46:59, 1.37s/it] 8%|▊ | 4776/61904 [2:22:35<21:20:53, 1.35s/it] 8%|▊ | 4777/61904 [2:22:37<21:31:39, 1.36s/it] 8%|▊ | 4778/61904 [2:22:38<21:15:08, 1.34s/it] 8%|▊ | 4779/61904 [2:22:39<21:20:34, 1.35s/it] 8%|▊ | 4780/61904 [2:22:41<21:36:58, 1.36s/it] {'loss': 3.0101, 'learning_rate': 1.9257746661480617e-07, 'epoch': 1.24} 8%|▊ | 4780/61904 [2:22:41<21:36:58, 1.36s/it] 8%|▊ | 4781/61904 [2:22:42<21:40:10, 1.37s/it] 8%|▊ | 4782/61904 [2:22:44<21:32:28, 1.36s/it] 8%|▊ | 4783/61904 [2:22:45<21:28:58, 1.35s/it] 8%|▊ | 4784/61904 [2:22:46<21:48:14, 1.37s/it] 8%|▊ | 4785/61904 [2:22:48<22:02:46, 1.39s/it] 8%|▊ | 4786/61904 [2:22:49<21:34:52, 1.36s/it] 8%|▊ | 4787/61904 [2:22:50<21:37:26, 1.36s/it] 8%|▊ | 4788/61904 [2:22:52<21:26:03, 1.35s/it] 8%|▊ | 4789/61904 [2:22:53<21:30:58, 1.36s/it] 8%|▊ | 4790/61904 [2:22:54<21:17:07, 1.34s/it] 8%|▊ | 4791/61904 [2:22:56<20:59:54, 1.32s/it] 8%|▊ | 4792/61904 [2:22:57<22:05:53, 1.39s/it] 8%|▊ | 4793/61904 [2:22:59<22:02:24, 1.39s/it] 8%|▊ | 4794/61904 [2:23:00<22:00:23, 1.39s/it] 8%|▊ | 4795/61904 [2:23:01<22:06:26, 1.39s/it] 8%|▊ | 4796/61904 [2:23:03<22:05:01, 1.39s/it] 8%|▊ | 4797/61904 [2:23:04<21:40:23, 1.37s/it] 8%|▊ | 4798/61904 [2:23:05<21:24:10, 1.35s/it] 8%|▊ | 4799/61904 [2:23:07<22:17:01, 1.40s/it] 8%|▊ | 4800/61904 [2:23:08<22:54:21, 1.44s/it] {'loss': 2.9621, 'learning_rate': 1.9254505380526384e-07, 'epoch': 1.24} 8%|▊ | 4800/61904 [2:23:08<22:54:21, 1.44s/it] 8%|▊ | 4801/61904 [2:23:10<22:02:01, 1.39s/it] 8%|▊ | 4802/61904 [2:23:11<21:11:24, 1.34s/it] 8%|▊ | 4803/61904 [2:23:12<20:51:17, 1.31s/it] 8%|▊ | 4804/61904 [2:23:13<20:11:26, 1.27s/it] 8%|▊ | 4805/61904 [2:23:15<21:00:56, 1.32s/it] 8%|▊ | 4806/61904 [2:23:16<21:10:07, 1.33s/it] 8%|▊ | 4807/61904 [2:23:17<21:03:23, 1.33s/it] 8%|▊ | 4808/61904 [2:23:19<20:40:15, 1.30s/it] 8%|▊ | 4809/61904 [2:23:20<21:04:34, 1.33s/it] 8%|▊ | 4810/61904 [2:23:21<20:47:07, 1.31s/it] 8%|▊ | 4811/61904 [2:23:23<20:32:08, 1.29s/it] 8%|▊ | 4812/61904 [2:23:24<22:01:01, 1.39s/it] 8%|▊ | 4813/61904 [2:23:26<22:45:26, 1.44s/it] 8%|▊ | 4814/61904 [2:23:27<22:01:21, 1.39s/it] 8%|▊ | 4815/61904 [2:23:28<21:13:55, 1.34s/it] 8%|▊ | 4816/61904 [2:23:30<21:25:13, 1.35s/it] 8%|▊ | 4817/61904 [2:23:31<21:04:12, 1.33s/it] 8%|▊ | 4818/61904 [2:23:32<21:20:18, 1.35s/it] 8%|▊ | 4819/61904 [2:23:34<22:11:33, 1.40s/it] 8%|▊ | 4820/61904 [2:23:35<22:18:50, 1.41s/it] {'loss': 3.0161, 'learning_rate': 1.925126409957215e-07, 'epoch': 1.25} 8%|▊ | 4820/61904 [2:23:35<22:18:50, 1.41s/it] 8%|▊ | 4821/61904 [2:23:37<22:06:47, 1.39s/it] 8%|▊ | 4822/61904 [2:23:38<22:06:39, 1.39s/it] 8%|▊ | 4823/61904 [2:23:40<23:18:43, 1.47s/it] 8%|▊ | 4824/61904 [2:23:41<23:42:21, 1.50s/it] 8%|▊ | 4825/61904 [2:23:43<22:58:50, 1.45s/it] 8%|▊ | 4826/61904 [2:23:44<22:38:02, 1.43s/it] 8%|▊ | 4827/61904 [2:23:45<22:14:50, 1.40s/it] 8%|▊ | 4828/61904 [2:23:47<21:53:18, 1.38s/it] 8%|▊ | 4829/61904 [2:23:48<22:23:43, 1.41s/it] 8%|▊ | 4830/61904 [2:23:50<22:39:28, 1.43s/it] 8%|▊ | 4831/61904 [2:23:51<21:34:54, 1.36s/it] 8%|▊ | 4832/61904 [2:23:52<21:29:25, 1.36s/it] 8%|▊ | 4833/61904 [2:23:53<21:26:58, 1.35s/it] 8%|▊ | 4834/61904 [2:23:55<21:36:42, 1.36s/it] 8%|▊ | 4835/61904 [2:23:56<22:05:27, 1.39s/it] 8%|▊ | 4836/61904 [2:23:58<21:26:56, 1.35s/it] 8%|▊ | 4837/61904 [2:23:59<21:16:14, 1.34s/it] 8%|▊ | 4838/61904 [2:24:01<22:29:34, 1.42s/it] 8%|▊ | 4839/61904 [2:24:02<22:28:40, 1.42s/it] 8%|▊ | 4840/61904 [2:24:03<22:26:06, 1.42s/it] {'loss': 2.9672, 'learning_rate': 1.9248022818617919e-07, 'epoch': 1.25} 8%|▊ | 4840/61904 [2:24:03<22:26:06, 1.42s/it] 8%|▊ | 4841/61904 [2:24:05<21:35:47, 1.36s/it] 8%|▊ | 4842/61904 [2:24:06<21:34:30, 1.36s/it] 8%|▊ | 4843/61904 [2:24:07<21:14:24, 1.34s/it] 8%|▊ | 4844/61904 [2:24:09<21:52:01, 1.38s/it] 8%|▊ | 4845/61904 [2:24:10<22:16:10, 1.41s/it] 8%|▊ | 4846/61904 [2:24:11<21:30:57, 1.36s/it] 8%|▊ | 4847/61904 [2:24:13<21:52:15, 1.38s/it] 8%|▊ | 4848/61904 [2:24:14<21:18:06, 1.34s/it] 8%|▊ | 4849/61904 [2:24:15<21:14:32, 1.34s/it] 8%|▊ | 4850/61904 [2:24:17<21:44:58, 1.37s/it] 8%|▊ | 4851/61904 [2:24:18<22:21:10, 1.41s/it] 8%|▊ | 4852/61904 [2:24:20<21:12:36, 1.34s/it] 8%|▊ | 4853/61904 [2:24:21<21:03:06, 1.33s/it] 8%|▊ | 4854/61904 [2:24:22<22:15:39, 1.40s/it] 8%|▊ | 4855/61904 [2:24:24<22:43:41, 1.43s/it] 8%|▊ | 4856/61904 [2:24:25<21:57:33, 1.39s/it] 8%|▊ | 4857/61904 [2:24:26<21:20:59, 1.35s/it] 8%|▊ | 4858/61904 [2:24:28<20:56:14, 1.32s/it] 8%|▊ | 4859/61904 [2:24:29<21:18:10, 1.34s/it] 8%|▊ | 4860/61904 [2:24:31<21:25:45, 1.35s/it] {'loss': 2.9749, 'learning_rate': 1.9244781537663682e-07, 'epoch': 1.26} 8%|▊ | 4860/61904 [2:24:31<21:25:45, 1.35s/it] 8%|▊ | 4861/61904 [2:24:32<21:03:30, 1.33s/it] 8%|▊ | 4862/61904 [2:24:33<20:47:43, 1.31s/it] 8%|▊ | 4863/61904 [2:24:34<20:34:48, 1.30s/it] 8%|▊ | 4864/61904 [2:24:36<20:37:18, 1.30s/it] 8%|▊ | 4865/61904 [2:24:37<21:10:36, 1.34s/it] 8%|▊ | 4866/61904 [2:24:39<22:00:43, 1.39s/it] 8%|▊ | 4867/61904 [2:24:40<23:15:19, 1.47s/it] 8%|▊ | 4868/61904 [2:24:41<22:09:32, 1.40s/it] 8%|▊ | 4869/61904 [2:24:43<22:07:03, 1.40s/it] 8%|▊ | 4870/61904 [2:24:44<22:03:47, 1.39s/it] 8%|▊ | 4871/61904 [2:24:46<22:03:07, 1.39s/it] 8%|▊ | 4872/61904 [2:24:47<21:55:08, 1.38s/it] 8%|▊ | 4873/61904 [2:24:48<21:54:33, 1.38s/it] 8%|▊ | 4874/61904 [2:24:50<21:47:11, 1.38s/it] 8%|▊ | 4875/61904 [2:24:51<21:03:28, 1.33s/it] 8%|▊ | 4876/61904 [2:24:52<21:15:13, 1.34s/it] 8%|▊ | 4877/61904 [2:24:54<21:17:17, 1.34s/it] 8%|▊ | 4878/61904 [2:24:55<21:18:52, 1.35s/it] 8%|▊ | 4879/61904 [2:24:57<22:03:15, 1.39s/it] 8%|▊ | 4880/61904 [2:24:58<21:38:20, 1.37s/it] {'loss': 2.9898, 'learning_rate': 1.924154025670945e-07, 'epoch': 1.26} 8%|▊ | 4880/61904 [2:24:58<21:38:20, 1.37s/it] 8%|▊ | 4881/61904 [2:24:59<21:21:07, 1.35s/it] 8%|▊ | 4882/61904 [2:25:01<21:38:28, 1.37s/it] 8%|▊ | 4883/61904 [2:25:02<21:43:09, 1.37s/it] 8%|▊ | 4884/61904 [2:25:03<22:04:15, 1.39s/it] 8%|▊ | 4885/61904 [2:25:05<21:21:46, 1.35s/it] 8%|▊ | 4886/61904 [2:25:06<22:22:03, 1.41s/it] 8%|▊ | 4887/61904 [2:25:07<21:33:15, 1.36s/it] 8%|▊ | 4888/61904 [2:25:09<21:04:57, 1.33s/it] 8%|▊ | 4889/61904 [2:25:10<21:01:45, 1.33s/it] 8%|▊ | 4890/61904 [2:25:11<21:01:02, 1.33s/it] 8%|▊ | 4891/61904 [2:25:13<21:14:20, 1.34s/it] 8%|▊ | 4892/61904 [2:25:14<21:16:55, 1.34s/it] 8%|▊ | 4893/61904 [2:25:15<20:59:44, 1.33s/it] 8%|▊ | 4894/61904 [2:25:17<21:01:56, 1.33s/it] 8%|▊ | 4895/61904 [2:25:18<21:44:01, 1.37s/it] 8%|▊ | 4896/61904 [2:25:19<21:30:25, 1.36s/it] 8%|▊ | 4897/61904 [2:25:21<21:33:00, 1.36s/it] 8%|▊ | 4898/61904 [2:25:22<21:17:28, 1.34s/it] 8%|▊ | 4899/61904 [2:25:24<21:46:41, 1.38s/it] 8%|▊ | 4900/61904 [2:25:25<21:13:02, 1.34s/it] {'loss': 2.9917, 'learning_rate': 1.9238298975755217e-07, 'epoch': 1.27} 8%|▊ | 4900/61904 [2:25:25<21:13:02, 1.34s/it] 8%|▊ | 4901/61904 [2:25:26<21:21:17, 1.35s/it] 8%|▊ | 4902/61904 [2:25:28<21:16:05, 1.34s/it] 8%|▊ | 4903/61904 [2:25:29<20:46:16, 1.31s/it] 8%|▊ | 4904/61904 [2:25:30<21:14:25, 1.34s/it] 8%|▊ | 4905/61904 [2:25:31<20:22:45, 1.29s/it] 8%|▊ | 4906/61904 [2:25:33<20:39:19, 1.30s/it] 8%|▊ | 4907/61904 [2:25:34<20:04:38, 1.27s/it] 8%|▊ | 4908/61904 [2:25:35<21:14:35, 1.34s/it] 8%|▊ | 4909/61904 [2:25:37<21:37:31, 1.37s/it] 8%|▊ | 4910/61904 [2:25:38<21:41:50, 1.37s/it] 8%|▊ | 4911/61904 [2:25:39<20:37:35, 1.30s/it] 8%|▊ | 4912/61904 [2:25:41<21:09:38, 1.34s/it] 8%|▊ | 4913/61904 [2:25:42<21:52:04, 1.38s/it] 8%|▊ | 4914/61904 [2:25:44<21:58:25, 1.39s/it] 8%|▊ | 4915/61904 [2:25:45<21:34:04, 1.36s/it] 8%|▊ | 4916/61904 [2:25:46<21:33:40, 1.36s/it] 8%|▊ | 4917/61904 [2:25:48<21:02:58, 1.33s/it] 8%|▊ | 4918/61904 [2:25:49<21:22:22, 1.35s/it] 8%|▊ | 4919/61904 [2:25:50<21:40:43, 1.37s/it] 8%|▊ | 4920/61904 [2:25:52<21:47:58, 1.38s/it] {'loss': 2.9675, 'learning_rate': 1.9235057694800983e-07, 'epoch': 1.27} 8%|▊ | 4920/61904 [2:25:52<21:47:58, 1.38s/it] 8%|▊ | 4921/61904 [2:25:53<22:29:38, 1.42s/it] 8%|▊ | 4922/61904 [2:25:55<22:23:03, 1.41s/it] 8%|▊ | 4923/61904 [2:25:56<21:46:19, 1.38s/it] 8%|▊ | 4924/61904 [2:25:57<21:55:00, 1.38s/it] 8%|▊ | 4925/61904 [2:25:59<21:47:43, 1.38s/it] 8%|▊ | 4926/61904 [2:26:00<21:55:51, 1.39s/it] 8%|▊ | 4927/61904 [2:26:01<21:19:49, 1.35s/it] 8%|▊ | 4928/61904 [2:26:03<21:50:35, 1.38s/it] 8%|▊ | 4929/61904 [2:26:04<21:46:58, 1.38s/it] 8%|▊ | 4930/61904 [2:26:06<22:49:23, 1.44s/it] 8%|▊ | 4931/61904 [2:26:07<22:27:24, 1.42s/it] 8%|▊ | 4932/61904 [2:26:08<21:50:34, 1.38s/it] 8%|▊ | 4933/61904 [2:26:10<21:04:26, 1.33s/it] 8%|▊ | 4934/61904 [2:26:11<20:50:19, 1.32s/it] 8%|▊ | 4935/61904 [2:26:12<20:48:59, 1.32s/it] 8%|▊ | 4936/61904 [2:26:14<20:39:47, 1.31s/it] 8%|▊ | 4937/61904 [2:26:15<22:35:21, 1.43s/it] 8%|▊ | 4938/61904 [2:26:17<22:11:22, 1.40s/it] 8%|▊ | 4939/61904 [2:26:18<21:58:54, 1.39s/it] 8%|▊ | 4940/61904 [2:26:19<21:57:31, 1.39s/it] {'loss': 2.9495, 'learning_rate': 1.9231816413846752e-07, 'epoch': 1.28} 8%|▊ | 4940/61904 [2:26:19<21:57:31, 1.39s/it] 8%|▊ | 4941/61904 [2:26:21<21:41:13, 1.37s/it] 8%|▊ | 4942/61904 [2:26:22<21:29:11, 1.36s/it] 8%|▊ | 4943/61904 [2:26:23<21:37:42, 1.37s/it] 8%|▊ | 4944/61904 [2:26:25<21:03:59, 1.33s/it] 8%|▊ | 4945/61904 [2:26:26<21:02:57, 1.33s/it] 8%|▊ | 4946/61904 [2:26:27<21:19:01, 1.35s/it] 8%|▊ | 4947/61904 [2:26:29<21:57:41, 1.39s/it] 8%|▊ | 4948/61904 [2:26:30<22:15:37, 1.41s/it] 8%|▊ | 4949/61904 [2:26:32<21:47:31, 1.38s/it] 8%|▊ | 4950/61904 [2:26:33<21:46:58, 1.38s/it] 8%|▊ | 4951/61904 [2:26:34<21:28:11, 1.36s/it] 8%|▊ | 4952/61904 [2:26:36<21:05:50, 1.33s/it] 8%|▊ | 4953/61904 [2:26:37<21:27:58, 1.36s/it] 8%|▊ | 4954/61904 [2:26:38<21:20:10, 1.35s/it] 8%|▊ | 4955/61904 [2:26:40<20:58:30, 1.33s/it] 8%|▊ | 4956/61904 [2:26:41<20:53:45, 1.32s/it] 8%|▊ | 4957/61904 [2:26:42<20:50:07, 1.32s/it] 8%|▊ | 4958/61904 [2:26:44<21:30:23, 1.36s/it] 8%|▊ | 4959/61904 [2:26:45<21:26:18, 1.36s/it] 8%|▊ | 4960/61904 [2:26:47<22:02:03, 1.39s/it] {'loss': 3.0443, 'learning_rate': 1.9228575132892518e-07, 'epoch': 1.28} 8%|▊ | 4960/61904 [2:26:47<22:02:03, 1.39s/it] 8%|▊ | 4961/61904 [2:26:48<21:39:48, 1.37s/it] 8%|▊ | 4962/61904 [2:26:49<21:53:28, 1.38s/it] 8%|▊ | 4963/61904 [2:26:50<20:48:55, 1.32s/it] 8%|▊ | 4964/61904 [2:26:52<20:16:19, 1.28s/it] 8%|▊ | 4965/61904 [2:26:53<21:03:48, 1.33s/it] 8%|▊ | 4966/61904 [2:26:54<20:58:57, 1.33s/it] 8%|▊ | 4967/61904 [2:26:56<20:53:09, 1.32s/it] 8%|▊ | 4968/61904 [2:26:57<21:09:48, 1.34s/it] 8%|▊ | 4969/61904 [2:26:58<20:59:32, 1.33s/it] 8%|▊ | 4970/61904 [2:27:00<20:48:26, 1.32s/it] 8%|▊ | 4971/61904 [2:27:01<21:36:35, 1.37s/it] 8%|▊ | 4972/61904 [2:27:02<21:40:05, 1.37s/it] 8%|▊ | 4973/61904 [2:27:04<21:40:29, 1.37s/it] 8%|▊ | 4974/61904 [2:27:05<21:16:11, 1.35s/it] 8%|▊ | 4975/61904 [2:27:06<20:51:34, 1.32s/it] 8%|▊ | 4976/61904 [2:27:08<20:14:22, 1.28s/it] 8%|▊ | 4977/61904 [2:27:09<21:05:25, 1.33s/it] 8%|▊ | 4978/61904 [2:27:11<21:50:51, 1.38s/it] 8%|▊ | 4979/61904 [2:27:12<21:19:37, 1.35s/it] 8%|▊ | 4980/61904 [2:27:13<21:00:20, 1.33s/it] {'loss': 2.9756, 'learning_rate': 1.9225333851938284e-07, 'epoch': 1.29} 8%|▊ | 4980/61904 [2:27:13<21:00:20, 1.33s/it] 8%|▊ | 4981/61904 [2:27:15<21:26:11, 1.36s/it] 8%|▊ | 4982/61904 [2:27:16<21:25:26, 1.35s/it] 8%|▊ | 4983/61904 [2:27:17<21:06:51, 1.34s/it] 8%|▊ | 4984/61904 [2:27:19<21:22:05, 1.35s/it] 8%|▊ | 4985/61904 [2:27:20<21:02:54, 1.33s/it] 8%|▊ | 4986/61904 [2:27:21<21:10:57, 1.34s/it] 8%|▊ | 4987/61904 [2:27:23<21:34:27, 1.36s/it] 8%|▊ | 4988/61904 [2:27:24<22:18:04, 1.41s/it] 8%|▊ | 4989/61904 [2:27:25<21:44:48, 1.38s/it] 8%|▊ | 4990/61904 [2:27:27<22:19:37, 1.41s/it] 8%|▊ | 4991/61904 [2:27:28<22:09:40, 1.40s/it] 8%|▊ | 4992/61904 [2:27:30<21:37:36, 1.37s/it] 8%|▊ | 4993/61904 [2:27:31<22:25:12, 1.42s/it] 8%|▊ | 4994/61904 [2:27:33<22:29:45, 1.42s/it] 8%|▊ | 4995/61904 [2:27:34<21:35:18, 1.37s/it] 8%|▊ | 4996/61904 [2:27:35<21:24:36, 1.35s/it] 8%|▊ | 4997/61904 [2:27:37<21:45:39, 1.38s/it] 8%|▊ | 4998/61904 [2:27:38<21:32:01, 1.36s/it] 8%|▊ | 4999/61904 [2:27:39<21:11:15, 1.34s/it] 8%|▊ | 5000/61904 [2:27:41<21:26:18, 1.36s/it] {'loss': 2.9824, 'learning_rate': 1.9222092570984053e-07, 'epoch': 1.29} 8%|▊ | 5000/61904 [2:27:41<21:26:18, 1.36s/it] 8%|▊ | 5001/61904 [2:27:42<21:36:10, 1.37s/it] 8%|▊ | 5002/61904 [2:27:43<21:20:19, 1.35s/it] 8%|▊ | 5003/61904 [2:27:45<21:16:36, 1.35s/it] 8%|▊ | 5004/61904 [2:27:46<21:00:32, 1.33s/it] 8%|▊ | 5005/61904 [2:27:47<20:58:06, 1.33s/it] 8%|▊ | 5006/61904 [2:27:49<21:08:47, 1.34s/it] 8%|▊ | 5007/61904 [2:27:50<21:22:52, 1.35s/it] 8%|▊ | 5008/61904 [2:27:51<21:33:13, 1.36s/it] 8%|▊ | 5009/61904 [2:27:53<21:04:34, 1.33s/it] 8%|▊ | 5010/61904 [2:27:54<20:31:37, 1.30s/it] 8%|▊ | 5011/61904 [2:27:55<21:05:14, 1.33s/it] 8%|▊ | 5012/61904 [2:27:57<21:33:34, 1.36s/it] 8%|▊ | 5013/61904 [2:27:58<20:58:01, 1.33s/it] 8%|▊ | 5014/61904 [2:27:59<21:34:56, 1.37s/it] 8%|▊ | 5015/61904 [2:28:01<21:51:29, 1.38s/it] 8%|▊ | 5016/61904 [2:28:02<21:27:19, 1.36s/it] 8%|▊ | 5017/61904 [2:28:03<20:44:08, 1.31s/it] 8%|▊ | 5018/61904 [2:28:05<21:04:06, 1.33s/it] 8%|▊ | 5019/61904 [2:28:06<21:10:04, 1.34s/it] 8%|▊ | 5020/61904 [2:28:07<21:04:35, 1.33s/it] {'loss': 3.0607, 'learning_rate': 1.9218851290029817e-07, 'epoch': 1.3} 8%|▊ | 5020/61904 [2:28:07<21:04:35, 1.33s/it] 8%|▊ | 5021/61904 [2:28:09<20:44:46, 1.31s/it] 8%|▊ | 5022/61904 [2:28:10<21:06:32, 1.34s/it] 8%|▊ | 5023/61904 [2:28:11<21:07:57, 1.34s/it] 8%|▊ | 5024/61904 [2:28:13<21:56:48, 1.39s/it] 8%|▊ | 5025/61904 [2:28:14<22:21:56, 1.42s/it] 8%|▊ | 5026/61904 [2:28:16<22:11:59, 1.41s/it] 8%|▊ | 5027/61904 [2:28:17<22:07:14, 1.40s/it] 8%|▊ | 5028/61904 [2:28:18<21:45:00, 1.38s/it] 8%|▊ | 5029/61904 [2:28:20<21:35:56, 1.37s/it] 8%|▊ | 5030/61904 [2:28:21<20:37:19, 1.31s/it] 8%|▊ | 5031/61904 [2:28:22<21:07:45, 1.34s/it] 8%|▊ | 5032/61904 [2:28:24<21:43:49, 1.38s/it] 8%|▊ | 5033/61904 [2:28:25<21:41:46, 1.37s/it] 8%|▊ | 5034/61904 [2:28:27<22:12:23, 1.41s/it] 8%|▊ | 5035/61904 [2:28:28<21:51:31, 1.38s/it] 8%|▊ | 5036/61904 [2:28:29<22:16:34, 1.41s/it] 8%|▊ | 5037/61904 [2:28:31<22:33:57, 1.43s/it] 8%|▊ | 5038/61904 [2:28:32<21:54:38, 1.39s/it] 8%|▊ | 5039/61904 [2:28:34<21:20:31, 1.35s/it] 8%|▊ | 5040/61904 [2:28:35<21:38:06, 1.37s/it] {'loss': 3.0266, 'learning_rate': 1.9215610009075586e-07, 'epoch': 1.3} 8%|▊ | 5040/61904 [2:28:35<21:38:06, 1.37s/it] 8%|▊ | 5041/61904 [2:28:36<21:56:01, 1.39s/it] 8%|▊ | 5042/61904 [2:28:38<22:30:02, 1.42s/it] 8%|▊ | 5043/61904 [2:28:39<22:09:01, 1.40s/it] 8%|▊ | 5044/61904 [2:28:41<22:47:25, 1.44s/it] 8%|▊ | 5045/61904 [2:28:42<22:46:43, 1.44s/it] 8%|▊ | 5046/61904 [2:28:44<23:08:01, 1.46s/it] 8%|▊ | 5047/61904 [2:28:45<22:16:08, 1.41s/it] 8%|▊ | 5048/61904 [2:28:46<21:42:39, 1.37s/it] 8%|▊ | 5049/61904 [2:28:48<21:39:08, 1.37s/it] 8%|▊ | 5050/61904 [2:28:49<21:09:43, 1.34s/it] 8%|▊ | 5051/61904 [2:28:51<23:13:56, 1.47s/it] 8%|▊ | 5052/61904 [2:28:52<22:22:50, 1.42s/it] 8%|▊ | 5053/61904 [2:28:53<21:56:49, 1.39s/it] 8%|▊ | 5054/61904 [2:28:55<21:42:42, 1.37s/it] 8%|▊ | 5055/61904 [2:28:56<21:37:03, 1.37s/it] 8%|▊ | 5056/61904 [2:28:57<21:28:02, 1.36s/it] 8%|▊ | 5057/61904 [2:28:59<21:42:09, 1.37s/it] 8%|▊ | 5058/61904 [2:29:00<21:35:52, 1.37s/it] 8%|▊ | 5059/61904 [2:29:02<22:10:27, 1.40s/it] 8%|▊ | 5060/61904 [2:29:03<22:25:28, 1.42s/it] {'loss': 2.9212, 'learning_rate': 1.9212368728121355e-07, 'epoch': 1.31} 8%|▊ | 5060/61904 [2:29:03<22:25:28, 1.42s/it] 8%|▊ | 5061/61904 [2:29:04<21:54:47, 1.39s/it] 8%|▊ | 5062/61904 [2:29:06<21:06:00, 1.34s/it] 8%|▊ | 5063/61904 [2:29:07<20:43:57, 1.31s/it] 8%|▊ | 5064/61904 [2:29:08<21:36:59, 1.37s/it] 8%|▊ | 5065/61904 [2:29:10<22:51:36, 1.45s/it] 8%|▊ | 5066/61904 [2:29:11<22:24:20, 1.42s/it] 8%|▊ | 5067/61904 [2:29:13<21:55:31, 1.39s/it] 8%|▊ | 5068/61904 [2:29:14<21:39:49, 1.37s/it] 8%|▊ | 5069/61904 [2:29:15<21:08:58, 1.34s/it] 8%|▊ | 5070/61904 [2:29:17<21:03:42, 1.33s/it] 8%|▊ | 5071/61904 [2:29:18<22:27:50, 1.42s/it] 8%|▊ | 5072/61904 [2:29:20<22:26:13, 1.42s/it] 8%|▊ | 5073/61904 [2:29:21<21:40:06, 1.37s/it] 8%|▊ | 5074/61904 [2:29:22<21:22:22, 1.35s/it] 8%|▊ | 5075/61904 [2:29:24<21:41:36, 1.37s/it] 8%|▊ | 5076/61904 [2:29:25<21:37:07, 1.37s/it] 8%|▊ | 5077/61904 [2:29:26<21:29:45, 1.36s/it] 8%|▊ | 5078/61904 [2:29:28<21:20:39, 1.35s/it] 8%|▊ | 5079/61904 [2:29:29<21:27:45, 1.36s/it] 8%|▊ | 5080/61904 [2:29:30<21:02:03, 1.33s/it] {'loss': 3.0226, 'learning_rate': 1.9209127447167118e-07, 'epoch': 1.31} 8%|▊ | 5080/61904 [2:29:30<21:02:03, 1.33s/it] 8%|▊ | 5081/61904 [2:29:32<20:44:02, 1.31s/it] 8%|▊ | 5082/61904 [2:29:33<20:58:37, 1.33s/it] 8%|▊ | 5083/61904 [2:29:34<21:27:19, 1.36s/it] 8%|▊ | 5084/61904 [2:29:36<21:00:40, 1.33s/it] 8%|▊ | 5085/61904 [2:29:37<21:00:43, 1.33s/it] 8%|▊ | 5086/61904 [2:29:38<21:11:30, 1.34s/it] 8%|▊ | 5087/61904 [2:29:40<21:19:32, 1.35s/it] 8%|▊ | 5088/61904 [2:29:41<21:24:18, 1.36s/it] 8%|▊ | 5089/61904 [2:29:42<20:58:17, 1.33s/it] 8%|▊ | 5090/61904 [2:29:44<22:01:46, 1.40s/it] 8%|▊ | 5091/61904 [2:29:45<22:06:58, 1.40s/it] 8%|▊ | 5092/61904 [2:29:47<21:19:19, 1.35s/it] 8%|▊ | 5093/61904 [2:29:48<22:05:00, 1.40s/it] 8%|▊ | 5094/61904 [2:29:49<21:31:20, 1.36s/it] 8%|▊ | 5095/61904 [2:29:51<21:10:15, 1.34s/it] 8%|▊ | 5096/61904 [2:29:52<20:57:46, 1.33s/it] 8%|▊ | 5097/61904 [2:29:53<20:32:25, 1.30s/it] 8%|▊ | 5098/61904 [2:29:54<20:01:09, 1.27s/it] 8%|▊ | 5099/61904 [2:29:56<21:07:34, 1.34s/it] 8%|▊ | 5100/61904 [2:29:57<21:44:58, 1.38s/it] {'loss': 2.9282, 'learning_rate': 1.9205886166212887e-07, 'epoch': 1.32} 8%|▊ | 5100/61904 [2:29:57<21:44:58, 1.38s/it] 8%|▊ | 5101/61904 [2:29:59<21:48:33, 1.38s/it] 8%|▊ | 5102/61904 [2:30:00<22:23:23, 1.42s/it] 8%|▊ | 5103/61904 [2:30:02<22:55:53, 1.45s/it] 8%|▊ | 5104/61904 [2:30:03<22:48:39, 1.45s/it] 8%|▊ | 5105/61904 [2:30:05<23:12:57, 1.47s/it] 8%|▊ | 5106/61904 [2:30:06<23:05:31, 1.46s/it] 8%|▊ | 5107/61904 [2:30:07<22:22:06, 1.42s/it] 8%|▊ | 5108/61904 [2:30:09<21:55:56, 1.39s/it] 8%|▊ | 5109/61904 [2:30:10<21:02:47, 1.33s/it] 8%|▊ | 5110/61904 [2:30:11<20:45:37, 1.32s/it] 8%|▊ | 5111/61904 [2:30:12<20:12:52, 1.28s/it] 8%|▊ | 5112/61904 [2:30:14<21:25:31, 1.36s/it] 8%|▊ | 5113/61904 [2:30:15<21:12:39, 1.34s/it] 8%|▊ | 5114/61904 [2:30:17<20:53:10, 1.32s/it] 8%|▊ | 5115/61904 [2:30:18<20:59:45, 1.33s/it] 8%|▊ | 5116/61904 [2:30:19<20:33:06, 1.30s/it] 8%|▊ | 5117/61904 [2:30:20<20:20:30, 1.29s/it] 8%|▊ | 5118/61904 [2:30:22<20:47:59, 1.32s/it] 8%|▊ | 5119/61904 [2:30:23<20:24:00, 1.29s/it] 8%|▊ | 5120/61904 [2:30:24<20:52:25, 1.32s/it] {'loss': 2.9725, 'learning_rate': 1.9202644885258653e-07, 'epoch': 1.32} 8%|▊ | 5120/61904 [2:30:24<20:52:25, 1.32s/it] 8%|▊ | 5121/61904 [2:30:26<20:39:22, 1.31s/it] 8%|▊ | 5122/61904 [2:30:27<20:54:15, 1.33s/it] 8%|▊ | 5123/61904 [2:30:28<20:32:24, 1.30s/it] 8%|▊ | 5124/61904 [2:30:30<20:49:31, 1.32s/it] 8%|▊ | 5125/61904 [2:30:31<21:31:20, 1.36s/it] 8%|▊ | 5126/61904 [2:30:33<22:07:42, 1.40s/it] 8%|▊ | 5127/61904 [2:30:34<21:50:42, 1.39s/it] 8%|▊ | 5128/61904 [2:30:35<21:47:29, 1.38s/it] 8%|▊ | 5129/61904 [2:30:37<21:12:00, 1.34s/it] 8%|▊ | 5130/61904 [2:30:38<20:23:06, 1.29s/it] 8%|▊ | 5131/61904 [2:30:39<20:32:47, 1.30s/it] 8%|▊ | 5132/61904 [2:30:40<20:31:41, 1.30s/it] 8%|▊ | 5133/61904 [2:30:42<20:46:08, 1.32s/it] 8%|▊ | 5134/61904 [2:30:43<21:19:26, 1.35s/it] 8%|▊ | 5135/61904 [2:30:45<22:01:05, 1.40s/it] 8%|▊ | 5136/61904 [2:30:46<21:56:24, 1.39s/it] 8%|▊ | 5137/61904 [2:30:48<22:11:10, 1.41s/it] 8%|▊ | 5138/61904 [2:30:49<21:19:19, 1.35s/it] 8%|▊ | 5139/61904 [2:30:50<21:21:20, 1.35s/it] 8%|▊ | 5140/61904 [2:30:51<21:09:16, 1.34s/it] {'loss': 2.9777, 'learning_rate': 1.919940360430442e-07, 'epoch': 1.33} 8%|▊ | 5140/61904 [2:30:51<21:09:16, 1.34s/it] 8%|▊ | 5141/61904 [2:30:53<20:55:11, 1.33s/it] 8%|▊ | 5142/61904 [2:30:54<21:36:05, 1.37s/it] 8%|▊ | 5143/61904 [2:30:56<21:37:23, 1.37s/it] 8%|▊ | 5144/61904 [2:30:57<21:49:29, 1.38s/it] 8%|▊ | 5145/61904 [2:30:58<21:19:18, 1.35s/it] 8%|▊ | 5146/61904 [2:31:00<21:15:48, 1.35s/it] 8%|▊ | 5147/61904 [2:31:01<21:12:13, 1.34s/it] 8%|▊ | 5148/61904 [2:31:02<21:09:10, 1.34s/it] 8%|▊ | 5149/61904 [2:31:04<21:31:53, 1.37s/it] 8%|▊ | 5150/61904 [2:31:05<22:01:45, 1.40s/it] 8%|▊ | 5151/61904 [2:31:07<22:03:19, 1.40s/it] 8%|▊ | 5152/61904 [2:31:08<21:30:05, 1.36s/it] 8%|▊ | 5153/61904 [2:31:09<21:52:49, 1.39s/it] 8%|▊ | 5154/61904 [2:31:11<21:57:27, 1.39s/it] 8%|▊ | 5155/61904 [2:31:12<21:06:32, 1.34s/it] 8%|▊ | 5156/61904 [2:31:13<20:59:00, 1.33s/it] 8%|▊ | 5157/61904 [2:31:15<20:59:06, 1.33s/it] 8%|▊ | 5158/61904 [2:31:16<21:52:57, 1.39s/it] 8%|▊ | 5159/61904 [2:31:17<21:50:59, 1.39s/it] 8%|▊ | 5160/61904 [2:31:19<21:31:24, 1.37s/it] {'loss': 2.9426, 'learning_rate': 1.9196162323350188e-07, 'epoch': 1.33} 8%|▊ | 5160/61904 [2:31:19<21:31:24, 1.37s/it] 8%|▊ | 5161/61904 [2:31:20<21:20:04, 1.35s/it] 8%|▊ | 5162/61904 [2:31:21<20:50:23, 1.32s/it] 8%|▊ | 5163/61904 [2:31:23<21:38:43, 1.37s/it] 8%|▊ | 5164/61904 [2:31:24<21:49:37, 1.38s/it] 8%|▊ | 5165/61904 [2:31:26<21:29:05, 1.36s/it] 8%|▊ | 5166/61904 [2:31:27<22:06:14, 1.40s/it] 8%|▊ | 5167/61904 [2:31:28<21:38:12, 1.37s/it] 8%|▊ | 5168/61904 [2:31:30<21:18:57, 1.35s/it] 8%|▊ | 5169/61904 [2:31:31<20:46:26, 1.32s/it] 8%|▊ | 5170/61904 [2:31:32<20:30:15, 1.30s/it] 8%|▊ | 5171/61904 [2:31:33<20:39:36, 1.31s/it] 8%|▊ | 5172/61904 [2:31:35<20:32:05, 1.30s/it] 8%|▊ | 5173/61904 [2:31:36<20:38:59, 1.31s/it] 8%|▊ | 5174/61904 [2:31:38<21:11:05, 1.34s/it] 8%|▊ | 5175/61904 [2:31:39<20:58:28, 1.33s/it] 8%|▊ | 5176/61904 [2:31:40<21:35:49, 1.37s/it] 8%|▊ | 5177/61904 [2:31:42<21:33:55, 1.37s/it] 8%|▊ | 5178/61904 [2:31:43<21:47:24, 1.38s/it] 8%|▊ | 5179/61904 [2:31:44<21:53:52, 1.39s/it] 8%|▊ | 5180/61904 [2:31:46<22:53:38, 1.45s/it] {'loss': 2.9584, 'learning_rate': 1.9192921042395954e-07, 'epoch': 1.34} 8%|▊ | 5180/61904 [2:31:46<22:53:38, 1.45s/it] 8%|▊ | 5181/61904 [2:31:47<22:11:58, 1.41s/it] 8%|▊ | 5182/61904 [2:31:49<21:29:39, 1.36s/it] 8%|▊ | 5183/61904 [2:31:50<21:39:14, 1.37s/it] 8%|▊ | 5184/61904 [2:31:51<21:51:19, 1.39s/it] 8%|▊ | 5185/61904 [2:31:53<21:14:16, 1.35s/it] 8%|▊ | 5186/61904 [2:31:54<22:09:46, 1.41s/it] 8%|▊ | 5187/61904 [2:31:56<21:47:25, 1.38s/it] 8%|▊ | 5188/61904 [2:31:57<22:22:38, 1.42s/it] 8%|▊ | 5189/61904 [2:31:58<21:43:12, 1.38s/it] 8%|▊ | 5190/61904 [2:32:00<21:22:23, 1.36s/it] 8%|▊ | 5191/61904 [2:32:01<21:23:27, 1.36s/it] 8%|▊ | 5192/61904 [2:32:02<21:10:30, 1.34s/it] 8%|▊ | 5193/61904 [2:32:04<21:16:53, 1.35s/it] 8%|▊ | 5194/61904 [2:32:05<21:54:07, 1.39s/it] 8%|▊ | 5195/61904 [2:32:07<21:53:19, 1.39s/it] 8%|▊ | 5196/61904 [2:32:08<21:39:40, 1.38s/it] 8%|▊ | 5197/61904 [2:32:09<21:28:01, 1.36s/it] 8%|▊ | 5198/61904 [2:32:11<21:38:24, 1.37s/it] 8%|▊ | 5199/61904 [2:32:12<21:54:22, 1.39s/it] 8%|▊ | 5200/61904 [2:32:14<22:17:52, 1.42s/it] {'loss': 2.9244, 'learning_rate': 1.918967976144172e-07, 'epoch': 1.34} 8%|▊ | 5200/61904 [2:32:14<22:17:52, 1.42s/it] 8%|▊ | 5201/61904 [2:32:15<21:36:50, 1.37s/it] 8%|▊ | 5202/61904 [2:32:16<21:08:41, 1.34s/it] 8%|▊ | 5203/61904 [2:32:18<22:16:41, 1.41s/it] 8%|▊ | 5204/61904 [2:32:19<21:48:31, 1.38s/it] 8%|▊ | 5205/61904 [2:32:20<21:12:46, 1.35s/it] 8%|▊ | 5206/61904 [2:32:22<20:51:30, 1.32s/it] 8%|▊ | 5207/61904 [2:32:23<20:42:15, 1.31s/it] 8%|▊ | 5208/61904 [2:32:24<21:03:59, 1.34s/it] 8%|▊ | 5209/61904 [2:32:26<21:12:57, 1.35s/it] 8%|▊ | 5210/61904 [2:32:27<21:33:38, 1.37s/it] 8%|▊ | 5211/61904 [2:32:28<21:22:56, 1.36s/it] 8%|▊ | 5212/61904 [2:32:30<21:26:37, 1.36s/it] 8%|▊ | 5213/61904 [2:32:31<21:13:52, 1.35s/it] 8%|▊ | 5214/61904 [2:32:32<20:41:25, 1.31s/it] 8%|▊ | 5215/61904 [2:32:34<20:43:17, 1.32s/it] 8%|▊ | 5216/61904 [2:32:35<20:57:29, 1.33s/it] 8%|▊ | 5217/61904 [2:32:36<21:31:28, 1.37s/it] 8%|▊ | 5218/61904 [2:32:38<21:26:18, 1.36s/it] 8%|▊ | 5219/61904 [2:32:39<21:56:27, 1.39s/it] 8%|▊ | 5220/61904 [2:32:41<21:33:57, 1.37s/it] {'loss': 3.0344, 'learning_rate': 1.918643848048749e-07, 'epoch': 1.35} 8%|▊ | 5220/61904 [2:32:41<21:33:57, 1.37s/it] 8%|▊ | 5221/61904 [2:32:42<20:56:44, 1.33s/it] 8%|▊ | 5222/61904 [2:32:43<21:32:33, 1.37s/it] 8%|▊ | 5223/61904 [2:32:45<21:50:40, 1.39s/it] 8%|▊ | 5224/61904 [2:32:46<22:36:54, 1.44s/it] 8%|▊ | 5225/61904 [2:32:48<22:38:42, 1.44s/it] 8%|▊ | 5226/61904 [2:32:49<22:04:33, 1.40s/it] 8%|▊ | 5227/61904 [2:32:50<21:21:26, 1.36s/it] 8%|▊ | 5228/61904 [2:32:52<21:17:09, 1.35s/it] 8%|▊ | 5229/61904 [2:32:53<21:04:14, 1.34s/it] 8%|▊ | 5230/61904 [2:32:54<21:07:13, 1.34s/it] 8%|▊ | 5231/61904 [2:32:56<21:28:45, 1.36s/it] 8%|▊ | 5232/61904 [2:32:57<22:13:41, 1.41s/it] 8%|▊ | 5233/61904 [2:32:59<22:29:31, 1.43s/it] 8%|▊ | 5234/61904 [2:33:00<22:48:40, 1.45s/it] 8%|▊ | 5235/61904 [2:33:02<22:32:18, 1.43s/it] 8%|▊ | 5236/61904 [2:33:03<21:47:42, 1.38s/it] 8%|▊ | 5237/61904 [2:33:04<21:51:10, 1.39s/it] 8%|▊ | 5238/61904 [2:33:06<22:03:47, 1.40s/it] 8%|▊ | 5239/61904 [2:33:07<22:52:37, 1.45s/it] 8%|▊ | 5240/61904 [2:33:08<21:41:49, 1.38s/it] {'loss': 2.9877, 'learning_rate': 1.9183197199533253e-07, 'epoch': 1.35} 8%|▊ | 5240/61904 [2:33:08<21:41:49, 1.38s/it] 8%|▊ | 5241/61904 [2:33:10<22:36:10, 1.44s/it] 8%|▊ | 5242/61904 [2:33:11<22:08:03, 1.41s/it] 8%|▊ | 5243/61904 [2:33:13<22:06:04, 1.40s/it] 8%|▊ | 5244/61904 [2:33:14<22:10:41, 1.41s/it] 8%|▊ | 5245/61904 [2:33:15<21:38:51, 1.38s/it] 8%|▊ | 5246/61904 [2:33:17<22:26:07, 1.43s/it] 8%|▊ | 5247/61904 [2:33:18<21:48:09, 1.39s/it] 8%|▊ | 5248/61904 [2:33:20<21:10:39, 1.35s/it] 8%|▊ | 5249/61904 [2:33:21<21:27:58, 1.36s/it] 8%|▊ | 5250/61904 [2:33:22<21:36:28, 1.37s/it] 8%|▊ | 5251/61904 [2:33:24<21:05:20, 1.34s/it] 8%|▊ | 5252/61904 [2:33:25<20:43:45, 1.32s/it] 8%|▊ | 5253/61904 [2:33:26<21:22:01, 1.36s/it] 8%|▊ | 5254/61904 [2:33:28<20:54:43, 1.33s/it] 8%|▊ | 5255/61904 [2:33:29<20:46:49, 1.32s/it] 8%|▊ | 5256/61904 [2:33:30<21:07:40, 1.34s/it] 8%|▊ | 5257/61904 [2:33:31<20:39:32, 1.31s/it] 8%|▊ | 5258/61904 [2:33:33<21:16:21, 1.35s/it] 8%|▊ | 5259/61904 [2:33:34<21:08:42, 1.34s/it] 8%|▊ | 5260/61904 [2:33:36<20:43:16, 1.32s/it] {'loss': 2.9763, 'learning_rate': 1.9179955918579022e-07, 'epoch': 1.36} 8%|▊ | 5260/61904 [2:33:36<20:43:16, 1.32s/it] 8%|▊ | 5261/61904 [2:33:37<20:24:27, 1.30s/it] 9%|▊ | 5262/61904 [2:33:38<20:46:21, 1.32s/it] 9%|▊ | 5263/61904 [2:33:39<20:30:55, 1.30s/it] 9%|▊ | 5264/61904 [2:33:41<21:44:41, 1.38s/it] 9%|▊ | 5265/61904 [2:33:42<21:11:47, 1.35s/it] 9%|▊ | 5266/61904 [2:33:44<22:09:38, 1.41s/it] 9%|▊ | 5267/61904 [2:33:45<21:24:59, 1.36s/it] 9%|▊ | 5268/61904 [2:33:46<21:14:56, 1.35s/it] 9%|▊ | 5269/61904 [2:33:48<20:37:12, 1.31s/it] 9%|▊ | 5270/61904 [2:33:49<20:49:21, 1.32s/it] 9%|▊ | 5271/61904 [2:33:50<21:00:55, 1.34s/it] 9%|▊ | 5272/61904 [2:33:52<20:51:03, 1.33s/it] 9%|▊ | 5273/61904 [2:33:53<23:20:13, 1.48s/it] 9%|▊ | 5274/61904 [2:33:55<23:44:27, 1.51s/it] 9%|▊ | 5275/61904 [2:33:56<22:51:18, 1.45s/it] 9%|▊ | 5276/61904 [2:33:58<22:09:13, 1.41s/it] 9%|▊ | 5277/61904 [2:33:59<21:16:14, 1.35s/it] 9%|▊ | 5278/61904 [2:34:00<20:47:06, 1.32s/it] 9%|▊ | 5279/61904 [2:34:01<20:45:50, 1.32s/it] 9%|▊ | 5280/61904 [2:34:03<21:12:44, 1.35s/it] {'loss': 2.9684, 'learning_rate': 1.917671463762479e-07, 'epoch': 1.36} 9%|▊ | 5280/61904 [2:34:03<21:12:44, 1.35s/it] 9%|▊ | 5281/61904 [2:34:04<21:30:45, 1.37s/it] 9%|▊ | 5282/61904 [2:34:06<21:29:07, 1.37s/it] 9%|▊ | 5283/61904 [2:34:07<21:20:55, 1.36s/it] 9%|▊ | 5284/61904 [2:34:08<21:08:04, 1.34s/it] 9%|▊ | 5285/61904 [2:34:10<21:00:45, 1.34s/it] 9%|▊ | 5286/61904 [2:34:11<20:50:59, 1.33s/it] 9%|▊ | 5287/61904 [2:34:12<21:31:40, 1.37s/it] 9%|▊ | 5288/61904 [2:34:14<21:22:40, 1.36s/it] 9%|▊ | 5289/61904 [2:34:15<22:41:00, 1.44s/it] 9%|▊ | 5290/61904 [2:34:17<22:00:31, 1.40s/it] 9%|▊ | 5291/61904 [2:34:18<21:19:57, 1.36s/it] 9%|▊ | 5292/61904 [2:34:19<21:33:30, 1.37s/it] 9%|▊ | 5293/61904 [2:34:21<20:53:57, 1.33s/it] 9%|▊ | 5294/61904 [2:34:22<20:33:32, 1.31s/it] 9%|▊ | 5295/61904 [2:34:23<20:51:15, 1.33s/it] 9%|▊ | 5296/61904 [2:34:24<20:44:00, 1.32s/it] 9%|▊ | 5297/61904 [2:34:26<20:59:12, 1.33s/it] 9%|▊ | 5298/61904 [2:34:27<21:25:42, 1.36s/it] 9%|▊ | 5299/61904 [2:34:29<21:39:17, 1.38s/it] 9%|▊ | 5300/61904 [2:34:30<21:54:20, 1.39s/it] {'loss': 3.0331, 'learning_rate': 1.9173473356670554e-07, 'epoch': 1.37} 9%|▊ | 5300/61904 [2:34:30<21:54:20, 1.39s/it] 9%|▊ | 5301/61904 [2:34:31<21:49:14, 1.39s/it] 9%|▊ | 5302/61904 [2:34:33<21:43:36, 1.38s/it] 9%|▊ | 5303/61904 [2:34:34<21:32:50, 1.37s/it] 9%|▊ | 5304/61904 [2:34:35<20:53:59, 1.33s/it] 9%|▊ | 5305/61904 [2:34:37<20:54:52, 1.33s/it] 9%|▊ | 5306/61904 [2:34:38<21:06:16, 1.34s/it] 9%|▊ | 5307/61904 [2:34:40<21:33:52, 1.37s/it] 9%|▊ | 5308/61904 [2:34:41<21:32:50, 1.37s/it] 9%|▊ | 5309/61904 [2:34:42<21:36:49, 1.37s/it] 9%|▊ | 5310/61904 [2:34:44<21:26:52, 1.36s/it] 9%|▊ | 5311/61904 [2:34:45<21:20:02, 1.36s/it] 9%|▊ | 5312/61904 [2:34:46<21:47:48, 1.39s/it] 9%|▊ | 5313/61904 [2:34:48<21:06:37, 1.34s/it] 9%|▊ | 5314/61904 [2:34:49<20:52:38, 1.33s/it] 9%|▊ | 5315/61904 [2:34:50<20:44:29, 1.32s/it] 9%|▊ | 5316/61904 [2:34:52<20:20:27, 1.29s/it] 9%|▊ | 5317/61904 [2:34:53<20:24:47, 1.30s/it] 9%|▊ | 5318/61904 [2:34:54<20:18:04, 1.29s/it] 9%|▊ | 5319/61904 [2:34:56<20:58:25, 1.33s/it] 9%|▊ | 5320/61904 [2:34:57<20:27:25, 1.30s/it] {'loss': 3.0052, 'learning_rate': 1.9170232075716323e-07, 'epoch': 1.37} 9%|▊ | 5320/61904 [2:34:57<20:27:25, 1.30s/it] 9%|▊ | 5321/61904 [2:34:58<20:10:17, 1.28s/it] 9%|▊ | 5322/61904 [2:34:59<20:15:13, 1.29s/it] 9%|▊ | 5323/61904 [2:35:01<20:08:47, 1.28s/it] 9%|▊ | 5324/61904 [2:35:02<20:05:44, 1.28s/it] 9%|▊ | 5325/61904 [2:35:03<20:20:07, 1.29s/it] 9%|▊ | 5326/61904 [2:35:04<20:27:40, 1.30s/it] 9%|▊ | 5327/61904 [2:35:06<20:44:51, 1.32s/it] 9%|▊ | 5328/61904 [2:35:07<20:46:35, 1.32s/it] 9%|▊ | 5329/61904 [2:35:09<21:17:40, 1.36s/it] 9%|▊ | 5330/61904 [2:35:10<21:16:01, 1.35s/it] 9%|▊ | 5331/61904 [2:35:11<21:36:56, 1.38s/it] 9%|▊ | 5332/61904 [2:35:13<21:02:44, 1.34s/it] 9%|▊ | 5333/61904 [2:35:14<21:37:16, 1.38s/it] 9%|▊ | 5334/61904 [2:35:15<21:21:26, 1.36s/it] 9%|▊ | 5335/61904 [2:35:17<21:04:45, 1.34s/it] 9%|▊ | 5336/61904 [2:35:18<21:49:11, 1.39s/it] 9%|▊ | 5337/61904 [2:35:20<21:51:21, 1.39s/it] 9%|▊ | 5338/61904 [2:35:21<21:37:57, 1.38s/it] 9%|▊ | 5339/61904 [2:35:22<21:49:52, 1.39s/it] 9%|▊ | 5340/61904 [2:35:24<22:29:59, 1.43s/it] {'loss': 2.9511, 'learning_rate': 1.916699079476209e-07, 'epoch': 1.38} 9%|▊ | 5340/61904 [2:35:24<22:29:59, 1.43s/it] 9%|▊ | 5341/61904 [2:35:25<22:34:58, 1.44s/it] 9%|▊ | 5342/61904 [2:35:27<22:43:17, 1.45s/it] 9%|▊ | 5343/61904 [2:35:28<22:24:40, 1.43s/it] 9%|▊ | 5344/61904 [2:35:29<21:26:50, 1.37s/it] 9%|▊ | 5345/61904 [2:35:31<21:36:40, 1.38s/it] 9%|▊ | 5346/61904 [2:35:32<20:53:47, 1.33s/it] 9%|▊ | 5347/61904 [2:35:33<20:55:23, 1.33s/it] 9%|▊ | 5348/61904 [2:35:35<21:36:47, 1.38s/it] 9%|▊ | 5349/61904 [2:35:36<21:44:27, 1.38s/it] 9%|▊ | 5350/61904 [2:35:38<21:38:19, 1.38s/it] 9%|▊ | 5351/61904 [2:35:39<21:38:03, 1.38s/it] 9%|▊ | 5352/61904 [2:35:40<21:59:28, 1.40s/it] 9%|▊ | 5353/61904 [2:35:42<21:08:21, 1.35s/it] 9%|▊ | 5354/61904 [2:35:43<21:09:27, 1.35s/it] 9%|▊ | 5355/61904 [2:35:44<21:33:45, 1.37s/it] 9%|▊ | 5356/61904 [2:35:46<22:15:06, 1.42s/it] 9%|▊ | 5357/61904 [2:35:47<21:46:18, 1.39s/it] 9%|▊ | 5358/61904 [2:35:49<21:25:59, 1.36s/it] 9%|▊ | 5359/61904 [2:35:50<21:19:48, 1.36s/it] 9%|▊ | 5360/61904 [2:35:51<20:56:21, 1.33s/it] {'loss': 2.9444, 'learning_rate': 1.9163749513807855e-07, 'epoch': 1.39} 9%|▊ | 5360/61904 [2:35:51<20:56:21, 1.33s/it] 9%|▊ | 5361/61904 [2:35:53<21:02:56, 1.34s/it] 9%|▊ | 5362/61904 [2:35:54<21:09:26, 1.35s/it] 9%|▊ | 5363/61904 [2:35:56<22:29:11, 1.43s/it] 9%|▊ | 5364/61904 [2:35:57<22:27:36, 1.43s/it] 9%|▊ | 5365/61904 [2:35:58<21:41:11, 1.38s/it] 9%|▊ | 5366/61904 [2:36:00<21:31:51, 1.37s/it] 9%|▊ | 5367/61904 [2:36:01<21:08:48, 1.35s/it] 9%|▊ | 5368/61904 [2:36:02<20:52:57, 1.33s/it] 9%|▊ | 5369/61904 [2:36:04<21:06:29, 1.34s/it] 9%|▊ | 5370/61904 [2:36:05<21:06:16, 1.34s/it] 9%|▊ | 5371/61904 [2:36:06<21:02:53, 1.34s/it] 9%|▊ | 5372/61904 [2:36:08<20:42:44, 1.32s/it] 9%|▊ | 5373/61904 [2:36:09<20:46:24, 1.32s/it] 9%|▊ | 5374/61904 [2:36:10<21:22:03, 1.36s/it] 9%|▊ | 5375/61904 [2:36:12<21:00:27, 1.34s/it] 9%|▊ | 5376/61904 [2:36:13<21:33:54, 1.37s/it] 9%|▊ | 5377/61904 [2:36:14<21:39:29, 1.38s/it] 9%|▊ | 5378/61904 [2:36:16<21:36:45, 1.38s/it] 9%|▊ | 5379/61904 [2:36:17<20:57:48, 1.34s/it] 9%|▊ | 5380/61904 [2:36:18<21:16:52, 1.36s/it] {'loss': 2.9757, 'learning_rate': 1.9160508232853624e-07, 'epoch': 1.39} 9%|▊ | 5380/61904 [2:36:18<21:16:52, 1.36s/it] 9%|▊ | 5381/61904 [2:36:20<21:25:32, 1.36s/it] 9%|▊ | 5382/61904 [2:36:21<21:38:45, 1.38s/it] 9%|▊ | 5383/61904 [2:36:22<20:47:35, 1.32s/it] 9%|▊ | 5384/61904 [2:36:24<20:34:35, 1.31s/it] 9%|▊ | 5385/61904 [2:36:25<21:09:58, 1.35s/it] 9%|▊ | 5386/61904 [2:36:26<20:38:07, 1.31s/it] 9%|▊ | 5387/61904 [2:36:28<20:13:13, 1.29s/it] 9%|▊ | 5388/61904 [2:36:29<20:29:08, 1.30s/it] 9%|▊ | 5389/61904 [2:36:30<21:04:01, 1.34s/it] 9%|▊ | 5390/61904 [2:36:32<21:58:36, 1.40s/it] 9%|▊ | 5391/61904 [2:36:33<21:24:50, 1.36s/it] 9%|▊ | 5392/61904 [2:36:35<21:53:18, 1.39s/it] 9%|▊ | 5393/61904 [2:36:36<22:27:05, 1.43s/it] 9%|▊ | 5394/61904 [2:36:37<21:17:19, 1.36s/it] 9%|▊ | 5395/61904 [2:36:39<20:47:06, 1.32s/it] 9%|▊ | 5396/61904 [2:36:40<21:01:02, 1.34s/it] 9%|▊ | 5397/61904 [2:36:41<20:21:18, 1.30s/it] 9%|▊ | 5398/61904 [2:36:42<20:22:41, 1.30s/it] 9%|▊ | 5399/61904 [2:36:44<20:54:24, 1.33s/it] 9%|▊ | 5400/61904 [2:36:45<20:57:44, 1.34s/it] {'loss': 3.0247, 'learning_rate': 1.915726695189939e-07, 'epoch': 1.4} 9%|▊ | 5400/61904 [2:36:45<20:57:44, 1.34s/it] 9%|▊ | 5401/61904 [2:36:47<21:07:10, 1.35s/it] 9%|▊ | 5402/61904 [2:36:48<20:57:27, 1.34s/it] 9%|▊ | 5403/61904 [2:36:49<21:41:05, 1.38s/it] 9%|▊ | 5404/61904 [2:36:51<21:04:22, 1.34s/it] 9%|▊ | 5405/61904 [2:36:52<20:30:50, 1.31s/it] 9%|▊ | 5406/61904 [2:36:53<20:40:32, 1.32s/it] 9%|▊ | 5407/61904 [2:36:55<21:09:22, 1.35s/it] 9%|▊ | 5408/61904 [2:36:56<21:43:29, 1.38s/it] 9%|▊ | 5409/61904 [2:36:57<21:36:20, 1.38s/it] 9%|▊ | 5410/61904 [2:36:59<21:36:56, 1.38s/it] 9%|▊ | 5411/61904 [2:37:00<21:13:25, 1.35s/it] 9%|▊ | 5412/61904 [2:37:01<21:06:56, 1.35s/it] 9%|▊ | 5413/61904 [2:37:03<20:49:16, 1.33s/it] 9%|▊ | 5414/61904 [2:37:04<21:26:44, 1.37s/it] 9%|▊ | 5415/61904 [2:37:05<20:50:13, 1.33s/it] 9%|▊ | 5416/61904 [2:37:07<22:46:48, 1.45s/it] 9%|▉ | 5417/61904 [2:37:09<22:19:06, 1.42s/it] 9%|▉ | 5418/61904 [2:37:10<21:45:29, 1.39s/it] 9%|▉ | 5419/61904 [2:37:11<21:34:15, 1.37s/it] 9%|▉ | 5420/61904 [2:37:13<21:49:52, 1.39s/it] {'loss': 3.0173, 'learning_rate': 1.9154025670945156e-07, 'epoch': 1.4} 9%|▉ | 5420/61904 [2:37:13<21:49:52, 1.39s/it] 9%|▉ | 5421/61904 [2:37:14<22:10:06, 1.41s/it] 9%|▉ | 5422/61904 [2:37:15<21:46:15, 1.39s/it] 9%|▉ | 5423/61904 [2:37:17<21:44:27, 1.39s/it] 9%|▉ | 5424/61904 [2:37:18<21:33:06, 1.37s/it] 9%|▉ | 5425/61904 [2:37:20<22:14:55, 1.42s/it] 9%|▉ | 5426/61904 [2:37:21<22:19:41, 1.42s/it] 9%|▉ | 5427/61904 [2:37:23<22:46:34, 1.45s/it] 9%|▉ | 5428/61904 [2:37:24<23:16:30, 1.48s/it] 9%|▉ | 5429/61904 [2:37:26<23:05:23, 1.47s/it] 9%|▉ | 5430/61904 [2:37:27<22:37:02, 1.44s/it] 9%|▉ | 5431/61904 [2:37:28<22:26:08, 1.43s/it] 9%|▉ | 5432/61904 [2:37:30<22:45:25, 1.45s/it] 9%|▉ | 5433/61904 [2:37:31<22:01:38, 1.40s/it] 9%|▉ | 5434/61904 [2:37:33<21:47:19, 1.39s/it] 9%|▉ | 5435/61904 [2:37:34<21:29:25, 1.37s/it] 9%|▉ | 5436/61904 [2:37:35<21:00:36, 1.34s/it] 9%|▉ | 5437/61904 [2:37:36<20:54:47, 1.33s/it] 9%|▉ | 5438/61904 [2:37:38<21:21:12, 1.36s/it] 9%|▉ | 5439/61904 [2:37:39<21:16:13, 1.36s/it] 9%|▉ | 5440/61904 [2:37:41<21:31:29, 1.37s/it] {'loss': 2.9848, 'learning_rate': 1.9150784389990925e-07, 'epoch': 1.41} 9%|▉ | 5440/61904 [2:37:41<21:31:29, 1.37s/it] 9%|▉ | 5441/61904 [2:37:42<21:00:14, 1.34s/it] 9%|▉ | 5442/61904 [2:37:43<21:12:06, 1.35s/it] 9%|▉ | 5443/61904 [2:37:45<20:44:45, 1.32s/it] 9%|▉ | 5444/61904 [2:37:46<20:53:07, 1.33s/it] 9%|▉ | 5445/61904 [2:37:47<20:54:21, 1.33s/it] 9%|▉ | 5446/61904 [2:37:49<21:00:50, 1.34s/it] 9%|▉ | 5447/61904 [2:37:50<21:12:11, 1.35s/it] 9%|▉ | 5448/61904 [2:37:51<21:31:45, 1.37s/it] 9%|▉ | 5449/61904 [2:37:53<21:39:37, 1.38s/it] 9%|▉ | 5450/61904 [2:37:54<21:25:40, 1.37s/it] 9%|▉ | 5451/61904 [2:37:56<21:52:11, 1.39s/it] 9%|▉ | 5452/61904 [2:37:57<21:42:33, 1.38s/it] 9%|▉ | 5453/61904 [2:37:59<23:23:23, 1.49s/it] 9%|▉ | 5454/61904 [2:38:00<22:25:02, 1.43s/it] 9%|▉ | 5455/61904 [2:38:01<22:11:28, 1.42s/it] 9%|▉ | 5456/61904 [2:38:03<22:50:04, 1.46s/it] 9%|▉ | 5457/61904 [2:38:04<21:44:28, 1.39s/it] 9%|▉ | 5458/61904 [2:38:06<21:59:32, 1.40s/it] 9%|▉ | 5459/61904 [2:38:07<22:16:20, 1.42s/it] 9%|▉ | 5460/61904 [2:38:08<21:29:17, 1.37s/it] {'loss': 2.9544, 'learning_rate': 1.914754310903669e-07, 'epoch': 1.41} 9%|▉ | 5460/61904 [2:38:08<21:29:17, 1.37s/it] 9%|▉ | 5461/61904 [2:38:10<21:24:49, 1.37s/it] 9%|▉ | 5462/61904 [2:38:11<21:52:50, 1.40s/it] 9%|▉ | 5463/61904 [2:38:12<21:37:34, 1.38s/it] 9%|▉ | 5464/61904 [2:38:14<21:06:45, 1.35s/it] 9%|▉ | 5465/61904 [2:38:15<20:38:43, 1.32s/it] 9%|▉ | 5466/61904 [2:38:16<21:17:34, 1.36s/it] 9%|▉ | 5467/61904 [2:38:18<20:53:28, 1.33s/it] 9%|▉ | 5468/61904 [2:38:19<20:37:37, 1.32s/it] 9%|▉ | 5469/61904 [2:38:20<20:32:32, 1.31s/it] 9%|▉ | 5470/61904 [2:38:22<21:50:07, 1.39s/it] 9%|▉ | 5471/61904 [2:38:23<21:18:29, 1.36s/it] 9%|▉ | 5472/61904 [2:38:24<20:54:36, 1.33s/it] 9%|▉ | 5473/61904 [2:38:26<21:02:25, 1.34s/it] 9%|▉ | 5474/61904 [2:38:27<20:19:28, 1.30s/it] 9%|▉ | 5475/61904 [2:38:28<20:45:59, 1.32s/it] 9%|▉ | 5476/61904 [2:38:30<20:23:44, 1.30s/it] 9%|▉ | 5477/61904 [2:38:31<20:55:08, 1.33s/it] 9%|▉ | 5478/61904 [2:38:32<21:25:22, 1.37s/it] 9%|▉ | 5479/61904 [2:38:34<21:04:20, 1.34s/it] 9%|▉ | 5480/61904 [2:38:35<21:03:05, 1.34s/it] {'loss': 2.9586, 'learning_rate': 1.9144301828082458e-07, 'epoch': 1.42} 9%|▉ | 5480/61904 [2:38:35<21:03:05, 1.34s/it] 9%|▉ | 5481/61904 [2:38:37<21:23:41, 1.37s/it] 9%|▉ | 5482/61904 [2:38:38<20:58:08, 1.34s/it] 9%|▉ | 5483/61904 [2:38:39<21:18:59, 1.36s/it] 9%|▉ | 5484/61904 [2:38:41<21:46:55, 1.39s/it] 9%|▉ | 5485/61904 [2:38:42<21:31:30, 1.37s/it] 9%|▉ | 5486/61904 [2:38:43<21:32:43, 1.37s/it] 9%|▉ | 5487/61904 [2:38:45<21:09:54, 1.35s/it] 9%|▉ | 5488/61904 [2:38:46<20:47:30, 1.33s/it] 9%|▉ | 5489/61904 [2:38:47<21:04:59, 1.35s/it] 9%|▉ | 5490/61904 [2:38:49<20:58:18, 1.34s/it] 9%|▉ | 5491/61904 [2:38:50<20:50:39, 1.33s/it] 9%|▉ | 5492/61904 [2:38:51<20:43:43, 1.32s/it] 9%|▉ | 5493/61904 [2:38:53<21:19:33, 1.36s/it] 9%|▉ | 5494/61904 [2:38:54<20:52:23, 1.33s/it] 9%|▉ | 5495/61904 [2:38:55<20:27:56, 1.31s/it] 9%|▉ | 5496/61904 [2:38:57<20:28:25, 1.31s/it] 9%|▉ | 5497/61904 [2:38:58<21:21:44, 1.36s/it] 9%|▉ | 5498/61904 [2:38:59<21:43:37, 1.39s/it] 9%|▉ | 5499/61904 [2:39:01<21:39:20, 1.38s/it] 9%|▉ | 5500/61904 [2:39:02<21:26:42, 1.37s/it] {'loss': 2.9663, 'learning_rate': 1.9141060547128224e-07, 'epoch': 1.42} 9%|▉ | 5500/61904 [2:39:02<21:26:42, 1.37s/it] 9%|▉ | 5501/61904 [2:39:04<21:25:17, 1.37s/it] 9%|▉ | 5502/61904 [2:39:05<21:20:08, 1.36s/it] 9%|▉ | 5503/61904 [2:39:06<21:28:06, 1.37s/it] 9%|▉ | 5504/61904 [2:39:08<21:05:10, 1.35s/it] 9%|▉ | 5505/61904 [2:39:09<20:57:46, 1.34s/it] 9%|▉ | 5506/61904 [2:39:10<22:06:12, 1.41s/it] 9%|▉ | 5507/61904 [2:39:12<22:03:06, 1.41s/it] 9%|▉ | 5508/61904 [2:39:13<22:12:51, 1.42s/it] 9%|▉ | 5509/61904 [2:39:15<22:23:49, 1.43s/it] 9%|▉ | 5510/61904 [2:39:16<22:19:43, 1.43s/it] 9%|▉ | 5511/61904 [2:39:18<21:50:48, 1.39s/it] 9%|▉ | 5512/61904 [2:39:19<22:59:25, 1.47s/it] 9%|▉ | 5513/61904 [2:39:21<22:51:37, 1.46s/it] 9%|▉ | 5514/61904 [2:39:22<23:13:49, 1.48s/it] 9%|▉ | 5515/61904 [2:39:23<22:41:33, 1.45s/it] 9%|▉ | 5516/61904 [2:39:25<22:12:11, 1.42s/it] 9%|▉ | 5517/61904 [2:39:26<22:11:16, 1.42s/it] 9%|▉ | 5518/61904 [2:39:28<21:55:02, 1.40s/it] 9%|▉ | 5519/61904 [2:39:29<22:00:44, 1.41s/it] 9%|▉ | 5520/61904 [2:39:30<21:53:36, 1.40s/it] {'loss': 2.9736, 'learning_rate': 1.913781926617399e-07, 'epoch': 1.43} 9%|▉ | 5520/61904 [2:39:30<21:53:36, 1.40s/it] 9%|▉ | 5521/61904 [2:39:32<21:32:35, 1.38s/it] 9%|▉ | 5522/61904 [2:39:33<22:06:56, 1.41s/it] 9%|▉ | 5523/61904 [2:39:35<22:21:33, 1.43s/it] 9%|▉ | 5524/61904 [2:39:36<23:07:41, 1.48s/it] 9%|▉ | 5525/61904 [2:39:38<22:25:35, 1.43s/it] 9%|▉ | 5526/61904 [2:39:39<23:18:05, 1.49s/it] 9%|▉ | 5527/61904 [2:39:41<22:54:25, 1.46s/it] 9%|▉ | 5528/61904 [2:39:42<23:39:59, 1.51s/it] 9%|▉ | 5529/61904 [2:39:44<23:18:45, 1.49s/it] 9%|▉ | 5530/61904 [2:39:45<22:42:55, 1.45s/it] 9%|▉ | 5531/61904 [2:39:47<23:24:10, 1.49s/it] 9%|▉ | 5532/61904 [2:39:48<22:26:04, 1.43s/it] 9%|▉ | 5533/61904 [2:39:49<21:50:23, 1.39s/it] 9%|▉ | 5534/61904 [2:39:51<23:21:25, 1.49s/it] 9%|▉ | 5535/61904 [2:39:52<22:28:16, 1.44s/it] 9%|▉ | 5536/61904 [2:39:54<22:48:09, 1.46s/it] 9%|▉ | 5537/61904 [2:39:55<22:28:39, 1.44s/it] 9%|▉ | 5538/61904 [2:39:57<22:32:21, 1.44s/it] 9%|▉ | 5539/61904 [2:39:58<22:15:10, 1.42s/it] 9%|▉ | 5540/61904 [2:39:59<21:16:20, 1.36s/it] {'loss': 2.9392, 'learning_rate': 1.913457798521976e-07, 'epoch': 1.43} 9%|▉ | 5540/61904 [2:39:59<21:16:20, 1.36s/it] 9%|▉ | 5541/61904 [2:40:00<20:52:07, 1.33s/it] 9%|▉ | 5542/61904 [2:40:02<21:19:31, 1.36s/it] 9%|▉ | 5543/61904 [2:40:03<20:56:46, 1.34s/it] 9%|▉ | 5544/61904 [2:40:05<20:56:07, 1.34s/it] 9%|▉ | 5545/61904 [2:40:06<20:56:28, 1.34s/it] 9%|▉ | 5546/61904 [2:40:07<21:13:39, 1.36s/it] 9%|▉ | 5547/61904 [2:40:09<21:04:53, 1.35s/it] 9%|▉ | 5548/61904 [2:40:10<21:31:35, 1.38s/it] 9%|▉ | 5549/61904 [2:40:11<21:14:32, 1.36s/it] 9%|▉ | 5550/61904 [2:40:13<21:14:25, 1.36s/it] 9%|▉ | 5551/61904 [2:40:14<20:52:01, 1.33s/it] 9%|▉ | 5552/61904 [2:40:15<21:19:24, 1.36s/it] 9%|▉ | 5553/61904 [2:40:17<20:47:15, 1.33s/it] 9%|▉ | 5554/61904 [2:40:18<20:53:25, 1.33s/it] 9%|▉ | 5555/61904 [2:40:20<22:15:43, 1.42s/it] 9%|▉ | 5556/61904 [2:40:21<21:52:54, 1.40s/it] 9%|▉ | 5557/61904 [2:40:22<21:30:07, 1.37s/it] 9%|▉ | 5558/61904 [2:40:24<21:51:07, 1.40s/it] 9%|▉ | 5559/61904 [2:40:25<21:27:10, 1.37s/it] 9%|▉ | 5560/61904 [2:40:26<21:28:02, 1.37s/it] {'loss': 2.9724, 'learning_rate': 1.9131336704265525e-07, 'epoch': 1.44} 9%|▉ | 5560/61904 [2:40:26<21:28:02, 1.37s/it] 9%|▉ | 5561/61904 [2:40:28<21:14:31, 1.36s/it] 9%|▉ | 5562/61904 [2:40:29<20:51:53, 1.33s/it] 9%|▉ | 5563/61904 [2:40:30<20:19:57, 1.30s/it] 9%|▉ | 5564/61904 [2:40:32<21:35:31, 1.38s/it] 9%|▉ | 5565/61904 [2:40:33<21:54:31, 1.40s/it] 9%|▉ | 5566/61904 [2:40:35<21:47:19, 1.39s/it] 9%|▉ | 5567/61904 [2:40:36<22:14:07, 1.42s/it] 9%|▉ | 5568/61904 [2:40:38<22:04:46, 1.41s/it] 9%|▉ | 5569/61904 [2:40:39<21:37:10, 1.38s/it] 9%|▉ | 5570/61904 [2:40:40<21:21:00, 1.36s/it] 9%|▉ | 5571/61904 [2:40:42<21:27:10, 1.37s/it] 9%|▉ | 5572/61904 [2:40:43<21:08:21, 1.35s/it] 9%|▉ | 5573/61904 [2:40:44<21:35:34, 1.38s/it] 9%|▉ | 5574/61904 [2:40:46<21:12:30, 1.36s/it] 9%|▉ | 5575/61904 [2:40:47<20:54:43, 1.34s/it] 9%|▉ | 5576/61904 [2:40:48<21:14:36, 1.36s/it] 9%|▉ | 5577/61904 [2:40:50<21:08:00, 1.35s/it] 9%|▉ | 5578/61904 [2:40:51<22:10:55, 1.42s/it] 9%|▉ | 5579/61904 [2:40:52<21:28:12, 1.37s/it] 9%|▉ | 5580/61904 [2:40:54<21:12:04, 1.36s/it] {'loss': 2.9819, 'learning_rate': 1.912809542331129e-07, 'epoch': 1.44} 9%|▉ | 5580/61904 [2:40:54<21:12:04, 1.36s/it] 9%|▉ | 5581/61904 [2:40:55<21:28:55, 1.37s/it] 9%|▉ | 5582/61904 [2:40:57<21:23:36, 1.37s/it] 9%|▉ | 5583/61904 [2:40:58<21:00:08, 1.34s/it] 9%|▉ | 5584/61904 [2:40:59<21:06:17, 1.35s/it] 9%|▉ | 5585/61904 [2:41:01<21:00:29, 1.34s/it] 9%|▉ | 5586/61904 [2:41:02<21:28:38, 1.37s/it] 9%|▉ | 5587/61904 [2:41:03<21:16:54, 1.36s/it] 9%|▉ | 5588/61904 [2:41:05<21:27:46, 1.37s/it] 9%|▉ | 5589/61904 [2:41:06<21:31:45, 1.38s/it] 9%|▉ | 5590/61904 [2:41:07<21:30:56, 1.38s/it] 9%|▉ | 5591/61904 [2:41:09<21:35:14, 1.38s/it] 9%|▉ | 5592/61904 [2:41:10<21:52:49, 1.40s/it] 9%|▉ | 5593/61904 [2:41:12<21:43:38, 1.39s/it] 9%|▉ | 5594/61904 [2:41:13<21:30:54, 1.38s/it] 9%|▉ | 5595/61904 [2:41:14<21:20:23, 1.36s/it] 9%|▉ | 5596/61904 [2:41:16<21:12:34, 1.36s/it] 9%|▉ | 5597/61904 [2:41:17<21:08:11, 1.35s/it] 9%|▉ | 5598/61904 [2:41:18<21:20:15, 1.36s/it] 9%|▉ | 5599/61904 [2:41:20<22:35:50, 1.44s/it] 9%|▉ | 5600/61904 [2:41:21<21:58:58, 1.41s/it] {'loss': 2.8861, 'learning_rate': 1.912485414235706e-07, 'epoch': 1.45} 9%|▉ | 5600/61904 [2:41:21<21:58:58, 1.41s/it] 9%|▉ | 5601/61904 [2:41:23<21:59:43, 1.41s/it] 9%|▉ | 5602/61904 [2:41:24<21:36:19, 1.38s/it] 9%|▉ | 5603/61904 [2:41:26<21:51:48, 1.40s/it] 9%|▉ | 5604/61904 [2:41:27<21:16:01, 1.36s/it] 9%|▉ | 5605/61904 [2:41:28<21:07:42, 1.35s/it] 9%|▉ | 5606/61904 [2:41:30<21:53:01, 1.40s/it] 9%|▉ | 5607/61904 [2:41:31<21:17:32, 1.36s/it] 9%|▉ | 5608/61904 [2:41:33<22:33:52, 1.44s/it] 9%|▉ | 5609/61904 [2:41:34<21:39:21, 1.38s/it] 9%|▉ | 5610/61904 [2:41:35<21:34:26, 1.38s/it] 9%|▉ | 5611/61904 [2:41:37<22:00:46, 1.41s/it] 9%|▉ | 5612/61904 [2:41:38<22:12:31, 1.42s/it] 9%|▉ | 5613/61904 [2:41:39<21:10:04, 1.35s/it] 9%|▉ | 5614/61904 [2:41:41<21:47:16, 1.39s/it] 9%|▉ | 5615/61904 [2:41:42<22:08:04, 1.42s/it] 9%|▉ | 5616/61904 [2:41:44<22:33:25, 1.44s/it] 9%|▉ | 5617/61904 [2:41:45<22:50:15, 1.46s/it] 9%|▉ | 5618/61904 [2:41:47<22:35:54, 1.45s/it] 9%|▉ | 5619/61904 [2:41:48<21:16:49, 1.36s/it] 9%|▉ | 5620/61904 [2:41:49<22:00:24, 1.41s/it] {'loss': 2.9408, 'learning_rate': 1.9121612861402824e-07, 'epoch': 1.45} 9%|▉ | 5620/61904 [2:41:49<22:00:24, 1.41s/it] 9%|▉ | 5621/61904 [2:41:51<21:44:34, 1.39s/it] 9%|▉ | 5622/61904 [2:41:52<21:51:10, 1.40s/it] 9%|▉ | 5623/61904 [2:41:54<22:12:05, 1.42s/it] 9%|▉ | 5624/61904 [2:41:55<21:45:00, 1.39s/it] 9%|▉ | 5625/61904 [2:41:56<21:15:53, 1.36s/it] 9%|▉ | 5626/61904 [2:41:58<21:12:22, 1.36s/it] 9%|▉ | 5627/61904 [2:41:59<20:27:03, 1.31s/it] 9%|▉ | 5628/61904 [2:42:00<20:41:20, 1.32s/it] 9%|▉ | 5629/61904 [2:42:01<20:58:09, 1.34s/it] 9%|▉ | 5630/61904 [2:42:03<21:43:40, 1.39s/it] 9%|▉ | 5631/61904 [2:42:04<20:53:00, 1.34s/it] 9%|▉ | 5632/61904 [2:42:06<20:59:06, 1.34s/it] 9%|▉ | 5633/61904 [2:42:07<21:05:39, 1.35s/it] 9%|▉ | 5634/61904 [2:42:08<20:46:33, 1.33s/it] 9%|▉ | 5635/61904 [2:42:09<20:34:11, 1.32s/it] 9%|▉ | 5636/61904 [2:42:11<19:48:23, 1.27s/it] 9%|▉ | 5637/61904 [2:42:12<20:30:09, 1.31s/it] 9%|▉ | 5638/61904 [2:42:13<20:33:13, 1.32s/it] 9%|▉ | 5639/61904 [2:42:15<21:22:46, 1.37s/it] 9%|▉ | 5640/61904 [2:42:16<21:42:57, 1.39s/it] {'loss': 3.0236, 'learning_rate': 1.9118371580448592e-07, 'epoch': 1.46} 9%|▉ | 5640/61904 [2:42:16<21:42:57, 1.39s/it] 9%|▉ | 5641/61904 [2:42:18<21:52:24, 1.40s/it] 9%|▉ | 5642/61904 [2:42:19<21:25:24, 1.37s/it] 9%|▉ | 5643/61904 [2:42:20<20:57:02, 1.34s/it] 9%|▉ | 5644/61904 [2:42:22<21:48:48, 1.40s/it] 9%|▉ | 5645/61904 [2:42:23<21:41:58, 1.39s/it] 9%|▉ | 5646/61904 [2:42:25<21:42:40, 1.39s/it] 9%|▉ | 5647/61904 [2:42:26<21:03:21, 1.35s/it] 9%|▉ | 5648/61904 [2:42:27<21:14:22, 1.36s/it] 9%|▉ | 5649/61904 [2:42:28<20:39:09, 1.32s/it] 9%|▉ | 5650/61904 [2:42:30<20:26:15, 1.31s/it] 9%|▉ | 5651/61904 [2:42:31<20:11:04, 1.29s/it] 9%|▉ | 5652/61904 [2:42:32<20:18:03, 1.30s/it] 9%|▉ | 5653/61904 [2:42:34<19:53:35, 1.27s/it] 9%|▉ | 5654/61904 [2:42:35<20:55:43, 1.34s/it] 9%|▉ | 5655/61904 [2:42:36<20:27:38, 1.31s/it] 9%|▉ | 5656/61904 [2:42:37<20:14:25, 1.30s/it] 9%|▉ | 5657/61904 [2:42:39<20:05:05, 1.29s/it] 9%|▉ | 5658/61904 [2:42:40<20:26:23, 1.31s/it] 9%|▉ | 5659/61904 [2:42:42<21:19:38, 1.37s/it] 9%|▉ | 5660/61904 [2:42:43<20:36:32, 1.32s/it] {'loss': 2.9481, 'learning_rate': 1.911513029949436e-07, 'epoch': 1.46} 9%|▉ | 5660/61904 [2:42:43<20:36:32, 1.32s/it] 9%|▉ | 5661/61904 [2:42:44<21:43:28, 1.39s/it] 9%|▉ | 5662/61904 [2:42:46<21:45:03, 1.39s/it] 9%|▉ | 5663/61904 [2:42:47<21:30:28, 1.38s/it] 9%|▉ | 5664/61904 [2:42:48<20:56:13, 1.34s/it] 9%|▉ | 5665/61904 [2:42:50<20:36:30, 1.32s/it] 9%|▉ | 5666/61904 [2:42:51<21:03:15, 1.35s/it] 9%|▉ | 5667/61904 [2:42:53<21:35:36, 1.38s/it] 9%|▉ | 5668/61904 [2:42:54<21:37:41, 1.38s/it] 9%|▉ | 5669/61904 [2:42:55<21:41:04, 1.39s/it] 9%|▉ | 5670/61904 [2:42:57<21:07:31, 1.35s/it] 9%|▉ | 5671/61904 [2:42:58<20:58:19, 1.34s/it] 9%|▉ | 5672/61904 [2:42:59<20:14:14, 1.30s/it] 9%|▉ | 5673/61904 [2:43:00<20:31:52, 1.31s/it] 9%|▉ | 5674/61904 [2:43:02<21:16:24, 1.36s/it] 9%|▉ | 5675/61904 [2:43:03<21:26:31, 1.37s/it] 9%|▉ | 5676/61904 [2:43:05<20:47:25, 1.33s/it] 9%|▉ | 5677/61904 [2:43:06<20:54:40, 1.34s/it] 9%|▉ | 5678/61904 [2:43:07<20:41:12, 1.32s/it] 9%|▉ | 5679/61904 [2:43:09<20:52:39, 1.34s/it] 9%|▉ | 5680/61904 [2:43:10<20:58:39, 1.34s/it] {'loss': 2.9892, 'learning_rate': 1.9111889018540125e-07, 'epoch': 1.47} 9%|▉ | 5680/61904 [2:43:10<20:58:39, 1.34s/it] 9%|▉ | 5681/61904 [2:43:11<20:33:07, 1.32s/it] 9%|▉ | 5682/61904 [2:43:13<21:03:34, 1.35s/it] 9%|▉ | 5683/61904 [2:43:14<20:42:24, 1.33s/it] 9%|▉ | 5684/61904 [2:43:15<21:29:04, 1.38s/it] 9%|▉ | 5685/61904 [2:43:17<21:24:05, 1.37s/it] 9%|▉ | 5686/61904 [2:43:18<21:08:56, 1.35s/it] 9%|▉ | 5687/61904 [2:43:19<21:04:59, 1.35s/it] 9%|▉ | 5688/61904 [2:43:21<21:36:42, 1.38s/it] 9%|▉ | 5689/61904 [2:43:22<21:35:50, 1.38s/it] 9%|▉ | 5690/61904 [2:43:24<21:17:54, 1.36s/it] 9%|▉ | 5691/61904 [2:43:25<20:56:17, 1.34s/it] 9%|▉ | 5692/61904 [2:43:26<21:30:57, 1.38s/it] 9%|▉ | 5693/61904 [2:43:28<21:57:09, 1.41s/it] 9%|▉ | 5694/61904 [2:43:29<21:34:21, 1.38s/it] 9%|▉ | 5695/61904 [2:43:30<21:16:09, 1.36s/it] 9%|▉ | 5696/61904 [2:43:32<20:50:04, 1.33s/it] 9%|▉ | 5697/61904 [2:43:33<21:01:09, 1.35s/it] 9%|▉ | 5698/61904 [2:43:34<20:38:10, 1.32s/it] 9%|▉ | 5699/61904 [2:43:36<21:00:41, 1.35s/it] 9%|▉ | 5700/61904 [2:43:37<21:11:25, 1.36s/it] {'loss': 2.9723, 'learning_rate': 1.9108647737585894e-07, 'epoch': 1.47} 9%|▉ | 5700/61904 [2:43:37<21:11:25, 1.36s/it] 9%|▉ | 5701/61904 [2:43:39<21:23:51, 1.37s/it] 9%|▉ | 5702/61904 [2:43:40<21:40:56, 1.39s/it] 9%|▉ | 5703/61904 [2:43:41<21:24:20, 1.37s/it] 9%|▉ | 5704/61904 [2:43:43<21:20:58, 1.37s/it] 9%|▉ | 5705/61904 [2:43:44<21:40:49, 1.39s/it] 9%|▉ | 5706/61904 [2:43:45<21:23:58, 1.37s/it] 9%|▉ | 5707/61904 [2:43:47<21:09:09, 1.36s/it] 9%|▉ | 5708/61904 [2:43:48<22:07:09, 1.42s/it] 9%|▉ | 5709/61904 [2:43:50<22:04:54, 1.41s/it] 9%|▉ | 5710/61904 [2:43:51<21:45:05, 1.39s/it] 9%|▉ | 5711/61904 [2:43:52<21:36:11, 1.38s/it] 9%|▉ | 5712/61904 [2:43:54<21:44:57, 1.39s/it] 9%|▉ | 5713/61904 [2:43:55<21:32:05, 1.38s/it] 9%|▉ | 5714/61904 [2:43:57<21:39:13, 1.39s/it] 9%|▉ | 5715/61904 [2:43:58<22:04:13, 1.41s/it] 9%|▉ | 5716/61904 [2:43:59<21:35:02, 1.38s/it] 9%|▉ | 5717/61904 [2:44:01<22:01:35, 1.41s/it] 9%|▉ | 5718/61904 [2:44:02<21:39:55, 1.39s/it] 9%|▉ | 5719/61904 [2:44:04<21:30:24, 1.38s/it] 9%|▉ | 5720/61904 [2:44:05<21:59:44, 1.41s/it] {'loss': 2.8791, 'learning_rate': 1.910540645663166e-07, 'epoch': 1.48} 9%|▉ | 5720/61904 [2:44:05<21:59:44, 1.41s/it] 9%|▉ | 5721/61904 [2:44:06<21:36:42, 1.38s/it] 9%|▉ | 5722/61904 [2:44:08<21:18:19, 1.37s/it] 9%|▉ | 5723/61904 [2:44:09<22:08:22, 1.42s/it] 9%|▉ | 5724/61904 [2:44:11<22:05:20, 1.42s/it] 9%|▉ | 5725/61904 [2:44:12<21:27:23, 1.37s/it] 9%|▉ | 5726/61904 [2:44:13<21:24:33, 1.37s/it] 9%|▉ | 5727/61904 [2:44:14<20:53:37, 1.34s/it] 9%|▉ | 5728/61904 [2:44:16<21:28:15, 1.38s/it] 9%|▉ | 5729/61904 [2:44:18<22:37:35, 1.45s/it] 9%|▉ | 5730/61904 [2:44:19<22:31:05, 1.44s/it] 9%|▉ | 5731/61904 [2:44:20<21:56:28, 1.41s/it] 9%|▉ | 5732/61904 [2:44:22<21:33:10, 1.38s/it] 9%|▉ | 5733/61904 [2:44:23<21:42:32, 1.39s/it] 9%|▉ | 5734/61904 [2:44:24<20:50:48, 1.34s/it] 9%|▉ | 5735/61904 [2:44:26<21:12:50, 1.36s/it] 9%|▉ | 5736/61904 [2:44:27<21:26:20, 1.37s/it] 9%|▉ | 5737/61904 [2:44:28<21:15:20, 1.36s/it] 9%|▉ | 5738/61904 [2:44:30<20:38:10, 1.32s/it] 9%|▉ | 5739/61904 [2:44:31<20:36:47, 1.32s/it] 9%|▉ | 5740/61904 [2:44:32<20:31:42, 1.32s/it] {'loss': 2.9381, 'learning_rate': 1.9102165175677426e-07, 'epoch': 1.48} 9%|▉ | 5740/61904 [2:44:32<20:31:42, 1.32s/it] 9%|▉ | 5741/61904 [2:44:34<20:29:41, 1.31s/it] 9%|▉ | 5742/61904 [2:44:35<20:46:00, 1.33s/it] 9%|▉ | 5743/61904 [2:44:36<20:52:56, 1.34s/it] 9%|▉ | 5744/61904 [2:44:38<21:16:07, 1.36s/it] 9%|▉ | 5745/61904 [2:44:39<20:45:11, 1.33s/it] 9%|▉ | 5746/61904 [2:44:40<20:51:56, 1.34s/it] 9%|▉ | 5747/61904 [2:44:42<20:51:54, 1.34s/it] 9%|▉ | 5748/61904 [2:44:43<21:41:10, 1.39s/it] 9%|▉ | 5749/61904 [2:44:45<21:16:56, 1.36s/it] 9%|▉ | 5750/61904 [2:44:46<21:10:46, 1.36s/it] 9%|▉ | 5751/61904 [2:44:47<20:49:41, 1.34s/it] 9%|▉ | 5752/61904 [2:44:48<20:56:15, 1.34s/it] 9%|▉ | 5753/61904 [2:44:50<21:37:30, 1.39s/it] 9%|▉ | 5754/61904 [2:44:51<21:44:27, 1.39s/it] 9%|▉ | 5755/61904 [2:44:53<21:09:40, 1.36s/it] 9%|▉ | 5756/61904 [2:44:54<21:05:52, 1.35s/it] 9%|▉ | 5757/61904 [2:44:56<21:59:04, 1.41s/it] 9%|▉ | 5758/61904 [2:44:57<21:43:33, 1.39s/it] 9%|▉ | 5759/61904 [2:44:58<21:57:05, 1.41s/it] 9%|▉ | 5760/61904 [2:45:00<21:19:46, 1.37s/it] {'loss': 2.8873, 'learning_rate': 1.9098923894723195e-07, 'epoch': 1.49} 9%|▉ | 5760/61904 [2:45:00<21:19:46, 1.37s/it] 9%|▉ | 5761/61904 [2:45:01<21:09:38, 1.36s/it] 9%|▉ | 5762/61904 [2:45:02<20:57:10, 1.34s/it] 9%|▉ | 5763/61904 [2:45:04<20:47:20, 1.33s/it] 9%|▉ | 5764/61904 [2:45:05<20:33:39, 1.32s/it] 9%|▉ | 5765/61904 [2:45:06<20:55:07, 1.34s/it] 9%|▉ | 5766/61904 [2:45:08<20:57:26, 1.34s/it] 9%|▉ | 5767/61904 [2:45:09<21:33:33, 1.38s/it] 9%|▉ | 5768/61904 [2:45:10<21:15:37, 1.36s/it] 9%|▉ | 5769/61904 [2:45:12<21:39:31, 1.39s/it] 9%|▉ | 5770/61904 [2:45:13<21:17:53, 1.37s/it] 9%|▉ | 5771/61904 [2:45:15<21:48:45, 1.40s/it] 9%|▉ | 5772/61904 [2:45:16<22:13:47, 1.43s/it] 9%|▉ | 5773/61904 [2:45:18<22:57:00, 1.47s/it] 9%|▉ | 5774/61904 [2:45:19<21:51:27, 1.40s/it] 9%|▉ | 5775/61904 [2:45:20<21:39:22, 1.39s/it] 9%|▉ | 5776/61904 [2:45:22<21:08:41, 1.36s/it] 9%|▉ | 5777/61904 [2:45:23<20:53:18, 1.34s/it] 9%|▉ | 5778/61904 [2:45:24<20:38:33, 1.32s/it] 9%|▉ | 5779/61904 [2:45:26<21:26:16, 1.38s/it] 9%|▉ | 5780/61904 [2:45:27<21:10:52, 1.36s/it] {'loss': 2.8586, 'learning_rate': 1.909568261376896e-07, 'epoch': 1.49} 9%|▉ | 5780/61904 [2:45:27<21:10:52, 1.36s/it] 9%|▉ | 5781/61904 [2:45:29<22:02:45, 1.41s/it] 9%|▉ | 5782/61904 [2:45:30<22:42:30, 1.46s/it] 9%|▉ | 5783/61904 [2:45:32<22:59:15, 1.47s/it] 9%|▉ | 5784/61904 [2:45:33<22:34:00, 1.45s/it] 9%|▉ | 5785/61904 [2:45:34<21:54:50, 1.41s/it] 9%|▉ | 5786/61904 [2:45:36<22:18:39, 1.43s/it] 9%|▉ | 5787/61904 [2:45:37<21:52:08, 1.40s/it] 9%|▉ | 5788/61904 [2:45:38<21:48:27, 1.40s/it] 9%|▉ | 5789/61904 [2:45:40<21:01:18, 1.35s/it] 9%|▉ | 5790/61904 [2:45:41<20:26:19, 1.31s/it] 9%|▉ | 5791/61904 [2:45:42<20:53:31, 1.34s/it] 9%|▉ | 5792/61904 [2:45:44<20:06:12, 1.29s/it] 9%|▉ | 5793/61904 [2:45:45<20:48:59, 1.34s/it] 9%|▉ | 5794/61904 [2:45:46<21:16:14, 1.36s/it] 9%|▉ | 5795/61904 [2:45:48<21:22:59, 1.37s/it] 9%|▉ | 5796/61904 [2:45:49<21:08:22, 1.36s/it] 9%|▉ | 5797/61904 [2:45:50<20:29:16, 1.31s/it] 9%|▉ | 5798/61904 [2:45:52<20:15:45, 1.30s/it] 9%|▉ | 5799/61904 [2:45:53<20:41:04, 1.33s/it] 9%|▉ | 5800/61904 [2:45:54<20:20:32, 1.31s/it] {'loss': 2.9308, 'learning_rate': 1.9092441332814727e-07, 'epoch': 1.5} 9%|▉ | 5800/61904 [2:45:54<20:20:32, 1.31s/it] 9%|▉ | 5801/61904 [2:45:56<21:14:59, 1.36s/it] 9%|▉ | 5802/61904 [2:45:57<20:53:49, 1.34s/it] 9%|▉ | 5803/61904 [2:45:58<20:56:20, 1.34s/it] 9%|▉ | 5804/61904 [2:46:00<21:27:34, 1.38s/it] 9%|▉ | 5805/61904 [2:46:01<21:08:21, 1.36s/it] 9%|▉ | 5806/61904 [2:46:03<21:16:33, 1.37s/it] 9%|▉ | 5807/61904 [2:46:04<21:22:26, 1.37s/it] 9%|▉ | 5808/61904 [2:46:05<21:23:45, 1.37s/it] 9%|▉ | 5809/61904 [2:46:07<21:19:48, 1.37s/it] 9%|▉ | 5810/61904 [2:46:08<22:01:06, 1.41s/it] 9%|▉ | 5811/61904 [2:46:10<22:15:49, 1.43s/it] 9%|▉ | 5812/61904 [2:46:11<21:39:13, 1.39s/it] 9%|▉ | 5813/61904 [2:46:12<21:33:30, 1.38s/it] 9%|▉ | 5814/61904 [2:46:14<21:59:52, 1.41s/it] 9%|▉ | 5815/61904 [2:46:15<21:55:55, 1.41s/it] 9%|▉ | 5816/61904 [2:46:17<21:46:23, 1.40s/it] 9%|▉ | 5817/61904 [2:46:18<21:13:11, 1.36s/it] 9%|▉ | 5818/61904 [2:46:19<21:21:45, 1.37s/it] 9%|▉ | 5819/61904 [2:46:20<20:30:31, 1.32s/it] 9%|▉ | 5820/61904 [2:46:22<21:31:22, 1.38s/it] {'loss': 2.9222, 'learning_rate': 1.9089200051860496e-07, 'epoch': 1.5} 9%|▉ | 5820/61904 [2:46:22<21:31:22, 1.38s/it] 9%|▉ | 5821/61904 [2:46:23<20:43:50, 1.33s/it] 9%|▉ | 5822/61904 [2:46:24<20:39:04, 1.33s/it] 9%|▉ | 5823/61904 [2:46:26<20:14:47, 1.30s/it] 9%|▉ | 5824/61904 [2:46:27<20:31:41, 1.32s/it] 9%|▉ | 5825/61904 [2:46:28<20:50:17, 1.34s/it] 9%|▉ | 5826/61904 [2:46:30<20:51:30, 1.34s/it] 9%|▉ | 5827/61904 [2:46:31<20:48:29, 1.34s/it] 9%|▉ | 5828/61904 [2:46:32<20:35:57, 1.32s/it] 9%|▉ | 5829/61904 [2:46:34<20:35:53, 1.32s/it] 9%|▉ | 5830/61904 [2:46:35<20:43:40, 1.33s/it] 9%|▉ | 5831/61904 [2:46:36<20:34:35, 1.32s/it] 9%|▉ | 5832/61904 [2:46:38<21:25:41, 1.38s/it] 9%|▉ | 5833/61904 [2:46:39<20:24:59, 1.31s/it] 9%|▉ | 5834/61904 [2:46:40<20:32:43, 1.32s/it] 9%|▉ | 5835/61904 [2:46:42<20:17:23, 1.30s/it] 9%|▉ | 5836/61904 [2:46:43<20:21:22, 1.31s/it] 9%|▉ | 5837/61904 [2:46:45<21:48:48, 1.40s/it] 9%|▉ | 5838/61904 [2:46:46<21:57:12, 1.41s/it] 9%|▉ | 5839/61904 [2:46:47<21:38:03, 1.39s/it] 9%|▉ | 5840/61904 [2:46:49<21:19:56, 1.37s/it] {'loss': 2.9787, 'learning_rate': 1.908595877090626e-07, 'epoch': 1.51} 9%|▉ | 5840/61904 [2:46:49<21:19:56, 1.37s/it] 9%|▉ | 5841/61904 [2:46:50<21:08:10, 1.36s/it] 9%|▉ | 5842/61904 [2:46:51<21:22:22, 1.37s/it] 9%|▉ | 5843/61904 [2:46:53<21:03:15, 1.35s/it] 9%|▉ | 5844/61904 [2:46:54<21:15:53, 1.37s/it] 9%|▉ | 5845/61904 [2:46:56<21:25:58, 1.38s/it] 9%|▉ | 5846/61904 [2:46:57<21:41:12, 1.39s/it] 9%|▉ | 5847/61904 [2:46:58<21:06:41, 1.36s/it] 9%|▉ | 5848/61904 [2:47:00<21:18:03, 1.37s/it] 9%|▉ | 5849/61904 [2:47:01<21:07:15, 1.36s/it] 9%|▉ | 5850/61904 [2:47:02<20:34:41, 1.32s/it] 9%|▉ | 5851/61904 [2:47:04<20:57:35, 1.35s/it] 9%|▉ | 5852/61904 [2:47:05<20:39:38, 1.33s/it] 9%|▉ | 5853/61904 [2:47:06<20:57:46, 1.35s/it] 9%|▉ | 5854/61904 [2:47:08<20:38:57, 1.33s/it] 9%|▉ | 5855/61904 [2:47:09<20:38:27, 1.33s/it] 9%|▉ | 5856/61904 [2:47:10<20:51:14, 1.34s/it] 9%|▉ | 5857/61904 [2:47:12<20:44:33, 1.33s/it] 9%|▉ | 5858/61904 [2:47:13<20:52:49, 1.34s/it] 9%|▉ | 5859/61904 [2:47:14<20:41:18, 1.33s/it] 9%|▉ | 5860/61904 [2:47:16<20:35:21, 1.32s/it] {'loss': 2.9207, 'learning_rate': 1.9082717489952028e-07, 'epoch': 1.51} 9%|▉ | 5860/61904 [2:47:16<20:35:21, 1.32s/it] 9%|▉ | 5861/61904 [2:47:17<20:41:13, 1.33s/it] 9%|▉ | 5862/61904 [2:47:18<20:19:39, 1.31s/it] 9%|▉ | 5863/61904 [2:47:20<20:47:00, 1.34s/it] 9%|▉ | 5864/61904 [2:47:21<20:41:38, 1.33s/it] 9%|▉ | 5865/61904 [2:47:22<21:35:15, 1.39s/it] 9%|▉ | 5866/61904 [2:47:24<21:11:43, 1.36s/it] 9%|▉ | 5867/61904 [2:47:25<21:14:11, 1.36s/it] 9%|▉ | 5868/61904 [2:47:26<21:08:10, 1.36s/it] 9%|▉ | 5869/61904 [2:47:28<21:40:06, 1.39s/it] 9%|▉ | 5870/61904 [2:47:29<21:28:23, 1.38s/it] 9%|▉ | 5871/61904 [2:47:31<21:53:24, 1.41s/it] 9%|▉ | 5872/61904 [2:47:32<22:03:44, 1.42s/it] 9%|▉ | 5873/61904 [2:47:34<22:08:05, 1.42s/it] 9%|▉ | 5874/61904 [2:47:35<22:25:58, 1.44s/it] 9%|▉ | 5875/61904 [2:47:36<22:12:48, 1.43s/it] 9%|▉ | 5876/61904 [2:47:38<21:45:45, 1.40s/it] 9%|▉ | 5877/61904 [2:47:39<21:59:04, 1.41s/it] 9%|▉ | 5878/61904 [2:47:41<22:10:04, 1.42s/it] 9%|▉ | 5879/61904 [2:47:42<21:28:52, 1.38s/it] 9%|▉ | 5880/61904 [2:47:43<21:50:28, 1.40s/it] {'loss': 2.9304, 'learning_rate': 1.9079476208997797e-07, 'epoch': 1.52} 9%|▉ | 5880/61904 [2:47:43<21:50:28, 1.40s/it] 10%|▉ | 5881/61904 [2:47:45<21:06:19, 1.36s/it] 10%|▉ | 5882/61904 [2:47:46<20:34:06, 1.32s/it] 10%|▉ | 5883/61904 [2:47:47<20:46:05, 1.33s/it] 10%|▉ | 5884/61904 [2:47:49<20:46:33, 1.34s/it] 10%|▉ | 5885/61904 [2:47:50<20:40:52, 1.33s/it] 10%|▉ | 5886/61904 [2:47:51<20:48:04, 1.34s/it] 10%|▉ | 5887/61904 [2:47:53<20:47:57, 1.34s/it] 10%|▉ | 5888/61904 [2:47:54<20:44:34, 1.33s/it] 10%|▉ | 5889/61904 [2:47:55<20:30:32, 1.32s/it] 10%|▉ | 5890/61904 [2:47:57<21:07:20, 1.36s/it] 10%|▉ | 5891/61904 [2:47:58<21:25:35, 1.38s/it] 10%|▉ | 5892/61904 [2:48:00<22:08:05, 1.42s/it] 10%|▉ | 5893/61904 [2:48:01<22:24:14, 1.44s/it] 10%|▉ | 5894/61904 [2:48:02<21:50:55, 1.40s/it] 10%|▉ | 5895/61904 [2:48:04<21:27:54, 1.38s/it] 10%|▉ | 5896/61904 [2:48:05<22:09:31, 1.42s/it] 10%|▉ | 5897/61904 [2:48:07<22:44:07, 1.46s/it] 10%|▉ | 5898/61904 [2:48:08<22:23:40, 1.44s/it] 10%|▉ | 5899/61904 [2:48:09<21:46:45, 1.40s/it] 10%|▉ | 5900/61904 [2:48:11<21:26:43, 1.38s/it] {'loss': 2.942, 'learning_rate': 1.907623492804356e-07, 'epoch': 1.52} 10%|▉ | 5900/61904 [2:48:11<21:26:43, 1.38s/it] 10%|▉ | 5901/61904 [2:48:12<21:07:34, 1.36s/it] 10%|▉ | 5902/61904 [2:48:14<21:33:41, 1.39s/it] 10%|▉ | 5903/61904 [2:48:15<22:01:35, 1.42s/it] 10%|▉ | 5904/61904 [2:48:17<22:34:13, 1.45s/it] 10%|▉ | 5905/61904 [2:48:18<21:35:51, 1.39s/it] 10%|▉ | 5906/61904 [2:48:19<21:50:56, 1.40s/it] 10%|▉ | 5907/61904 [2:48:21<21:43:55, 1.40s/it] 10%|▉ | 5908/61904 [2:48:22<21:23:40, 1.38s/it] 10%|▉ | 5909/61904 [2:48:23<20:45:17, 1.33s/it] 10%|▉ | 5910/61904 [2:48:24<20:23:58, 1.31s/it] 10%|▉ | 5911/61904 [2:48:26<20:51:34, 1.34s/it] 10%|▉ | 5912/61904 [2:48:27<20:32:00, 1.32s/it] 10%|▉ | 5913/61904 [2:48:28<20:33:09, 1.32s/it] 10%|▉ | 5914/61904 [2:48:30<20:42:23, 1.33s/it] 10%|▉ | 5915/61904 [2:48:31<20:09:08, 1.30s/it] 10%|▉ | 5916/61904 [2:48:32<20:05:22, 1.29s/it] 10%|▉ | 5917/61904 [2:48:34<20:29:36, 1.32s/it] 10%|▉ | 5918/61904 [2:48:35<20:32:33, 1.32s/it] 10%|▉ | 5919/61904 [2:48:36<20:49:33, 1.34s/it] 10%|▉ | 5920/61904 [2:48:38<20:38:07, 1.33s/it] {'loss': 2.9672, 'learning_rate': 1.907299364708933e-07, 'epoch': 1.53} 10%|▉ | 5920/61904 [2:48:38<20:38:07, 1.33s/it] 10%|▉ | 5921/61904 [2:48:39<20:48:35, 1.34s/it] 10%|▉ | 5922/61904 [2:48:40<20:22:03, 1.31s/it] 10%|▉ | 5923/61904 [2:48:42<19:52:41, 1.28s/it] 10%|▉ | 5924/61904 [2:48:43<20:14:53, 1.30s/it] 10%|▉ | 5925/61904 [2:48:44<20:30:17, 1.32s/it] 10%|▉ | 5926/61904 [2:48:46<21:36:29, 1.39s/it] 10%|▉ | 5927/61904 [2:48:47<21:31:23, 1.38s/it] 10%|▉ | 5928/61904 [2:48:49<21:59:06, 1.41s/it] 10%|▉ | 5929/61904 [2:48:50<22:12:40, 1.43s/it] 10%|▉ | 5930/61904 [2:48:51<21:39:31, 1.39s/it] 10%|▉ | 5931/61904 [2:48:53<21:33:24, 1.39s/it] 10%|▉ | 5932/61904 [2:48:54<21:33:07, 1.39s/it] 10%|▉ | 5933/61904 [2:48:55<21:02:04, 1.35s/it] 10%|▉ | 5934/61904 [2:48:57<21:33:46, 1.39s/it] 10%|▉ | 5935/61904 [2:48:58<21:43:13, 1.40s/it] 10%|▉ | 5936/61904 [2:49:00<20:53:40, 1.34s/it] 10%|▉ | 5937/61904 [2:49:01<21:15:22, 1.37s/it] 10%|▉ | 5938/61904 [2:49:02<21:26:23, 1.38s/it] 10%|▉ | 5939/61904 [2:49:04<20:47:15, 1.34s/it] 10%|▉ | 5940/61904 [2:49:05<21:03:44, 1.35s/it] {'loss': 2.9461, 'learning_rate': 1.9069752366135096e-07, 'epoch': 1.54} 10%|▉ | 5940/61904 [2:49:05<21:03:44, 1.35s/it] 10%|▉ | 5941/61904 [2:49:06<20:57:24, 1.35s/it] 10%|▉ | 5942/61904 [2:49:08<20:57:58, 1.35s/it] 10%|▉ | 5943/61904 [2:49:09<20:43:02, 1.33s/it] 10%|▉ | 5944/61904 [2:49:10<20:14:58, 1.30s/it] 10%|▉ | 5945/61904 [2:49:12<20:25:04, 1.31s/it] 10%|▉ | 5946/61904 [2:49:13<20:46:31, 1.34s/it] 10%|▉ | 5947/61904 [2:49:14<20:16:53, 1.30s/it] 10%|▉ | 5948/61904 [2:49:16<20:16:53, 1.30s/it] 10%|▉ | 5949/61904 [2:49:17<20:37:52, 1.33s/it] 10%|▉ | 5950/61904 [2:49:18<21:28:24, 1.38s/it] 10%|▉ | 5951/61904 [2:49:20<20:53:45, 1.34s/it] 10%|▉ | 5952/61904 [2:49:21<22:03:29, 1.42s/it] 10%|▉ | 5953/61904 [2:49:23<22:24:11, 1.44s/it] 10%|▉ | 5954/61904 [2:49:24<21:32:22, 1.39s/it] 10%|▉ | 5955/61904 [2:49:25<21:12:29, 1.36s/it] 10%|▉ | 5956/61904 [2:49:27<21:19:35, 1.37s/it] 10%|▉ | 5957/61904 [2:49:28<21:34:27, 1.39s/it] 10%|▉ | 5958/61904 [2:49:29<21:24:11, 1.38s/it] 10%|▉ | 5959/61904 [2:49:31<21:26:11, 1.38s/it] 10%|▉ | 5960/61904 [2:49:32<21:37:58, 1.39s/it] {'loss': 2.9206, 'learning_rate': 1.9066511085180862e-07, 'epoch': 1.54} 10%|▉ | 5960/61904 [2:49:32<21:37:58, 1.39s/it] 10%|▉ | 5961/61904 [2:49:34<21:44:36, 1.40s/it] 10%|▉ | 5962/61904 [2:49:35<21:37:47, 1.39s/it] 10%|▉ | 5963/61904 [2:49:36<21:23:15, 1.38s/it] 10%|▉ | 5964/61904 [2:49:38<21:17:45, 1.37s/it] 10%|▉ | 5965/61904 [2:49:39<21:25:10, 1.38s/it] 10%|▉ | 5966/61904 [2:49:41<21:46:28, 1.40s/it] 10%|▉ | 5967/61904 [2:49:42<21:27:44, 1.38s/it] 10%|▉ | 5968/61904 [2:49:43<21:57:38, 1.41s/it] 10%|▉ | 5969/61904 [2:49:45<21:17:08, 1.37s/it] 10%|▉ | 5970/61904 [2:49:46<21:11:20, 1.36s/it] 10%|▉ | 5971/61904 [2:49:47<20:47:13, 1.34s/it] 10%|▉ | 5972/61904 [2:49:49<20:58:23, 1.35s/it] 10%|▉ | 5973/61904 [2:49:50<21:30:25, 1.38s/it] 10%|▉ | 5974/61904 [2:49:51<21:05:45, 1.36s/it] 10%|▉ | 5975/61904 [2:49:53<21:18:26, 1.37s/it] 10%|▉ | 5976/61904 [2:49:54<21:06:20, 1.36s/it] 10%|▉ | 5977/61904 [2:49:55<20:39:59, 1.33s/it] 10%|▉ | 5978/61904 [2:49:57<20:54:07, 1.35s/it] 10%|▉ | 5979/61904 [2:49:58<21:35:05, 1.39s/it] 10%|▉ | 5980/61904 [2:50:00<21:09:25, 1.36s/it] {'loss': 3.0094, 'learning_rate': 1.906326980422663e-07, 'epoch': 1.55} 10%|▉ | 5980/61904 [2:50:00<21:09:25, 1.36s/it] 10%|▉ | 5981/61904 [2:50:01<20:47:09, 1.34s/it] 10%|▉ | 5982/61904 [2:50:02<21:17:50, 1.37s/it] 10%|▉ | 5983/61904 [2:50:04<21:14:17, 1.37s/it] 10%|▉ | 5984/61904 [2:50:05<20:25:28, 1.31s/it] 10%|▉ | 5985/61904 [2:50:06<21:10:08, 1.36s/it] 10%|▉ | 5986/61904 [2:50:08<20:57:06, 1.35s/it] 10%|▉ | 5987/61904 [2:50:09<20:59:23, 1.35s/it] 10%|▉ | 5988/61904 [2:50:10<21:04:27, 1.36s/it] 10%|▉ | 5989/61904 [2:50:12<21:29:08, 1.38s/it] 10%|▉ | 5990/61904 [2:50:13<21:08:14, 1.36s/it] 10%|▉ | 5991/61904 [2:50:15<21:02:23, 1.35s/it] 10%|▉ | 5992/61904 [2:50:16<21:11:52, 1.36s/it] 10%|▉ | 5993/61904 [2:50:17<20:55:40, 1.35s/it] 10%|▉ | 5994/61904 [2:50:18<20:27:12, 1.32s/it] 10%|▉ | 5995/61904 [2:50:20<21:07:51, 1.36s/it] 10%|▉ | 5996/61904 [2:50:21<21:28:10, 1.38s/it] 10%|▉ | 5997/61904 [2:50:23<20:49:07, 1.34s/it] 10%|▉ | 5998/61904 [2:50:24<21:15:24, 1.37s/it] 10%|▉ | 5999/61904 [2:50:25<21:05:27, 1.36s/it] 10%|▉ | 6000/61904 [2:50:27<22:06:56, 1.42s/it] {'loss': 2.9333, 'learning_rate': 1.9060028523272397e-07, 'epoch': 1.55} 10%|▉ | 6000/61904 [2:50:27<22:06:56, 1.42s/it] 10%|▉ | 6001/61904 [2:50:28<21:35:17, 1.39s/it] 10%|▉ | 6002/61904 [2:50:30<21:16:48, 1.37s/it] 10%|▉ | 6003/61904 [2:50:31<21:03:39, 1.36s/it] 10%|▉ | 6004/61904 [2:50:32<20:53:57, 1.35s/it] 10%|▉ | 6005/61904 [2:50:34<21:35:09, 1.39s/it] 10%|▉ | 6006/61904 [2:50:35<21:08:01, 1.36s/it] 10%|▉ | 6007/61904 [2:50:36<21:21:38, 1.38s/it] 10%|▉ | 6008/61904 [2:50:38<22:07:06, 1.42s/it] 10%|▉ | 6009/61904 [2:50:39<21:42:53, 1.40s/it] 10%|▉ | 6010/61904 [2:50:41<21:34:14, 1.39s/it] 10%|▉ | 6011/61904 [2:50:42<21:27:38, 1.38s/it] 10%|▉ | 6012/61904 [2:50:43<21:27:30, 1.38s/it] 10%|▉ | 6013/61904 [2:50:45<20:37:56, 1.33s/it] 10%|▉ | 6014/61904 [2:50:46<20:54:37, 1.35s/it] 10%|▉ | 6015/61904 [2:50:47<20:42:20, 1.33s/it] 10%|▉ | 6016/61904 [2:50:49<20:16:36, 1.31s/it] 10%|▉ | 6017/61904 [2:50:50<20:30:35, 1.32s/it] 10%|▉ | 6018/61904 [2:50:52<21:47:42, 1.40s/it] 10%|▉ | 6019/61904 [2:50:53<22:05:26, 1.42s/it] 10%|▉ | 6020/61904 [2:50:54<21:22:30, 1.38s/it] {'loss': 2.8769, 'learning_rate': 1.9056787242318163e-07, 'epoch': 1.56} 10%|▉ | 6020/61904 [2:50:54<21:22:30, 1.38s/it] 10%|▉ | 6021/61904 [2:50:56<21:03:54, 1.36s/it] 10%|▉ | 6022/61904 [2:50:57<21:21:22, 1.38s/it] 10%|▉ | 6023/61904 [2:50:58<21:05:26, 1.36s/it] 10%|▉ | 6024/61904 [2:51:00<20:53:31, 1.35s/it] 10%|▉ | 6025/61904 [2:51:01<20:35:27, 1.33s/it] 10%|▉ | 6026/61904 [2:51:02<21:32:45, 1.39s/it] 10%|▉ | 6027/61904 [2:51:04<21:55:30, 1.41s/it] 10%|▉ | 6028/61904 [2:51:05<21:46:25, 1.40s/it] 10%|▉ | 6029/61904 [2:51:07<21:24:04, 1.38s/it] 10%|▉ | 6030/61904 [2:51:08<21:39:02, 1.39s/it] 10%|▉ | 6031/61904 [2:51:09<21:45:01, 1.40s/it] 10%|▉ | 6032/61904 [2:51:11<22:31:21, 1.45s/it] 10%|▉ | 6033/61904 [2:51:12<21:55:38, 1.41s/it] 10%|▉ | 6034/61904 [2:51:14<21:51:17, 1.41s/it] 10%|▉ | 6035/61904 [2:51:15<20:57:13, 1.35s/it] 10%|▉ | 6036/61904 [2:51:16<20:57:55, 1.35s/it] 10%|▉ | 6037/61904 [2:51:18<20:50:35, 1.34s/it] 10%|▉ | 6038/61904 [2:51:19<20:37:23, 1.33s/it] 10%|▉ | 6039/61904 [2:51:20<20:20:06, 1.31s/it] 10%|▉ | 6040/61904 [2:51:22<21:34:37, 1.39s/it] {'loss': 2.8884, 'learning_rate': 1.9053545961363932e-07, 'epoch': 1.56} 10%|▉ | 6040/61904 [2:51:22<21:34:37, 1.39s/it] 10%|▉ | 6041/61904 [2:51:23<21:25:03, 1.38s/it] 10%|▉ | 6042/61904 [2:51:24<21:10:20, 1.36s/it] 10%|▉ | 6043/61904 [2:51:26<22:06:28, 1.42s/it] 10%|▉ | 6044/61904 [2:51:27<22:07:10, 1.43s/it] 10%|▉ | 6045/61904 [2:51:29<22:36:53, 1.46s/it] 10%|▉ | 6046/61904 [2:51:30<22:11:15, 1.43s/it] 10%|▉ | 6047/61904 [2:51:32<21:56:26, 1.41s/it] 10%|▉ | 6048/61904 [2:51:33<21:53:44, 1.41s/it] 10%|▉ | 6049/61904 [2:51:34<21:12:19, 1.37s/it] 10%|▉ | 6050/61904 [2:51:36<21:36:31, 1.39s/it] 10%|▉ | 6051/61904 [2:51:37<22:37:37, 1.46s/it] 10%|▉ | 6052/61904 [2:51:39<22:08:18, 1.43s/it] 10%|▉ | 6053/61904 [2:51:40<22:09:35, 1.43s/it] 10%|▉ | 6054/61904 [2:51:42<22:18:00, 1.44s/it] 10%|▉ | 6055/61904 [2:51:43<21:53:56, 1.41s/it] 10%|▉ | 6056/61904 [2:51:45<22:12:00, 1.43s/it] 10%|▉ | 6057/61904 [2:51:46<22:12:18, 1.43s/it] 10%|▉ | 6058/61904 [2:51:48<22:45:48, 1.47s/it] 10%|▉ | 6059/61904 [2:51:49<21:51:16, 1.41s/it] 10%|▉ | 6060/61904 [2:51:50<21:19:20, 1.37s/it] {'loss': 2.9742, 'learning_rate': 1.9050304680409696e-07, 'epoch': 1.57} 10%|▉ | 6060/61904 [2:51:50<21:19:20, 1.37s/it] 10%|▉ | 6061/61904 [2:51:51<21:08:26, 1.36s/it] 10%|▉ | 6062/61904 [2:51:53<21:24:51, 1.38s/it] 10%|▉ | 6063/61904 [2:51:54<21:26:30, 1.38s/it] 10%|▉ | 6064/61904 [2:51:56<22:01:12, 1.42s/it] 10%|▉ | 6065/61904 [2:51:57<21:45:51, 1.40s/it] 10%|▉ | 6066/61904 [2:51:58<21:16:37, 1.37s/it] 10%|▉ | 6067/61904 [2:52:00<21:51:59, 1.41s/it] 10%|▉ | 6068/61904 [2:52:01<21:49:20, 1.41s/it] 10%|▉ | 6069/61904 [2:52:03<21:18:51, 1.37s/it] 10%|▉ | 6070/61904 [2:52:04<21:28:48, 1.38s/it] 10%|▉ | 6071/61904 [2:52:05<21:05:03, 1.36s/it] 10%|▉ | 6072/61904 [2:52:07<21:27:32, 1.38s/it] 10%|▉ | 6073/61904 [2:52:08<20:44:03, 1.34s/it] 10%|▉ | 6074/61904 [2:52:09<21:04:14, 1.36s/it] 10%|▉ | 6075/61904 [2:52:11<20:56:41, 1.35s/it] 10%|▉ | 6076/61904 [2:52:12<20:48:35, 1.34s/it] 10%|▉ | 6077/61904 [2:52:13<21:05:10, 1.36s/it] 10%|▉ | 6078/61904 [2:52:15<20:24:56, 1.32s/it] 10%|▉ | 6079/61904 [2:52:16<20:42:28, 1.34s/it] 10%|▉ | 6080/61904 [2:52:17<20:39:58, 1.33s/it] {'loss': 3.0092, 'learning_rate': 1.9047063399455464e-07, 'epoch': 1.57} 10%|▉ | 6080/61904 [2:52:17<20:39:58, 1.33s/it] 10%|▉ | 6081/61904 [2:52:19<21:42:55, 1.40s/it] 10%|▉ | 6082/61904 [2:52:20<22:03:07, 1.42s/it] 10%|▉ | 6083/61904 [2:52:22<21:21:37, 1.38s/it] 10%|▉ | 6084/61904 [2:52:23<21:31:49, 1.39s/it] 10%|▉ | 6085/61904 [2:52:24<20:47:04, 1.34s/it] 10%|▉ | 6086/61904 [2:52:26<20:19:41, 1.31s/it] 10%|▉ | 6087/61904 [2:52:27<20:41:24, 1.33s/it] 10%|▉ | 6088/61904 [2:52:28<20:11:46, 1.30s/it] 10%|▉ | 6089/61904 [2:52:30<20:16:49, 1.31s/it] 10%|▉ | 6090/61904 [2:52:31<20:32:00, 1.32s/it] 10%|▉ | 6091/61904 [2:52:32<20:14:53, 1.31s/it] 10%|▉ | 6092/61904 [2:52:33<20:28:45, 1.32s/it] 10%|▉ | 6093/61904 [2:52:35<20:12:42, 1.30s/it] 10%|▉ | 6094/61904 [2:52:36<20:14:32, 1.31s/it] 10%|▉ | 6095/61904 [2:52:38<20:58:03, 1.35s/it] 10%|▉ | 6096/61904 [2:52:39<20:46:15, 1.34s/it] 10%|▉ | 6097/61904 [2:52:40<20:23:50, 1.32s/it] 10%|▉ | 6098/61904 [2:52:41<20:42:31, 1.34s/it] 10%|▉ | 6099/61904 [2:52:43<20:17:53, 1.31s/it] 10%|▉ | 6100/61904 [2:52:44<19:35:37, 1.26s/it] {'loss': 2.9553, 'learning_rate': 1.904382211850123e-07, 'epoch': 1.58} 10%|▉ | 6100/61904 [2:52:44<19:35:37, 1.26s/it] 10%|▉ | 6101/61904 [2:52:45<21:11:08, 1.37s/it] 10%|▉ | 6102/61904 [2:52:47<21:37:26, 1.40s/it] 10%|▉ | 6103/61904 [2:52:48<21:39:05, 1.40s/it] 10%|▉ | 6104/61904 [2:52:50<21:03:45, 1.36s/it] 10%|▉ | 6105/61904 [2:52:51<21:02:24, 1.36s/it] 10%|▉ | 6106/61904 [2:52:52<21:17:03, 1.37s/it] 10%|▉ | 6107/61904 [2:52:54<21:33:20, 1.39s/it] 10%|▉ | 6108/61904 [2:52:55<21:11:13, 1.37s/it] 10%|▉ | 6109/61904 [2:52:56<20:53:05, 1.35s/it] 10%|▉ | 6110/61904 [2:52:58<21:11:19, 1.37s/it] 10%|▉ | 6111/61904 [2:52:59<21:26:15, 1.38s/it] 10%|▉ | 6112/61904 [2:53:01<21:25:35, 1.38s/it] 10%|▉ | 6113/61904 [2:53:02<21:11:42, 1.37s/it] 10%|▉ | 6114/61904 [2:53:03<21:06:02, 1.36s/it] 10%|▉ | 6115/61904 [2:53:05<21:09:02, 1.36s/it] 10%|▉ | 6116/61904 [2:53:06<21:15:22, 1.37s/it] 10%|▉ | 6117/61904 [2:53:07<20:55:11, 1.35s/it] 10%|▉ | 6118/61904 [2:53:09<20:56:26, 1.35s/it] 10%|▉ | 6119/61904 [2:53:10<20:43:39, 1.34s/it] 10%|▉ | 6120/61904 [2:53:12<21:31:15, 1.39s/it] {'loss': 2.8705, 'learning_rate': 1.9040580837546997e-07, 'epoch': 1.58} 10%|▉ | 6120/61904 [2:53:12<21:31:15, 1.39s/it] 10%|▉ | 6121/61904 [2:53:13<21:04:01, 1.36s/it] 10%|▉ | 6122/61904 [2:53:14<20:41:53, 1.34s/it] 10%|▉ | 6123/61904 [2:53:15<20:48:52, 1.34s/it] 10%|▉ | 6124/61904 [2:53:17<21:02:41, 1.36s/it] 10%|▉ | 6125/61904 [2:53:18<20:30:38, 1.32s/it] 10%|▉ | 6126/61904 [2:53:20<21:45:34, 1.40s/it] 10%|▉ | 6127/61904 [2:53:21<21:48:32, 1.41s/it] 10%|▉ | 6128/61904 [2:53:23<21:56:21, 1.42s/it] 10%|▉ | 6129/61904 [2:53:24<22:11:03, 1.43s/it] 10%|▉ | 6130/61904 [2:53:25<21:23:32, 1.38s/it] 10%|▉ | 6131/61904 [2:53:27<21:25:28, 1.38s/it] 10%|▉ | 6132/61904 [2:53:28<21:14:28, 1.37s/it] 10%|▉ | 6133/61904 [2:53:29<21:06:11, 1.36s/it] 10%|▉ | 6134/61904 [2:53:31<20:48:09, 1.34s/it] 10%|▉ | 6135/61904 [2:53:32<21:58:59, 1.42s/it] 10%|▉ | 6136/61904 [2:53:33<21:06:21, 1.36s/it] 10%|▉ | 6137/61904 [2:53:35<20:41:05, 1.34s/it] 10%|▉ | 6138/61904 [2:53:36<20:53:42, 1.35s/it] 10%|▉ | 6139/61904 [2:53:38<21:36:40, 1.40s/it] 10%|▉ | 6140/61904 [2:53:39<21:38:34, 1.40s/it] {'loss': 2.9877, 'learning_rate': 1.9037339556592766e-07, 'epoch': 1.59} 10%|▉ | 6140/61904 [2:53:39<21:38:34, 1.40s/it] 10%|▉ | 6141/61904 [2:53:40<21:38:25, 1.40s/it] 10%|▉ | 6142/61904 [2:53:42<21:14:37, 1.37s/it] 10%|▉ | 6143/61904 [2:53:43<21:42:18, 1.40s/it] 10%|▉ | 6144/61904 [2:53:45<21:20:53, 1.38s/it] 10%|▉ | 6145/61904 [2:53:46<20:44:18, 1.34s/it] 10%|▉ | 6146/61904 [2:53:47<21:16:06, 1.37s/it] 10%|▉ | 6147/61904 [2:53:49<21:18:15, 1.38s/it] 10%|▉ | 6148/61904 [2:53:50<21:13:20, 1.37s/it] 10%|▉ | 6149/61904 [2:53:52<21:52:03, 1.41s/it] 10%|▉ | 6150/61904 [2:53:53<22:00:18, 1.42s/it] 10%|▉ | 6151/61904 [2:53:54<21:06:42, 1.36s/it] 10%|▉ | 6152/61904 [2:53:55<20:53:14, 1.35s/it] 10%|▉ | 6153/61904 [2:53:57<21:13:21, 1.37s/it] 10%|▉ | 6154/61904 [2:53:58<21:00:46, 1.36s/it] 10%|▉ | 6155/61904 [2:54:00<21:02:33, 1.36s/it] 10%|▉ | 6156/61904 [2:54:01<20:54:15, 1.35s/it] 10%|▉ | 6157/61904 [2:54:02<20:47:44, 1.34s/it] 10%|▉ | 6158/61904 [2:54:04<20:48:48, 1.34s/it] 10%|▉ | 6159/61904 [2:54:05<21:08:04, 1.36s/it] 10%|▉ | 6160/61904 [2:54:06<20:23:13, 1.32s/it] {'loss': 2.966, 'learning_rate': 1.9034098275638532e-07, 'epoch': 1.59} 10%|▉ | 6160/61904 [2:54:06<20:23:13, 1.32s/it] 10%|▉ | 6161/61904 [2:54:08<20:25:06, 1.32s/it] 10%|▉ | 6162/61904 [2:54:09<21:10:47, 1.37s/it] 10%|▉ | 6163/61904 [2:54:10<21:16:24, 1.37s/it] 10%|▉ | 6164/61904 [2:54:12<20:58:04, 1.35s/it] 10%|▉ | 6165/61904 [2:54:13<21:30:24, 1.39s/it] 10%|▉ | 6166/61904 [2:54:15<21:18:46, 1.38s/it] 10%|▉ | 6167/61904 [2:54:16<21:30:17, 1.39s/it] 10%|▉ | 6168/61904 [2:54:17<21:09:25, 1.37s/it] 10%|▉ | 6169/61904 [2:54:19<20:41:30, 1.34s/it] 10%|▉ | 6170/61904 [2:54:20<21:06:34, 1.36s/it] 10%|▉ | 6171/61904 [2:54:21<20:52:44, 1.35s/it] 10%|▉ | 6172/61904 [2:54:23<20:51:34, 1.35s/it] 10%|▉ | 6173/61904 [2:54:24<20:32:54, 1.33s/it] 10%|▉ | 6174/61904 [2:54:25<20:28:23, 1.32s/it] 10%|▉ | 6175/61904 [2:54:27<20:31:32, 1.33s/it] 10%|▉ | 6176/61904 [2:54:28<20:47:57, 1.34s/it] 10%|▉ | 6177/61904 [2:54:29<20:42:06, 1.34s/it] 10%|▉ | 6178/61904 [2:54:31<20:22:46, 1.32s/it] 10%|▉ | 6179/61904 [2:54:32<19:58:17, 1.29s/it] 10%|▉ | 6180/61904 [2:54:33<19:48:35, 1.28s/it] {'loss': 2.9414, 'learning_rate': 1.9030856994684298e-07, 'epoch': 1.6} 10%|▉ | 6180/61904 [2:54:33<19:48:35, 1.28s/it] 10%|▉ | 6181/61904 [2:54:34<20:30:48, 1.33s/it] 10%|▉ | 6182/61904 [2:54:36<20:54:09, 1.35s/it] 10%|▉ | 6183/61904 [2:54:37<20:58:42, 1.36s/it] 10%|▉ | 6184/61904 [2:54:39<20:56:56, 1.35s/it] 10%|▉ | 6185/61904 [2:54:40<20:49:03, 1.35s/it] 10%|▉ | 6186/61904 [2:54:41<20:22:00, 1.32s/it] 10%|▉ | 6187/61904 [2:54:43<20:36:07, 1.33s/it] 10%|▉ | 6188/61904 [2:54:44<20:30:17, 1.32s/it] 10%|▉ | 6189/61904 [2:54:45<20:19:51, 1.31s/it] 10%|▉ | 6190/61904 [2:54:46<20:16:26, 1.31s/it] 10%|█ | 6191/61904 [2:54:48<20:01:42, 1.29s/it] 10%|█ | 6192/61904 [2:54:49<20:05:57, 1.30s/it] 10%|█ | 6193/61904 [2:54:50<20:33:08, 1.33s/it] 10%|█ | 6194/61904 [2:54:52<20:07:16, 1.30s/it] 10%|█ | 6195/61904 [2:54:53<20:19:04, 1.31s/it] 10%|█ | 6196/61904 [2:54:54<21:03:23, 1.36s/it] 10%|█ | 6197/61904 [2:54:56<20:51:15, 1.35s/it] 10%|█ | 6198/61904 [2:54:57<21:08:22, 1.37s/it] 10%|█ | 6199/61904 [2:54:58<20:38:29, 1.33s/it] 10%|█ | 6200/61904 [2:55:00<20:33:11, 1.33s/it] {'loss': 2.9618, 'learning_rate': 1.9027615713730067e-07, 'epoch': 1.6} 10%|█ | 6200/61904 [2:55:00<20:33:11, 1.33s/it] 10%|█ | 6201/61904 [2:55:01<20:40:45, 1.34s/it] 10%|█ | 6202/61904 [2:55:02<20:49:39, 1.35s/it] 10%|█ | 6203/61904 [2:55:04<21:37:26, 1.40s/it] 10%|█ | 6204/61904 [2:55:05<21:59:57, 1.42s/it] 10%|█ | 6205/61904 [2:55:07<21:54:32, 1.42s/it] 10%|█ | 6206/61904 [2:55:08<22:01:37, 1.42s/it] 10%|█ | 6207/61904 [2:55:10<21:25:23, 1.38s/it] 10%|█ | 6208/61904 [2:55:11<21:37:58, 1.40s/it] 10%|█ | 6209/61904 [2:55:12<21:13:53, 1.37s/it] 10%|█ | 6210/61904 [2:55:14<21:14:01, 1.37s/it] 10%|█ | 6211/61904 [2:55:15<20:26:59, 1.32s/it] 10%|█ | 6212/61904 [2:55:16<21:10:14, 1.37s/it] 10%|█ | 6213/61904 [2:55:18<21:06:55, 1.36s/it] 10%|█ | 6214/61904 [2:55:19<21:27:30, 1.39s/it] 10%|█ | 6215/61904 [2:55:20<21:04:45, 1.36s/it] 10%|█ | 6216/61904 [2:55:22<21:11:47, 1.37s/it] 10%|█ | 6217/61904 [2:55:23<20:42:47, 1.34s/it] 10%|█ | 6218/61904 [2:55:25<21:17:59, 1.38s/it] 10%|█ | 6219/61904 [2:55:26<21:14:09, 1.37s/it] 10%|█ | 6220/61904 [2:55:27<20:54:05, 1.35s/it] {'loss': 2.9328, 'learning_rate': 1.902437443277583e-07, 'epoch': 1.61} 10%|█ | 6220/61904 [2:55:27<20:54:05, 1.35s/it] 10%|█ | 6221/61904 [2:55:29<21:12:10, 1.37s/it] 10%|█ | 6222/61904 [2:55:30<20:43:35, 1.34s/it] 10%|█ | 6223/61904 [2:55:31<20:33:07, 1.33s/it] 10%|█ | 6224/61904 [2:55:33<20:42:15, 1.34s/it] 10%|█ | 6225/61904 [2:55:34<21:36:44, 1.40s/it] 10%|█ | 6226/61904 [2:55:35<21:12:55, 1.37s/it] 10%|█ | 6227/61904 [2:55:37<20:51:00, 1.35s/it] 10%|█ | 6228/61904 [2:55:38<20:56:31, 1.35s/it] 10%|█ | 6229/61904 [2:55:39<20:23:55, 1.32s/it] 10%|█ | 6230/61904 [2:55:41<20:59:40, 1.36s/it] 10%|█ | 6231/61904 [2:55:42<20:19:06, 1.31s/it] 10%|█ | 6232/61904 [2:55:43<20:39:34, 1.34s/it] 10%|█ | 6233/61904 [2:55:45<20:13:43, 1.31s/it] 10%|█ | 6234/61904 [2:55:46<21:58:55, 1.42s/it] 10%|█ | 6235/61904 [2:55:48<22:12:13, 1.44s/it] 10%|█ | 6236/61904 [2:55:49<21:39:31, 1.40s/it] 10%|█ | 6237/61904 [2:55:50<21:06:01, 1.36s/it] 10%|█ | 6238/61904 [2:55:52<21:23:24, 1.38s/it] 10%|█ | 6239/61904 [2:55:53<21:40:31, 1.40s/it] 10%|█ | 6240/61904 [2:55:55<22:06:13, 1.43s/it] {'loss': 2.8855, 'learning_rate': 1.90211331518216e-07, 'epoch': 1.61} 10%|█ | 6240/61904 [2:55:55<22:06:13, 1.43s/it] 10%|█ | 6241/61904 [2:55:56<21:35:14, 1.40s/it] 10%|█ | 6242/61904 [2:55:58<21:45:36, 1.41s/it] 10%|█ | 6243/61904 [2:55:59<21:42:47, 1.40s/it] 10%|█ | 6244/61904 [2:56:00<21:36:33, 1.40s/it] 10%|█ | 6245/61904 [2:56:02<22:05:34, 1.43s/it] 10%|█ | 6246/61904 [2:56:03<22:22:42, 1.45s/it] 10%|█ | 6247/61904 [2:56:05<21:54:30, 1.42s/it] 10%|█ | 6248/61904 [2:56:06<21:26:41, 1.39s/it] 10%|█ | 6249/61904 [2:56:07<21:01:58, 1.36s/it] 10%|█ | 6250/61904 [2:56:09<20:36:20, 1.33s/it] 10%|█ | 6251/61904 [2:56:10<20:57:29, 1.36s/it] 10%|█ | 6252/61904 [2:56:11<21:46:04, 1.41s/it] 10%|█ | 6253/61904 [2:56:13<21:16:13, 1.38s/it] 10%|█ | 6254/61904 [2:56:14<21:25:12, 1.39s/it] 10%|█ | 6255/61904 [2:56:16<21:21:05, 1.38s/it] 10%|█ | 6256/61904 [2:56:17<21:16:44, 1.38s/it] 10%|█ | 6257/61904 [2:56:18<21:25:25, 1.39s/it] 10%|█ | 6258/61904 [2:56:20<21:13:44, 1.37s/it] 10%|█ | 6259/61904 [2:56:21<21:25:53, 1.39s/it] 10%|█ | 6260/61904 [2:56:22<21:25:28, 1.39s/it] {'loss': 2.9168, 'learning_rate': 1.9017891870867368e-07, 'epoch': 1.62} 10%|█ | 6260/61904 [2:56:22<21:25:28, 1.39s/it] 10%|█ | 6261/61904 [2:56:24<21:22:08, 1.38s/it] 10%|█ | 6262/61904 [2:56:25<20:38:49, 1.34s/it] 10%|█ | 6263/61904 [2:56:26<20:36:44, 1.33s/it] 10%|█ | 6264/61904 [2:56:28<21:59:09, 1.42s/it] 10%|█ | 6265/61904 [2:56:29<21:38:46, 1.40s/it] 10%|█ | 6266/61904 [2:56:31<21:22:16, 1.38s/it] 10%|█ | 6267/61904 [2:56:32<21:04:27, 1.36s/it] 10%|█ | 6268/61904 [2:56:33<20:29:54, 1.33s/it] 10%|█ | 6269/61904 [2:56:35<21:04:56, 1.36s/it] 10%|█ | 6270/61904 [2:56:36<21:27:14, 1.39s/it] 10%|█ | 6271/61904 [2:56:38<22:29:54, 1.46s/it] 10%|█ | 6272/61904 [2:56:39<23:09:42, 1.50s/it] 10%|█ | 6273/61904 [2:56:41<22:55:49, 1.48s/it] 10%|█ | 6274/61904 [2:56:42<22:33:32, 1.46s/it] 10%|█ | 6275/61904 [2:56:44<22:04:45, 1.43s/it] 10%|█ | 6276/61904 [2:56:45<22:28:14, 1.45s/it] 10%|█ | 6277/61904 [2:56:46<22:01:34, 1.43s/it] 10%|█ | 6278/61904 [2:56:48<21:26:26, 1.39s/it] 10%|█ | 6279/61904 [2:56:49<21:03:49, 1.36s/it] 10%|█ | 6280/61904 [2:56:50<20:52:42, 1.35s/it] {'loss': 2.9443, 'learning_rate': 1.9014650589913132e-07, 'epoch': 1.62} 10%|█ | 6280/61904 [2:56:50<20:52:42, 1.35s/it] 10%|█ | 6281/61904 [2:56:52<20:57:45, 1.36s/it] 10%|█ | 6282/61904 [2:56:53<20:51:39, 1.35s/it] 10%|█ | 6283/61904 [2:56:54<20:48:16, 1.35s/it] 10%|█ | 6284/61904 [2:56:56<21:41:51, 1.40s/it] 10%|█ | 6285/61904 [2:56:57<21:30:21, 1.39s/it] 10%|█ | 6286/61904 [2:56:59<21:35:54, 1.40s/it] 10%|█ | 6287/61904 [2:57:00<21:01:43, 1.36s/it] 10%|█ | 6288/61904 [2:57:01<20:58:24, 1.36s/it] 10%|█ | 6289/61904 [2:57:03<20:34:13, 1.33s/it] 10%|█ | 6290/61904 [2:57:04<20:42:21, 1.34s/it] 10%|█ | 6291/61904 [2:57:05<21:12:41, 1.37s/it] 10%|█ | 6292/61904 [2:57:07<20:59:52, 1.36s/it] 10%|█ | 6293/61904 [2:57:08<20:26:57, 1.32s/it] 10%|█ | 6294/61904 [2:57:09<20:31:40, 1.33s/it] 10%|█ | 6295/61904 [2:57:11<20:47:38, 1.35s/it] 10%|█ | 6296/61904 [2:57:12<20:24:44, 1.32s/it] 10%|█ | 6297/61904 [2:57:13<20:36:11, 1.33s/it] 10%|█ | 6298/61904 [2:57:15<21:04:13, 1.36s/it] 10%|█ | 6299/61904 [2:57:16<21:06:23, 1.37s/it] 10%|█ | 6300/61904 [2:57:18<20:56:29, 1.36s/it] {'loss': 2.9139, 'learning_rate': 1.90114093089589e-07, 'epoch': 1.63} 10%|█ | 6300/61904 [2:57:18<20:56:29, 1.36s/it] 10%|█ | 6301/61904 [2:57:19<20:54:59, 1.35s/it] 10%|█ | 6302/61904 [2:57:20<20:47:43, 1.35s/it] 10%|█ | 6303/61904 [2:57:21<20:32:46, 1.33s/it] 10%|█ | 6304/61904 [2:57:23<20:39:14, 1.34s/it] 10%|█ | 6305/61904 [2:57:24<20:38:02, 1.34s/it] 10%|█ | 6306/61904 [2:57:26<21:12:05, 1.37s/it] 10%|█ | 6307/61904 [2:57:27<21:26:41, 1.39s/it] 10%|█ | 6308/61904 [2:57:28<20:55:40, 1.36s/it] 10%|█ | 6309/61904 [2:57:30<21:16:18, 1.38s/it] 10%|█ | 6310/61904 [2:57:31<21:10:34, 1.37s/it] 10%|█ | 6311/61904 [2:57:32<20:46:49, 1.35s/it] 10%|█ | 6312/61904 [2:57:34<20:36:51, 1.33s/it] 10%|█ | 6313/61904 [2:57:35<20:27:37, 1.32s/it] 10%|█ | 6314/61904 [2:57:36<20:28:43, 1.33s/it] 10%|█ | 6315/61904 [2:57:38<20:56:52, 1.36s/it] 10%|█ | 6316/61904 [2:57:39<21:23:24, 1.39s/it] 10%|█ | 6317/61904 [2:57:41<21:28:20, 1.39s/it] 10%|█ | 6318/61904 [2:57:42<21:38:22, 1.40s/it] 10%|█ | 6319/61904 [2:57:43<21:32:04, 1.39s/it] 10%|█ | 6320/61904 [2:57:45<21:25:49, 1.39s/it] {'loss': 2.9343, 'learning_rate': 1.9008168028004667e-07, 'epoch': 1.63} 10%|█ | 6320/61904 [2:57:45<21:25:49, 1.39s/it] 10%|█ | 6321/61904 [2:57:46<21:06:33, 1.37s/it] 10%|█ | 6322/61904 [2:57:48<21:23:09, 1.39s/it] 10%|█ | 6323/61904 [2:57:49<21:00:10, 1.36s/it] 10%|█ | 6324/61904 [2:57:50<20:46:26, 1.35s/it] 10%|█ | 6325/61904 [2:57:51<20:44:02, 1.34s/it] 10%|█ | 6326/61904 [2:57:53<20:32:38, 1.33s/it] 10%|█ | 6327/61904 [2:57:54<20:54:56, 1.35s/it] 10%|█ | 6328/61904 [2:57:56<20:44:23, 1.34s/it] 10%|█ | 6329/61904 [2:57:57<20:29:04, 1.33s/it] 10%|█ | 6330/61904 [2:57:58<20:12:45, 1.31s/it] 10%|█ | 6331/61904 [2:58:00<20:46:09, 1.35s/it] 10%|█ | 6332/61904 [2:58:01<21:01:34, 1.36s/it] 10%|█ | 6333/61904 [2:58:02<21:02:03, 1.36s/it] 10%|█ | 6334/61904 [2:58:04<20:30:16, 1.33s/it] 10%|█ | 6335/61904 [2:58:05<20:37:00, 1.34s/it] 10%|█ | 6336/61904 [2:58:06<20:28:34, 1.33s/it] 10%|█ | 6337/61904 [2:58:08<20:37:16, 1.34s/it] 10%|█ | 6338/61904 [2:58:09<20:39:36, 1.34s/it] 10%|█ | 6339/61904 [2:58:10<21:27:38, 1.39s/it] 10%|█ | 6340/61904 [2:58:12<21:05:05, 1.37s/it] {'loss': 2.9279, 'learning_rate': 1.9004926747050433e-07, 'epoch': 1.64} 10%|█ | 6340/61904 [2:58:12<21:05:05, 1.37s/it] 10%|█ | 6341/61904 [2:58:13<21:18:03, 1.38s/it] 10%|█ | 6342/61904 [2:58:15<21:40:15, 1.40s/it] 10%|█ | 6343/61904 [2:58:16<21:30:43, 1.39s/it] 10%|█ | 6344/61904 [2:58:17<21:26:41, 1.39s/it] 10%|█ | 6345/61904 [2:58:19<21:21:01, 1.38s/it] 10%|█ | 6346/61904 [2:58:20<21:08:33, 1.37s/it] 10%|█ | 6347/61904 [2:58:21<20:44:34, 1.34s/it] 10%|█ | 6348/61904 [2:58:23<20:31:43, 1.33s/it] 10%|█ | 6349/61904 [2:58:24<20:59:32, 1.36s/it] 10%|█ | 6350/61904 [2:58:25<20:58:16, 1.36s/it] 10%|█ | 6351/61904 [2:58:27<20:57:12, 1.36s/it] 10%|█ | 6352/61904 [2:58:28<21:06:45, 1.37s/it] 10%|█ | 6353/61904 [2:58:30<21:34:38, 1.40s/it] 10%|█ | 6354/61904 [2:58:31<21:39:53, 1.40s/it] 10%|█ | 6355/61904 [2:58:32<21:14:09, 1.38s/it] 10%|█ | 6356/61904 [2:58:34<21:19:37, 1.38s/it] 10%|█ | 6357/61904 [2:58:35<20:52:50, 1.35s/it] 10%|█ | 6358/61904 [2:58:36<21:01:38, 1.36s/it] 10%|█ | 6359/61904 [2:58:38<20:46:41, 1.35s/it] 10%|█ | 6360/61904 [2:58:39<21:48:31, 1.41s/it] {'loss': 2.8902, 'learning_rate': 1.9001685466096202e-07, 'epoch': 1.64} 10%|█ | 6360/61904 [2:58:39<21:48:31, 1.41s/it] 10%|█ | 6361/61904 [2:58:41<21:45:57, 1.41s/it] 10%|█ | 6362/61904 [2:58:42<21:19:06, 1.38s/it] 10%|█ | 6363/61904 [2:58:43<21:08:53, 1.37s/it] 10%|█ | 6364/61904 [2:58:45<20:40:23, 1.34s/it] 10%|█ | 6365/61904 [2:58:46<20:30:21, 1.33s/it] 10%|█ | 6366/61904 [2:58:47<20:35:32, 1.33s/it] 10%|█ | 6367/61904 [2:58:49<20:55:04, 1.36s/it] 10%|█ | 6368/61904 [2:58:50<21:37:51, 1.40s/it] 10%|█ | 6369/61904 [2:58:52<21:49:28, 1.41s/it] 10%|█ | 6370/61904 [2:58:53<21:31:39, 1.40s/it] 10%|█ | 6371/61904 [2:58:54<21:04:06, 1.37s/it] 10%|█ | 6372/61904 [2:58:56<21:14:46, 1.38s/it] 10%|█ | 6373/61904 [2:58:57<21:38:13, 1.40s/it] 10%|█ | 6374/61904 [2:58:58<21:18:26, 1.38s/it] 10%|█ | 6375/61904 [2:59:00<21:40:44, 1.41s/it] 10%|█ | 6376/61904 [2:59:02<22:22:30, 1.45s/it] 10%|█ | 6377/61904 [2:59:03<21:50:17, 1.42s/it] 10%|█ | 6378/61904 [2:59:04<22:22:00, 1.45s/it] 10%|█ | 6379/61904 [2:59:06<21:55:53, 1.42s/it] 10%|█ | 6380/61904 [2:59:07<21:53:37, 1.42s/it] {'loss': 2.9188, 'learning_rate': 1.8998444185141968e-07, 'epoch': 1.65} 10%|█ | 6380/61904 [2:59:07<21:53:37, 1.42s/it] 10%|█ | 6381/61904 [2:59:09<21:38:50, 1.40s/it] 10%|█ | 6382/61904 [2:59:10<22:17:42, 1.45s/it] 10%|█ | 6383/61904 [2:59:11<22:09:22, 1.44s/it] 10%|█ | 6384/61904 [2:59:13<21:33:52, 1.40s/it] 10%|█ | 6385/61904 [2:59:14<21:23:45, 1.39s/it] 10%|█ | 6386/61904 [2:59:15<21:14:09, 1.38s/it] 10%|█ | 6387/61904 [2:59:17<20:47:23, 1.35s/it] 10%|█ | 6388/61904 [2:59:18<21:35:24, 1.40s/it] 10%|█ | 6389/61904 [2:59:20<21:49:52, 1.42s/it] 10%|█ | 6390/61904 [2:59:21<21:09:06, 1.37s/it] 10%|█ | 6391/61904 [2:59:22<21:32:09, 1.40s/it] 10%|█ | 6392/61904 [2:59:24<21:25:56, 1.39s/it] 10%|█ | 6393/61904 [2:59:25<20:46:48, 1.35s/it] 10%|█ | 6394/61904 [2:59:27<21:05:22, 1.37s/it] 10%|█ | 6395/61904 [2:59:28<20:47:13, 1.35s/it] 10%|█ | 6396/61904 [2:59:29<21:14:32, 1.38s/it] 10%|█ | 6397/61904 [2:59:31<21:47:56, 1.41s/it] 10%|█ | 6398/61904 [2:59:32<22:33:09, 1.46s/it] 10%|█ | 6399/61904 [2:59:34<23:14:31, 1.51s/it] 10%|█ | 6400/61904 [2:59:35<21:59:50, 1.43s/it] {'loss': 2.9314, 'learning_rate': 1.8995202904187734e-07, 'epoch': 1.65} 10%|█ | 6400/61904 [2:59:35<21:59:50, 1.43s/it] 10%|█ | 6401/61904 [2:59:37<23:25:50, 1.52s/it] 10%|█ | 6402/61904 [2:59:38<22:31:47, 1.46s/it] 10%|█ | 6403/61904 [2:59:40<22:37:04, 1.47s/it] 10%|█ | 6404/61904 [2:59:41<22:25:25, 1.45s/it] 10%|█ | 6405/61904 [2:59:43<22:03:06, 1.43s/it] 10%|█ | 6406/61904 [2:59:44<21:55:02, 1.42s/it] 10%|█ | 6407/61904 [2:59:45<21:44:45, 1.41s/it] 10%|█ | 6408/61904 [2:59:47<21:35:56, 1.40s/it] 10%|█ | 6409/61904 [2:59:48<21:31:58, 1.40s/it] 10%|█ | 6410/61904 [2:59:49<21:25:55, 1.39s/it] 10%|█ | 6411/61904 [2:59:51<22:16:21, 1.44s/it] 10%|█ | 6412/61904 [2:59:52<21:23:05, 1.39s/it] 10%|█ | 6413/61904 [2:59:54<21:13:43, 1.38s/it] 10%|█ | 6414/61904 [2:59:55<21:52:36, 1.42s/it] 10%|█ | 6415/61904 [2:59:56<21:06:46, 1.37s/it] 10%|█ | 6416/61904 [2:59:58<21:09:24, 1.37s/it] 10%|█ | 6417/61904 [2:59:59<21:08:43, 1.37s/it] 10%|█ | 6418/61904 [3:00:01<21:18:52, 1.38s/it] 10%|█ | 6419/61904 [3:00:02<21:21:27, 1.39s/it] 10%|█ | 6420/61904 [3:00:03<20:41:14, 1.34s/it] {'loss': 2.9729, 'learning_rate': 1.8991961623233503e-07, 'epoch': 1.66} 10%|█ | 6420/61904 [3:00:03<20:41:14, 1.34s/it] 10%|█ | 6421/61904 [3:00:05<20:38:18, 1.34s/it] 10%|█ | 6422/61904 [3:00:06<22:02:41, 1.43s/it] 10%|█ | 6423/61904 [3:00:08<21:57:42, 1.43s/it] 10%|█ | 6424/61904 [3:00:09<21:38:33, 1.40s/it] 10%|█ | 6425/61904 [3:00:10<20:48:41, 1.35s/it] 10%|█ | 6426/61904 [3:00:12<20:54:15, 1.36s/it] 10%|█ | 6427/61904 [3:00:13<21:25:03, 1.39s/it] 10%|█ | 6428/61904 [3:00:14<21:21:43, 1.39s/it] 10%|█ | 6429/61904 [3:00:16<22:21:09, 1.45s/it] 10%|█ | 6430/61904 [3:00:17<21:42:02, 1.41s/it] 10%|█ | 6431/61904 [3:00:19<21:50:06, 1.42s/it] 10%|█ | 6432/61904 [3:00:20<22:10:18, 1.44s/it] 10%|█ | 6433/61904 [3:00:22<22:22:34, 1.45s/it] 10%|█ | 6434/61904 [3:00:23<22:04:31, 1.43s/it] 10%|█ | 6435/61904 [3:00:25<22:11:22, 1.44s/it] 10%|█ | 6436/61904 [3:00:26<21:30:52, 1.40s/it] 10%|█ | 6437/61904 [3:00:27<21:35:43, 1.40s/it] 10%|█ | 6438/61904 [3:00:29<21:50:07, 1.42s/it] 10%|█ | 6439/61904 [3:00:30<21:25:01, 1.39s/it] 10%|█ | 6440/61904 [3:00:31<21:35:14, 1.40s/it] {'loss': 2.9336, 'learning_rate': 1.8988720342279266e-07, 'epoch': 1.66} 10%|█ | 6440/61904 [3:00:31<21:35:14, 1.40s/it] 10%|█ | 6441/61904 [3:00:33<21:43:44, 1.41s/it] 10%|█ | 6442/61904 [3:00:34<21:20:29, 1.39s/it] 10%|█ | 6443/61904 [3:00:36<21:45:50, 1.41s/it] 10%|█ | 6444/61904 [3:00:37<21:45:29, 1.41s/it] 10%|█ | 6445/61904 [3:00:38<21:23:55, 1.39s/it] 10%|█ | 6446/61904 [3:00:40<21:53:33, 1.42s/it] 10%|█ | 6447/61904 [3:00:41<21:31:55, 1.40s/it] 10%|█ | 6448/61904 [3:00:43<21:00:01, 1.36s/it] 10%|█ | 6449/61904 [3:00:44<21:03:11, 1.37s/it] 10%|█ | 6450/61904 [3:00:45<21:26:23, 1.39s/it] 10%|█ | 6451/61904 [3:00:47<21:29:53, 1.40s/it] 10%|█ | 6452/61904 [3:00:48<21:30:38, 1.40s/it] 10%|█ | 6453/61904 [3:00:50<22:28:26, 1.46s/it] 10%|█ | 6454/61904 [3:00:51<21:57:14, 1.43s/it] 10%|█ | 6455/61904 [3:00:52<21:25:30, 1.39s/it] 10%|█ | 6456/61904 [3:00:54<21:32:14, 1.40s/it] 10%|█ | 6457/61904 [3:00:55<22:02:46, 1.43s/it] 10%|█ | 6458/61904 [3:00:57<21:50:10, 1.42s/it] 10%|█ | 6459/61904 [3:00:58<21:54:09, 1.42s/it] 10%|█ | 6460/61904 [3:01:00<21:39:27, 1.41s/it] {'loss': 2.9242, 'learning_rate': 1.8985479061325035e-07, 'epoch': 1.67} 10%|█ | 6460/61904 [3:01:00<21:39:27, 1.41s/it] 10%|█ | 6461/61904 [3:01:01<22:44:08, 1.48s/it] 10%|█ | 6462/61904 [3:01:03<21:59:07, 1.43s/it] 10%|█ | 6463/61904 [3:01:04<22:08:47, 1.44s/it] 10%|█ | 6464/61904 [3:01:05<21:35:41, 1.40s/it] 10%|█ | 6465/61904 [3:01:07<21:26:32, 1.39s/it] 10%|█ | 6466/61904 [3:01:08<21:13:07, 1.38s/it] 10%|█ | 6467/61904 [3:01:09<20:54:00, 1.36s/it] 10%|█ | 6468/61904 [3:01:11<21:07:13, 1.37s/it] 10%|█ | 6469/61904 [3:01:12<21:36:09, 1.40s/it] 10%|█ | 6470/61904 [3:01:14<22:01:50, 1.43s/it] 10%|█ | 6471/61904 [3:01:15<22:06:36, 1.44s/it] 10%|█ | 6472/61904 [3:01:16<21:33:42, 1.40s/it] 10%|█ | 6473/61904 [3:01:18<21:35:05, 1.40s/it] 10%|█ | 6474/61904 [3:01:19<21:29:23, 1.40s/it] 10%|█ | 6475/61904 [3:01:21<21:19:37, 1.39s/it] 10%|█ | 6476/61904 [3:01:22<21:10:33, 1.38s/it] 10%|█ | 6477/61904 [3:01:23<21:03:04, 1.37s/it] 10%|█ | 6478/61904 [3:01:25<21:24:52, 1.39s/it] 10%|█ | 6479/61904 [3:01:26<21:33:02, 1.40s/it] 10%|█ | 6480/61904 [3:01:28<21:14:08, 1.38s/it] {'loss': 3.0066, 'learning_rate': 1.8982237780370804e-07, 'epoch': 1.67} 10%|█ | 6480/61904 [3:01:28<21:14:08, 1.38s/it] 10%|█ | 6481/61904 [3:01:29<21:24:18, 1.39s/it] 10%|█ | 6482/61904 [3:01:30<21:11:12, 1.38s/it] 10%|█ | 6483/61904 [3:01:31<20:29:29, 1.33s/it] 10%|█ | 6484/61904 [3:01:33<20:11:10, 1.31s/it] 10%|█ | 6485/61904 [3:01:34<20:39:19, 1.34s/it] 10%|█ | 6486/61904 [3:01:36<20:55:46, 1.36s/it] 10%|█ | 6487/61904 [3:01:37<21:21:37, 1.39s/it] 10%|█ | 6488/61904 [3:01:38<21:44:11, 1.41s/it] 10%|█ | 6489/61904 [3:01:40<21:36:14, 1.40s/it] 10%|█ | 6490/61904 [3:01:41<21:12:16, 1.38s/it] 10%|█ | 6491/61904 [3:01:43<21:18:05, 1.38s/it] 10%|█ | 6492/61904 [3:01:44<21:26:34, 1.39s/it] 10%|█ | 6493/61904 [3:01:45<21:19:38, 1.39s/it] 10%|█ | 6494/61904 [3:01:47<21:42:10, 1.41s/it] 10%|█ | 6495/61904 [3:01:48<21:19:13, 1.39s/it] 10%|█ | 6496/61904 [3:01:50<21:13:34, 1.38s/it] 10%|█ | 6497/61904 [3:01:51<21:23:34, 1.39s/it] 10%|█ | 6498/61904 [3:01:52<20:59:12, 1.36s/it] 10%|█ | 6499/61904 [3:01:54<20:39:46, 1.34s/it] 11%|█ | 6500/61904 [3:01:55<20:22:53, 1.32s/it] {'loss': 2.9272, 'learning_rate': 1.8978996499416568e-07, 'epoch': 1.68} 11%|█ | 6500/61904 [3:01:55<20:22:53, 1.32s/it] 11%|█ | 6501/61904 [3:01:56<21:35:38, 1.40s/it] 11%|█ | 6502/61904 [3:01:58<22:04:59, 1.43s/it] 11%|█ | 6503/61904 [3:01:59<21:59:08, 1.43s/it] 11%|█ | 6504/61904 [3:02:01<21:24:07, 1.39s/it] 11%|█ | 6505/61904 [3:02:02<21:31:15, 1.40s/it] 11%|█ | 6506/61904 [3:02:03<21:10:50, 1.38s/it] 11%|█ | 6507/61904 [3:02:05<21:33:53, 1.40s/it] 11%|█ | 6508/61904 [3:02:06<20:54:40, 1.36s/it] 11%|█ | 6509/61904 [3:02:07<20:48:22, 1.35s/it] 11%|█ | 6510/61904 [3:02:09<21:02:43, 1.37s/it] 11%|█ | 6511/61904 [3:02:10<21:15:49, 1.38s/it] 11%|█ | 6512/61904 [3:02:12<21:10:31, 1.38s/it] 11%|█ | 6513/61904 [3:02:13<21:11:12, 1.38s/it] 11%|█ | 6514/61904 [3:02:15<21:53:03, 1.42s/it] 11%|█ | 6515/61904 [3:02:16<21:39:27, 1.41s/it] 11%|█ | 6516/61904 [3:02:17<21:15:50, 1.38s/it] 11%|█ | 6517/61904 [3:02:19<21:28:50, 1.40s/it] 11%|█ | 6518/61904 [3:02:20<20:42:49, 1.35s/it] 11%|█ | 6519/61904 [3:02:21<21:19:27, 1.39s/it] 11%|█ | 6520/61904 [3:02:23<21:24:39, 1.39s/it] {'loss': 2.8841, 'learning_rate': 1.8975755218462336e-07, 'epoch': 1.68} 11%|█ | 6520/61904 [3:02:23<21:24:39, 1.39s/it] 11%|█ | 6521/61904 [3:02:24<21:33:09, 1.40s/it] 11%|█ | 6522/61904 [3:02:26<21:38:36, 1.41s/it] 11%|█ | 6523/61904 [3:02:27<21:46:11, 1.42s/it] 11%|█ | 6524/61904 [3:02:28<21:43:55, 1.41s/it] 11%|█ | 6525/61904 [3:02:30<20:56:10, 1.36s/it] 11%|█ | 6526/61904 [3:02:31<21:39:37, 1.41s/it] 11%|█ | 6527/61904 [3:02:33<22:02:13, 1.43s/it] 11%|█ | 6528/61904 [3:02:34<21:41:57, 1.41s/it] 11%|█ | 6529/61904 [3:02:35<21:23:04, 1.39s/it] 11%|█ | 6530/61904 [3:02:37<21:34:00, 1.40s/it] 11%|█ | 6531/61904 [3:02:38<21:30:12, 1.40s/it] 11%|█ | 6532/61904 [3:02:40<22:02:39, 1.43s/it] 11%|█ | 6533/61904 [3:02:41<21:35:00, 1.40s/it] 11%|█ | 6534/61904 [3:02:43<21:49:34, 1.42s/it] 11%|█ | 6535/61904 [3:02:44<21:36:14, 1.40s/it] 11%|█ | 6536/61904 [3:02:45<21:08:11, 1.37s/it] 11%|█ | 6537/61904 [3:02:46<20:43:17, 1.35s/it] 11%|█ | 6538/61904 [3:02:48<21:05:55, 1.37s/it] 11%|█ | 6539/61904 [3:02:49<20:39:12, 1.34s/it] 11%|█ | 6540/61904 [3:02:51<20:41:07, 1.35s/it] {'loss': 2.9365, 'learning_rate': 1.8972513937508103e-07, 'epoch': 1.69} 11%|█ | 6540/61904 [3:02:51<20:41:07, 1.35s/it] 11%|█ | 6541/61904 [3:02:52<21:24:55, 1.39s/it] 11%|█ | 6542/61904 [3:02:54<22:00:30, 1.43s/it] 11%|█ | 6543/61904 [3:02:55<22:09:29, 1.44s/it] 11%|█ | 6544/61904 [3:02:56<21:58:23, 1.43s/it] 11%|█ | 6545/61904 [3:02:58<21:55:42, 1.43s/it] 11%|█ | 6546/61904 [3:02:59<21:29:12, 1.40s/it] 11%|█ | 6547/61904 [3:03:01<21:27:09, 1.40s/it] 11%|█ | 6548/61904 [3:03:02<21:12:01, 1.38s/it] 11%|█ | 6549/61904 [3:03:03<20:30:46, 1.33s/it] 11%|█ | 6550/61904 [3:03:05<20:40:39, 1.34s/it] 11%|█ | 6551/61904 [3:03:06<21:24:47, 1.39s/it] 11%|█ | 6552/61904 [3:03:07<21:11:47, 1.38s/it] 11%|█ | 6553/61904 [3:03:09<21:41:25, 1.41s/it] 11%|█ | 6554/61904 [3:03:10<20:59:14, 1.37s/it] 11%|█ | 6555/61904 [3:03:11<20:33:06, 1.34s/it] 11%|█ | 6556/61904 [3:03:13<20:33:51, 1.34s/it] 11%|█ | 6557/61904 [3:03:14<20:33:22, 1.34s/it] 11%|█ | 6558/61904 [3:03:15<20:37:10, 1.34s/it] 11%|█ | 6559/61904 [3:03:17<21:12:22, 1.38s/it] 11%|█ | 6560/61904 [3:03:18<21:14:41, 1.38s/it] {'loss': 2.9948, 'learning_rate': 1.896927265655387e-07, 'epoch': 1.7} 11%|█ | 6560/61904 [3:03:18<21:14:41, 1.38s/it] 11%|█ | 6561/61904 [3:03:20<21:11:43, 1.38s/it] 11%|█ | 6562/61904 [3:03:21<21:11:03, 1.38s/it] 11%|█ | 6563/61904 [3:03:22<20:51:17, 1.36s/it] 11%|█ | 6564/61904 [3:03:24<20:57:20, 1.36s/it] 11%|█ | 6565/61904 [3:03:25<22:14:30, 1.45s/it] 11%|█ | 6566/61904 [3:03:27<21:38:58, 1.41s/it] 11%|█ | 6567/61904 [3:03:28<21:52:21, 1.42s/it] 11%|█ | 6568/61904 [3:03:29<21:30:28, 1.40s/it] 11%|█ | 6569/61904 [3:03:31<21:23:59, 1.39s/it] 11%|█ | 6570/61904 [3:03:32<21:06:54, 1.37s/it] 11%|█ | 6571/61904 [3:03:33<20:52:33, 1.36s/it] 11%|█ | 6572/61904 [3:03:35<21:46:13, 1.42s/it] 11%|█ | 6573/61904 [3:03:36<21:40:32, 1.41s/it] 11%|█ | 6574/61904 [3:03:38<21:12:41, 1.38s/it] 11%|█ | 6575/61904 [3:03:39<21:05:42, 1.37s/it] 11%|█ | 6576/61904 [3:03:40<20:41:38, 1.35s/it] 11%|█ | 6577/61904 [3:03:42<21:26:55, 1.40s/it] 11%|█ | 6578/61904 [3:03:43<20:54:44, 1.36s/it] 11%|█ | 6579/61904 [3:03:45<20:52:33, 1.36s/it] 11%|█ | 6580/61904 [3:03:46<20:54:53, 1.36s/it] {'loss': 2.9, 'learning_rate': 1.8966031375599638e-07, 'epoch': 1.7} 11%|█ | 6580/61904 [3:03:46<20:54:53, 1.36s/it] 11%|█ | 6581/61904 [3:03:47<20:43:31, 1.35s/it] 11%|█ | 6582/61904 [3:03:49<21:31:29, 1.40s/it] 11%|█ | 6583/61904 [3:03:50<21:38:46, 1.41s/it] 11%|█ | 6584/61904 [3:03:52<21:57:32, 1.43s/it] 11%|█ | 6585/61904 [3:03:53<21:23:00, 1.39s/it] 11%|█ | 6586/61904 [3:03:54<21:18:20, 1.39s/it] 11%|█ | 6587/61904 [3:03:56<21:24:07, 1.39s/it] 11%|█ | 6588/61904 [3:03:57<20:59:33, 1.37s/it] 11%|█ | 6589/61904 [3:03:58<21:29:06, 1.40s/it] 11%|█ | 6590/61904 [3:04:00<21:14:24, 1.38s/it] 11%|█ | 6591/61904 [3:04:01<21:24:00, 1.39s/it] 11%|█ | 6592/61904 [3:04:03<21:15:36, 1.38s/it] 11%|█ | 6593/61904 [3:04:04<21:23:04, 1.39s/it] 11%|█ | 6594/61904 [3:04:05<21:31:49, 1.40s/it] 11%|█ | 6595/61904 [3:04:07<21:34:58, 1.40s/it] 11%|█ | 6596/61904 [3:04:08<20:49:24, 1.36s/it] 11%|█ | 6597/61904 [3:04:09<20:57:43, 1.36s/it] 11%|█ | 6598/61904 [3:04:11<20:47:39, 1.35s/it] 11%|█ | 6599/61904 [3:04:12<20:46:15, 1.35s/it] 11%|█ | 6600/61904 [3:04:14<20:55:20, 1.36s/it] {'loss': 2.9318, 'learning_rate': 1.8962790094645404e-07, 'epoch': 1.71} 11%|█ | 6600/61904 [3:04:14<20:55:20, 1.36s/it] 11%|█ | 6601/61904 [3:04:15<20:22:30, 1.33s/it] 11%|█ | 6602/61904 [3:04:16<20:57:05, 1.36s/it] 11%|█ | 6603/61904 [3:04:18<21:00:08, 1.37s/it] 11%|█ | 6604/61904 [3:04:19<20:41:56, 1.35s/it] 11%|█ | 6605/61904 [3:04:20<20:21:38, 1.33s/it] 11%|█ | 6606/61904 [3:04:21<20:07:50, 1.31s/it] 11%|█ | 6607/61904 [3:04:23<20:15:50, 1.32s/it] 11%|█ | 6608/61904 [3:04:24<20:06:54, 1.31s/it] 11%|█ | 6609/61904 [3:04:25<19:55:07, 1.30s/it] 11%|█ | 6610/61904 [3:04:27<20:06:28, 1.31s/it] 11%|█ | 6611/61904 [3:04:28<19:42:32, 1.28s/it] 11%|█ | 6612/61904 [3:04:29<19:42:37, 1.28s/it] 11%|█ | 6613/61904 [3:04:31<20:00:52, 1.30s/it] 11%|█ | 6614/61904 [3:04:32<20:11:29, 1.31s/it] 11%|█ | 6615/61904 [3:04:33<20:42:53, 1.35s/it] 11%|█ | 6616/61904 [3:04:35<20:23:10, 1.33s/it] 11%|█ | 6617/61904 [3:04:36<21:00:51, 1.37s/it] 11%|█ | 6618/61904 [3:04:38<21:37:24, 1.41s/it] 11%|█ | 6619/61904 [3:04:39<21:47:19, 1.42s/it] 11%|█ | 6620/61904 [3:04:40<21:07:10, 1.38s/it] {'loss': 2.8761, 'learning_rate': 1.895954881369117e-07, 'epoch': 1.71} 11%|█ | 6620/61904 [3:04:40<21:07:10, 1.38s/it] 11%|█ | 6621/61904 [3:04:42<20:55:46, 1.36s/it] 11%|█ | 6622/61904 [3:04:43<20:55:41, 1.36s/it] 11%|█ | 6623/61904 [3:04:44<20:29:58, 1.33s/it] 11%|█ | 6624/61904 [3:04:46<20:24:14, 1.33s/it] 11%|█ | 6625/61904 [3:04:47<20:21:35, 1.33s/it] 11%|█ | 6626/61904 [3:04:48<20:44:36, 1.35s/it] 11%|█ | 6627/61904 [3:04:50<20:15:25, 1.32s/it] 11%|█ | 6628/61904 [3:04:51<21:04:18, 1.37s/it] 11%|█ | 6629/61904 [3:04:53<21:56:14, 1.43s/it] 11%|█ | 6630/61904 [3:04:54<22:18:30, 1.45s/it] 11%|█ | 6631/61904 [3:04:56<22:15:49, 1.45s/it] 11%|█ | 6632/61904 [3:04:57<21:51:51, 1.42s/it] 11%|█ | 6633/61904 [3:04:58<21:19:17, 1.39s/it] 11%|█ | 6634/61904 [3:05:00<20:51:40, 1.36s/it] 11%|█ | 6635/61904 [3:05:01<20:55:26, 1.36s/it] 11%|█ | 6636/61904 [3:05:02<21:14:06, 1.38s/it] 11%|█ | 6637/61904 [3:05:04<20:54:42, 1.36s/it] 11%|█ | 6638/61904 [3:05:05<20:46:34, 1.35s/it] 11%|█ | 6639/61904 [3:05:06<21:11:05, 1.38s/it] 11%|█ | 6640/61904 [3:05:08<20:27:57, 1.33s/it] {'loss': 2.9628, 'learning_rate': 1.895630753273694e-07, 'epoch': 1.72} 11%|█ | 6640/61904 [3:05:08<20:27:57, 1.33s/it] 11%|█ | 6641/61904 [3:05:09<20:50:22, 1.36s/it] 11%|█ | 6642/61904 [3:05:10<20:23:12, 1.33s/it] 11%|█ | 6643/61904 [3:05:12<21:21:45, 1.39s/it] 11%|█ | 6644/61904 [3:05:13<20:37:44, 1.34s/it] 11%|█ | 6645/61904 [3:05:14<20:47:51, 1.35s/it] 11%|█ | 6646/61904 [3:05:16<20:40:44, 1.35s/it] 11%|█ | 6647/61904 [3:05:17<21:39:40, 1.41s/it] 11%|█ | 6648/61904 [3:05:19<21:36:44, 1.41s/it] 11%|█ | 6649/61904 [3:05:20<21:15:22, 1.38s/it] 11%|█ | 6650/61904 [3:05:21<21:10:02, 1.38s/it] 11%|█ | 6651/61904 [3:05:23<21:13:04, 1.38s/it] 11%|█ | 6652/61904 [3:05:24<21:31:38, 1.40s/it] 11%|█ | 6653/61904 [3:05:26<22:08:28, 1.44s/it] 11%|█ | 6654/61904 [3:05:27<20:59:52, 1.37s/it] 11%|█ | 6655/61904 [3:05:28<20:31:40, 1.34s/it] 11%|█ | 6656/61904 [3:05:30<20:39:15, 1.35s/it] 11%|█ | 6657/61904 [3:05:31<20:34:24, 1.34s/it] 11%|█ | 6658/61904 [3:05:32<20:43:17, 1.35s/it] 11%|█ | 6659/61904 [3:05:34<21:16:26, 1.39s/it] 11%|█ | 6660/61904 [3:05:35<20:50:49, 1.36s/it] {'loss': 2.9114, 'learning_rate': 1.8953066251782702e-07, 'epoch': 1.72} 11%|█ | 6660/61904 [3:05:35<20:50:49, 1.36s/it] 11%|█ | 6661/61904 [3:05:37<21:14:49, 1.38s/it] 11%|█ | 6662/61904 [3:05:38<21:03:58, 1.37s/it] 11%|█ | 6663/61904 [3:05:39<21:13:24, 1.38s/it] 11%|█ | 6664/61904 [3:05:41<21:29:22, 1.40s/it] 11%|█ | 6665/61904 [3:05:42<20:51:26, 1.36s/it] 11%|█ | 6666/61904 [3:05:43<21:13:34, 1.38s/it] 11%|█ | 6667/61904 [3:05:45<20:52:04, 1.36s/it] 11%|█ | 6668/61904 [3:05:46<20:42:47, 1.35s/it] 11%|█ | 6669/61904 [3:05:47<20:42:37, 1.35s/it] 11%|█ | 6670/61904 [3:05:49<20:34:36, 1.34s/it] 11%|█ | 6671/61904 [3:05:50<21:12:32, 1.38s/it] 11%|█ | 6672/61904 [3:05:52<20:50:52, 1.36s/it] 11%|█ | 6673/61904 [3:05:53<20:24:51, 1.33s/it] 11%|█ | 6674/61904 [3:05:54<21:07:46, 1.38s/it] 11%|█ | 6675/61904 [3:05:56<21:02:19, 1.37s/it] 11%|█ | 6676/61904 [3:05:57<21:07:31, 1.38s/it] 11%|█ | 6677/61904 [3:05:58<21:05:45, 1.38s/it] 11%|█ | 6678/61904 [3:06:00<21:31:44, 1.40s/it] 11%|█ | 6679/61904 [3:06:01<21:42:44, 1.42s/it] 11%|█ | 6680/61904 [3:06:03<21:25:01, 1.40s/it] {'loss': 2.9338, 'learning_rate': 1.894982497082847e-07, 'epoch': 1.73} 11%|█ | 6680/61904 [3:06:03<21:25:01, 1.40s/it] 11%|█ | 6681/61904 [3:06:04<21:01:51, 1.37s/it] 11%|█ | 6682/61904 [3:06:05<20:58:25, 1.37s/it] 11%|█ | 6683/61904 [3:06:07<20:25:57, 1.33s/it] 11%|█ | 6684/61904 [3:06:08<20:50:16, 1.36s/it] 11%|█ | 6685/61904 [3:06:09<20:40:33, 1.35s/it] 11%|█ | 6686/61904 [3:06:11<20:34:59, 1.34s/it] 11%|█ | 6687/61904 [3:06:12<20:47:44, 1.36s/it] 11%|█ | 6688/61904 [3:06:13<20:45:54, 1.35s/it] 11%|█ | 6689/61904 [3:06:15<20:30:01, 1.34s/it] 11%|█ | 6690/61904 [3:06:16<20:49:50, 1.36s/it] 11%|█ | 6691/61904 [3:06:18<21:15:43, 1.39s/it] 11%|█ | 6692/61904 [3:06:19<22:03:14, 1.44s/it] 11%|█ | 6693/61904 [3:06:20<21:42:18, 1.42s/it] 11%|█ | 6694/61904 [3:06:22<21:16:19, 1.39s/it] 11%|█ | 6695/61904 [3:06:23<21:25:50, 1.40s/it] 11%|█ | 6696/61904 [3:06:25<20:58:15, 1.37s/it] 11%|█ | 6697/61904 [3:06:26<21:43:29, 1.42s/it] 11%|█ | 6698/61904 [3:06:27<21:04:34, 1.37s/it] 11%|█ | 6699/61904 [3:06:29<20:25:41, 1.33s/it] 11%|█ | 6700/61904 [3:06:30<20:12:30, 1.32s/it] {'loss': 2.9956, 'learning_rate': 1.8946583689874237e-07, 'epoch': 1.73} 11%|█ | 6700/61904 [3:06:30<20:12:30, 1.32s/it] 11%|█ | 6701/61904 [3:06:31<20:46:50, 1.36s/it] 11%|█ | 6702/61904 [3:06:33<20:46:55, 1.36s/it] 11%|█ | 6703/61904 [3:06:34<21:16:57, 1.39s/it] 11%|█ | 6704/61904 [3:06:35<21:05:53, 1.38s/it] 11%|█ | 6705/61904 [3:06:37<21:05:23, 1.38s/it] 11%|█ | 6706/61904 [3:06:38<21:12:56, 1.38s/it] 11%|█ | 6707/61904 [3:06:40<21:07:34, 1.38s/it] 11%|█ | 6708/61904 [3:06:41<21:10:35, 1.38s/it] 11%|█ | 6709/61904 [3:06:42<21:01:08, 1.37s/it] 11%|█ | 6710/61904 [3:06:44<21:06:41, 1.38s/it] 11%|█ | 6711/61904 [3:06:45<21:25:55, 1.40s/it] 11%|█ | 6712/61904 [3:06:47<21:14:52, 1.39s/it] 11%|█ | 6713/61904 [3:06:48<20:26:43, 1.33s/it] 11%|█ | 6714/61904 [3:06:49<20:30:57, 1.34s/it] 11%|█ | 6715/61904 [3:06:50<19:59:48, 1.30s/it] 11%|█ | 6716/61904 [3:06:52<20:58:16, 1.37s/it] 11%|█ | 6717/61904 [3:06:53<20:15:38, 1.32s/it] 11%|█ | 6718/61904 [3:06:54<19:50:23, 1.29s/it] 11%|█ | 6719/61904 [3:06:56<20:11:35, 1.32s/it] 11%|█ | 6720/61904 [3:06:57<20:30:27, 1.34s/it] {'loss': 2.9089, 'learning_rate': 1.8943342408920004e-07, 'epoch': 1.74} 11%|█ | 6720/61904 [3:06:57<20:30:27, 1.34s/it] 11%|█ | 6721/61904 [3:06:58<20:07:09, 1.31s/it] 11%|█ | 6722/61904 [3:07:00<20:17:36, 1.32s/it] 11%|█ | 6723/61904 [3:07:01<19:41:45, 1.28s/it] 11%|█ | 6724/61904 [3:07:02<19:39:30, 1.28s/it] 11%|█ | 6725/61904 [3:07:03<19:43:16, 1.29s/it] 11%|█ | 6726/61904 [3:07:05<20:16:38, 1.32s/it] 11%|█ | 6727/61904 [3:07:06<20:06:30, 1.31s/it] 11%|█ | 6728/61904 [3:07:07<20:22:12, 1.33s/it] 11%|█ | 6729/61904 [3:07:09<20:41:20, 1.35s/it] 11%|█ | 6730/61904 [3:07:10<20:06:08, 1.31s/it] 11%|█ | 6731/61904 [3:07:11<20:29:34, 1.34s/it] 11%|█ | 6732/61904 [3:07:13<19:59:41, 1.30s/it] 11%|█ | 6733/61904 [3:07:14<20:13:28, 1.32s/it] 11%|█ | 6734/61904 [3:07:15<20:24:11, 1.33s/it] 11%|█ | 6735/61904 [3:07:17<20:28:53, 1.34s/it] 11%|█ | 6736/61904 [3:07:18<20:18:20, 1.33s/it] 11%|█ | 6737/61904 [3:07:19<20:34:58, 1.34s/it] 11%|█ | 6738/61904 [3:07:21<21:10:05, 1.38s/it] 11%|█ | 6739/61904 [3:07:22<20:51:55, 1.36s/it] 11%|█ | 6740/61904 [3:07:24<21:23:07, 1.40s/it] {'loss': 2.9213, 'learning_rate': 1.8940101127965772e-07, 'epoch': 1.74} 11%|█ | 6740/61904 [3:07:24<21:23:07, 1.40s/it] 11%|█ | 6741/61904 [3:07:25<21:02:01, 1.37s/it] 11%|█ | 6742/61904 [3:07:26<21:22:16, 1.39s/it] 11%|█ | 6743/61904 [3:07:28<21:23:41, 1.40s/it] 11%|█ | 6744/61904 [3:07:29<20:38:27, 1.35s/it] 11%|█ | 6745/61904 [3:07:31<21:02:30, 1.37s/it] 11%|█ | 6746/61904 [3:07:32<21:01:31, 1.37s/it] 11%|█ | 6747/61904 [3:07:33<21:28:06, 1.40s/it] 11%|█ | 6748/61904 [3:07:35<21:24:48, 1.40s/it] 11%|█ | 6749/61904 [3:07:36<21:21:15, 1.39s/it] 11%|█ | 6750/61904 [3:07:38<21:09:27, 1.38s/it] 11%|█ | 6751/61904 [3:07:39<20:32:08, 1.34s/it] 11%|█ | 6752/61904 [3:07:40<20:43:01, 1.35s/it] 11%|█ | 6753/61904 [3:07:41<20:39:33, 1.35s/it] 11%|█ | 6754/61904 [3:07:43<21:01:26, 1.37s/it] 11%|█ | 6755/61904 [3:07:44<20:56:31, 1.37s/it] 11%|█ | 6756/61904 [3:07:46<20:32:25, 1.34s/it] 11%|█ | 6757/61904 [3:07:47<20:03:32, 1.31s/it] 11%|█ | 6758/61904 [3:07:48<20:17:28, 1.32s/it] 11%|█ | 6759/61904 [3:07:49<20:13:37, 1.32s/it] 11%|█ | 6760/61904 [3:07:51<20:03:58, 1.31s/it] {'loss': 2.9461, 'learning_rate': 1.8936859847011539e-07, 'epoch': 1.75} 11%|█ | 6760/61904 [3:07:51<20:03:58, 1.31s/it] 11%|█ | 6761/61904 [3:07:52<20:20:32, 1.33s/it] 11%|█ | 6762/61904 [3:07:53<19:47:34, 1.29s/it] 11%|█ | 6763/61904 [3:07:55<19:48:38, 1.29s/it] 11%|█ | 6764/61904 [3:07:56<19:48:48, 1.29s/it] 11%|█ | 6765/61904 [3:07:57<19:59:10, 1.30s/it] 11%|█ | 6766/61904 [3:07:59<20:33:29, 1.34s/it] 11%|█ | 6767/61904 [3:08:00<19:59:03, 1.30s/it] 11%|█ | 6768/61904 [3:08:01<20:03:34, 1.31s/it] 11%|█ | 6769/61904 [3:08:03<20:18:53, 1.33s/it] 11%|█ | 6770/61904 [3:08:04<20:52:54, 1.36s/it] 11%|█ | 6771/61904 [3:08:05<20:41:53, 1.35s/it] 11%|█ | 6772/61904 [3:08:07<21:34:20, 1.41s/it] 11%|█ | 6773/61904 [3:08:08<21:06:07, 1.38s/it] 11%|█ | 6774/61904 [3:08:10<21:32:21, 1.41s/it] 11%|█ | 6775/61904 [3:08:11<21:21:08, 1.39s/it] 11%|█ | 6776/61904 [3:08:12<20:55:34, 1.37s/it] 11%|█ | 6777/61904 [3:08:14<21:00:36, 1.37s/it] 11%|█ | 6778/61904 [3:08:15<20:52:36, 1.36s/it] 11%|█ | 6779/61904 [3:08:16<20:29:01, 1.34s/it] 11%|█ | 6780/61904 [3:08:18<20:35:44, 1.35s/it] {'loss': 2.9103, 'learning_rate': 1.8933618566057305e-07, 'epoch': 1.75} 11%|█ | 6780/61904 [3:08:18<20:35:44, 1.35s/it] 11%|█ | 6781/61904 [3:08:19<20:27:48, 1.34s/it] 11%|█ | 6782/61904 [3:08:20<20:20:46, 1.33s/it] 11%|█ | 6783/61904 [3:08:22<20:32:59, 1.34s/it] 11%|█ | 6784/61904 [3:08:23<20:10:44, 1.32s/it] 11%|█ | 6785/61904 [3:08:24<20:01:50, 1.31s/it] 11%|█ | 6786/61904 [3:08:26<19:59:55, 1.31s/it] 11%|█ | 6787/61904 [3:08:27<20:28:50, 1.34s/it] 11%|█ | 6788/61904 [3:08:28<20:35:21, 1.34s/it] 11%|█ | 6789/61904 [3:08:30<20:37:33, 1.35s/it] 11%|█ | 6790/61904 [3:08:31<20:47:50, 1.36s/it] 11%|█ | 6791/61904 [3:08:32<20:35:55, 1.35s/it] 11%|█ | 6792/61904 [3:08:34<20:30:26, 1.34s/it] 11%|█ | 6793/61904 [3:08:35<20:18:15, 1.33s/it] 11%|█ | 6794/61904 [3:08:36<20:42:59, 1.35s/it] 11%|█ | 6795/61904 [3:08:38<20:21:49, 1.33s/it] 11%|█ | 6796/61904 [3:08:39<20:58:06, 1.37s/it] 11%|█ | 6797/61904 [3:08:40<20:29:49, 1.34s/it] 11%|█ | 6798/61904 [3:08:42<20:40:47, 1.35s/it] 11%|█ | 6799/61904 [3:08:43<20:20:36, 1.33s/it] 11%|█ | 6800/61904 [3:08:44<20:42:08, 1.35s/it] {'loss': 2.9025, 'learning_rate': 1.8930377285103074e-07, 'epoch': 1.76} 11%|█ | 6800/61904 [3:08:44<20:42:08, 1.35s/it] 11%|█ | 6801/61904 [3:08:46<21:00:00, 1.37s/it] 11%|█ | 6802/61904 [3:08:47<21:06:59, 1.38s/it] 11%|█ | 6803/61904 [3:08:49<22:04:38, 1.44s/it] 11%|█ | 6804/61904 [3:08:50<21:30:00, 1.40s/it] 11%|█ | 6805/61904 [3:08:51<20:51:54, 1.36s/it] 11%|█ | 6806/61904 [3:08:53<20:57:41, 1.37s/it] 11%|█ | 6807/61904 [3:08:54<20:08:44, 1.32s/it] 11%|█ | 6808/61904 [3:08:55<20:04:02, 1.31s/it] 11%|█ | 6809/61904 [3:08:57<20:02:32, 1.31s/it] 11%|█ | 6810/61904 [3:08:58<20:03:27, 1.31s/it] 11%|█ | 6811/61904 [3:08:59<19:48:01, 1.29s/it] 11%|█ | 6812/61904 [3:09:01<20:35:21, 1.35s/it] 11%|█ | 6813/61904 [3:09:02<21:44:01, 1.42s/it] 11%|█ | 6814/61904 [3:09:04<21:44:17, 1.42s/it] 11%|█ | 6815/61904 [3:09:05<21:41:10, 1.42s/it] 11%|█ | 6816/61904 [3:09:06<21:04:34, 1.38s/it] 11%|█ | 6817/61904 [3:09:08<20:21:30, 1.33s/it] 11%|█ | 6818/61904 [3:09:09<20:11:56, 1.32s/it] 11%|█ | 6819/61904 [3:09:10<20:07:19, 1.32s/it] 11%|█ | 6820/61904 [3:09:12<20:05:10, 1.31s/it] {'loss': 2.9723, 'learning_rate': 1.8927136004148837e-07, 'epoch': 1.76} 11%|█ | 6820/61904 [3:09:12<20:05:10, 1.31s/it] 11%|█ | 6821/61904 [3:09:13<20:28:19, 1.34s/it] 11%|█ | 6822/61904 [3:09:14<20:56:27, 1.37s/it] 11%|█ | 6823/61904 [3:09:16<21:13:11, 1.39s/it] 11%|█ | 6824/61904 [3:09:17<21:52:12, 1.43s/it] 11%|█ | 6825/61904 [3:09:19<21:07:51, 1.38s/it] 11%|█ | 6826/61904 [3:09:20<20:40:13, 1.35s/it] 11%|█ | 6827/61904 [3:09:21<20:33:41, 1.34s/it] 11%|█ | 6828/61904 [3:09:23<20:30:05, 1.34s/it] 11%|█ | 6829/61904 [3:09:24<20:22:13, 1.33s/it] 11%|█ | 6830/61904 [3:09:25<20:40:59, 1.35s/it] 11%|█ | 6831/61904 [3:09:27<20:33:52, 1.34s/it] 11%|█ | 6832/61904 [3:09:28<20:56:49, 1.37s/it] 11%|█ | 6833/61904 [3:09:29<21:20:32, 1.40s/it] 11%|█ | 6834/61904 [3:09:31<22:09:46, 1.45s/it] 11%|█ | 6835/61904 [3:09:32<21:53:17, 1.43s/it] 11%|█ | 6836/61904 [3:09:34<21:25:52, 1.40s/it] 11%|█ | 6837/61904 [3:09:35<21:27:54, 1.40s/it] 11%|█ | 6838/61904 [3:09:36<20:52:00, 1.36s/it] 11%|█ | 6839/61904 [3:09:38<20:34:27, 1.35s/it] 11%|█ | 6840/61904 [3:09:39<20:19:16, 1.33s/it] {'loss': 2.9768, 'learning_rate': 1.8923894723194606e-07, 'epoch': 1.77} 11%|█ | 6840/61904 [3:09:39<20:19:16, 1.33s/it] 11%|█ | 6841/61904 [3:09:40<20:00:24, 1.31s/it] 11%|█ | 6842/61904 [3:09:42<19:45:51, 1.29s/it] 11%|█ | 6843/61904 [3:09:43<19:57:28, 1.30s/it] 11%|█ | 6844/61904 [3:09:44<20:00:17, 1.31s/it] 11%|█ | 6845/61904 [3:09:45<19:58:17, 1.31s/it] 11%|█ | 6846/61904 [3:09:47<20:05:32, 1.31s/it] 11%|█ | 6847/61904 [3:09:48<20:24:38, 1.33s/it] 11%|█ | 6848/61904 [3:09:49<19:30:15, 1.28s/it] 11%|█ | 6849/61904 [3:09:51<19:17:13, 1.26s/it] 11%|█ | 6850/61904 [3:09:52<19:59:30, 1.31s/it] 11%|█ | 6851/61904 [3:09:53<20:15:09, 1.32s/it] 11%|█ | 6852/61904 [3:09:55<20:22:26, 1.33s/it] 11%|█ | 6853/61904 [3:09:56<20:20:26, 1.33s/it] 11%|█ | 6854/61904 [3:09:57<20:08:47, 1.32s/it] 11%|█ | 6855/61904 [3:09:59<20:54:33, 1.37s/it] 11%|█ | 6856/61904 [3:10:00<20:58:53, 1.37s/it] 11%|█ | 6857/61904 [3:10:02<21:04:37, 1.38s/it] 11%|█ | 6858/61904 [3:10:03<20:59:37, 1.37s/it] 11%|█ | 6859/61904 [3:10:04<20:35:43, 1.35s/it] 11%|█ | 6860/61904 [3:10:06<22:00:00, 1.44s/it] {'loss': 2.8984, 'learning_rate': 1.8920653442240375e-07, 'epoch': 1.77} 11%|█ | 6860/61904 [3:10:06<22:00:00, 1.44s/it] 11%|█ | 6861/61904 [3:10:07<21:37:22, 1.41s/it] 11%|█ | 6862/61904 [3:10:09<21:43:40, 1.42s/it] 11%|█ | 6863/61904 [3:10:10<21:04:00, 1.38s/it] 11%|█ | 6864/61904 [3:10:11<20:48:37, 1.36s/it] 11%|█ | 6865/61904 [3:10:13<20:24:12, 1.33s/it] 11%|█ | 6866/61904 [3:10:14<20:50:30, 1.36s/it] 11%|█ | 6867/61904 [3:10:15<20:43:59, 1.36s/it] 11%|█ | 6868/61904 [3:10:17<20:37:21, 1.35s/it] 11%|█ | 6869/61904 [3:10:18<20:28:50, 1.34s/it] 11%|█ | 6870/61904 [3:10:19<21:27:23, 1.40s/it] 11%|█ | 6871/61904 [3:10:21<21:13:20, 1.39s/it] 11%|█ | 6872/61904 [3:10:22<20:44:11, 1.36s/it] 11%|█ | 6873/61904 [3:10:23<20:47:49, 1.36s/it] 11%|█ | 6874/61904 [3:10:25<20:53:23, 1.37s/it] 11%|█ | 6875/61904 [3:10:26<20:43:15, 1.36s/it] 11%|█ | 6876/61904 [3:10:27<20:16:25, 1.33s/it] 11%|█ | 6877/61904 [3:10:29<21:02:54, 1.38s/it] 11%|█ | 6878/61904 [3:10:30<21:08:13, 1.38s/it] 11%|█ | 6879/61904 [3:10:32<20:57:43, 1.37s/it] 11%|█ | 6880/61904 [3:10:33<21:10:43, 1.39s/it] {'loss': 2.9167, 'learning_rate': 1.8917412161286138e-07, 'epoch': 1.78} 11%|█ | 6880/61904 [3:10:33<21:10:43, 1.39s/it] 11%|█ | 6881/61904 [3:10:35<21:16:38, 1.39s/it] 11%|█ | 6882/61904 [3:10:36<21:22:17, 1.40s/it] 11%|█ | 6883/61904 [3:10:38<22:07:33, 1.45s/it] 11%|█ | 6884/61904 [3:10:39<21:25:52, 1.40s/it] 11%|█ | 6885/61904 [3:10:40<20:56:39, 1.37s/it] 11%|█ | 6886/61904 [3:10:42<21:14:18, 1.39s/it] 11%|█ | 6887/61904 [3:10:43<21:17:42, 1.39s/it] 11%|█ | 6888/61904 [3:10:44<21:28:43, 1.41s/it] 11%|█ | 6889/61904 [3:10:46<22:23:49, 1.47s/it] 11%|█ | 6890/61904 [3:10:47<22:08:49, 1.45s/it] 11%|█ | 6891/61904 [3:10:49<22:01:44, 1.44s/it] 11%|█ | 6892/61904 [3:10:50<21:38:08, 1.42s/it] 11%|█ | 6893/61904 [3:10:52<21:39:42, 1.42s/it] 11%|█ | 6894/61904 [3:10:53<21:14:40, 1.39s/it] 11%|█ | 6895/61904 [3:10:54<21:46:46, 1.43s/it] 11%|█ | 6896/61904 [3:10:56<21:10:33, 1.39s/it] 11%|█ | 6897/61904 [3:10:57<20:44:53, 1.36s/it] 11%|█ | 6898/61904 [3:10:58<20:38:26, 1.35s/it] 11%|█ | 6899/61904 [3:11:00<19:58:04, 1.31s/it] 11%|█ | 6900/61904 [3:11:01<20:06:21, 1.32s/it] {'loss': 2.9561, 'learning_rate': 1.8914170880331907e-07, 'epoch': 1.78} 11%|█ | 6900/61904 [3:11:01<20:06:21, 1.32s/it] 11%|█ | 6901/61904 [3:11:02<20:03:27, 1.31s/it] 11%|█ | 6902/61904 [3:11:04<20:42:43, 1.36s/it] 11%|█ | 6903/61904 [3:11:05<20:46:22, 1.36s/it] 11%|█ | 6904/61904 [3:11:07<21:46:29, 1.43s/it] 11%|█ | 6905/61904 [3:11:08<20:58:54, 1.37s/it] 11%|█ | 6906/61904 [3:11:09<20:31:55, 1.34s/it] 11%|█ | 6907/61904 [3:11:11<21:27:39, 1.40s/it] 11%|█ | 6908/61904 [3:11:12<20:44:19, 1.36s/it] 11%|█ | 6909/61904 [3:11:13<20:41:42, 1.35s/it] 11%|█ | 6910/61904 [3:11:15<20:25:37, 1.34s/it] 11%|█ | 6911/61904 [3:11:16<20:25:03, 1.34s/it] 11%|█ | 6912/61904 [3:11:17<20:41:58, 1.36s/it] 11%|█ | 6913/61904 [3:11:19<20:47:43, 1.36s/it] 11%|█ | 6914/61904 [3:11:20<20:36:58, 1.35s/it] 11%|█ | 6915/61904 [3:11:21<21:03:21, 1.38s/it] 11%|█ | 6916/61904 [3:11:23<21:28:29, 1.41s/it] 11%|█ | 6917/61904 [3:11:24<21:14:37, 1.39s/it] 11%|█ | 6918/61904 [3:11:26<21:25:16, 1.40s/it] 11%|█ | 6919/61904 [3:11:27<22:01:30, 1.44s/it] 11%|█ | 6920/61904 [3:11:29<22:07:20, 1.45s/it] {'loss': 2.8689, 'learning_rate': 1.8910929599377673e-07, 'epoch': 1.79} 11%|█ | 6920/61904 [3:11:29<22:07:20, 1.45s/it] 11%|█ | 6921/61904 [3:11:30<21:30:32, 1.41s/it] 11%|█ | 6922/61904 [3:11:31<20:59:03, 1.37s/it] 11%|█ | 6923/61904 [3:11:33<20:53:49, 1.37s/it] 11%|█ | 6924/61904 [3:11:34<20:26:16, 1.34s/it] 11%|█ | 6925/61904 [3:11:35<20:19:11, 1.33s/it] 11%|█ | 6926/61904 [3:11:37<20:32:40, 1.35s/it] 11%|█ | 6927/61904 [3:11:38<20:19:20, 1.33s/it] 11%|█ | 6928/61904 [3:11:39<20:22:42, 1.33s/it] 11%|█ | 6929/61904 [3:11:41<20:33:24, 1.35s/it] 11%|█ | 6930/61904 [3:11:42<20:20:24, 1.33s/it] 11%|█ | 6931/61904 [3:11:43<20:13:28, 1.32s/it] 11%|█ | 6932/61904 [3:11:45<20:56:25, 1.37s/it] 11%|█ | 6933/61904 [3:11:46<20:42:12, 1.36s/it] 11%|█ | 6934/61904 [3:11:47<20:35:44, 1.35s/it] 11%|█ | 6935/61904 [3:11:49<19:59:51, 1.31s/it] 11%|█ | 6936/61904 [3:11:50<20:43:07, 1.36s/it] 11%|█ | 6937/61904 [3:11:51<21:07:37, 1.38s/it] 11%|█ | 6938/61904 [3:11:53<21:24:33, 1.40s/it] 11%|█ | 6939/61904 [3:11:54<20:42:35, 1.36s/it] 11%|█ | 6940/61904 [3:11:55<20:28:31, 1.34s/it] {'loss': 2.9081, 'learning_rate': 1.890768831842344e-07, 'epoch': 1.79} 11%|█ | 6940/61904 [3:11:55<20:28:31, 1.34s/it] 11%|█ | 6941/61904 [3:11:57<20:11:29, 1.32s/it] 11%|█ | 6942/61904 [3:11:58<20:12:31, 1.32s/it] 11%|█ | 6943/61904 [3:11:59<19:42:18, 1.29s/it] 11%|█ | 6944/61904 [3:12:01<20:11:13, 1.32s/it] 11%|█ | 6945/61904 [3:12:02<20:15:43, 1.33s/it] 11%|█ | 6946/61904 [3:12:03<19:55:38, 1.31s/it] 11%|█ | 6947/61904 [3:12:05<19:49:30, 1.30s/it] 11%|█ | 6948/61904 [3:12:06<20:19:41, 1.33s/it] 11%|█ | 6949/61904 [3:12:07<20:00:28, 1.31s/it] 11%|█ | 6950/61904 [3:12:09<20:41:13, 1.36s/it] 11%|█ | 6951/61904 [3:12:10<21:28:28, 1.41s/it] 11%|█ | 6952/61904 [3:12:11<20:44:11, 1.36s/it] 11%|█ | 6953/61904 [3:12:13<20:34:26, 1.35s/it] 11%|█ | 6954/61904 [3:12:14<20:27:27, 1.34s/it] 11%|█ | 6955/61904 [3:12:16<20:35:41, 1.35s/it] 11%|█ | 6956/61904 [3:12:17<20:56:38, 1.37s/it] 11%|█ | 6957/61904 [3:12:18<20:45:58, 1.36s/it] 11%|█ | 6958/61904 [3:12:20<20:58:57, 1.37s/it] 11%|█ | 6959/61904 [3:12:21<20:43:06, 1.36s/it] 11%|█ | 6960/61904 [3:12:22<20:00:12, 1.31s/it] {'loss': 2.8818, 'learning_rate': 1.8904447037469208e-07, 'epoch': 1.8} 11%|█ | 6960/61904 [3:12:22<20:00:12, 1.31s/it] 11%|█ | 6961/61904 [3:12:24<20:26:00, 1.34s/it] 11%|█ | 6962/61904 [3:12:25<21:00:27, 1.38s/it] 11%|█ | 6963/61904 [3:12:26<20:48:50, 1.36s/it] 11%|█ | 6964/61904 [3:12:28<20:40:32, 1.35s/it] 11%|█▏ | 6965/61904 [3:12:29<20:49:11, 1.36s/it] 11%|█▏ | 6966/61904 [3:12:31<21:25:54, 1.40s/it] 11%|█▏ | 6967/61904 [3:12:32<20:53:22, 1.37s/it] 11%|█▏ | 6968/61904 [3:12:33<20:09:42, 1.32s/it] 11%|█▏ | 6969/61904 [3:12:34<20:13:51, 1.33s/it] 11%|█▏ | 6970/61904 [3:12:36<21:09:56, 1.39s/it] 11%|█▏ | 6971/61904 [3:12:37<20:53:10, 1.37s/it] 11%|█▏ | 6972/61904 [3:12:39<20:32:02, 1.35s/it] 11%|█▏ | 6973/61904 [3:12:40<20:23:30, 1.34s/it] 11%|█▏ | 6974/61904 [3:12:41<20:43:47, 1.36s/it] 11%|█▏ | 6975/61904 [3:12:43<20:05:36, 1.32s/it] 11%|█▏ | 6976/61904 [3:12:44<20:26:24, 1.34s/it] 11%|█▏ | 6977/61904 [3:12:45<21:14:57, 1.39s/it] 11%|█▏ | 6978/61904 [3:12:47<20:20:30, 1.33s/it] 11%|█▏ | 6979/61904 [3:12:48<20:08:01, 1.32s/it] 11%|█▏ | 6980/61904 [3:12:49<20:02:50, 1.31s/it] {'loss': 2.9503, 'learning_rate': 1.8901205756514975e-07, 'epoch': 1.8} 11%|█▏ | 6980/61904 [3:12:49<20:02:50, 1.31s/it] 11%|█▏ | 6981/61904 [3:12:51<19:57:26, 1.31s/it] 11%|█▏ | 6982/61904 [3:12:52<20:16:16, 1.33s/it] 11%|█▏ | 6983/61904 [3:12:53<20:10:55, 1.32s/it] 11%|█▏ | 6984/61904 [3:12:54<19:48:01, 1.30s/it] 11%|█▏ | 6985/61904 [3:12:56<20:08:51, 1.32s/it] 11%|█▏ | 6986/61904 [3:12:57<20:14:50, 1.33s/it] 11%|█▏ | 6987/61904 [3:12:58<20:15:19, 1.33s/it] 11%|█▏ | 6988/61904 [3:13:00<20:03:32, 1.31s/it] 11%|█▏ | 6989/61904 [3:13:01<20:21:45, 1.33s/it] 11%|█▏ | 6990/61904 [3:13:03<20:31:56, 1.35s/it] 11%|█▏ | 6991/61904 [3:13:04<20:40:32, 1.36s/it] 11%|█▏ | 6992/61904 [3:13:05<21:18:17, 1.40s/it] 11%|█▏ | 6993/61904 [3:13:07<20:46:43, 1.36s/it] 11%|█▏ | 6994/61904 [3:13:08<20:48:03, 1.36s/it] 11%|█▏ | 6995/61904 [3:13:09<20:14:01, 1.33s/it] 11%|█▏ | 6996/61904 [3:13:11<20:16:01, 1.33s/it] 11%|█▏ | 6997/61904 [3:13:12<19:50:32, 1.30s/it] 11%|█▏ | 6998/61904 [3:13:13<20:05:44, 1.32s/it] 11%|█▏ | 6999/61904 [3:13:15<20:48:13, 1.36s/it] 11%|█▏ | 7000/61904 [3:13:16<20:55:11, 1.37s/it] {'loss': 2.985, 'learning_rate': 1.889796447556074e-07, 'epoch': 1.81} 11%|█▏ | 7000/61904 [3:13:16<20:55:11, 1.37s/it] 11%|█▏ | 7001/61904 [3:13:17<20:46:56, 1.36s/it] 11%|█▏ | 7002/61904 [3:13:19<20:28:35, 1.34s/it] 11%|█▏ | 7003/61904 [3:13:20<19:51:35, 1.30s/it] 11%|█▏ | 7004/61904 [3:13:21<20:44:04, 1.36s/it] 11%|█▏ | 7005/61904 [3:13:23<20:17:46, 1.33s/it] 11%|█▏ | 7006/61904 [3:13:24<20:45:39, 1.36s/it] 11%|█▏ | 7007/61904 [3:13:26<21:17:10, 1.40s/it] 11%|█▏ | 7008/61904 [3:13:27<20:26:06, 1.34s/it] 11%|█▏ | 7009/61904 [3:13:28<20:27:48, 1.34s/it] 11%|█▏ | 7010/61904 [3:13:29<20:16:48, 1.33s/it] 11%|█▏ | 7011/61904 [3:13:31<20:28:24, 1.34s/it] 11%|█▏ | 7012/61904 [3:13:32<20:15:35, 1.33s/it] 11%|█▏ | 7013/61904 [3:13:33<20:04:07, 1.32s/it] 11%|█▏ | 7014/61904 [3:13:35<19:48:23, 1.30s/it] 11%|█▏ | 7015/61904 [3:13:36<19:41:52, 1.29s/it] 11%|█▏ | 7016/61904 [3:13:37<20:01:08, 1.31s/it] 11%|█▏ | 7017/61904 [3:13:39<20:03:25, 1.32s/it] 11%|█▏ | 7018/61904 [3:13:40<20:11:46, 1.32s/it] 11%|█▏ | 7019/61904 [3:13:41<19:59:07, 1.31s/it] 11%|█▏ | 7020/61904 [3:13:43<20:04:26, 1.32s/it] {'loss': 2.9812, 'learning_rate': 1.889472319460651e-07, 'epoch': 1.81} 11%|█▏ | 7020/61904 [3:13:43<20:04:26, 1.32s/it] 11%|█▏ | 7021/61904 [3:13:44<20:16:59, 1.33s/it] 11%|█▏ | 7022/61904 [3:13:45<20:34:47, 1.35s/it] 11%|█▏ | 7023/61904 [3:13:47<20:36:03, 1.35s/it] 11%|█▏ | 7024/61904 [3:13:48<21:07:17, 1.39s/it] 11%|█▏ | 7025/61904 [3:13:49<20:46:51, 1.36s/it] 11%|█▏ | 7026/61904 [3:13:51<20:46:28, 1.36s/it] 11%|█▏ | 7027/61904 [3:13:52<20:20:39, 1.33s/it] 11%|█▏ | 7028/61904 [3:13:54<20:46:12, 1.36s/it] 11%|█▏ | 7029/61904 [3:13:55<21:05:38, 1.38s/it] 11%|█▏ | 7030/61904 [3:13:56<21:21:50, 1.40s/it] 11%|█▏ | 7031/61904 [3:13:58<21:01:25, 1.38s/it] 11%|█▏ | 7032/61904 [3:13:59<20:50:26, 1.37s/it] 11%|█▏ | 7033/61904 [3:14:00<20:22:56, 1.34s/it] 11%|█▏ | 7034/61904 [3:14:02<20:42:41, 1.36s/it] 11%|█▏ | 7035/61904 [3:14:03<20:29:33, 1.34s/it] 11%|█▏ | 7036/61904 [3:14:04<20:43:58, 1.36s/it] 11%|█▏ | 7037/61904 [3:14:06<21:24:41, 1.40s/it] 11%|█▏ | 7038/61904 [3:14:08<22:12:36, 1.46s/it] 11%|█▏ | 7039/61904 [3:14:09<21:47:42, 1.43s/it] 11%|█▏ | 7040/61904 [3:14:10<22:01:31, 1.45s/it] {'loss': 2.9244, 'learning_rate': 1.8891481913652273e-07, 'epoch': 1.82} 11%|█▏ | 7040/61904 [3:14:10<22:01:31, 1.45s/it] 11%|█▏ | 7041/61904 [3:14:12<21:27:28, 1.41s/it] 11%|█▏ | 7042/61904 [3:14:13<21:06:57, 1.39s/it] 11%|█▏ | 7043/61904 [3:14:14<21:14:53, 1.39s/it] 11%|█▏ | 7044/61904 [3:14:16<20:50:33, 1.37s/it] 11%|█▏ | 7045/61904 [3:14:17<20:32:45, 1.35s/it] 11%|█▏ | 7046/61904 [3:14:18<20:10:13, 1.32s/it] 11%|█▏ | 7047/61904 [3:14:20<20:28:56, 1.34s/it] 11%|█▏ | 7048/61904 [3:14:21<20:21:58, 1.34s/it] 11%|█▏ | 7049/61904 [3:14:22<20:25:46, 1.34s/it] 11%|█▏ | 7050/61904 [3:14:24<20:08:30, 1.32s/it] 11%|█▏ | 7051/61904 [3:14:25<20:11:07, 1.32s/it] 11%|█▏ | 7052/61904 [3:14:26<19:47:12, 1.30s/it] 11%|█▏ | 7053/61904 [3:14:27<19:29:04, 1.28s/it] 11%|█▏ | 7054/61904 [3:14:29<20:00:12, 1.31s/it] 11%|█▏ | 7055/61904 [3:14:30<20:17:05, 1.33s/it] 11%|█▏ | 7056/61904 [3:14:32<20:07:28, 1.32s/it] 11%|█▏ | 7057/61904 [3:14:33<20:05:35, 1.32s/it] 11%|█▏ | 7058/61904 [3:14:34<20:24:18, 1.34s/it] 11%|█▏ | 7059/61904 [3:14:36<20:21:19, 1.34s/it] 11%|█▏ | 7060/61904 [3:14:37<20:39:54, 1.36s/it] {'loss': 2.9199, 'learning_rate': 1.8888240632698042e-07, 'epoch': 1.82} 11%|█▏ | 7060/61904 [3:14:37<20:39:54, 1.36s/it] 11%|█▏ | 7061/61904 [3:14:38<20:38:21, 1.35s/it] 11%|█▏ | 7062/61904 [3:14:40<20:48:57, 1.37s/it] 11%|█▏ | 7063/61904 [3:14:41<20:27:45, 1.34s/it] 11%|█▏ | 7064/61904 [3:14:42<20:49:44, 1.37s/it] 11%|█▏ | 7065/61904 [3:14:44<20:22:44, 1.34s/it] 11%|█▏ | 7066/61904 [3:14:45<21:00:33, 1.38s/it] 11%|█▏ | 7067/61904 [3:14:47<20:50:18, 1.37s/it] 11%|█▏ | 7068/61904 [3:14:48<20:32:48, 1.35s/it] 11%|█▏ | 7069/61904 [3:14:49<20:06:37, 1.32s/it] 11%|█▏ | 7070/61904 [3:14:50<20:30:52, 1.35s/it] 11%|█▏ | 7071/61904 [3:14:52<20:20:23, 1.34s/it] 11%|█▏ | 7072/61904 [3:14:53<20:01:54, 1.32s/it] 11%|█▏ | 7073/61904 [3:14:55<20:37:30, 1.35s/it] 11%|█▏ | 7074/61904 [3:14:56<20:25:09, 1.34s/it] 11%|█▏ | 7075/61904 [3:14:57<20:10:47, 1.32s/it] 11%|█▏ | 7076/61904 [3:14:58<20:06:58, 1.32s/it] 11%|█▏ | 7077/61904 [3:15:00<21:12:36, 1.39s/it] 11%|█▏ | 7078/61904 [3:15:01<21:40:31, 1.42s/it] 11%|█▏ | 7079/61904 [3:15:03<21:31:31, 1.41s/it] 11%|█▏ | 7080/61904 [3:15:04<21:40:46, 1.42s/it] {'loss': 2.9121, 'learning_rate': 1.8884999351743808e-07, 'epoch': 1.83} 11%|█▏ | 7080/61904 [3:15:04<21:40:46, 1.42s/it] 11%|█▏ | 7081/61904 [3:15:06<21:26:01, 1.41s/it] 11%|█▏ | 7082/61904 [3:15:07<22:26:41, 1.47s/it] 11%|█▏ | 7083/61904 [3:15:09<22:17:39, 1.46s/it] 11%|█▏ | 7084/61904 [3:15:10<22:43:06, 1.49s/it] 11%|█▏ | 7085/61904 [3:15:12<22:03:57, 1.45s/it] 11%|█▏ | 7086/61904 [3:15:13<21:52:58, 1.44s/it] 11%|█▏ | 7087/61904 [3:15:14<21:05:03, 1.38s/it] 11%|█▏ | 7088/61904 [3:15:16<20:54:14, 1.37s/it] 11%|█▏ | 7089/61904 [3:15:17<21:06:22, 1.39s/it] 11%|█▏ | 7090/61904 [3:15:18<21:09:07, 1.39s/it] 11%|█▏ | 7091/61904 [3:15:20<21:12:40, 1.39s/it] 11%|█▏ | 7092/61904 [3:15:21<20:33:25, 1.35s/it] 11%|█▏ | 7093/61904 [3:15:23<21:01:10, 1.38s/it] 11%|█▏ | 7094/61904 [3:15:24<20:46:39, 1.36s/it] 11%|█▏ | 7095/61904 [3:15:25<20:43:18, 1.36s/it] 11%|█▏ | 7096/61904 [3:15:26<20:04:53, 1.32s/it] 11%|█▏ | 7097/61904 [3:15:28<21:18:25, 1.40s/it] 11%|█▏ | 7098/61904 [3:15:29<21:14:39, 1.40s/it] 11%|█▏ | 7099/61904 [3:15:31<20:53:07, 1.37s/it] 11%|█▏ | 7100/61904 [3:15:32<20:37:46, 1.36s/it] {'loss': 2.9476, 'learning_rate': 1.8881758070789574e-07, 'epoch': 1.83} 11%|█▏ | 7100/61904 [3:15:32<20:37:46, 1.36s/it] 11%|█▏ | 7101/61904 [3:15:34<21:05:25, 1.39s/it] 11%|█▏ | 7102/61904 [3:15:35<20:50:10, 1.37s/it] 11%|█▏ | 7103/61904 [3:15:36<20:46:08, 1.36s/it] 11%|█▏ | 7104/61904 [3:15:38<20:59:54, 1.38s/it] 11%|█▏ | 7105/61904 [3:15:39<20:48:26, 1.37s/it] 11%|█▏ | 7106/61904 [3:15:41<21:48:51, 1.43s/it] 11%|█▏ | 7107/61904 [3:15:42<21:07:28, 1.39s/it] 11%|█▏ | 7108/61904 [3:15:43<20:45:52, 1.36s/it] 11%|█▏ | 7109/61904 [3:15:44<20:28:18, 1.34s/it] 11%|█▏ | 7110/61904 [3:15:46<20:49:41, 1.37s/it] 11%|█▏ | 7111/61904 [3:15:47<21:12:19, 1.39s/it] 11%|█▏ | 7112/61904 [3:15:49<20:40:33, 1.36s/it] 11%|█▏ | 7113/61904 [3:15:50<20:38:00, 1.36s/it] 11%|█▏ | 7114/61904 [3:15:52<21:40:33, 1.42s/it] 11%|█▏ | 7115/61904 [3:15:53<22:11:47, 1.46s/it] 11%|█▏ | 7116/61904 [3:15:54<21:44:14, 1.43s/it] 11%|█▏ | 7117/61904 [3:15:56<22:04:32, 1.45s/it] 11%|█▏ | 7118/61904 [3:15:57<21:42:19, 1.43s/it] 12%|█▏ | 7119/61904 [3:15:59<21:49:42, 1.43s/it] 12%|█▏ | 7120/61904 [3:16:00<21:15:05, 1.40s/it] {'loss': 2.9117, 'learning_rate': 1.8878516789835343e-07, 'epoch': 1.84} 12%|█▏ | 7120/61904 [3:16:00<21:15:05, 1.40s/it] 12%|█▏ | 7121/61904 [3:16:01<21:19:10, 1.40s/it] 12%|█▏ | 7122/61904 [3:16:03<21:15:52, 1.40s/it] 12%|█▏ | 7123/61904 [3:16:04<21:22:44, 1.40s/it] 12%|█▏ | 7124/61904 [3:16:06<22:03:56, 1.45s/it] 12%|█▏ | 7125/61904 [3:16:07<21:08:43, 1.39s/it] 12%|█▏ | 7126/61904 [3:16:08<20:45:57, 1.36s/it] 12%|█▏ | 7127/61904 [3:16:10<20:24:25, 1.34s/it] 12%|█▏ | 7128/61904 [3:16:11<20:30:51, 1.35s/it] 12%|█▏ | 7129/61904 [3:16:12<20:24:47, 1.34s/it] 12%|█▏ | 7130/61904 [3:16:14<20:58:35, 1.38s/it] 12%|█▏ | 7131/61904 [3:16:15<20:33:44, 1.35s/it] 12%|█▏ | 7132/61904 [3:16:17<20:42:55, 1.36s/it] 12%|█▏ | 7133/61904 [3:16:18<21:03:28, 1.38s/it] 12%|█▏ | 7134/61904 [3:16:19<21:30:01, 1.41s/it] 12%|█▏ | 7135/61904 [3:16:21<20:28:28, 1.35s/it] 12%|█▏ | 7136/61904 [3:16:22<20:57:42, 1.38s/it] 12%|█▏ | 7137/61904 [3:16:24<21:46:41, 1.43s/it] 12%|█▏ | 7138/61904 [3:16:25<21:41:05, 1.43s/it] 12%|█▏ | 7139/61904 [3:16:26<20:55:15, 1.38s/it] 12%|█▏ | 7140/61904 [3:16:28<20:32:18, 1.35s/it] {'loss': 2.8593, 'learning_rate': 1.887527550888111e-07, 'epoch': 1.85} 12%|█▏ | 7140/61904 [3:16:28<20:32:18, 1.35s/it] 12%|█▏ | 7141/61904 [3:16:29<20:35:30, 1.35s/it] 12%|█▏ | 7142/61904 [3:16:30<20:15:34, 1.33s/it] 12%|█▏ | 7143/61904 [3:16:32<20:31:00, 1.35s/it] 12%|█▏ | 7144/61904 [3:16:33<20:36:34, 1.35s/it] 12%|█▏ | 7145/61904 [3:16:34<20:30:21, 1.35s/it] 12%|█▏ | 7146/61904 [3:16:36<21:35:37, 1.42s/it] 12%|█▏ | 7147/61904 [3:16:37<20:56:03, 1.38s/it] 12%|█▏ | 7148/61904 [3:16:39<20:58:22, 1.38s/it] 12%|█▏ | 7149/61904 [3:16:40<21:18:54, 1.40s/it] 12%|█▏ | 7150/61904 [3:16:41<21:01:40, 1.38s/it] 12%|█▏ | 7151/61904 [3:16:43<21:11:06, 1.39s/it] 12%|█▏ | 7152/61904 [3:16:44<21:17:03, 1.40s/it] 12%|█▏ | 7153/61904 [3:16:46<20:52:11, 1.37s/it] 12%|█▏ | 7154/61904 [3:16:47<20:52:59, 1.37s/it] 12%|█▏ | 7155/61904 [3:16:48<20:11:02, 1.33s/it] 12%|█▏ | 7156/61904 [3:16:49<20:14:56, 1.33s/it] 12%|█▏ | 7157/61904 [3:16:51<21:24:50, 1.41s/it] 12%|█▏ | 7158/61904 [3:16:52<21:21:10, 1.40s/it] 12%|█▏ | 7159/61904 [3:16:54<21:02:26, 1.38s/it] 12%|█▏ | 7160/61904 [3:16:55<20:29:16, 1.35s/it] {'loss': 2.9515, 'learning_rate': 1.8872034227926876e-07, 'epoch': 1.85} 12%|█▏ | 7160/61904 [3:16:55<20:29:16, 1.35s/it] 12%|█▏ | 7161/61904 [3:16:56<20:15:53, 1.33s/it] 12%|█▏ | 7162/61904 [3:16:58<20:05:52, 1.32s/it] 12%|█▏ | 7163/61904 [3:16:59<20:12:58, 1.33s/it] 12%|█▏ | 7164/61904 [3:17:00<20:01:02, 1.32s/it] 12%|█▏ | 7165/61904 [3:17:02<20:01:14, 1.32s/it] 12%|█▏ | 7166/61904 [3:17:03<20:02:07, 1.32s/it] 12%|█▏ | 7167/61904 [3:17:04<19:51:56, 1.31s/it] 12%|█▏ | 7168/61904 [3:17:06<20:05:04, 1.32s/it] 12%|█▏ | 7169/61904 [3:17:07<19:48:43, 1.30s/it] 12%|█▏ | 7170/61904 [3:17:08<19:38:30, 1.29s/it] 12%|█▏ | 7171/61904 [3:17:10<20:18:51, 1.34s/it] 12%|█▏ | 7172/61904 [3:17:11<19:52:35, 1.31s/it] 12%|█▏ | 7173/61904 [3:17:12<19:42:09, 1.30s/it] 12%|█▏ | 7174/61904 [3:17:14<20:52:37, 1.37s/it] 12%|█▏ | 7175/61904 [3:17:15<20:27:10, 1.35s/it] 12%|█▏ | 7176/61904 [3:17:16<20:46:40, 1.37s/it] 12%|█▏ | 7177/61904 [3:17:18<20:29:31, 1.35s/it] 12%|█▏ | 7178/61904 [3:17:19<20:24:39, 1.34s/it] 12%|█▏ | 7179/61904 [3:17:20<21:12:31, 1.40s/it] 12%|█▏ | 7180/61904 [3:17:22<21:33:57, 1.42s/it] {'loss': 2.8816, 'learning_rate': 1.8868792946972644e-07, 'epoch': 1.86} 12%|█▏ | 7180/61904 [3:17:22<21:33:57, 1.42s/it] 12%|█▏ | 7181/61904 [3:17:23<21:36:06, 1.42s/it] 12%|█▏ | 7182/61904 [3:17:25<21:19:59, 1.40s/it] 12%|█▏ | 7183/61904 [3:17:26<21:10:29, 1.39s/it] 12%|█▏ | 7184/61904 [3:17:27<21:10:35, 1.39s/it] 12%|█▏ | 7185/61904 [3:17:29<20:48:54, 1.37s/it] 12%|█▏ | 7186/61904 [3:17:30<20:50:12, 1.37s/it] 12%|█▏ | 7187/61904 [3:17:31<20:33:33, 1.35s/it] 12%|█▏ | 7188/61904 [3:17:33<20:00:18, 1.32s/it] 12%|█▏ | 7189/61904 [3:17:34<20:48:33, 1.37s/it] 12%|█▏ | 7190/61904 [3:17:36<21:10:11, 1.39s/it] 12%|█▏ | 7191/61904 [3:17:37<21:18:47, 1.40s/it] 12%|█▏ | 7192/61904 [3:17:38<21:24:55, 1.41s/it] 12%|█▏ | 7193/61904 [3:17:40<21:05:44, 1.39s/it] 12%|█▏ | 7194/61904 [3:17:41<20:26:35, 1.35s/it] 12%|█▏ | 7195/61904 [3:17:43<21:38:03, 1.42s/it] 12%|█▏ | 7196/61904 [3:17:44<22:08:46, 1.46s/it] 12%|█▏ | 7197/61904 [3:17:45<21:21:59, 1.41s/it] 12%|█▏ | 7198/61904 [3:17:47<21:33:25, 1.42s/it] 12%|█▏ | 7199/61904 [3:17:48<21:31:50, 1.42s/it] 12%|█▏ | 7200/61904 [3:17:50<21:12:06, 1.40s/it] {'loss': 2.8811, 'learning_rate': 1.8865551666018408e-07, 'epoch': 1.86} 12%|█▏ | 7200/61904 [3:17:50<21:12:06, 1.40s/it] 12%|█▏ | 7201/61904 [3:17:51<21:27:00, 1.41s/it] 12%|█▏ | 7202/61904 [3:17:52<20:57:12, 1.38s/it] 12%|█▏ | 7203/61904 [3:17:54<20:30:20, 1.35s/it] 12%|█▏ | 7204/61904 [3:17:55<20:49:43, 1.37s/it] 12%|█▏ | 7205/61904 [3:17:57<21:08:41, 1.39s/it] 12%|█▏ | 7206/61904 [3:17:58<20:45:19, 1.37s/it] 12%|█▏ | 7207/61904 [3:17:59<20:36:14, 1.36s/it] 12%|█▏ | 7208/61904 [3:18:01<20:37:50, 1.36s/it] 12%|█▏ | 7209/61904 [3:18:02<20:25:08, 1.34s/it] 12%|█▏ | 7210/61904 [3:18:03<20:16:55, 1.33s/it] 12%|█▏ | 7211/61904 [3:18:05<20:52:16, 1.37s/it] 12%|█▏ | 7212/61904 [3:18:06<20:18:56, 1.34s/it] 12%|█▏ | 7213/61904 [3:18:07<20:34:11, 1.35s/it] 12%|█▏ | 7214/61904 [3:18:09<20:37:38, 1.36s/it] 12%|█▏ | 7215/61904 [3:18:10<20:50:31, 1.37s/it] 12%|█▏ | 7216/61904 [3:18:11<20:28:14, 1.35s/it] 12%|█▏ | 7217/61904 [3:18:13<20:27:58, 1.35s/it] 12%|█▏ | 7218/61904 [3:18:14<20:09:49, 1.33s/it] 12%|█▏ | 7219/61904 [3:18:15<20:33:33, 1.35s/it] 12%|█▏ | 7220/61904 [3:18:17<20:24:56, 1.34s/it] {'loss': 2.8955, 'learning_rate': 1.8862310385064177e-07, 'epoch': 1.87} 12%|█▏ | 7220/61904 [3:18:17<20:24:56, 1.34s/it] 12%|█▏ | 7221/61904 [3:18:18<20:26:33, 1.35s/it] 12%|█▏ | 7222/61904 [3:18:19<20:23:33, 1.34s/it] 12%|█▏ | 7223/61904 [3:18:21<19:44:56, 1.30s/it] 12%|█▏ | 7224/61904 [3:18:22<20:01:17, 1.32s/it] 12%|█▏ | 7225/61904 [3:18:23<20:11:24, 1.33s/it] 12%|█▏ | 7226/61904 [3:18:25<19:50:56, 1.31s/it] 12%|█▏ | 7227/61904 [3:18:26<20:01:13, 1.32s/it] 12%|█▏ | 7228/61904 [3:18:27<20:30:04, 1.35s/it] 12%|█▏ | 7229/61904 [3:18:29<20:26:35, 1.35s/it] 12%|█▏ | 7230/61904 [3:18:30<20:23:03, 1.34s/it] 12%|█▏ | 7231/61904 [3:18:31<20:21:41, 1.34s/it] 12%|█▏ | 7232/61904 [3:18:33<19:59:32, 1.32s/it] 12%|█▏ | 7233/61904 [3:18:34<20:23:32, 1.34s/it] 12%|█▏ | 7234/61904 [3:18:35<20:33:22, 1.35s/it] 12%|█▏ | 7235/61904 [3:18:37<20:55:48, 1.38s/it] 12%|█▏ | 7236/61904 [3:18:38<20:35:25, 1.36s/it] 12%|█▏ | 7237/61904 [3:18:40<20:34:41, 1.36s/it] 12%|█▏ | 7238/61904 [3:18:41<22:33:23, 1.49s/it] 12%|█▏ | 7239/61904 [3:18:43<21:40:02, 1.43s/it] 12%|█▏ | 7240/61904 [3:18:44<21:18:27, 1.40s/it] {'loss': 2.8895, 'learning_rate': 1.8859069104109943e-07, 'epoch': 1.87} 12%|█▏ | 7240/61904 [3:18:44<21:18:27, 1.40s/it] 12%|█▏ | 7241/61904 [3:18:45<20:55:53, 1.38s/it] 12%|█▏ | 7242/61904 [3:18:47<22:42:37, 1.50s/it] 12%|█▏ | 7243/61904 [3:18:48<21:43:30, 1.43s/it] 12%|█▏ | 7244/61904 [3:18:50<21:44:51, 1.43s/it] 12%|█▏ | 7245/61904 [3:18:51<21:39:52, 1.43s/it] 12%|█▏ | 7246/61904 [3:18:52<21:02:37, 1.39s/it] 12%|█▏ | 7247/61904 [3:18:54<20:41:27, 1.36s/it] 12%|█▏ | 7248/61904 [3:18:55<21:10:24, 1.39s/it] 12%|█▏ | 7249/61904 [3:18:57<21:03:59, 1.39s/it] 12%|█▏ | 7250/61904 [3:18:58<20:48:11, 1.37s/it] 12%|█▏ | 7251/61904 [3:18:59<20:25:24, 1.35s/it] 12%|█▏ | 7252/61904 [3:19:01<21:05:42, 1.39s/it] 12%|█▏ | 7253/61904 [3:19:02<20:32:01, 1.35s/it] 12%|█▏ | 7254/61904 [3:19:03<21:20:28, 1.41s/it] 12%|█▏ | 7255/61904 [3:19:05<20:57:30, 1.38s/it] 12%|█▏ | 7256/61904 [3:19:06<20:28:47, 1.35s/it] 12%|█▏ | 7257/61904 [3:19:07<19:41:39, 1.30s/it] 12%|█▏ | 7258/61904 [3:19:09<20:25:53, 1.35s/it] 12%|█▏ | 7259/61904 [3:19:10<20:20:43, 1.34s/it] 12%|█▏ | 7260/61904 [3:19:12<21:23:21, 1.41s/it] {'loss': 2.8621, 'learning_rate': 1.885582782315571e-07, 'epoch': 1.88} 12%|█▏ | 7260/61904 [3:19:12<21:23:21, 1.41s/it] 12%|█▏ | 7261/61904 [3:19:13<20:49:31, 1.37s/it] 12%|█▏ | 7262/61904 [3:19:14<20:41:43, 1.36s/it] 12%|█▏ | 7263/61904 [3:19:16<20:57:45, 1.38s/it] 12%|█▏ | 7264/61904 [3:19:17<20:57:04, 1.38s/it] 12%|█▏ | 7265/61904 [3:19:18<20:54:43, 1.38s/it] 12%|█▏ | 7266/61904 [3:19:20<20:35:33, 1.36s/it] 12%|█▏ | 7267/61904 [3:19:21<20:59:14, 1.38s/it] 12%|█▏ | 7268/61904 [3:19:23<21:32:27, 1.42s/it] 12%|█▏ | 7269/61904 [3:19:24<20:31:17, 1.35s/it] 12%|█▏ | 7270/61904 [3:19:25<20:24:11, 1.34s/it] 12%|█▏ | 7271/61904 [3:19:27<20:21:06, 1.34s/it] 12%|█▏ | 7272/61904 [3:19:28<20:20:04, 1.34s/it] 12%|█▏ | 7273/61904 [3:19:29<20:46:19, 1.37s/it] 12%|█▏ | 7274/61904 [3:19:31<21:24:16, 1.41s/it] 12%|█▏ | 7275/61904 [3:19:32<21:33:21, 1.42s/it] 12%|█▏ | 7276/61904 [3:19:33<20:30:47, 1.35s/it] 12%|█▏ | 7277/61904 [3:19:35<21:01:56, 1.39s/it] 12%|█▏ | 7278/61904 [3:19:36<20:19:50, 1.34s/it] 12%|█▏ | 7279/61904 [3:19:38<20:32:10, 1.35s/it] 12%|█▏ | 7280/61904 [3:19:39<20:58:40, 1.38s/it] {'loss': 2.8738, 'learning_rate': 1.8852586542201478e-07, 'epoch': 1.88} 12%|█▏ | 7280/61904 [3:19:39<20:58:40, 1.38s/it] 12%|█▏ | 7281/61904 [3:19:40<20:58:00, 1.38s/it] 12%|█▏ | 7282/61904 [3:19:42<20:45:23, 1.37s/it] 12%|█▏ | 7283/61904 [3:19:43<20:52:43, 1.38s/it] 12%|█▏ | 7284/61904 [3:19:44<20:34:04, 1.36s/it] 12%|█▏ | 7285/61904 [3:19:46<21:13:18, 1.40s/it] 12%|█▏ | 7286/61904 [3:19:47<20:47:58, 1.37s/it] 12%|█▏ | 7287/61904 [3:19:48<20:20:50, 1.34s/it] 12%|█▏ | 7288/61904 [3:19:50<20:43:05, 1.37s/it] 12%|█▏ | 7289/61904 [3:19:51<20:51:17, 1.37s/it] 12%|█▏ | 7290/61904 [3:19:53<20:44:59, 1.37s/it] 12%|█▏ | 7291/61904 [3:19:54<20:23:50, 1.34s/it] 12%|█▏ | 7292/61904 [3:19:55<20:38:18, 1.36s/it] 12%|█▏ | 7293/61904 [3:19:57<20:52:15, 1.38s/it] 12%|█▏ | 7294/61904 [3:19:58<20:50:09, 1.37s/it] 12%|█▏ | 7295/61904 [3:19:59<20:27:33, 1.35s/it] 12%|█▏ | 7296/61904 [3:20:01<20:12:11, 1.33s/it] 12%|█▏ | 7297/61904 [3:20:02<20:02:01, 1.32s/it] 12%|█▏ | 7298/61904 [3:20:03<20:35:47, 1.36s/it] 12%|█▏ | 7299/61904 [3:20:05<20:38:06, 1.36s/it] 12%|█▏ | 7300/61904 [3:20:06<20:33:40, 1.36s/it] {'loss': 2.9246, 'learning_rate': 1.8849345261247244e-07, 'epoch': 1.89} 12%|█▏ | 7300/61904 [3:20:06<20:33:40, 1.36s/it] 12%|█▏ | 7301/61904 [3:20:08<21:02:50, 1.39s/it] 12%|█▏ | 7302/61904 [3:20:09<20:32:37, 1.35s/it] 12%|█▏ | 7303/61904 [3:20:10<20:27:12, 1.35s/it] 12%|█▏ | 7304/61904 [3:20:11<19:49:23, 1.31s/it] 12%|█▏ | 7305/61904 [3:20:13<20:03:50, 1.32s/it] 12%|█▏ | 7306/61904 [3:20:14<20:21:50, 1.34s/it] 12%|█▏ | 7307/61904 [3:20:16<21:04:20, 1.39s/it] 12%|█▏ | 7308/61904 [3:20:17<20:40:35, 1.36s/it] 12%|█▏ | 7309/61904 [3:20:18<20:36:44, 1.36s/it] 12%|█▏ | 7310/61904 [3:20:20<20:39:02, 1.36s/it] 12%|█▏ | 7311/61904 [3:20:21<20:09:32, 1.33s/it] 12%|█▏ | 7312/61904 [3:20:22<20:21:30, 1.34s/it] 12%|█▏ | 7313/61904 [3:20:24<21:38:50, 1.43s/it] 12%|█▏ | 7314/61904 [3:20:25<21:12:24, 1.40s/it] 12%|█▏ | 7315/61904 [3:20:27<21:23:25, 1.41s/it] 12%|█▏ | 7316/61904 [3:20:28<21:47:20, 1.44s/it] 12%|█▏ | 7317/61904 [3:20:30<21:14:13, 1.40s/it] 12%|█▏ | 7318/61904 [3:20:31<21:17:06, 1.40s/it] 12%|█▏ | 7319/61904 [3:20:32<21:35:53, 1.42s/it] 12%|█▏ | 7320/61904 [3:20:34<21:23:30, 1.41s/it] {'loss': 2.898, 'learning_rate': 1.884610398029301e-07, 'epoch': 1.89} 12%|█▏ | 7320/61904 [3:20:34<21:23:30, 1.41s/it] 12%|█▏ | 7321/61904 [3:20:35<20:34:43, 1.36s/it] 12%|█▏ | 7322/61904 [3:20:36<20:31:06, 1.35s/it] 12%|█▏ | 7323/61904 [3:20:38<20:48:45, 1.37s/it] 12%|█▏ | 7324/61904 [3:20:39<20:42:29, 1.37s/it] 12%|█▏ | 7325/61904 [3:20:40<20:24:08, 1.35s/it] 12%|█▏ | 7326/61904 [3:20:42<19:58:45, 1.32s/it] 12%|█▏ | 7327/61904 [3:20:43<20:52:53, 1.38s/it] 12%|█▏ | 7328/61904 [3:20:45<20:51:08, 1.38s/it] 12%|█▏ | 7329/61904 [3:20:46<20:49:11, 1.37s/it] 12%|█▏ | 7330/61904 [3:20:47<20:43:31, 1.37s/it] 12%|█▏ | 7331/61904 [3:20:49<20:23:24, 1.35s/it] 12%|█▏ | 7332/61904 [3:20:50<20:19:43, 1.34s/it] 12%|█▏ | 7333/61904 [3:20:51<20:35:01, 1.36s/it] 12%|█▏ | 7334/61904 [3:20:53<20:18:28, 1.34s/it] 12%|█▏ | 7335/61904 [3:20:54<19:56:38, 1.32s/it] 12%|█▏ | 7336/61904 [3:20:55<19:49:15, 1.31s/it] 12%|█▏ | 7337/61904 [3:20:56<19:30:20, 1.29s/it] 12%|█▏ | 7338/61904 [3:20:58<19:17:24, 1.27s/it] 12%|█▏ | 7339/61904 [3:20:59<19:39:54, 1.30s/it] 12%|█▏ | 7340/61904 [3:21:00<20:07:00, 1.33s/it] {'loss': 2.9362, 'learning_rate': 1.884286269933878e-07, 'epoch': 1.9} 12%|█▏ | 7340/61904 [3:21:00<20:07:00, 1.33s/it] 12%|█▏ | 7341/61904 [3:21:02<20:18:51, 1.34s/it] 12%|█▏ | 7342/61904 [3:21:03<20:36:12, 1.36s/it] 12%|█▏ | 7343/61904 [3:21:05<20:47:53, 1.37s/it] 12%|█▏ | 7344/61904 [3:21:06<20:04:07, 1.32s/it] 12%|█▏ | 7345/61904 [3:21:07<20:44:57, 1.37s/it] 12%|█▏ | 7346/61904 [3:21:09<21:16:07, 1.40s/it] 12%|█▏ | 7347/61904 [3:21:10<21:02:17, 1.39s/it] 12%|█▏ | 7348/61904 [3:21:11<20:58:48, 1.38s/it] 12%|█▏ | 7349/61904 [3:21:13<21:01:41, 1.39s/it] 12%|█▏ | 7350/61904 [3:21:14<21:08:18, 1.39s/it] 12%|█▏ | 7351/61904 [3:21:16<21:09:24, 1.40s/it] 12%|█▏ | 7352/61904 [3:21:17<21:16:42, 1.40s/it] 12%|█▏ | 7353/61904 [3:21:18<20:53:55, 1.38s/it] 12%|█▏ | 7354/61904 [3:21:20<21:11:52, 1.40s/it] 12%|█▏ | 7355/61904 [3:21:21<21:12:13, 1.40s/it] 12%|█▏ | 7356/61904 [3:21:23<20:51:36, 1.38s/it] 12%|█▏ | 7357/61904 [3:21:24<21:02:49, 1.39s/it] 12%|█▏ | 7358/61904 [3:21:25<20:40:45, 1.36s/it] 12%|█▏ | 7359/61904 [3:21:27<20:11:33, 1.33s/it] 12%|█▏ | 7360/61904 [3:21:28<20:07:48, 1.33s/it] {'loss': 2.9173, 'learning_rate': 1.8839621418384545e-07, 'epoch': 1.9} 12%|█▏ | 7360/61904 [3:21:28<20:07:48, 1.33s/it] 12%|█▏ | 7361/61904 [3:21:29<20:11:50, 1.33s/it] 12%|█▏ | 7362/61904 [3:21:31<20:13:19, 1.33s/it] 12%|█▏ | 7363/61904 [3:21:32<20:45:28, 1.37s/it] 12%|█▏ | 7364/61904 [3:21:34<21:23:05, 1.41s/it] 12%|█▏ | 7365/61904 [3:21:35<21:15:02, 1.40s/it] 12%|█▏ | 7366/61904 [3:21:36<21:14:51, 1.40s/it] 12%|█▏ | 7367/61904 [3:21:38<20:50:36, 1.38s/it] 12%|█▏ | 7368/61904 [3:21:39<20:32:53, 1.36s/it] 12%|█▏ | 7369/61904 [3:21:40<20:49:13, 1.37s/it] 12%|█▏ | 7370/61904 [3:21:42<20:50:25, 1.38s/it] 12%|█▏ | 7371/61904 [3:21:43<20:48:53, 1.37s/it] 12%|█▏ | 7372/61904 [3:21:44<20:48:05, 1.37s/it] 12%|█▏ | 7373/61904 [3:21:46<20:37:37, 1.36s/it] 12%|█▏ | 7374/61904 [3:21:47<20:54:39, 1.38s/it] 12%|█▏ | 7375/61904 [3:21:49<20:37:56, 1.36s/it] 12%|█▏ | 7376/61904 [3:21:50<20:21:34, 1.34s/it] 12%|█▏ | 7377/61904 [3:21:51<20:06:18, 1.33s/it] 12%|█▏ | 7378/61904 [3:21:52<19:56:57, 1.32s/it] 12%|█▏ | 7379/61904 [3:21:54<20:23:17, 1.35s/it] 12%|█▏ | 7380/61904 [3:21:55<20:07:55, 1.33s/it] {'loss': 2.9209, 'learning_rate': 1.8836380137430312e-07, 'epoch': 1.91} 12%|█▏ | 7380/61904 [3:21:55<20:07:55, 1.33s/it] 12%|█▏ | 7381/61904 [3:21:57<20:41:55, 1.37s/it] 12%|█▏ | 7382/61904 [3:21:58<20:51:51, 1.38s/it] 12%|█▏ | 7383/61904 [3:21:59<20:56:13, 1.38s/it] 12%|█▏ | 7384/61904 [3:22:01<20:14:47, 1.34s/it] 12%|█▏ | 7385/61904 [3:22:02<20:32:24, 1.36s/it] 12%|█▏ | 7386/61904 [3:22:03<20:18:35, 1.34s/it] 12%|█▏ | 7387/61904 [3:22:05<20:23:34, 1.35s/it] 12%|█▏ | 7388/61904 [3:22:06<20:09:16, 1.33s/it] 12%|█▏ | 7389/61904 [3:22:07<20:22:51, 1.35s/it] 12%|█▏ | 7390/61904 [3:22:09<20:10:20, 1.33s/it] 12%|█▏ | 7391/61904 [3:22:10<19:47:24, 1.31s/it] 12%|█▏ | 7392/61904 [3:22:11<19:43:34, 1.30s/it] 12%|█▏ | 7393/61904 [3:22:13<19:58:38, 1.32s/it] 12%|█▏ | 7394/61904 [3:22:14<20:06:19, 1.33s/it] 12%|█▏ | 7395/61904 [3:22:15<19:23:59, 1.28s/it] 12%|█▏ | 7396/61904 [3:22:16<19:39:45, 1.30s/it] 12%|█▏ | 7397/61904 [3:22:18<19:27:40, 1.29s/it] 12%|█▏ | 7398/61904 [3:22:19<19:39:58, 1.30s/it] 12%|█▏ | 7399/61904 [3:22:20<20:10:48, 1.33s/it] 12%|█▏ | 7400/61904 [3:22:22<19:52:34, 1.31s/it] {'loss': 2.9294, 'learning_rate': 1.883313885647608e-07, 'epoch': 1.91} 12%|█▏ | 7400/61904 [3:22:22<19:52:34, 1.31s/it] 12%|█▏ | 7401/61904 [3:22:23<20:46:08, 1.37s/it] 12%|█▏ | 7402/61904 [3:22:25<20:49:57, 1.38s/it] 12%|█▏ | 7403/61904 [3:22:26<20:17:54, 1.34s/it] 12%|█▏ | 7404/61904 [3:22:27<20:08:36, 1.33s/it] 12%|█▏ | 7405/61904 [3:22:29<20:35:51, 1.36s/it] 12%|█▏ | 7406/61904 [3:22:30<21:07:16, 1.40s/it] 12%|█▏ | 7407/61904 [3:22:31<20:51:36, 1.38s/it] 12%|█▏ | 7408/61904 [3:22:33<20:20:01, 1.34s/it] 12%|█▏ | 7409/61904 [3:22:34<20:15:51, 1.34s/it] 12%|█▏ | 7410/61904 [3:22:35<20:04:59, 1.33s/it] 12%|█▏ | 7411/61904 [3:22:37<20:32:54, 1.36s/it] 12%|█▏ | 7412/61904 [3:22:38<20:43:40, 1.37s/it] 12%|█▏ | 7413/61904 [3:22:39<20:43:47, 1.37s/it] 12%|█▏ | 7414/61904 [3:22:41<21:07:48, 1.40s/it] 12%|█▏ | 7415/61904 [3:22:42<21:20:00, 1.41s/it] 12%|█▏ | 7416/61904 [3:22:44<20:51:08, 1.38s/it] 12%|█▏ | 7417/61904 [3:22:45<20:10:04, 1.33s/it] 12%|█▏ | 7418/61904 [3:22:46<20:41:15, 1.37s/it] 12%|█▏ | 7419/61904 [3:22:48<20:14:07, 1.34s/it] 12%|█▏ | 7420/61904 [3:22:49<20:51:25, 1.38s/it] {'loss': 2.9321, 'learning_rate': 1.8829897575521844e-07, 'epoch': 1.92} 12%|█▏ | 7420/61904 [3:22:49<20:51:25, 1.38s/it] 12%|█▏ | 7421/61904 [3:22:50<20:37:31, 1.36s/it] 12%|█▏ | 7422/61904 [3:22:52<20:21:46, 1.35s/it] 12%|█▏ | 7423/61904 [3:22:53<20:40:32, 1.37s/it] 12%|█▏ | 7424/61904 [3:22:55<21:03:00, 1.39s/it] 12%|█▏ | 7425/61904 [3:22:56<20:48:37, 1.38s/it] 12%|█▏ | 7426/61904 [3:22:57<20:49:16, 1.38s/it] 12%|█▏ | 7427/61904 [3:22:59<21:07:12, 1.40s/it] 12%|█▏ | 7428/61904 [3:23:00<21:19:47, 1.41s/it] 12%|█▏ | 7429/61904 [3:23:02<21:06:58, 1.40s/it] 12%|█▏ | 7430/61904 [3:23:03<20:43:45, 1.37s/it] 12%|█▏ | 7431/61904 [3:23:04<20:07:29, 1.33s/it] 12%|█▏ | 7432/61904 [3:23:06<20:48:28, 1.38s/it] 12%|█▏ | 7433/61904 [3:23:07<20:39:26, 1.37s/it] 12%|█▏ | 7434/61904 [3:23:08<21:12:40, 1.40s/it] 12%|█▏ | 7435/61904 [3:23:10<23:09:35, 1.53s/it] 12%|█▏ | 7436/61904 [3:23:12<22:20:43, 1.48s/it] 12%|█▏ | 7437/61904 [3:23:13<21:37:05, 1.43s/it] 12%|█▏ | 7438/61904 [3:23:14<21:24:12, 1.41s/it] 12%|█▏ | 7439/61904 [3:23:16<22:02:58, 1.46s/it] 12%|█▏ | 7440/61904 [3:23:17<21:18:53, 1.41s/it] {'loss': 2.8453, 'learning_rate': 1.8826656294567613e-07, 'epoch': 1.92} 12%|█▏ | 7440/61904 [3:23:17<21:18:53, 1.41s/it] 12%|█▏ | 7441/61904 [3:23:18<20:50:05, 1.38s/it] 12%|█▏ | 7442/61904 [3:23:20<20:46:00, 1.37s/it] 12%|█▏ | 7443/61904 [3:23:21<20:43:45, 1.37s/it] 12%|█▏ | 7444/61904 [3:23:23<20:44:05, 1.37s/it] 12%|█▏ | 7445/61904 [3:23:24<21:02:08, 1.39s/it] 12%|█▏ | 7446/61904 [3:23:25<21:16:08, 1.41s/it] 12%|█▏ | 7447/61904 [3:23:27<21:03:38, 1.39s/it] 12%|█▏ | 7448/61904 [3:23:28<20:46:36, 1.37s/it] 12%|█▏ | 7449/61904 [3:23:29<20:26:11, 1.35s/it] 12%|█▏ | 7450/61904 [3:23:31<20:26:28, 1.35s/it] 12%|█▏ | 7451/61904 [3:23:32<20:23:07, 1.35s/it] 12%|█▏ | 7452/61904 [3:23:33<20:12:41, 1.34s/it] 12%|█▏ | 7453/61904 [3:23:35<20:43:37, 1.37s/it] 12%|█▏ | 7454/61904 [3:23:36<20:14:02, 1.34s/it] 12%|█▏ | 7455/61904 [3:23:37<20:07:46, 1.33s/it] 12%|█▏ | 7456/61904 [3:23:39<20:15:24, 1.34s/it] 12%|█▏ | 7457/61904 [3:23:40<20:14:04, 1.34s/it] 12%|█▏ | 7458/61904 [3:23:42<21:09:49, 1.40s/it] 12%|█▏ | 7459/61904 [3:23:43<21:33:54, 1.43s/it] 12%|█▏ | 7460/61904 [3:23:44<20:50:03, 1.38s/it] {'loss': 2.8941, 'learning_rate': 1.882341501361338e-07, 'epoch': 1.93} 12%|█▏ | 7460/61904 [3:23:44<20:50:03, 1.38s/it] 12%|█▏ | 7461/61904 [3:23:46<21:06:12, 1.40s/it] 12%|█▏ | 7462/61904 [3:23:47<21:01:21, 1.39s/it] 12%|█▏ | 7463/61904 [3:23:49<20:43:13, 1.37s/it] 12%|█▏ | 7464/61904 [3:23:50<21:05:32, 1.39s/it] 12%|█▏ | 7465/61904 [3:23:51<21:10:46, 1.40s/it] 12%|█▏ | 7466/61904 [3:23:53<21:34:41, 1.43s/it] 12%|█▏ | 7467/61904 [3:23:54<21:33:59, 1.43s/it] 12%|█▏ | 7468/61904 [3:23:56<21:13:12, 1.40s/it] 12%|█▏ | 7469/61904 [3:23:57<20:46:40, 1.37s/it] 12%|█▏ | 7470/61904 [3:23:58<20:26:51, 1.35s/it] 12%|█▏ | 7471/61904 [3:24:00<20:17:02, 1.34s/it] 12%|█▏ | 7472/61904 [3:24:01<20:33:55, 1.36s/it] 12%|█▏ | 7473/61904 [3:24:03<21:08:31, 1.40s/it] 12%|█▏ | 7474/61904 [3:24:04<21:46:27, 1.44s/it] 12%|█▏ | 7475/61904 [3:24:05<20:54:42, 1.38s/it] 12%|█▏ | 7476/61904 [3:24:07<20:49:00, 1.38s/it] 12%|█▏ | 7477/61904 [3:24:08<20:14:19, 1.34s/it] 12%|█▏ | 7478/61904 [3:24:09<20:07:42, 1.33s/it] 12%|█▏ | 7479/61904 [3:24:10<19:46:11, 1.31s/it] 12%|█▏ | 7480/61904 [3:24:12<19:46:29, 1.31s/it] {'loss': 2.932, 'learning_rate': 1.8820173732659145e-07, 'epoch': 1.93} 12%|█▏ | 7480/61904 [3:24:12<19:46:29, 1.31s/it] 12%|█▏ | 7481/61904 [3:24:13<19:37:46, 1.30s/it] 12%|█▏ | 7482/61904 [3:24:14<19:10:37, 1.27s/it] 12%|█▏ | 7483/61904 [3:24:16<20:11:36, 1.34s/it] 12%|█▏ | 7484/61904 [3:24:17<20:08:20, 1.33s/it] 12%|█▏ | 7485/61904 [3:24:19<20:35:35, 1.36s/it] 12%|█▏ | 7486/61904 [3:24:20<20:45:22, 1.37s/it] 12%|█▏ | 7487/61904 [3:24:21<20:37:01, 1.36s/it] 12%|█▏ | 7488/61904 [3:24:23<20:51:11, 1.38s/it] 12%|█▏ | 7489/61904 [3:24:24<20:52:32, 1.38s/it] 12%|█▏ | 7490/61904 [3:24:25<20:55:07, 1.38s/it] 12%|█▏ | 7491/61904 [3:24:27<21:14:00, 1.40s/it] 12%|█▏ | 7492/61904 [3:24:28<21:15:38, 1.41s/it] 12%|█▏ | 7493/61904 [3:24:30<21:01:38, 1.39s/it] 12%|█▏ | 7494/61904 [3:24:31<20:18:08, 1.34s/it] 12%|█▏ | 7495/61904 [3:24:32<20:14:36, 1.34s/it] 12%|█▏ | 7496/61904 [3:24:34<21:03:20, 1.39s/it] 12%|█▏ | 7497/61904 [3:24:35<20:53:42, 1.38s/it] 12%|█▏ | 7498/61904 [3:24:37<21:34:03, 1.43s/it] 12%|█▏ | 7499/61904 [3:24:38<20:50:13, 1.38s/it] 12%|█▏ | 7500/61904 [3:24:39<20:10:40, 1.34s/it] {'loss': 2.9291, 'learning_rate': 1.8816932451704914e-07, 'epoch': 1.94} 12%|█▏ | 7500/61904 [3:24:39<20:10:40, 1.34s/it] 12%|█▏ | 7501/61904 [3:24:40<19:42:38, 1.30s/it] 12%|█▏ | 7502/61904 [3:24:42<20:27:11, 1.35s/it] 12%|█▏ | 7503/61904 [3:24:43<20:46:45, 1.38s/it] 12%|█▏ | 7504/61904 [3:24:45<20:59:32, 1.39s/it] 12%|█▏ | 7505/61904 [3:24:46<21:05:59, 1.40s/it] 12%|█▏ | 7506/61904 [3:24:47<20:44:10, 1.37s/it] 12%|█▏ | 7507/61904 [3:24:49<20:54:46, 1.38s/it] 12%|█▏ | 7508/61904 [3:24:50<20:10:43, 1.34s/it] 12%|█▏ | 7509/61904 [3:24:51<20:23:26, 1.35s/it] 12%|█▏ | 7510/61904 [3:24:53<19:56:25, 1.32s/it] 12%|█▏ | 7511/61904 [3:24:54<20:36:33, 1.36s/it] 12%|█▏ | 7512/61904 [3:24:55<20:14:50, 1.34s/it] 12%|█▏ | 7513/61904 [3:24:57<20:43:08, 1.37s/it] 12%|█▏ | 7514/61904 [3:24:58<20:15:55, 1.34s/it] 12%|█▏ | 7515/61904 [3:25:00<20:44:05, 1.37s/it] 12%|█▏ | 7516/61904 [3:25:01<21:11:55, 1.40s/it] 12%|█▏ | 7517/61904 [3:25:02<20:23:03, 1.35s/it] 12%|█▏ | 7518/61904 [3:25:04<20:30:09, 1.36s/it] 12%|█▏ | 7519/61904 [3:25:05<20:39:22, 1.37s/it] 12%|█▏ | 7520/61904 [3:25:07<20:59:35, 1.39s/it] {'loss': 2.9375, 'learning_rate': 1.881369117075068e-07, 'epoch': 1.94} 12%|█▏ | 7520/61904 [3:25:07<20:59:35, 1.39s/it] 12%|█▏ | 7521/61904 [3:25:08<20:33:07, 1.36s/it] 12%|█▏ | 7522/61904 [3:25:09<20:44:04, 1.37s/it] 12%|█▏ | 7523/61904 [3:25:11<20:41:44, 1.37s/it] 12%|█▏ | 7524/61904 [3:25:12<20:46:24, 1.38s/it] 12%|█▏ | 7525/61904 [3:25:13<20:37:32, 1.37s/it] 12%|█▏ | 7526/61904 [3:25:15<20:25:52, 1.35s/it] 12%|█▏ | 7527/61904 [3:25:16<20:33:54, 1.36s/it] 12%|█▏ | 7528/61904 [3:25:17<20:27:33, 1.35s/it] 12%|█▏ | 7529/61904 [3:25:19<20:48:03, 1.38s/it] 12%|█▏ | 7530/61904 [3:25:20<20:16:55, 1.34s/it] 12%|█▏ | 7531/61904 [3:25:21<20:03:11, 1.33s/it] 12%|█▏ | 7532/61904 [3:25:23<19:43:49, 1.31s/it] 12%|█▏ | 7533/61904 [3:25:24<19:39:33, 1.30s/it] 12%|█▏ | 7534/61904 [3:25:25<19:21:40, 1.28s/it] 12%|█▏ | 7535/61904 [3:25:27<19:56:02, 1.32s/it] 12%|█▏ | 7536/61904 [3:25:28<20:46:17, 1.38s/it] 12%|█▏ | 7537/61904 [3:25:29<20:33:33, 1.36s/it] 12%|█▏ | 7538/61904 [3:25:31<20:53:55, 1.38s/it] 12%|█▏ | 7539/61904 [3:25:32<20:43:36, 1.37s/it] 12%|█▏ | 7540/61904 [3:25:33<19:59:23, 1.32s/it] {'loss': 2.9291, 'learning_rate': 1.8810449889796446e-07, 'epoch': 1.95} 12%|█▏ | 7540/61904 [3:25:33<19:59:23, 1.32s/it] 12%|█▏ | 7541/61904 [3:25:35<20:08:17, 1.33s/it] 12%|█▏ | 7542/61904 [3:25:36<19:40:59, 1.30s/it] 12%|█▏ | 7543/61904 [3:25:37<20:29:20, 1.36s/it] 12%|█▏ | 7544/61904 [3:25:39<20:35:43, 1.36s/it] 12%|█▏ | 7545/61904 [3:25:40<19:54:07, 1.32s/it] 12%|█▏ | 7546/61904 [3:25:41<19:49:39, 1.31s/it] 12%|█▏ | 7547/61904 [3:25:43<20:13:08, 1.34s/it] 12%|█▏ | 7548/61904 [3:25:44<19:44:03, 1.31s/it] 12%|█▏ | 7549/61904 [3:25:45<20:35:43, 1.36s/it] 12%|█▏ | 7550/61904 [3:25:47<20:25:54, 1.35s/it] 12%|█▏ | 7551/61904 [3:25:48<19:57:25, 1.32s/it] 12%|█▏ | 7552/61904 [3:25:49<20:11:17, 1.34s/it] 12%|█▏ | 7553/61904 [3:25:51<20:28:25, 1.36s/it] 12%|█▏ | 7554/61904 [3:25:52<20:49:21, 1.38s/it] 12%|█▏ | 7555/61904 [3:25:54<20:24:00, 1.35s/it] 12%|█▏ | 7556/61904 [3:25:55<20:06:34, 1.33s/it] 12%|█▏ | 7557/61904 [3:25:56<20:44:05, 1.37s/it] 12%|█▏ | 7558/61904 [3:25:58<20:13:18, 1.34s/it] 12%|█▏ | 7559/61904 [3:25:59<20:02:56, 1.33s/it] 12%|█▏ | 7560/61904 [3:26:00<20:22:32, 1.35s/it] {'loss': 2.8834, 'learning_rate': 1.8807208608842215e-07, 'epoch': 1.95} 12%|█▏ | 7560/61904 [3:26:00<20:22:32, 1.35s/it] 12%|█▏ | 7561/61904 [3:26:01<19:52:34, 1.32s/it] 12%|█▏ | 7562/61904 [3:26:03<19:35:55, 1.30s/it] 12%|█▏ | 7563/61904 [3:26:04<19:49:40, 1.31s/it] 12%|█▏ | 7564/61904 [3:26:05<20:14:30, 1.34s/it] 12%|█▏ | 7565/61904 [3:26:07<20:12:31, 1.34s/it] 12%|█▏ | 7566/61904 [3:26:08<20:22:56, 1.35s/it] 12%|█▏ | 7567/61904 [3:26:09<20:11:50, 1.34s/it] 12%|█▏ | 7568/61904 [3:26:11<20:21:59, 1.35s/it] 12%|█▏ | 7569/61904 [3:26:12<20:16:41, 1.34s/it] 12%|█▏ | 7570/61904 [3:26:13<20:03:30, 1.33s/it] 12%|█▏ | 7571/61904 [3:26:15<20:29:19, 1.36s/it] 12%|█▏ | 7572/61904 [3:26:16<20:23:21, 1.35s/it] 12%|█▏ | 7573/61904 [3:26:18<20:24:07, 1.35s/it] 12%|█▏ | 7574/61904 [3:26:19<20:07:15, 1.33s/it] 12%|█▏ | 7575/61904 [3:26:20<20:09:35, 1.34s/it] 12%|█▏ | 7576/61904 [3:26:22<19:55:16, 1.32s/it] 12%|█▏ | 7577/61904 [3:26:23<19:54:07, 1.32s/it] 12%|█▏ | 7578/61904 [3:26:24<20:06:56, 1.33s/it] 12%|█▏ | 7579/61904 [3:26:26<20:09:42, 1.34s/it] 12%|█▏ | 7580/61904 [3:26:27<20:37:02, 1.37s/it] {'loss': 2.9606, 'learning_rate': 1.880396732788798e-07, 'epoch': 1.96} 12%|█▏ | 7580/61904 [3:26:27<20:37:02, 1.37s/it] 12%|█▏ | 7581/61904 [3:26:28<21:02:00, 1.39s/it] 12%|█▏ | 7582/61904 [3:26:30<20:43:33, 1.37s/it] 12%|█▏ | 7583/61904 [3:26:31<21:40:31, 1.44s/it] 12%|█▏ | 7584/61904 [3:26:33<21:28:19, 1.42s/it] 12%|█▏ | 7585/61904 [3:26:34<21:22:03, 1.42s/it] 12%|█▏ | 7586/61904 [3:26:36<21:14:41, 1.41s/it] 12%|█▏ | 7587/61904 [3:26:37<20:41:20, 1.37s/it] 12%|█▏ | 7588/61904 [3:26:38<21:05:55, 1.40s/it] 12%|█▏ | 7589/61904 [3:26:40<20:19:24, 1.35s/it] 12%|█▏ | 7590/61904 [3:26:41<20:06:05, 1.33s/it] 12%|█▏ | 7591/61904 [3:26:42<20:25:08, 1.35s/it] 12%|█▏ | 7592/61904 [3:26:44<20:15:48, 1.34s/it] 12%|█▏ | 7593/61904 [3:26:45<20:23:33, 1.35s/it] 12%|█▏ | 7594/61904 [3:26:46<21:08:37, 1.40s/it] 12%|█▏ | 7595/61904 [3:26:48<22:03:54, 1.46s/it] 12%|█▏ | 7596/61904 [3:26:49<21:26:13, 1.42s/it] 12%|█▏ | 7597/61904 [3:26:51<21:43:56, 1.44s/it] 12%|█▏ | 7598/61904 [3:26:52<21:19:23, 1.41s/it] 12%|█▏ | 7599/61904 [3:26:53<20:39:58, 1.37s/it] 12%|█▏ | 7600/61904 [3:26:55<20:35:55, 1.37s/it] {'loss': 2.8769, 'learning_rate': 1.8800726046933748e-07, 'epoch': 1.96} 12%|█▏ | 7600/61904 [3:26:55<20:35:55, 1.37s/it] 12%|█▏ | 7601/61904 [3:26:56<20:34:03, 1.36s/it] 12%|█▏ | 7602/61904 [3:26:58<21:17:01, 1.41s/it] 12%|█▏ | 7603/61904 [3:26:59<20:44:28, 1.38s/it] 12%|█▏ | 7604/61904 [3:27:00<20:39:33, 1.37s/it] 12%|█▏ | 7605/61904 [3:27:02<20:53:00, 1.38s/it] 12%|█▏ | 7606/61904 [3:27:03<20:34:23, 1.36s/it] 12%|█▏ | 7607/61904 [3:27:05<21:15:16, 1.41s/it] 12%|█▏ | 7608/61904 [3:27:06<20:58:31, 1.39s/it] 12%|█▏ | 7609/61904 [3:27:08<21:47:31, 1.44s/it] 12%|█▏ | 7610/61904 [3:27:09<20:47:58, 1.38s/it] 12%|█▏ | 7611/61904 [3:27:10<22:02:09, 1.46s/it] 12%|█▏ | 7612/61904 [3:27:12<21:56:38, 1.46s/it] 12%|█▏ | 7613/61904 [3:27:13<20:42:29, 1.37s/it] 12%|█▏ | 7614/61904 [3:27:14<20:59:59, 1.39s/it] 12%|█▏ | 7615/61904 [3:27:16<20:47:28, 1.38s/it] 12%|█▏ | 7616/61904 [3:27:17<20:26:25, 1.36s/it] 12%|█▏ | 7617/61904 [3:27:18<20:10:07, 1.34s/it] 12%|█▏ | 7618/61904 [3:27:20<20:10:20, 1.34s/it] 12%|█▏ | 7619/61904 [3:27:21<19:48:50, 1.31s/it] 12%|█▏ | 7620/61904 [3:27:22<19:49:44, 1.32s/it] {'loss': 2.9195, 'learning_rate': 1.8797484765979514e-07, 'epoch': 1.97} 12%|█▏ | 7620/61904 [3:27:22<19:49:44, 1.32s/it] 12%|█▏ | 7621/61904 [3:27:24<20:17:28, 1.35s/it] 12%|█▏ | 7622/61904 [3:27:25<20:09:06, 1.34s/it] 12%|█▏ | 7623/61904 [3:27:26<20:06:17, 1.33s/it] 12%|█▏ | 7624/61904 [3:27:28<20:36:54, 1.37s/it] 12%|█▏ | 7625/61904 [3:27:29<19:58:49, 1.33s/it] 12%|█▏ | 7626/61904 [3:27:30<20:07:54, 1.34s/it] 12%|█▏ | 7627/61904 [3:27:32<20:26:50, 1.36s/it] 12%|█▏ | 7628/61904 [3:27:33<20:52:51, 1.38s/it] 12%|█▏ | 7629/61904 [3:27:34<20:15:07, 1.34s/it] 12%|█▏ | 7630/61904 [3:27:36<20:36:09, 1.37s/it] 12%|█▏ | 7631/61904 [3:27:37<20:40:39, 1.37s/it] 12%|█▏ | 7632/61904 [3:27:39<20:42:30, 1.37s/it] 12%|█▏ | 7633/61904 [3:27:40<20:13:34, 1.34s/it] 12%|█▏ | 7634/61904 [3:27:41<20:12:44, 1.34s/it] 12%|█▏ | 7635/61904 [3:27:43<20:08:30, 1.34s/it] 12%|█▏ | 7636/61904 [3:27:44<20:18:56, 1.35s/it] 12%|█▏ | 7637/61904 [3:27:45<19:51:57, 1.32s/it] 12%|█▏ | 7638/61904 [3:27:47<20:07:10, 1.33s/it] 12%|█▏ | 7639/61904 [3:27:48<20:27:44, 1.36s/it] 12%|█▏ | 7640/61904 [3:27:49<20:27:29, 1.36s/it] {'loss': 2.9, 'learning_rate': 1.879424348502528e-07, 'epoch': 1.97} 12%|█▏ | 7640/61904 [3:27:49<20:27:29, 1.36s/it] 12%|█▏ | 7641/61904 [3:27:51<20:53:48, 1.39s/it] 12%|█▏ | 7642/61904 [3:27:52<19:59:35, 1.33s/it] 12%|█▏ | 7643/61904 [3:27:53<19:37:17, 1.30s/it] 12%|█▏ | 7644/61904 [3:27:55<19:45:11, 1.31s/it] 12%|█▏ | 7645/61904 [3:27:56<19:37:48, 1.30s/it] 12%|█▏ | 7646/61904 [3:27:57<20:02:37, 1.33s/it] 12%|█▏ | 7647/61904 [3:27:59<21:31:42, 1.43s/it] 12%|█▏ | 7648/61904 [3:28:00<21:47:33, 1.45s/it] 12%|█▏ | 7649/61904 [3:28:02<21:29:47, 1.43s/it] 12%|█▏ | 7650/61904 [3:28:03<21:00:52, 1.39s/it] 12%|█▏ | 7651/61904 [3:28:05<21:17:56, 1.41s/it] 12%|█▏ | 7652/61904 [3:28:06<21:05:58, 1.40s/it] 12%|█▏ | 7653/61904 [3:28:07<21:27:58, 1.42s/it] 12%|█▏ | 7654/61904 [3:28:09<21:39:39, 1.44s/it] 12%|█▏ | 7655/61904 [3:28:10<21:10:25, 1.41s/it] 12%|█▏ | 7656/61904 [3:28:12<20:54:49, 1.39s/it] 12%|█▏ | 7657/61904 [3:28:13<21:24:58, 1.42s/it] 12%|█▏ | 7658/61904 [3:28:15<21:28:46, 1.43s/it] 12%|█▏ | 7659/61904 [3:28:16<21:26:15, 1.42s/it] 12%|█▏ | 7660/61904 [3:28:17<20:47:49, 1.38s/it] {'loss': 2.9407, 'learning_rate': 1.879100220407105e-07, 'epoch': 1.98} 12%|█▏ | 7660/61904 [3:28:17<20:47:49, 1.38s/it] 12%|█▏ | 7661/61904 [3:28:19<20:39:50, 1.37s/it] 12%|█▏ | 7662/61904 [3:28:20<20:16:14, 1.35s/it] 12%|█▏ | 7663/61904 [3:28:21<20:51:02, 1.38s/it] 12%|█▏ | 7664/61904 [3:28:23<21:07:21, 1.40s/it] 12%|█▏ | 7665/61904 [3:28:24<22:01:21, 1.46s/it] 12%|█▏ | 7666/61904 [3:28:26<21:52:43, 1.45s/it] 12%|█▏ | 7667/61904 [3:28:27<21:05:29, 1.40s/it] 12%|█▏ | 7668/61904 [3:28:29<21:15:09, 1.41s/it] 12%|█▏ | 7669/61904 [3:28:30<20:59:34, 1.39s/it] 12%|█▏ | 7670/61904 [3:28:31<21:14:25, 1.41s/it] 12%|█▏ | 7671/61904 [3:28:33<21:12:22, 1.41s/it] 12%|█▏ | 7672/61904 [3:28:34<20:57:07, 1.39s/it] 12%|█▏ | 7673/61904 [3:28:36<21:14:46, 1.41s/it] 12%|█▏ | 7674/61904 [3:28:37<22:01:25, 1.46s/it] 12%|█▏ | 7675/61904 [3:28:38<21:18:09, 1.41s/it] 12%|█▏ | 7676/61904 [3:28:40<20:47:24, 1.38s/it] 12%|█▏ | 7677/61904 [3:28:41<20:51:24, 1.38s/it] 12%|█▏ | 7678/61904 [3:28:42<20:50:21, 1.38s/it] 12%|█▏ | 7679/61904 [3:28:44<21:10:18, 1.41s/it] 12%|█▏ | 7680/61904 [3:28:45<21:02:21, 1.40s/it] {'loss': 2.9094, 'learning_rate': 1.8787760923116815e-07, 'epoch': 1.98} 12%|█▏ | 7680/61904 [3:28:45<21:02:21, 1.40s/it] 12%|█▏ | 7681/61904 [3:28:47<21:44:07, 1.44s/it] 12%|█▏ | 7682/61904 [3:28:48<21:10:33, 1.41s/it] 12%|█▏ | 7683/61904 [3:28:49<20:21:45, 1.35s/it] 12%|█▏ | 7684/61904 [3:28:51<20:59:24, 1.39s/it] 12%|█▏ | 7685/61904 [3:28:52<20:41:15, 1.37s/it] 12%|█▏ | 7686/61904 [3:28:54<20:41:25, 1.37s/it] 12%|█▏ | 7687/61904 [3:28:55<20:16:23, 1.35s/it] 12%|█▏ | 7688/61904 [3:28:56<20:34:37, 1.37s/it] 12%|█▏ | 7689/61904 [3:28:58<20:31:12, 1.36s/it] 12%|█▏ | 7690/61904 [3:28:59<20:09:15, 1.34s/it] 12%|█▏ | 7691/61904 [3:29:00<20:22:23, 1.35s/it] 12%|█▏ | 7692/61904 [3:29:02<20:19:23, 1.35s/it] 12%|█▏ | 7693/61904 [3:29:03<20:03:09, 1.33s/it] 12%|█▏ | 7694/61904 [3:29:04<20:10:41, 1.34s/it] 12%|█▏ | 7695/61904 [3:29:05<19:27:05, 1.29s/it] 12%|█▏ | 7696/61904 [3:29:07<19:32:12, 1.30s/it] 12%|█▏ | 7697/61904 [3:29:08<20:23:11, 1.35s/it] 12%|█▏ | 7698/61904 [3:29:10<20:19:45, 1.35s/it] 12%|█▏ | 7699/61904 [3:29:11<20:00:00, 1.33s/it] 12%|█▏ | 7700/61904 [3:29:12<19:35:24, 1.30s/it] {'loss': 2.9574, 'learning_rate': 1.878451964216258e-07, 'epoch': 1.99} 12%|█▏ | 7700/61904 [3:29:12<19:35:24, 1.30s/it] 12%|█▏ | 7701/61904 [3:29:14<20:45:36, 1.38s/it] 12%|█▏ | 7702/61904 [3:29:15<20:35:44, 1.37s/it] 12%|█▏ | 7703/61904 [3:29:16<20:22:40, 1.35s/it] 12%|█▏ | 7704/61904 [3:29:18<20:30:48, 1.36s/it] 12%|█▏ | 7705/61904 [3:29:19<20:19:40, 1.35s/it] 12%|█▏ | 7706/61904 [3:29:20<20:37:10, 1.37s/it] 12%|█▏ | 7707/61904 [3:29:22<20:20:22, 1.35s/it] 12%|█▏ | 7708/61904 [3:29:23<20:05:48, 1.33s/it] 12%|█▏ | 7709/61904 [3:29:24<20:16:09, 1.35s/it] 12%|█▏ | 7710/61904 [3:29:26<20:04:16, 1.33s/it] 12%|█▏ | 7711/61904 [3:29:27<19:54:13, 1.32s/it] 12%|█▏ | 7712/61904 [3:29:28<19:31:22, 1.30s/it] 12%|█▏ | 7713/61904 [3:29:30<19:54:32, 1.32s/it] 12%|█▏ | 7714/61904 [3:29:31<20:18:07, 1.35s/it] 12%|█▏ | 7715/61904 [3:29:32<19:53:36, 1.32s/it] 12%|█▏ | 7716/61904 [3:29:34<19:48:06, 1.32s/it] 12%|█▏ | 7717/61904 [3:29:35<19:39:34, 1.31s/it] 12%|█▏ | 7718/61904 [3:29:36<19:59:20, 1.33s/it] 12%|█▏ | 7719/61904 [3:29:38<20:07:53, 1.34s/it] 12%|█▏ | 7720/61904 [3:29:39<20:50:27, 1.38s/it] {'loss': 2.8978, 'learning_rate': 1.878127836120835e-07, 'epoch': 2.0} 12%|█▏ | 7720/61904 [3:29:39<20:50:27, 1.38s/it] 12%|█▏ | 7721/61904 [3:29:41<21:25:25, 1.42s/it] 12%|█▏ | 7722/61904 [3:29:42<20:49:10, 1.38s/it] 12%|█▏ | 7723/61904 [3:29:43<21:22:57, 1.42s/it] 12%|█▏ | 7724/61904 [3:29:45<21:26:54, 1.43s/it] 12%|█▏ | 7725/61904 [3:29:46<21:32:59, 1.43s/it] 12%|█▏ | 7726/61904 [3:29:48<20:48:12, 1.38s/it] 12%|█▏ | 7727/61904 [3:29:49<20:32:38, 1.37s/it] 12%|█▏ | 7728/61904 [3:29:50<20:03:03, 1.33s/it] 12%|█▏ | 7729/61904 [3:29:52<20:27:46, 1.36s/it] 12%|█▏ | 7730/61904 [3:29:53<21:33:43, 1.43s/it] 12%|█▏ | 7731/61904 [3:29:55<22:02:19, 1.46s/it] 12%|█▏ | 7732/61904 [3:29:56<21:35:08, 1.43s/it] 12%|█▏ | 7733/61904 [3:29:57<20:54:28, 1.39s/it] 12%|█▏ | 7734/61904 [3:29:59<21:43:16, 1.44s/it] 12%|█▏ | 7735/61904 [3:30:00<21:09:00, 1.41s/it] 12%|█▏ | 7736/61904 [3:30:02<21:09:44, 1.41s/it] 12%|█▏ | 7737/61904 [3:30:03<21:03:14, 1.40s/it] 12%|█▎ | 7738/61904 [3:30:05<21:06:24, 1.40s/it] 13%|█▎ | 7739/61904 [3:30:06<21:12:27, 1.41s/it]Generation Kwargs: {'max_length': 384, 'max_gen_length': 380, 'num_beams': 5} 0%| | 0/861 [00:00> Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model) instead. This warning will be raised to an exception in v4.41. Non-default generation parameters: {'max_length': 200, 'early_stopping': True, 'num_beams': 5, 'forced_eos_token_id': 2} /opt/conda/lib/python3.10/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock. self.pid = os.fork() 13%|█▎ | 7740/61904 [4:03:13<8981:53:45, 596.98s/it] {'loss': 2.8846, 'learning_rate': 1.8778037080254114e-07, 'epoch': 2.0} 13%|█▎ | 7740/61904 [4:03:13<8981:53:45, 596.98s/it] 13%|█▎ | 7741/61904 [4:03:14<6294:35:44, 418.38s/it] 13%|█▎ | 7742/61904 [4:03:16<4412:32:50, 293.29s/it] 13%|█▎ | 7743/61904 [4:03:17<3095:55:39, 205.78s/it] 13%|█▎ | 7744/61904 [4:03:19<2173:25:21, 144.47s/it] 13%|█▎ | 7745/61904 [4:03:20<1527:26:31, 101.53s/it] 13%|█▎ | 7746/61904 [4:03:21<1075:47:08, 71.51s/it] 13%|█▎ | 7747/61904 [4:03:23<759:00:34, 50.45s/it] 13%|█▎ | 7748/61904 [4:03:24<537:19:22, 35.72s/it] 13%|█▎ | 7749/61904 [4:03:26<383:04:52, 25.47s/it] 13%|█▎ | 7750/61904 [4:03:27<274:40:16, 18.26s/it] 13%|█▎ | 7751/61904 [4:03:28<198:03:11, 13.17s/it] 13%|█▎ | 7752/61904 [4:03:30<144:57:18, 9.64s/it] 13%|█▎ | 7753/61904 [4:03:31<107:21:09, 7.14s/it] 13%|█▎ | 7754/61904 [4:03:33<81:43:30, 5.43s/it] 13%|█▎ | 7755/61904 [4:03:34<63:53:19, 4.25s/it] 13%|█▎ | 7756/61904 [4:03:35<50:55:00, 3.39s/it] 13%|█▎ | 7757/61904 [4:03:37<41:40:24, 2.77s/it] 13%|█▎ | 7758/61904 [4:03:38<35:20:56, 2.35s/it] 13%|█▎ | 7759/61904 [4:03:40<31:42:49, 2.11s/it] 13%|█▎ | 7760/61904 [4:03:41<28:36:23, 1.90s/it] {'loss': 2.954, 'learning_rate': 1.8774795799299882e-07, 'epoch': 2.01} 13%|█▎ | 7760/61904 [4:03:41<28:36:23, 1.90s/it] 13%|█▎ | 7761/61904 [4:03:42<25:49:01, 1.72s/it] 13%|█▎ | 7762/61904 [4:03:44<24:44:08, 1.64s/it] 13%|█▎ | 7763/61904 [4:03:45<23:22:02, 1.55s/it] 13%|█▎ | 7764/61904 [4:03:47<22:52:41, 1.52s/it] 13%|█▎ | 7765/61904 [4:03:48<21:59:31, 1.46s/it] 13%|█▎ | 7766/61904 [4:03:49<21:38:15, 1.44s/it] 13%|█▎ | 7767/61904 [4:03:51<21:01:04, 1.40s/it] 13%|█▎ | 7768/61904 [4:03:52<21:06:03, 1.40s/it] 13%|█▎ | 7769/61904 [4:03:53<20:51:05, 1.39s/it] 13%|█▎ | 7770/61904 [4:03:55<20:26:06, 1.36s/it] 13%|█▎ | 7771/61904 [4:03:56<21:34:45, 1.44s/it] 13%|█▎ | 7772/61904 [4:03:58<21:46:10, 1.45s/it] 13%|█▎ | 7773/61904 [4:03:59<21:50:34, 1.45s/it] 13%|█▎ | 7774/61904 [4:04:01<21:52:49, 1.46s/it] 13%|█▎ | 7775/61904 [4:04:02<21:18:00, 1.42s/it] 13%|█▎ | 7776/61904 [4:04:03<21:13:53, 1.41s/it] 13%|█▎ | 7777/61904 [4:04:05<20:48:41, 1.38s/it] 13%|█▎ | 7778/61904 [4:04:06<20:11:15, 1.34s/it] 13%|█▎ | 7779/61904 [4:04:08<22:02:09, 1.47s/it] 13%|█▎ | 7780/61904 [4:04:09<22:08:46, 1.47s/it] {'loss': 2.9074, 'learning_rate': 1.8771554518345649e-07, 'epoch': 2.01} 13%|█▎ | 7780/61904 [4:04:09<22:08:46, 1.47s/it] 13%|█▎ | 7781/61904 [4:04:11<22:02:07, 1.47s/it] 13%|█▎ | 7782/61904 [4:04:12<21:16:38, 1.42s/it] 13%|█▎ | 7783/61904 [4:04:13<20:51:55, 1.39s/it] 13%|█▎ | 7784/61904 [4:04:15<21:14:21, 1.41s/it] 13%|█▎ | 7785/61904 [4:04:16<20:54:57, 1.39s/it] 13%|█▎ | 7786/61904 [4:04:18<21:06:56, 1.40s/it] 13%|█▎ | 7787/61904 [4:04:19<21:30:57, 1.43s/it] 13%|█▎ | 7788/61904 [4:04:20<21:36:34, 1.44s/it] 13%|█▎ | 7789/61904 [4:04:22<21:09:52, 1.41s/it] 13%|█▎ | 7790/61904 [4:04:23<20:48:51, 1.38s/it] 13%|█▎ | 7791/61904 [4:04:25<20:38:53, 1.37s/it] 13%|█▎ | 7792/61904 [4:04:26<21:03:17, 1.40s/it] 13%|█▎ | 7793/61904 [4:04:27<20:41:08, 1.38s/it] 13%|█▎ | 7794/61904 [4:04:29<20:28:24, 1.36s/it] 13%|█▎ | 7795/61904 [4:04:30<20:29:12, 1.36s/it] 13%|█▎ | 7796/61904 [4:04:31<21:04:25, 1.40s/it] 13%|█▎ | 7797/61904 [4:04:33<21:29:19, 1.43s/it] 13%|█▎ | 7798/61904 [4:04:34<21:07:13, 1.41s/it] 13%|█▎ | 7799/61904 [4:04:36<20:50:26, 1.39s/it] 13%|█▎ | 7800/61904 [4:04:37<20:31:32, 1.37s/it] {'loss': 2.8973, 'learning_rate': 1.8768313237391415e-07, 'epoch': 2.02} 13%|█▎ | 7800/61904 [4:04:37<20:31:32, 1.37s/it] 13%|█▎ | 7801/61904 [4:04:38<21:02:55, 1.40s/it] 13%|█▎ | 7802/61904 [4:04:40<21:09:16, 1.41s/it] 13%|█▎ | 7803/61904 [4:04:41<21:11:41, 1.41s/it] 13%|█▎ | 7804/61904 [4:04:43<21:54:44, 1.46s/it] 13%|█▎ | 7805/61904 [4:04:45<22:55:43, 1.53s/it] 13%|█▎ | 7806/61904 [4:04:46<22:01:23, 1.47s/it] 13%|█▎ | 7807/61904 [4:04:47<21:55:16, 1.46s/it] 13%|█▎ | 7808/61904 [4:04:49<21:02:42, 1.40s/it] 13%|█▎ | 7809/61904 [4:04:50<20:10:06, 1.34s/it] 13%|█▎ | 7810/61904 [4:04:51<20:58:23, 1.40s/it] 13%|█▎ | 7811/61904 [4:04:53<21:30:50, 1.43s/it] 13%|█▎ | 7812/61904 [4:04:54<21:09:00, 1.41s/it] 13%|█▎ | 7813/61904 [4:04:56<21:26:12, 1.43s/it] 13%|█▎ | 7814/61904 [4:04:57<21:19:53, 1.42s/it] 13%|█▎ | 7815/61904 [4:04:59<21:35:14, 1.44s/it] 13%|█▎ | 7816/61904 [4:05:00<21:12:04, 1.41s/it] 13%|█▎ | 7817/61904 [4:05:01<20:58:11, 1.40s/it] 13%|█▎ | 7818/61904 [4:05:03<20:52:47, 1.39s/it] 13%|█▎ | 7819/61904 [4:05:04<20:24:17, 1.36s/it] 13%|█▎ | 7820/61904 [4:05:05<20:50:50, 1.39s/it] {'loss': 2.9155, 'learning_rate': 1.8765071956437184e-07, 'epoch': 2.02} 13%|█▎ | 7820/61904 [4:05:05<20:50:50, 1.39s/it] 13%|█▎ | 7821/61904 [4:05:07<20:23:42, 1.36s/it] 13%|█▎ | 7822/61904 [4:05:08<20:21:56, 1.36s/it] 13%|█▎ | 7823/61904 [4:05:09<20:27:54, 1.36s/it] 13%|█▎ | 7824/61904 [4:05:11<20:45:23, 1.38s/it] 13%|█▎ | 7825/61904 [4:05:12<21:18:26, 1.42s/it] 13%|█▎ | 7826/61904 [4:05:14<21:53:08, 1.46s/it] 13%|█▎ | 7827/61904 [4:05:15<21:16:27, 1.42s/it] 13%|█▎ | 7828/61904 [4:05:17<20:55:04, 1.39s/it] 13%|█▎ | 7829/61904 [4:05:18<21:46:11, 1.45s/it] 13%|█▎ | 7830/61904 [4:05:19<21:03:11, 1.40s/it] 13%|█▎ | 7831/61904 [4:05:21<20:34:28, 1.37s/it] 13%|█▎ | 7832/61904 [4:05:22<21:12:22, 1.41s/it] 13%|█▎ | 7833/61904 [4:05:24<21:12:29, 1.41s/it] 13%|█▎ | 7834/61904 [4:05:25<21:32:42, 1.43s/it] 13%|█▎ | 7835/61904 [4:05:27<21:32:46, 1.43s/it] 13%|█▎ | 7836/61904 [4:05:28<20:53:43, 1.39s/it] 13%|█▎ | 7837/61904 [4:05:29<20:44:55, 1.38s/it] 13%|█▎ | 7838/61904 [4:05:31<21:45:47, 1.45s/it] 13%|█▎ | 7839/61904 [4:05:32<21:38:20, 1.44s/it] 13%|█▎ | 7840/61904 [4:05:34<21:27:08, 1.43s/it] {'loss': 2.9162, 'learning_rate': 1.876183067548295e-07, 'epoch': 2.03} 13%|█▎ | 7840/61904 [4:05:34<21:27:08, 1.43s/it] 13%|█▎ | 7841/61904 [4:05:35<21:27:13, 1.43s/it] 13%|█▎ | 7842/61904 [4:05:37<22:10:25, 1.48s/it] 13%|█▎ | 7843/61904 [4:05:38<21:17:32, 1.42s/it] 13%|█▎ | 7844/61904 [4:05:39<21:24:21, 1.43s/it] 13%|█▎ | 7845/61904 [4:05:41<21:10:59, 1.41s/it] 13%|█▎ | 7846/61904 [4:05:42<20:53:02, 1.39s/it] 13%|█▎ | 7847/61904 [4:05:43<20:22:38, 1.36s/it] 13%|█▎ | 7848/61904 [4:05:45<20:12:45, 1.35s/it] 13%|█▎ | 7849/61904 [4:05:46<20:48:03, 1.39s/it] 13%|█▎ | 7850/61904 [4:05:48<21:06:05, 1.41s/it] 13%|█▎ | 7851/61904 [4:05:49<20:36:00, 1.37s/it] 13%|█▎ | 7852/61904 [4:05:50<20:16:33, 1.35s/it] 13%|█▎ | 7853/61904 [4:05:52<20:52:55, 1.39s/it] 13%|█▎ | 7854/61904 [4:05:53<20:40:52, 1.38s/it] 13%|█▎ | 7855/61904 [4:05:54<20:03:05, 1.34s/it] 13%|█▎ | 7856/61904 [4:05:56<20:12:48, 1.35s/it] 13%|█▎ | 7857/61904 [4:05:57<20:06:10, 1.34s/it] 13%|█▎ | 7858/61904 [4:05:58<20:43:42, 1.38s/it] 13%|█▎ | 7859/61904 [4:06:00<21:16:57, 1.42s/it] 13%|█▎ | 7860/61904 [4:06:01<21:18:18, 1.42s/it] {'loss': 2.8391, 'learning_rate': 1.8758589394528716e-07, 'epoch': 2.03} 13%|█▎ | 7860/61904 [4:06:01<21:18:18, 1.42s/it] 13%|█▎ | 7861/61904 [4:06:03<20:45:59, 1.38s/it] 13%|█▎ | 7862/61904 [4:06:04<20:14:43, 1.35s/it] 13%|█▎ | 7863/61904 [4:06:05<20:28:34, 1.36s/it] 13%|█▎ | 7864/61904 [4:06:07<21:10:46, 1.41s/it] 13%|█▎ | 7865/61904 [4:06:08<21:15:49, 1.42s/it] 13%|█▎ | 7866/61904 [4:06:10<21:27:55, 1.43s/it] 13%|█▎ | 7867/61904 [4:06:11<21:17:55, 1.42s/it] 13%|█▎ | 7868/61904 [4:06:12<20:37:48, 1.37s/it] 13%|█▎ | 7869/61904 [4:06:14<20:39:05, 1.38s/it] 13%|█▎ | 7870/61904 [4:06:15<20:43:28, 1.38s/it] 13%|█▎ | 7871/61904 [4:06:17<20:37:22, 1.37s/it] 13%|█▎ | 7872/61904 [4:06:18<20:08:39, 1.34s/it] 13%|█▎ | 7873/61904 [4:06:19<20:17:15, 1.35s/it] 13%|█▎ | 7874/61904 [4:06:20<19:53:25, 1.33s/it] 13%|█▎ | 7875/61904 [4:06:22<20:21:07, 1.36s/it] 13%|█▎ | 7876/61904 [4:06:23<21:37:25, 1.44s/it] 13%|█▎ | 7877/61904 [4:06:25<20:54:56, 1.39s/it] 13%|█▎ | 7878/61904 [4:06:26<20:50:02, 1.39s/it] 13%|█▎ | 7879/61904 [4:06:27<20:24:16, 1.36s/it] 13%|█▎ | 7880/61904 [4:06:29<19:44:04, 1.32s/it] {'loss': 2.8963, 'learning_rate': 1.8755348113574485e-07, 'epoch': 2.04} 13%|█▎ | 7880/61904 [4:06:29<19:44:04, 1.32s/it] 13%|█▎ | 7881/61904 [4:06:30<19:28:35, 1.30s/it] 13%|█▎ | 7882/61904 [4:06:31<20:07:01, 1.34s/it] 13%|█▎ | 7883/61904 [4:06:33<20:32:59, 1.37s/it] 13%|█▎ | 7884/61904 [4:06:34<20:11:44, 1.35s/it] 13%|█▎ | 7885/61904 [4:06:35<20:15:18, 1.35s/it] 13%|█▎ | 7886/61904 [4:06:37<20:38:29, 1.38s/it] 13%|█▎ | 7887/61904 [4:06:38<20:33:41, 1.37s/it] 13%|█▎ | 7888/61904 [4:06:40<20:24:55, 1.36s/it] 13%|█▎ | 7889/61904 [4:06:41<20:33:59, 1.37s/it] 13%|█▎ | 7890/61904 [4:06:42<20:21:07, 1.36s/it] 13%|█▎ | 7891/61904 [4:06:44<20:56:04, 1.40s/it] 13%|█▎ | 7892/61904 [4:06:45<20:59:05, 1.40s/it] 13%|█▎ | 7893/61904 [4:06:47<20:56:36, 1.40s/it] 13%|█▎ | 7894/61904 [4:06:48<20:48:09, 1.39s/it] 13%|█▎ | 7895/61904 [4:06:49<20:50:03, 1.39s/it] 13%|█▎ | 7896/61904 [4:06:51<21:00:35, 1.40s/it] 13%|█▎ | 7897/61904 [4:06:52<20:33:56, 1.37s/it] 13%|█▎ | 7898/61904 [4:06:53<20:16:11, 1.35s/it] 13%|█▎ | 7899/61904 [4:06:55<19:51:49, 1.32s/it] 13%|█▎ | 7900/61904 [4:06:56<20:19:35, 1.35s/it] {'loss': 2.9148, 'learning_rate': 1.8752106832620248e-07, 'epoch': 2.04} 13%|█▎ | 7900/61904 [4:06:56<20:19:35, 1.35s/it] 13%|█▎ | 7901/61904 [4:06:57<20:05:23, 1.34s/it] 13%|█▎ | 7902/61904 [4:06:59<20:10:33, 1.35s/it] 13%|█▎ | 7903/61904 [4:07:00<21:00:01, 1.40s/it] 13%|█▎ | 7904/61904 [4:07:02<21:15:42, 1.42s/it] 13%|█▎ | 7905/61904 [4:07:03<20:57:20, 1.40s/it] 13%|█▎ | 7906/61904 [4:07:04<20:12:59, 1.35s/it] 13%|█▎ | 7907/61904 [4:07:06<19:57:22, 1.33s/it] 13%|█▎ | 7908/61904 [4:07:07<20:11:38, 1.35s/it] 13%|█▎ | 7909/61904 [4:07:08<20:05:54, 1.34s/it] 13%|█▎ | 7910/61904 [4:07:10<20:08:54, 1.34s/it] 13%|█▎ | 7911/61904 [4:07:11<20:12:11, 1.35s/it] 13%|█▎ | 7912/61904 [4:07:12<20:12:21, 1.35s/it] 13%|█▎ | 7913/61904 [4:07:14<21:18:02, 1.42s/it] 13%|█▎ | 7914/61904 [4:07:15<20:36:30, 1.37s/it] 13%|█▎ | 7915/61904 [4:07:17<20:40:32, 1.38s/it] 13%|█▎ | 7916/61904 [4:07:18<20:11:44, 1.35s/it] 13%|█▎ | 7917/61904 [4:07:19<20:12:51, 1.35s/it] 13%|█▎ | 7918/61904 [4:07:21<20:50:39, 1.39s/it] 13%|█▎ | 7919/61904 [4:07:22<20:52:51, 1.39s/it] 13%|█▎ | 7920/61904 [4:07:23<20:04:20, 1.34s/it] {'loss': 2.9378, 'learning_rate': 1.8748865551666017e-07, 'epoch': 2.05} 13%|█▎ | 7920/61904 [4:07:23<20:04:20, 1.34s/it] 13%|█▎ | 7921/61904 [4:07:25<19:30:43, 1.30s/it] 13%|█▎ | 7922/61904 [4:07:26<19:45:36, 1.32s/it] 13%|█▎ | 7923/61904 [4:07:27<19:35:56, 1.31s/it] 13%|█▎ | 7924/61904 [4:07:29<20:02:11, 1.34s/it] 13%|█▎ | 7925/61904 [4:07:30<19:47:17, 1.32s/it] 13%|█▎ | 7926/61904 [4:07:31<19:55:00, 1.33s/it] 13%|█▎ | 7927/61904 [4:07:32<19:39:03, 1.31s/it] 13%|█▎ | 7928/61904 [4:07:34<19:33:13, 1.30s/it] 13%|█▎ | 7929/61904 [4:07:35<19:17:52, 1.29s/it] 13%|█▎ | 7930/61904 [4:07:36<19:33:46, 1.30s/it] 13%|█▎ | 7931/61904 [4:07:38<19:44:18, 1.32s/it] 13%|█▎ | 7932/61904 [4:07:39<20:03:45, 1.34s/it] 13%|█▎ | 7933/61904 [4:07:41<20:35:32, 1.37s/it] 13%|█▎ | 7934/61904 [4:07:42<20:19:33, 1.36s/it] 13%|█▎ | 7935/61904 [4:07:43<20:17:47, 1.35s/it] 13%|█▎ | 7936/61904 [4:07:45<20:38:33, 1.38s/it] 13%|█▎ | 7937/61904 [4:07:46<20:06:26, 1.34s/it] 13%|█▎ | 7938/61904 [4:07:47<20:30:48, 1.37s/it] 13%|█▎ | 7939/61904 [4:07:49<20:22:07, 1.36s/it] 13%|█▎ | 7940/61904 [4:07:50<20:32:37, 1.37s/it] {'loss': 2.8385, 'learning_rate': 1.8745624270711783e-07, 'epoch': 2.05} 13%|█▎ | 7940/61904 [4:07:50<20:32:37, 1.37s/it] 13%|█▎ | 7941/61904 [4:07:51<20:26:46, 1.36s/it] 13%|█▎ | 7942/61904 [4:07:53<20:34:08, 1.37s/it] 13%|█▎ | 7943/61904 [4:07:54<20:58:42, 1.40s/it] 13%|█▎ | 7944/61904 [4:07:56<20:30:19, 1.37s/it] 13%|█▎ | 7945/61904 [4:07:57<20:01:38, 1.34s/it] 13%|█▎ | 7946/61904 [4:07:58<19:53:06, 1.33s/it] 13%|█▎ | 7947/61904 [4:08:00<20:39:52, 1.38s/it] 13%|█▎ | 7948/61904 [4:08:01<20:32:44, 1.37s/it] 13%|█▎ | 7949/61904 [4:08:02<20:09:56, 1.35s/it] 13%|█▎ | 7950/61904 [4:08:04<20:18:05, 1.35s/it] 13%|█▎ | 7951/61904 [4:08:05<20:21:59, 1.36s/it] 13%|█▎ | 7952/61904 [4:08:06<20:21:25, 1.36s/it] 13%|█▎ | 7953/61904 [4:08:08<20:17:26, 1.35s/it] 13%|█▎ | 7954/61904 [4:08:09<20:13:01, 1.35s/it] 13%|█▎ | 7955/61904 [4:08:10<20:17:14, 1.35s/it] 13%|█▎ | 7956/61904 [4:08:12<20:03:58, 1.34s/it] 13%|█▎ | 7957/61904 [4:08:13<20:22:25, 1.36s/it] 13%|█▎ | 7958/61904 [4:08:14<20:11:33, 1.35s/it] 13%|█▎ | 7959/61904 [4:08:16<20:27:08, 1.36s/it] 13%|█▎ | 7960/61904 [4:08:17<20:22:01, 1.36s/it] {'loss': 2.9649, 'learning_rate': 1.874238298975755e-07, 'epoch': 2.06} 13%|█▎ | 7960/61904 [4:08:17<20:22:01, 1.36s/it] 13%|█▎ | 7961/61904 [4:08:19<20:15:19, 1.35s/it] 13%|█▎ | 7962/61904 [4:08:20<20:42:26, 1.38s/it] 13%|█▎ | 7963/61904 [4:08:21<20:15:38, 1.35s/it] 13%|█▎ | 7964/61904 [4:08:23<20:04:09, 1.34s/it] 13%|█▎ | 7965/61904 [4:08:24<19:59:20, 1.33s/it] 13%|█▎ | 7966/61904 [4:08:25<21:10:04, 1.41s/it] 13%|█▎ | 7967/61904 [4:08:27<21:17:34, 1.42s/it] 13%|█▎ | 7968/61904 [4:08:28<20:41:52, 1.38s/it] 13%|█▎ | 7969/61904 [4:08:30<20:30:33, 1.37s/it] 13%|█▎ | 7970/61904 [4:08:31<20:21:11, 1.36s/it] 13%|█▎ | 7971/61904 [4:08:32<20:30:30, 1.37s/it] 13%|█▎ | 7972/61904 [4:08:34<20:43:49, 1.38s/it] 13%|█▎ | 7973/61904 [4:08:35<20:26:05, 1.36s/it] 13%|█▎ | 7974/61904 [4:08:36<19:52:14, 1.33s/it] 13%|█▎ | 7975/61904 [4:08:38<21:26:45, 1.43s/it] 13%|█▎ | 7976/61904 [4:08:39<21:03:13, 1.41s/it] 13%|█▎ | 7977/61904 [4:08:41<20:31:19, 1.37s/it] 13%|█▎ | 7978/61904 [4:08:42<20:33:34, 1.37s/it] 13%|█▎ | 7979/61904 [4:08:43<20:05:01, 1.34s/it] 13%|█▎ | 7980/61904 [4:08:44<19:36:53, 1.31s/it] {'loss': 2.8578, 'learning_rate': 1.8739141708803318e-07, 'epoch': 2.06} 13%|█▎ | 7980/61904 [4:08:44<19:36:53, 1.31s/it] 13%|█▎ | 7981/61904 [4:08:46<19:52:02, 1.33s/it] 13%|█▎ | 7982/61904 [4:08:47<20:07:19, 1.34s/it] 13%|█▎ | 7983/61904 [4:08:49<21:30:44, 1.44s/it] 13%|█▎ | 7984/61904 [4:08:50<21:28:38, 1.43s/it] 13%|█▎ | 7985/61904 [4:08:52<20:51:38, 1.39s/it] 13%|█▎ | 7986/61904 [4:08:53<21:06:39, 1.41s/it] 13%|█▎ | 7987/61904 [4:08:54<21:17:04, 1.42s/it] 13%|█▎ | 7988/61904 [4:08:56<20:34:08, 1.37s/it] 13%|█▎ | 7989/61904 [4:08:57<19:59:32, 1.33s/it] 13%|█▎ | 7990/61904 [4:08:58<20:09:43, 1.35s/it] 13%|█▎ | 7991/61904 [4:09:00<20:00:21, 1.34s/it] 13%|█▎ | 7992/61904 [4:09:01<20:31:22, 1.37s/it] 13%|█▎ | 7993/61904 [4:09:02<19:56:28, 1.33s/it] 13%|█▎ | 7994/61904 [4:09:04<21:03:07, 1.41s/it] 13%|█▎ | 7995/61904 [4:09:05<20:29:43, 1.37s/it] 13%|█▎ | 7996/61904 [4:09:07<20:32:15, 1.37s/it] 13%|█▎ | 7997/61904 [4:09:08<20:05:58, 1.34s/it] 13%|█▎ | 7998/61904 [4:09:09<20:00:38, 1.34s/it] 13%|█▎ | 7999/61904 [4:09:11<20:35:23, 1.38s/it] 13%|█▎ | 8000/61904 [4:09:12<20:48:28, 1.39s/it] {'loss': 2.8477, 'learning_rate': 1.8735900427849085e-07, 'epoch': 2.07} 13%|█▎ | 8000/61904 [4:09:12<20:48:28, 1.39s/it] 13%|█▎ | 8001/61904 [4:09:14<21:16:00, 1.42s/it] 13%|█▎ | 8002/61904 [4:09:15<20:46:05, 1.39s/it] 13%|█▎ | 8003/61904 [4:09:16<20:34:22, 1.37s/it] 13%|█▎ | 8004/61904 [4:09:17<19:44:50, 1.32s/it] 13%|█▎ | 8005/61904 [4:09:19<20:39:29, 1.38s/it] 13%|█▎ | 8006/61904 [4:09:20<20:28:38, 1.37s/it] 13%|█▎ | 8007/61904 [4:09:22<20:11:15, 1.35s/it] 13%|█▎ | 8008/61904 [4:09:23<20:30:35, 1.37s/it] 13%|█▎ | 8009/61904 [4:09:24<20:26:09, 1.37s/it] 13%|█▎ | 8010/61904 [4:09:26<20:21:37, 1.36s/it] 13%|█▎ | 8011/61904 [4:09:27<20:38:00, 1.38s/it] 13%|█▎ | 8012/61904 [4:09:28<19:54:07, 1.33s/it] 13%|█▎ | 8013/61904 [4:09:30<20:20:11, 1.36s/it] 13%|█▎ | 8014/61904 [4:09:31<21:01:53, 1.40s/it] 13%|█▎ | 8015/61904 [4:09:33<20:55:21, 1.40s/it] 13%|█▎ | 8016/61904 [4:09:34<21:24:43, 1.43s/it] 13%|█▎ | 8017/61904 [4:09:35<20:31:32, 1.37s/it] 13%|█▎ | 8018/61904 [4:09:37<20:04:38, 1.34s/it] 13%|█▎ | 8019/61904 [4:09:38<19:55:51, 1.33s/it] 13%|█▎ | 8020/61904 [4:09:39<19:36:00, 1.31s/it] {'loss': 2.8336, 'learning_rate': 1.873265914689485e-07, 'epoch': 2.07} 13%|█▎ | 8020/61904 [4:09:39<19:36:00, 1.31s/it] 13%|█▎ | 8021/61904 [4:09:41<20:44:32, 1.39s/it] 13%|█▎ | 8022/61904 [4:09:42<19:59:11, 1.34s/it] 13%|█▎ | 8023/61904 [4:09:43<20:33:59, 1.37s/it] 13%|█▎ | 8024/61904 [4:09:45<20:45:40, 1.39s/it] 13%|█▎ | 8025/61904 [4:09:46<20:27:31, 1.37s/it] 13%|█▎ | 8026/61904 [4:09:48<20:10:19, 1.35s/it] 13%|█▎ | 8027/61904 [4:09:49<21:15:43, 1.42s/it] 13%|█▎ | 8028/61904 [4:09:50<20:57:55, 1.40s/it] 13%|█▎ | 8029/61904 [4:09:52<20:24:52, 1.36s/it] 13%|█▎ | 8030/61904 [4:09:53<20:33:53, 1.37s/it] 13%|█▎ | 8031/61904 [4:09:54<19:53:12, 1.33s/it] 13%|█▎ | 8032/61904 [4:09:56<20:22:04, 1.36s/it] 13%|█▎ | 8033/61904 [4:09:57<19:54:21, 1.33s/it] 13%|█▎ | 8034/61904 [4:09:59<20:35:16, 1.38s/it] 13%|█▎ | 8035/61904 [4:10:00<20:41:00, 1.38s/it] 13%|█▎ | 8036/61904 [4:10:01<20:07:58, 1.35s/it] 13%|█▎ | 8037/61904 [4:10:02<19:32:44, 1.31s/it] 13%|█▎ | 8038/61904 [4:10:04<20:12:57, 1.35s/it] 13%|█▎ | 8039/61904 [4:10:05<19:48:30, 1.32s/it] 13%|█▎ | 8040/61904 [4:10:07<20:33:55, 1.37s/it] {'loss': 2.9005, 'learning_rate': 1.872941786594062e-07, 'epoch': 2.08} 13%|█▎ | 8040/61904 [4:10:07<20:33:55, 1.37s/it] 13%|█▎ | 8041/61904 [4:10:08<20:12:43, 1.35s/it] 13%|█▎ | 8042/61904 [4:10:09<20:18:35, 1.36s/it] 13%|█▎ | 8043/61904 [4:10:11<19:54:21, 1.33s/it] 13%|█▎ | 8044/61904 [4:10:12<20:07:42, 1.35s/it] 13%|█▎ | 8045/61904 [4:10:13<20:15:46, 1.35s/it] 13%|█▎ | 8046/61904 [4:10:15<20:53:10, 1.40s/it] 13%|█▎ | 8047/61904 [4:10:16<20:50:33, 1.39s/it] 13%|█▎ | 8048/61904 [4:10:18<20:34:52, 1.38s/it] 13%|█▎ | 8049/61904 [4:10:19<21:00:46, 1.40s/it] 13%|█▎ | 8050/61904 [4:10:20<21:04:40, 1.41s/it] 13%|█▎ | 8051/61904 [4:10:22<21:46:49, 1.46s/it] 13%|█▎ | 8052/61904 [4:10:23<21:53:35, 1.46s/it] 13%|█▎ | 8053/61904 [4:10:25<21:30:42, 1.44s/it] 13%|█▎ | 8054/61904 [4:10:26<22:26:25, 1.50s/it] 13%|█▎ | 8055/61904 [4:10:28<22:23:14, 1.50s/it] 13%|█▎ | 8056/61904 [4:10:29<21:33:36, 1.44s/it] 13%|█▎ | 8057/61904 [4:10:31<21:43:16, 1.45s/it] 13%|█▎ | 8058/61904 [4:10:32<20:40:15, 1.38s/it] 13%|█▎ | 8059/61904 [4:10:33<20:08:43, 1.35s/it] 13%|█▎ | 8060/61904 [4:10:35<20:27:47, 1.37s/it] {'loss': 2.9426, 'learning_rate': 1.8726176584986386e-07, 'epoch': 2.08} 13%|█▎ | 8060/61904 [4:10:35<20:27:47, 1.37s/it] 13%|█▎ | 8061/61904 [4:10:36<20:19:36, 1.36s/it] 13%|█▎ | 8062/61904 [4:10:37<20:13:28, 1.35s/it] 13%|█▎ | 8063/61904 [4:10:39<20:15:23, 1.35s/it] 13%|█▎ | 8064/61904 [4:10:40<20:06:22, 1.34s/it] 13%|█▎ | 8065/61904 [4:10:41<19:51:14, 1.33s/it] 13%|█▎ | 8066/61904 [4:10:43<19:56:58, 1.33s/it] 13%|█▎ | 8067/61904 [4:10:44<20:06:41, 1.34s/it] 13%|█▎ | 8068/61904 [4:10:45<19:47:27, 1.32s/it] 13%|█▎ | 8069/61904 [4:10:47<20:20:08, 1.36s/it] 13%|█▎ | 8070/61904 [4:10:48<21:19:54, 1.43s/it] 13%|█▎ | 8071/61904 [4:10:50<21:18:39, 1.43s/it] 13%|█▎ | 8072/61904 [4:10:51<21:15:48, 1.42s/it] 13%|█▎ | 8073/61904 [4:10:53<21:30:43, 1.44s/it] 13%|█▎ | 8074/61904 [4:10:54<20:59:23, 1.40s/it] 13%|█▎ | 8075/61904 [4:10:56<21:46:53, 1.46s/it] 13%|█▎ | 8076/61904 [4:10:57<21:01:25, 1.41s/it] 13%|█▎ | 8077/61904 [4:10:58<21:23:49, 1.43s/it] 13%|█▎ | 8078/61904 [4:11:00<20:49:56, 1.39s/it] 13%|█▎ | 8079/61904 [4:11:01<20:08:54, 1.35s/it] 13%|█▎ | 8080/61904 [4:11:02<19:45:58, 1.32s/it] {'loss': 2.8823, 'learning_rate': 1.8722935304032152e-07, 'epoch': 2.09} 13%|█▎ | 8080/61904 [4:11:02<19:45:58, 1.32s/it] 13%|█▎ | 8081/61904 [4:11:04<19:57:16, 1.33s/it] 13%|█▎ | 8082/61904 [4:11:05<20:39:25, 1.38s/it] 13%|█▎ | 8083/61904 [4:11:06<20:47:22, 1.39s/it] 13%|█▎ | 8084/61904 [4:11:08<21:13:04, 1.42s/it] 13%|█▎ | 8085/61904 [4:11:09<20:21:29, 1.36s/it] 13%|█▎ | 8086/61904 [4:11:10<20:26:13, 1.37s/it] 13%|█▎ | 8087/61904 [4:11:12<20:51:21, 1.40s/it] 13%|█▎ | 8088/61904 [4:11:13<20:05:25, 1.34s/it] 13%|█▎ | 8089/61904 [4:11:15<20:37:04, 1.38s/it] 13%|█▎ | 8090/61904 [4:11:16<20:25:29, 1.37s/it] 13%|█▎ | 8091/61904 [4:11:17<20:10:49, 1.35s/it] 13%|█▎ | 8092/61904 [4:11:19<19:58:30, 1.34s/it] 13%|█▎ | 8093/61904 [4:11:20<20:23:28, 1.36s/it] 13%|█▎ | 8094/61904 [4:11:21<20:24:13, 1.37s/it] 13%|█▎ | 8095/61904 [4:11:23<20:06:59, 1.35s/it] 13%|█▎ | 8096/61904 [4:11:24<20:29:06, 1.37s/it] 13%|█▎ | 8097/61904 [4:11:26<20:32:42, 1.37s/it] 13%|█▎ | 8098/61904 [4:11:27<20:33:28, 1.38s/it] 13%|█▎ | 8099/61904 [4:11:28<19:54:47, 1.33s/it] 13%|█▎ | 8100/61904 [4:11:29<19:36:46, 1.31s/it] {'loss': 2.8766, 'learning_rate': 1.871969402307792e-07, 'epoch': 2.09} 13%|█▎ | 8100/61904 [4:11:29<19:36:46, 1.31s/it] 13%|█▎ | 8101/61904 [4:11:31<19:32:59, 1.31s/it] 13%|█▎ | 8102/61904 [4:11:32<19:41:34, 1.32s/it] 13%|█▎ | 8103/61904 [4:11:33<19:45:47, 1.32s/it] 13%|█▎ | 8104/61904 [4:11:35<19:47:11, 1.32s/it] 13%|█▎ | 8105/61904 [4:11:36<20:14:22, 1.35s/it] 13%|█▎ | 8106/61904 [4:11:37<20:12:25, 1.35s/it] 13%|█▎ | 8107/61904 [4:11:39<20:03:53, 1.34s/it] 13%|█▎ | 8108/61904 [4:11:40<19:52:03, 1.33s/it] 13%|█▎ | 8109/61904 [4:11:41<19:52:35, 1.33s/it] 13%|█▎ | 8110/61904 [4:11:43<19:23:12, 1.30s/it] 13%|█▎ | 8111/61904 [4:11:44<19:57:02, 1.34s/it] 13%|█▎ | 8112/61904 [4:11:45<19:53:28, 1.33s/it] 13%|█▎ | 8113/61904 [4:11:47<20:27:34, 1.37s/it] 13%|█▎ | 8114/61904 [4:11:48<20:27:30, 1.37s/it] 13%|█▎ | 8115/61904 [4:11:50<20:56:59, 1.40s/it] 13%|█▎ | 8116/61904 [4:11:51<20:20:31, 1.36s/it] 13%|█▎ | 8117/61904 [4:11:52<20:04:33, 1.34s/it] 13%|█▎ | 8118/61904 [4:11:54<19:47:24, 1.32s/it] 13%|█▎ | 8119/61904 [4:11:55<19:59:07, 1.34s/it] 13%|█▎ | 8120/61904 [4:11:56<20:23:02, 1.36s/it] {'loss': 2.8903, 'learning_rate': 1.8716452742123684e-07, 'epoch': 2.1} 13%|█▎ | 8120/61904 [4:11:56<20:23:02, 1.36s/it] 13%|█▎ | 8121/61904 [4:11:58<20:11:10, 1.35s/it] 13%|█▎ | 8122/61904 [4:11:59<19:54:03, 1.33s/it] 13%|█▎ | 8123/61904 [4:12:00<19:57:58, 1.34s/it] 13%|█▎ | 8124/61904 [4:12:02<19:58:36, 1.34s/it] 13%|█▎ | 8125/61904 [4:12:03<20:01:23, 1.34s/it] 13%|█▎ | 8126/61904 [4:12:04<20:19:55, 1.36s/it] 13%|█▎ | 8127/61904 [4:12:06<19:50:26, 1.33s/it] 13%|█▎ | 8128/61904 [4:12:07<19:46:14, 1.32s/it] 13%|█▎ | 8129/61904 [4:12:08<19:38:42, 1.32s/it] 13%|█▎ | 8130/61904 [4:12:10<19:31:51, 1.31s/it] 13%|█▎ | 8131/61904 [4:12:11<19:47:12, 1.32s/it] 13%|█▎ | 8132/61904 [4:12:12<19:25:37, 1.30s/it] 13%|█▎ | 8133/61904 [4:12:13<19:41:08, 1.32s/it] 13%|█▎ | 8134/61904 [4:12:15<20:03:22, 1.34s/it] 13%|█▎ | 8135/61904 [4:12:16<19:41:20, 1.32s/it] 13%|█▎ | 8136/61904 [4:12:18<19:56:50, 1.34s/it] 13%|█▎ | 8137/61904 [4:12:19<20:08:53, 1.35s/it] 13%|█▎ | 8138/61904 [4:12:20<20:16:19, 1.36s/it] 13%|█▎ | 8139/61904 [4:12:22<21:12:21, 1.42s/it] 13%|█▎ | 8140/61904 [4:12:23<21:31:27, 1.44s/it] {'loss': 2.9383, 'learning_rate': 1.8713211461169453e-07, 'epoch': 2.1} 13%|█▎ | 8140/61904 [4:12:23<21:31:27, 1.44s/it] 13%|█▎ | 8141/61904 [4:12:25<21:21:13, 1.43s/it] 13%|█▎ | 8142/61904 [4:12:26<20:52:12, 1.40s/it] 13%|█▎ | 8143/61904 [4:12:27<20:28:43, 1.37s/it] 13%|█▎ | 8144/61904 [4:12:29<21:11:22, 1.42s/it] 13%|█▎ | 8145/61904 [4:12:30<21:05:23, 1.41s/it] 13%|█▎ | 8146/61904 [4:12:32<20:50:35, 1.40s/it] 13%|█▎ | 8147/61904 [4:12:33<20:29:28, 1.37s/it] 13%|█▎ | 8148/61904 [4:12:34<20:09:09, 1.35s/it] 13%|█▎ | 8149/61904 [4:12:36<20:43:03, 1.39s/it] 13%|█▎ | 8150/61904 [4:12:37<20:58:31, 1.40s/it] 13%|█▎ | 8151/61904 [4:12:38<20:15:55, 1.36s/it] 13%|█▎ | 8152/61904 [4:12:40<20:37:08, 1.38s/it] 13%|█▎ | 8153/61904 [4:12:41<20:42:37, 1.39s/it] 13%|█▎ | 8154/61904 [4:12:43<21:25:55, 1.44s/it] 13%|█▎ | 8155/61904 [4:12:44<21:41:59, 1.45s/it] 13%|█▎ | 8156/61904 [4:12:46<22:02:21, 1.48s/it] 13%|█▎ | 8157/61904 [4:12:47<21:16:47, 1.43s/it] 13%|█▎ | 8158/61904 [4:12:49<21:00:36, 1.41s/it] 13%|█▎ | 8159/61904 [4:12:50<20:35:53, 1.38s/it] 13%|█▎ | 8160/61904 [4:12:51<19:56:01, 1.34s/it] {'loss': 2.8702, 'learning_rate': 1.870997018021522e-07, 'epoch': 2.11} 13%|█▎ | 8160/61904 [4:12:51<19:56:01, 1.34s/it] 13%|█▎ | 8161/61904 [4:12:52<19:37:50, 1.31s/it] 13%|█▎ | 8162/61904 [4:12:54<19:42:25, 1.32s/it] 13%|█▎ | 8163/61904 [4:12:55<20:25:54, 1.37s/it] 13%|█▎ | 8164/61904 [4:12:57<20:52:36, 1.40s/it] 13%|█▎ | 8165/61904 [4:12:58<20:47:20, 1.39s/it] 13%|█▎ | 8166/61904 [4:12:59<20:29:38, 1.37s/it] 13%|█▎ | 8167/61904 [4:13:01<19:39:35, 1.32s/it] 13%|█▎ | 8168/61904 [4:13:02<19:16:57, 1.29s/it] 13%|█▎ | 8169/61904 [4:13:03<19:23:50, 1.30s/it] 13%|█▎ | 8170/61904 [4:13:04<19:14:55, 1.29s/it] 13%|█▎ | 8171/61904 [4:13:06<19:19:52, 1.30s/it] 13%|█▎ | 8172/61904 [4:13:07<19:30:15, 1.31s/it] 13%|█▎ | 8173/61904 [4:13:08<20:01:00, 1.34s/it] 13%|█▎ | 8174/61904 [4:13:10<20:02:18, 1.34s/it] 13%|█▎ | 8175/61904 [4:13:11<19:55:58, 1.34s/it] 13%|█▎ | 8176/61904 [4:13:12<19:36:19, 1.31s/it] 13%|█▎ | 8177/61904 [4:13:14<19:42:44, 1.32s/it] 13%|█▎ | 8178/61904 [4:13:15<20:06:40, 1.35s/it] 13%|█▎ | 8179/61904 [4:13:17<21:00:53, 1.41s/it] 13%|█▎ | 8180/61904 [4:13:18<20:04:43, 1.35s/it] {'loss': 2.8984, 'learning_rate': 1.8706728899260986e-07, 'epoch': 2.11} 13%|█▎ | 8180/61904 [4:13:18<20:04:43, 1.35s/it] 13%|█▎ | 8181/61904 [4:13:19<19:54:14, 1.33s/it] 13%|█▎ | 8182/61904 [4:13:21<20:24:21, 1.37s/it] 13%|█▎ | 8183/61904 [4:13:22<20:17:02, 1.36s/it] 13%|█▎ | 8184/61904 [4:13:23<19:48:28, 1.33s/it] 13%|█▎ | 8185/61904 [4:13:25<20:32:07, 1.38s/it] 13%|█▎ | 8186/61904 [4:13:26<20:37:46, 1.38s/it] 13%|█▎ | 8187/61904 [4:13:27<20:05:20, 1.35s/it] 13%|█▎ | 8188/61904 [4:13:29<19:52:59, 1.33s/it] 13%|█▎ | 8189/61904 [4:13:30<20:02:24, 1.34s/it] 13%|█▎ | 8190/61904 [4:13:32<20:59:06, 1.41s/it] 13%|█▎ | 8191/61904 [4:13:33<21:21:00, 1.43s/it] 13%|█▎ | 8192/61904 [4:13:34<20:50:17, 1.40s/it] 13%|█▎ | 8193/61904 [4:13:36<20:30:49, 1.37s/it] 13%|█▎ | 8194/61904 [4:13:37<20:04:25, 1.35s/it] 13%|█▎ | 8195/61904 [4:13:38<19:44:01, 1.32s/it] 13%|█▎ | 8196/61904 [4:13:40<20:04:31, 1.35s/it] 13%|█▎ | 8197/61904 [4:13:41<20:07:46, 1.35s/it] 13%|█▎ | 8198/61904 [4:13:42<20:11:35, 1.35s/it] 13%|█▎ | 8199/61904 [4:13:44<20:06:27, 1.35s/it] 13%|█▎ | 8200/61904 [4:13:45<20:31:45, 1.38s/it] {'loss': 2.934, 'learning_rate': 1.8703487618306754e-07, 'epoch': 2.12} 13%|█▎ | 8200/61904 [4:13:45<20:31:45, 1.38s/it] 13%|█▎ | 8201/61904 [4:13:47<20:44:12, 1.39s/it] 13%|█▎ | 8202/61904 [4:13:48<21:15:29, 1.43s/it] 13%|█▎ | 8203/61904 [4:13:49<20:24:41, 1.37s/it] 13%|█▎ | 8204/61904 [4:13:51<20:41:41, 1.39s/it] 13%|█▎ | 8205/61904 [4:13:52<20:38:09, 1.38s/it] 13%|█▎ | 8206/61904 [4:13:53<20:16:02, 1.36s/it] 13%|█▎ | 8207/61904 [4:13:55<20:04:46, 1.35s/it] 13%|█▎ | 8208/61904 [4:13:56<19:46:50, 1.33s/it] 13%|█▎ | 8209/61904 [4:13:57<19:52:11, 1.33s/it] 13%|█▎ | 8210/61904 [4:13:59<20:49:26, 1.40s/it] 13%|█▎ | 8211/61904 [4:14:00<20:34:06, 1.38s/it] 13%|█▎ | 8212/61904 [4:14:02<20:10:58, 1.35s/it] 13%|█▎ | 8213/61904 [4:14:03<20:17:46, 1.36s/it] 13%|█▎ | 8214/61904 [4:14:04<20:33:53, 1.38s/it] 13%|█▎ | 8215/61904 [4:14:06<21:22:17, 1.43s/it] 13%|█▎ | 8216/61904 [4:14:07<21:05:48, 1.41s/it] 13%|█▎ | 8217/61904 [4:14:09<20:51:43, 1.40s/it] 13%|█▎ | 8218/61904 [4:14:10<20:57:45, 1.41s/it] 13%|█▎ | 8219/61904 [4:14:11<20:27:43, 1.37s/it] 13%|█▎ | 8220/61904 [4:14:13<19:55:11, 1.34s/it] {'loss': 2.8964, 'learning_rate': 1.870024633735252e-07, 'epoch': 2.12} 13%|█▎ | 8220/61904 [4:14:13<19:55:11, 1.34s/it] 13%|█▎ | 8221/61904 [4:14:14<20:01:06, 1.34s/it] 13%|█▎ | 8222/61904 [4:14:15<20:17:20, 1.36s/it] 13%|█▎ | 8223/61904 [4:14:17<20:12:03, 1.35s/it] 13%|█▎ | 8224/61904 [4:14:18<20:13:29, 1.36s/it] 13%|█▎ | 8225/61904 [4:14:19<20:44:23, 1.39s/it] 13%|█▎ | 8226/61904 [4:14:21<20:41:51, 1.39s/it] 13%|█▎ | 8227/61904 [4:14:22<20:29:37, 1.37s/it] 13%|█▎ | 8228/61904 [4:14:24<20:11:28, 1.35s/it] 13%|█▎ | 8229/61904 [4:14:25<20:49:40, 1.40s/it] 13%|█▎ | 8230/61904 [4:14:26<20:57:24, 1.41s/it] 13%|█▎ | 8231/61904 [4:14:28<20:47:22, 1.39s/it] 13%|█▎ | 8232/61904 [4:14:29<20:56:42, 1.40s/it] 13%|█▎ | 8233/61904 [4:14:31<20:43:11, 1.39s/it] 13%|█▎ | 8234/61904 [4:14:32<20:21:58, 1.37s/it] 13%|█▎ | 8235/61904 [4:14:33<20:20:29, 1.36s/it] 13%|█▎ | 8236/61904 [4:14:35<20:04:27, 1.35s/it] 13%|█▎ | 8237/61904 [4:14:36<20:20:36, 1.36s/it] 13%|█▎ | 8238/61904 [4:14:37<20:40:55, 1.39s/it] 13%|█▎ | 8239/61904 [4:14:39<20:28:50, 1.37s/it] 13%|█▎ | 8240/61904 [4:14:40<20:42:56, 1.39s/it] {'loss': 2.887, 'learning_rate': 1.8697005056398287e-07, 'epoch': 2.13} 13%|█▎ | 8240/61904 [4:14:40<20:42:56, 1.39s/it] 13%|█▎ | 8241/61904 [4:14:42<21:13:20, 1.42s/it] 13%|█▎ | 8242/61904 [4:14:43<20:57:59, 1.41s/it] 13%|█▎ | 8243/61904 [4:14:44<20:11:31, 1.35s/it] 13%|█▎ | 8244/61904 [4:14:46<19:53:41, 1.33s/it] 13%|█▎ | 8245/61904 [4:14:47<19:59:46, 1.34s/it] 13%|█▎ | 8246/61904 [4:14:48<20:44:02, 1.39s/it] 13%|█▎ | 8247/61904 [4:14:50<20:40:36, 1.39s/it] 13%|█▎ | 8248/61904 [4:14:51<21:02:44, 1.41s/it] 13%|█▎ | 8249/61904 [4:14:53<21:19:49, 1.43s/it] 13%|█▎ | 8250/61904 [4:14:54<20:34:28, 1.38s/it] 13%|█▎ | 8251/61904 [4:14:56<21:17:21, 1.43s/it] 13%|█▎ | 8252/61904 [4:14:57<20:39:04, 1.39s/it] 13%|█▎ | 8253/61904 [4:14:58<20:44:52, 1.39s/it] 13%|█▎ | 8254/61904 [4:15:00<20:22:45, 1.37s/it] 13%|█▎ | 8255/61904 [4:15:01<21:01:11, 1.41s/it] 13%|█▎ | 8256/61904 [4:15:02<20:55:12, 1.40s/it] 13%|█▎ | 8257/61904 [4:15:04<20:15:31, 1.36s/it] 13%|█▎ | 8258/61904 [4:15:05<20:26:41, 1.37s/it] 13%|█▎ | 8259/61904 [4:15:07<20:31:17, 1.38s/it] 13%|█▎ | 8260/61904 [4:15:08<21:06:13, 1.42s/it] {'loss': 2.8915, 'learning_rate': 1.8693763775444056e-07, 'epoch': 2.13} 13%|█▎ | 8260/61904 [4:15:08<21:06:13, 1.42s/it] 13%|█▎ | 8261/61904 [4:15:09<20:43:09, 1.39s/it] 13%|█▎ | 8262/61904 [4:15:11<21:28:23, 1.44s/it] 13%|█▎ | 8263/61904 [4:15:12<20:50:49, 1.40s/it] 13%|█▎ | 8264/61904 [4:15:14<20:37:46, 1.38s/it] 13%|█▎ | 8265/61904 [4:15:15<20:33:51, 1.38s/it] 13%|█▎ | 8266/61904 [4:15:16<21:08:03, 1.42s/it] 13%|█▎ | 8267/61904 [4:15:18<20:59:42, 1.41s/it] 13%|█▎ | 8268/61904 [4:15:19<21:57:25, 1.47s/it] 13%|█▎ | 8269/61904 [4:15:21<21:49:28, 1.46s/it] 13%|█▎ | 8270/61904 [4:15:22<21:28:35, 1.44s/it] 13%|█▎ | 8271/61904 [4:15:24<21:29:13, 1.44s/it] 13%|█▎ | 8272/61904 [4:15:25<21:13:24, 1.42s/it] 13%|█▎ | 8273/61904 [4:15:26<20:52:05, 1.40s/it] 13%|█▎ | 8274/61904 [4:15:28<21:44:27, 1.46s/it] 13%|█▎ | 8275/61904 [4:15:29<21:20:32, 1.43s/it] 13%|█▎ | 8276/61904 [4:15:31<21:13:19, 1.42s/it] 13%|█▎ | 8277/61904 [4:15:32<21:07:08, 1.42s/it] 13%|█▎ | 8278/61904 [4:15:34<21:03:37, 1.41s/it] 13%|█▎ | 8279/61904 [4:15:35<20:34:22, 1.38s/it] 13%|█▎ | 8280/61904 [4:15:36<20:34:23, 1.38s/it] {'loss': 2.9201, 'learning_rate': 1.8690522494489822e-07, 'epoch': 2.14} 13%|█▎ | 8280/61904 [4:15:36<20:34:23, 1.38s/it] 13%|█▎ | 8281/61904 [4:15:38<20:30:57, 1.38s/it] 13%|█▎ | 8282/61904 [4:15:39<20:37:06, 1.38s/it] 13%|█▎ | 8283/61904 [4:15:41<21:00:54, 1.41s/it] 13%|█▎ | 8284/61904 [4:15:42<20:42:57, 1.39s/it] 13%|█▎ | 8285/61904 [4:15:43<20:40:58, 1.39s/it] 13%|█▎ | 8286/61904 [4:15:45<20:03:55, 1.35s/it] 13%|█▎ | 8287/61904 [4:15:46<19:52:36, 1.33s/it] 13%|█▎ | 8288/61904 [4:15:47<19:50:36, 1.33s/it] 13%|█▎ | 8289/61904 [4:15:49<20:04:00, 1.35s/it] 13%|█▎ | 8290/61904 [4:15:50<19:49:22, 1.33s/it] 13%|█▎ | 8291/61904 [4:15:51<19:26:51, 1.31s/it] 13%|█▎ | 8292/61904 [4:15:53<19:54:16, 1.34s/it] 13%|█▎ | 8293/61904 [4:15:54<20:08:08, 1.35s/it] 13%|█▎ | 8294/61904 [4:15:55<20:32:36, 1.38s/it] 13%|█▎ | 8295/61904 [4:15:57<20:05:28, 1.35s/it] 13%|█▎ | 8296/61904 [4:15:58<20:21:00, 1.37s/it] 13%|█▎ | 8297/61904 [4:15:59<20:11:34, 1.36s/it] 13%|█▎ | 8298/61904 [4:16:01<21:02:13, 1.41s/it] 13%|█▎ | 8299/61904 [4:16:02<21:18:06, 1.43s/it] 13%|█▎ | 8300/61904 [4:16:04<20:42:02, 1.39s/it] {'loss': 2.8616, 'learning_rate': 1.8687281213535588e-07, 'epoch': 2.14} 13%|█▎ | 8300/61904 [4:16:04<20:42:02, 1.39s/it] 13%|█▎ | 8301/61904 [4:16:05<21:02:02, 1.41s/it] 13%|█▎ | 8302/61904 [4:16:07<20:48:53, 1.40s/it] 13%|█▎ | 8303/61904 [4:16:08<20:19:41, 1.37s/it] 13%|█▎ | 8304/61904 [4:16:09<21:05:59, 1.42s/it] 13%|█▎ | 8305/61904 [4:16:11<21:04:03, 1.42s/it] 13%|█▎ | 8306/61904 [4:16:12<20:46:10, 1.40s/it] 13%|█▎ | 8307/61904 [4:16:14<21:13:55, 1.43s/it] 13%|█▎ | 8308/61904 [4:16:15<20:10:25, 1.36s/it] 13%|█▎ | 8309/61904 [4:16:16<20:59:41, 1.41s/it] 13%|█▎ | 8310/61904 [4:16:18<21:11:56, 1.42s/it] 13%|█▎ | 8311/61904 [4:16:19<21:04:48, 1.42s/it] 13%|█▎ | 8312/61904 [4:16:20<20:35:08, 1.38s/it] 13%|█▎ | 8313/61904 [4:16:22<20:35:30, 1.38s/it] 13%|█▎ | 8314/61904 [4:16:23<20:21:11, 1.37s/it] 13%|█▎ | 8315/61904 [4:16:24<19:59:03, 1.34s/it] 13%|█▎ | 8316/61904 [4:16:26<20:17:31, 1.36s/it] 13%|█▎ | 8317/61904 [4:16:27<20:12:16, 1.36s/it] 13%|█▎ | 8318/61904 [4:16:29<20:12:40, 1.36s/it] 13%|█▎ | 8319/61904 [4:16:30<19:35:59, 1.32s/it] 13%|█▎ | 8320/61904 [4:16:31<19:46:21, 1.33s/it] {'loss': 2.8828, 'learning_rate': 1.8684039932581354e-07, 'epoch': 2.15} 13%|█▎ | 8320/61904 [4:16:31<19:46:21, 1.33s/it] 13%|█▎ | 8321/61904 [4:16:32<19:41:38, 1.32s/it] 13%|█▎ | 8322/61904 [4:16:34<19:53:53, 1.34s/it] 13%|█▎ | 8323/61904 [4:16:35<20:28:05, 1.38s/it] 13%|█▎ | 8324/61904 [4:16:37<20:54:46, 1.41s/it] 13%|█▎ | 8325/61904 [4:16:38<21:06:31, 1.42s/it] 13%|█▎ | 8326/61904 [4:16:40<20:56:38, 1.41s/it] 13%|█▎ | 8327/61904 [4:16:41<21:08:35, 1.42s/it] 13%|█▎ | 8328/61904 [4:16:42<21:08:01, 1.42s/it] 13%|█▎ | 8329/61904 [4:16:44<21:01:58, 1.41s/it] 13%|█▎ | 8330/61904 [4:16:45<20:22:16, 1.37s/it] 13%|█▎ | 8331/61904 [4:16:46<20:10:01, 1.36s/it] 13%|█▎ | 8332/61904 [4:16:48<20:16:51, 1.36s/it] 13%|█▎ | 8333/61904 [4:16:49<20:27:38, 1.37s/it] 13%|█▎ | 8334/61904 [4:16:50<19:46:43, 1.33s/it] 13%|█▎ | 8335/61904 [4:16:52<19:18:14, 1.30s/it] 13%|█▎ | 8336/61904 [4:16:53<19:48:07, 1.33s/it] 13%|█▎ | 8337/61904 [4:16:54<19:33:43, 1.31s/it] 13%|█▎ | 8338/61904 [4:16:56<19:29:30, 1.31s/it] 13%|█▎ | 8339/61904 [4:16:57<19:57:22, 1.34s/it] 13%|█▎ | 8340/61904 [4:16:58<19:51:35, 1.33s/it] {'loss': 2.8614, 'learning_rate': 1.868079865162712e-07, 'epoch': 2.16} 13%|█▎ | 8340/61904 [4:16:58<19:51:35, 1.33s/it] 13%|█▎ | 8341/61904 [4:17:00<19:37:51, 1.32s/it] 13%|█▎ | 8342/61904 [4:17:02<21:47:13, 1.46s/it] 13%|█▎ | 8343/61904 [4:17:03<22:14:03, 1.49s/it] 13%|█▎ | 8344/61904 [4:17:04<21:22:32, 1.44s/it] 13%|█▎ | 8345/61904 [4:17:06<21:26:00, 1.44s/it] 13%|█▎ | 8346/61904 [4:17:07<21:19:31, 1.43s/it] 13%|█▎ | 8347/61904 [4:17:09<20:58:07, 1.41s/it] 13%|█▎ | 8348/61904 [4:17:10<20:00:35, 1.35s/it] 13%|█▎ | 8349/61904 [4:17:11<20:50:44, 1.40s/it] 13%|█▎ | 8350/61904 [4:17:13<20:56:38, 1.41s/it] 13%|█▎ | 8351/61904 [4:17:14<21:31:11, 1.45s/it] 13%|█▎ | 8352/61904 [4:17:16<21:15:26, 1.43s/it] 13%|█▎ | 8353/61904 [4:17:17<21:33:29, 1.45s/it] 13%|█▎ | 8354/61904 [4:17:19<21:06:49, 1.42s/it] 13%|█▎ | 8355/61904 [4:17:20<20:52:41, 1.40s/it] 13%|█▎ | 8356/61904 [4:17:21<20:14:21, 1.36s/it] 13%|█▎ | 8357/61904 [4:17:22<19:59:09, 1.34s/it] 14%|█▎ | 8358/61904 [4:17:24<19:47:06, 1.33s/it] 14%|█▎ | 8359/61904 [4:17:25<19:45:20, 1.33s/it] 14%|█▎ | 8360/61904 [4:17:26<19:16:51, 1.30s/it] {'loss': 2.8295, 'learning_rate': 1.867755737067289e-07, 'epoch': 2.16} 14%|█▎ | 8360/61904 [4:17:26<19:16:51, 1.30s/it] 14%|█▎ | 8361/61904 [4:17:28<19:55:24, 1.34s/it] 14%|█▎ | 8362/61904 [4:17:29<20:17:28, 1.36s/it] 14%|█▎ | 8363/61904 [4:17:30<19:47:02, 1.33s/it] 14%|█▎ | 8364/61904 [4:17:32<20:15:57, 1.36s/it] 14%|█▎ | 8365/61904 [4:17:33<20:28:01, 1.38s/it] 14%|█▎ | 8366/61904 [4:17:35<21:07:20, 1.42s/it] 14%|█▎ | 8367/61904 [4:17:36<21:04:24, 1.42s/it] 14%|█▎ | 8368/61904 [4:17:38<20:50:37, 1.40s/it] 14%|█▎ | 8369/61904 [4:17:39<20:47:55, 1.40s/it] 14%|█▎ | 8370/61904 [4:17:40<20:25:28, 1.37s/it] 14%|█▎ | 8371/61904 [4:17:41<19:48:18, 1.33s/it] 14%|█▎ | 8372/61904 [4:17:43<21:01:25, 1.41s/it] 14%|█▎ | 8373/61904 [4:17:44<20:07:21, 1.35s/it] 14%|█▎ | 8374/61904 [4:17:46<19:36:50, 1.32s/it] 14%|█▎ | 8375/61904 [4:17:47<20:31:10, 1.38s/it] 14%|█▎ | 8376/61904 [4:17:48<20:19:16, 1.37s/it] 14%|█▎ | 8377/61904 [4:17:50<20:39:46, 1.39s/it] 14%|█▎ | 8378/61904 [4:17:51<20:51:45, 1.40s/it] 14%|█▎ | 8379/61904 [4:17:53<21:49:46, 1.47s/it] 14%|█▎ | 8380/61904 [4:17:54<21:55:43, 1.47s/it] {'loss': 2.8375, 'learning_rate': 1.8674316089718655e-07, 'epoch': 2.17} 14%|█▎ | 8380/61904 [4:17:54<21:55:43, 1.47s/it] 14%|█▎ | 8381/61904 [4:17:56<21:41:50, 1.46s/it] 14%|█▎ | 8382/61904 [4:17:57<21:02:19, 1.42s/it] 14%|█▎ | 8383/61904 [4:17:58<20:42:18, 1.39s/it] 14%|█▎ | 8384/61904 [4:18:00<20:11:55, 1.36s/it] 14%|█▎ | 8385/61904 [4:18:01<20:15:59, 1.36s/it] 14%|█▎ | 8386/61904 [4:18:02<20:05:34, 1.35s/it] 14%|█▎ | 8387/61904 [4:18:04<20:13:39, 1.36s/it] 14%|█▎ | 8388/61904 [4:18:05<20:46:05, 1.40s/it] 14%|█▎ | 8389/61904 [4:18:07<20:27:36, 1.38s/it] 14%|█▎ | 8390/61904 [4:18:08<19:58:55, 1.34s/it] 14%|█▎ | 8391/61904 [4:18:09<19:53:01, 1.34s/it] 14%|█▎ | 8392/61904 [4:18:11<19:47:47, 1.33s/it] 14%|█▎ | 8393/61904 [4:18:12<19:30:10, 1.31s/it] 14%|█▎ | 8394/61904 [4:18:13<20:10:18, 1.36s/it] 14%|█▎ | 8395/61904 [4:18:15<20:09:13, 1.36s/it] 14%|█▎ | 8396/61904 [4:18:16<19:58:41, 1.34s/it] 14%|█▎ | 8397/61904 [4:18:17<19:47:33, 1.33s/it] 14%|█▎ | 8398/61904 [4:18:19<21:02:58, 1.42s/it] 14%|█▎ | 8399/61904 [4:18:20<21:41:14, 1.46s/it] 14%|█▎ | 8400/61904 [4:18:22<20:28:52, 1.38s/it] {'loss': 2.844, 'learning_rate': 1.8671074808764422e-07, 'epoch': 2.17} 14%|█▎ | 8400/61904 [4:18:22<20:28:52, 1.38s/it] 14%|█▎ | 8401/61904 [4:18:23<19:56:54, 1.34s/it] 14%|█▎ | 8402/61904 [4:18:24<19:44:42, 1.33s/it] 14%|█▎ | 8403/61904 [4:18:25<19:29:14, 1.31s/it] 14%|█▎ | 8404/61904 [4:18:27<19:24:14, 1.31s/it] 14%|█▎ | 8405/61904 [4:18:28<19:37:46, 1.32s/it] 14%|█▎ | 8406/61904 [4:18:29<19:17:29, 1.30s/it] 14%|█▎ | 8407/61904 [4:18:31<20:21:17, 1.37s/it] 14%|█▎ | 8408/61904 [4:18:32<20:15:46, 1.36s/it] 14%|█▎ | 8409/61904 [4:18:34<19:56:46, 1.34s/it] 14%|█▎ | 8410/61904 [4:18:35<19:44:37, 1.33s/it] 14%|█▎ | 8411/61904 [4:18:36<20:07:31, 1.35s/it] 14%|█▎ | 8412/61904 [4:18:38<20:21:38, 1.37s/it] 14%|█▎ | 8413/61904 [4:18:39<20:16:13, 1.36s/it] 14%|█▎ | 8414/61904 [4:18:40<20:01:37, 1.35s/it] 14%|█▎ | 8415/61904 [4:18:42<20:32:38, 1.38s/it] 14%|█▎ | 8416/61904 [4:18:43<20:47:22, 1.40s/it] 14%|█▎ | 8417/61904 [4:18:44<20:02:53, 1.35s/it] 14%|█▎ | 8418/61904 [4:18:46<19:39:22, 1.32s/it] 14%|█▎ | 8419/61904 [4:18:47<20:36:42, 1.39s/it] 14%|█▎ | 8420/61904 [4:18:49<20:40:19, 1.39s/it] {'loss': 2.8151, 'learning_rate': 1.866783352781019e-07, 'epoch': 2.18} 14%|█▎ | 8420/61904 [4:18:49<20:40:19, 1.39s/it] 14%|█▎ | 8421/61904 [4:18:50<21:07:53, 1.42s/it] 14%|█▎ | 8422/61904 [4:18:51<20:19:39, 1.37s/it] 14%|█▎ | 8423/61904 [4:18:53<20:25:18, 1.37s/it] 14%|█▎ | 8424/61904 [4:18:54<19:56:22, 1.34s/it] 14%|█▎ | 8425/61904 [4:18:56<20:41:13, 1.39s/it] 14%|█▎ | 8426/61904 [4:18:57<20:11:02, 1.36s/it] 14%|█▎ | 8427/61904 [4:18:58<20:49:40, 1.40s/it] 14%|█▎ | 8428/61904 [4:19:00<20:04:17, 1.35s/it] 14%|█▎ | 8429/61904 [4:19:01<19:43:12, 1.33s/it] 14%|█▎ | 8430/61904 [4:19:02<20:50:42, 1.40s/it] 14%|█▎ | 8431/61904 [4:19:04<20:31:42, 1.38s/it] 14%|█▎ | 8432/61904 [4:19:05<19:59:17, 1.35s/it] 14%|█▎ | 8433/61904 [4:19:06<20:24:13, 1.37s/it] 14%|█▎ | 8434/61904 [4:19:08<20:38:59, 1.39s/it] 14%|█▎ | 8435/61904 [4:19:09<21:19:38, 1.44s/it] 14%|█▎ | 8436/61904 [4:19:11<21:01:30, 1.42s/it] 14%|█▎ | 8437/61904 [4:19:12<20:08:05, 1.36s/it] 14%|█▎ | 8438/61904 [4:19:13<20:27:56, 1.38s/it] 14%|█▎ | 8439/61904 [4:19:15<20:22:39, 1.37s/it] 14%|█▎ | 8440/61904 [4:19:16<19:59:10, 1.35s/it] {'loss': 2.8761, 'learning_rate': 1.8664592246855957e-07, 'epoch': 2.18} 14%|█▎ | 8440/61904 [4:19:16<19:59:10, 1.35s/it] 14%|█▎ | 8441/61904 [4:19:18<20:29:26, 1.38s/it] 14%|█▎ | 8442/61904 [4:19:19<20:25:36, 1.38s/it] 14%|█▎ | 8443/61904 [4:19:20<20:27:33, 1.38s/it] 14%|█▎ | 8444/61904 [4:19:22<20:26:33, 1.38s/it] 14%|█▎ | 8445/61904 [4:19:23<20:18:54, 1.37s/it] 14%|█▎ | 8446/61904 [4:19:24<20:01:44, 1.35s/it] 14%|█▎ | 8447/61904 [4:19:25<19:24:25, 1.31s/it] 14%|█▎ | 8448/61904 [4:19:27<19:21:38, 1.30s/it] 14%|█▎ | 8449/61904 [4:19:28<19:07:55, 1.29s/it] 14%|█▎ | 8450/61904 [4:19:29<19:21:56, 1.30s/it] 14%|█▎ | 8451/61904 [4:19:31<19:35:34, 1.32s/it] 14%|█▎ | 8452/61904 [4:19:32<19:38:20, 1.32s/it] 14%|█▎ | 8453/61904 [4:19:34<20:25:25, 1.38s/it] 14%|█▎ | 8454/61904 [4:19:35<21:04:03, 1.42s/it] 14%|█▎ | 8455/61904 [4:19:36<20:33:33, 1.38s/it] 14%|█▎ | 8456/61904 [4:19:38<20:29:41, 1.38s/it] 14%|█▎ | 8457/61904 [4:19:39<19:41:53, 1.33s/it] 14%|█▎ | 8458/61904 [4:19:40<20:02:37, 1.35s/it] 14%|█▎ | 8459/61904 [4:19:42<20:18:04, 1.37s/it] 14%|█▎ | 8460/61904 [4:19:43<20:26:33, 1.38s/it] {'loss': 2.9129, 'learning_rate': 1.8661350965901723e-07, 'epoch': 2.19} 14%|█▎ | 8460/61904 [4:19:43<20:26:33, 1.38s/it] 14%|█▎ | 8461/61904 [4:19:44<19:51:21, 1.34s/it] 14%|█▎ | 8462/61904 [4:19:46<19:46:02, 1.33s/it] 14%|█▎ | 8463/61904 [4:19:47<20:21:21, 1.37s/it] 14%|█▎ | 8464/61904 [4:19:48<19:42:26, 1.33s/it] 14%|█▎ | 8465/61904 [4:19:50<20:01:30, 1.35s/it] 14%|█▎ | 8466/61904 [4:19:51<20:21:41, 1.37s/it] 14%|█▎ | 8467/61904 [4:19:53<20:33:51, 1.39s/it] 14%|█▎ | 8468/61904 [4:19:54<20:09:31, 1.36s/it] 14%|█▎ | 8469/61904 [4:19:55<20:13:51, 1.36s/it] 14%|█▎ | 8470/61904 [4:19:57<19:56:09, 1.34s/it] 14%|█▎ | 8471/61904 [4:19:58<20:54:11, 1.41s/it] 14%|█▎ | 8472/61904 [4:20:00<21:02:54, 1.42s/it] 14%|█▎ | 8473/61904 [4:20:01<20:08:54, 1.36s/it] 14%|█▎ | 8474/61904 [4:20:02<20:02:22, 1.35s/it] 14%|█▎ | 8475/61904 [4:20:04<20:13:33, 1.36s/it] 14%|█▎ | 8476/61904 [4:20:05<20:09:38, 1.36s/it] 14%|█▎ | 8477/61904 [4:20:06<19:57:11, 1.34s/it] 14%|█▎ | 8478/61904 [4:20:08<19:53:15, 1.34s/it] 14%|█▎ | 8479/61904 [4:20:09<19:53:55, 1.34s/it] 14%|█▎ | 8480/61904 [4:20:10<19:48:00, 1.33s/it] {'loss': 2.8907, 'learning_rate': 1.8658109684947492e-07, 'epoch': 2.19} 14%|█▎ | 8480/61904 [4:20:10<19:48:00, 1.33s/it] 14%|█▎ | 8481/61904 [4:20:12<20:42:28, 1.40s/it] 14%|█▎ | 8482/61904 [4:20:13<20:41:54, 1.39s/it] 14%|█▎ | 8483/61904 [4:20:15<20:31:08, 1.38s/it] 14%|█▎ | 8484/61904 [4:20:16<19:50:38, 1.34s/it] 14%|█▎ | 8485/61904 [4:20:17<19:37:27, 1.32s/it] 14%|█▎ | 8486/61904 [4:20:18<19:16:39, 1.30s/it] 14%|█▎ | 8487/61904 [4:20:20<19:55:06, 1.34s/it] 14%|█▎ | 8488/61904 [4:20:21<19:21:42, 1.30s/it] 14%|█▎ | 8489/61904 [4:20:22<19:41:20, 1.33s/it] 14%|█▎ | 8490/61904 [4:20:24<19:49:52, 1.34s/it] 14%|█▎ | 8491/61904 [4:20:25<19:59:54, 1.35s/it] 14%|█▎ | 8492/61904 [4:20:26<19:45:18, 1.33s/it] 14%|█▎ | 8493/61904 [4:20:28<20:15:45, 1.37s/it] 14%|█▎ | 8494/61904 [4:20:29<20:29:27, 1.38s/it] 14%|█▎ | 8495/61904 [4:20:31<20:22:09, 1.37s/it] 14%|█▎ | 8496/61904 [4:20:32<21:08:34, 1.43s/it] 14%|█▎ | 8497/61904 [4:20:33<20:39:51, 1.39s/it] 14%|█▎ | 8498/61904 [4:20:35<20:56:05, 1.41s/it] 14%|█▎ | 8499/61904 [4:20:36<21:22:05, 1.44s/it] 14%|█▎ | 8500/61904 [4:20:38<20:33:06, 1.39s/it] {'loss': 2.9062, 'learning_rate': 1.8654868403993255e-07, 'epoch': 2.2} 14%|█▎ | 8500/61904 [4:20:38<20:33:06, 1.39s/it] 14%|█▎ | 8501/61904 [4:20:39<19:41:54, 1.33s/it] 14%|█▎ | 8502/61904 [4:20:40<19:44:45, 1.33s/it] 14%|█▎ | 8503/61904 [4:20:42<20:19:15, 1.37s/it] 14%|█▎ | 8504/61904 [4:20:43<20:05:30, 1.35s/it] 14%|█▎ | 8505/61904 [4:20:44<20:03:11, 1.35s/it] 14%|█▎ | 8506/61904 [4:20:46<20:14:36, 1.36s/it] 14%|█▎ | 8507/61904 [4:20:47<19:49:44, 1.34s/it] 14%|█▎ | 8508/61904 [4:20:48<19:47:51, 1.33s/it] 14%|█▎ | 8509/61904 [4:20:50<19:36:16, 1.32s/it] 14%|█▎ | 8510/61904 [4:20:51<20:00:48, 1.35s/it] 14%|█▎ | 8511/61904 [4:20:52<19:47:10, 1.33s/it] 14%|█▍ | 8512/61904 [4:20:54<19:20:03, 1.30s/it] 14%|█▍ | 8513/61904 [4:20:55<19:27:05, 1.31s/it] 14%|█▍ | 8514/61904 [4:20:56<19:43:58, 1.33s/it] 14%|█▍ | 8515/61904 [4:20:58<19:45:30, 1.33s/it] 14%|█▍ | 8516/61904 [4:20:59<19:44:44, 1.33s/it] 14%|█▍ | 8517/61904 [4:21:00<20:16:37, 1.37s/it] 14%|█▍ | 8518/61904 [4:21:02<20:29:15, 1.38s/it] 14%|█▍ | 8519/61904 [4:21:03<20:00:00, 1.35s/it] 14%|█▍ | 8520/61904 [4:21:04<19:48:13, 1.34s/it] {'loss': 2.9185, 'learning_rate': 1.8651627123039024e-07, 'epoch': 2.2} 14%|█▍ | 8520/61904 [4:21:04<19:48:13, 1.34s/it] 14%|█▍ | 8521/61904 [4:21:06<20:29:26, 1.38s/it] 14%|█▍ | 8522/61904 [4:21:07<19:59:14, 1.35s/it] 14%|█▍ | 8523/61904 [4:21:08<19:58:58, 1.35s/it] 14%|█▍ | 8524/61904 [4:21:10<19:57:29, 1.35s/it] 14%|█▍ | 8525/61904 [4:21:11<19:11:02, 1.29s/it] 14%|█▍ | 8526/61904 [4:21:12<19:09:53, 1.29s/it] 14%|█▍ | 8527/61904 [4:21:14<19:44:46, 1.33s/it] 14%|█▍ | 8528/61904 [4:21:15<19:28:18, 1.31s/it] 14%|█▍ | 8529/61904 [4:21:16<19:40:37, 1.33s/it] 14%|█▍ | 8530/61904 [4:21:18<19:24:53, 1.31s/it] 14%|█▍ | 8531/61904 [4:21:19<20:27:53, 1.38s/it] 14%|█▍ | 8532/61904 [4:21:20<20:22:01, 1.37s/it] 14%|█▍ | 8533/61904 [4:21:22<20:11:28, 1.36s/it] 14%|█▍ | 8534/61904 [4:21:23<20:07:06, 1.36s/it] 14%|█▍ | 8535/61904 [4:21:24<20:00:05, 1.35s/it] 14%|█▍ | 8536/61904 [4:21:26<20:02:55, 1.35s/it] 14%|█▍ | 8537/61904 [4:21:27<19:56:14, 1.34s/it] 14%|█▍ | 8538/61904 [4:21:28<19:41:03, 1.33s/it] 14%|█▍ | 8539/61904 [4:21:30<19:16:53, 1.30s/it] 14%|█▍ | 8540/61904 [4:21:31<19:06:12, 1.29s/it] {'loss': 2.9334, 'learning_rate': 1.864838584208479e-07, 'epoch': 2.21} 14%|█▍ | 8540/61904 [4:21:31<19:06:12, 1.29s/it] 14%|█▍ | 8541/61904 [4:21:32<19:37:58, 1.32s/it] 14%|█▍ | 8542/61904 [4:21:34<20:27:47, 1.38s/it] 14%|█▍ | 8543/61904 [4:21:35<21:13:28, 1.43s/it] 14%|█▍ | 8544/61904 [4:21:37<20:54:42, 1.41s/it] 14%|█▍ | 8545/61904 [4:21:38<20:56:26, 1.41s/it] 14%|█▍ | 8546/61904 [4:21:40<21:02:26, 1.42s/it] 14%|█▍ | 8547/61904 [4:21:41<21:11:43, 1.43s/it] 14%|█▍ | 8548/61904 [4:21:43<21:04:22, 1.42s/it] 14%|█▍ | 8549/61904 [4:21:44<22:23:51, 1.51s/it] 14%|█▍ | 8550/61904 [4:21:46<21:24:50, 1.44s/it] 14%|█▍ | 8551/61904 [4:21:47<20:38:09, 1.39s/it] 14%|█▍ | 8552/61904 [4:21:48<19:55:00, 1.34s/it] 14%|█▍ | 8553/61904 [4:21:49<20:20:57, 1.37s/it] 14%|█▍ | 8554/61904 [4:21:51<20:14:21, 1.37s/it] 14%|█▍ | 8555/61904 [4:21:52<20:31:38, 1.39s/it] 14%|█▍ | 8556/61904 [4:21:54<21:05:17, 1.42s/it] 14%|█▍ | 8557/61904 [4:21:55<20:18:52, 1.37s/it] 14%|█▍ | 8558/61904 [4:21:56<20:23:20, 1.38s/it] 14%|█▍ | 8559/61904 [4:21:58<20:12:32, 1.36s/it] 14%|█▍ | 8560/61904 [4:21:59<19:42:54, 1.33s/it] {'loss': 2.8598, 'learning_rate': 1.8645144561130556e-07, 'epoch': 2.21} 14%|█▍ | 8560/61904 [4:21:59<19:42:54, 1.33s/it] 14%|█▍ | 8561/61904 [4:22:00<19:59:49, 1.35s/it] 14%|█▍ | 8562/61904 [4:22:02<19:44:06, 1.33s/it] 14%|█▍ | 8563/61904 [4:22:03<19:58:23, 1.35s/it] 14%|█▍ | 8564/61904 [4:22:04<19:54:00, 1.34s/it] 14%|█▍ | 8565/61904 [4:22:06<19:45:04, 1.33s/it] 14%|█▍ | 8566/61904 [4:22:07<19:47:01, 1.34s/it] 14%|█▍ | 8567/61904 [4:22:08<19:55:16, 1.34s/it] 14%|█▍ | 8568/61904 [4:22:10<19:57:30, 1.35s/it] 14%|█▍ | 8569/61904 [4:22:11<19:46:47, 1.34s/it] 14%|█▍ | 8570/61904 [4:22:12<19:54:10, 1.34s/it] 14%|█▍ | 8571/61904 [4:22:14<20:06:20, 1.36s/it] 14%|█▍ | 8572/61904 [4:22:15<20:37:07, 1.39s/it] 14%|█▍ | 8573/61904 [4:22:17<20:14:25, 1.37s/it] 14%|█▍ | 8574/61904 [4:22:18<20:18:06, 1.37s/it] 14%|█▍ | 8575/61904 [4:22:19<19:51:50, 1.34s/it] 14%|█▍ | 8576/61904 [4:22:21<19:52:33, 1.34s/it] 14%|█▍ | 8577/61904 [4:22:22<20:22:17, 1.38s/it] 14%|█▍ | 8578/61904 [4:22:23<20:27:24, 1.38s/it] 14%|█▍ | 8579/61904 [4:22:25<21:11:05, 1.43s/it] 14%|█▍ | 8580/61904 [4:22:26<21:08:34, 1.43s/it] {'loss': 2.8341, 'learning_rate': 1.8641903280176325e-07, 'epoch': 2.22} 14%|█▍ | 8580/61904 [4:22:26<21:08:34, 1.43s/it] 14%|█▍ | 8581/61904 [4:22:28<22:01:18, 1.49s/it] 14%|█▍ | 8582/61904 [4:22:29<21:46:18, 1.47s/it] 14%|█▍ | 8583/61904 [4:22:31<21:25:22, 1.45s/it] 14%|█▍ | 8584/61904 [4:22:32<21:02:30, 1.42s/it] 14%|█▍ | 8585/61904 [4:22:34<20:41:15, 1.40s/it] 14%|█▍ | 8586/61904 [4:22:35<20:33:36, 1.39s/it] 14%|█▍ | 8587/61904 [4:22:36<20:28:29, 1.38s/it] 14%|█▍ | 8588/61904 [4:22:38<20:32:12, 1.39s/it] 14%|█▍ | 8589/61904 [4:22:39<20:07:47, 1.36s/it] 14%|█▍ | 8590/61904 [4:22:40<20:16:22, 1.37s/it] 14%|█▍ | 8591/61904 [4:22:42<20:40:18, 1.40s/it] 14%|█▍ | 8592/61904 [4:22:43<20:24:36, 1.38s/it] 14%|█▍ | 8593/61904 [4:22:44<20:12:46, 1.36s/it] 14%|█▍ | 8594/61904 [4:22:46<20:03:23, 1.35s/it] 14%|█▍ | 8595/61904 [4:22:47<19:41:58, 1.33s/it] 14%|█▍ | 8596/61904 [4:22:49<20:26:09, 1.38s/it] 14%|█▍ | 8597/61904 [4:22:50<19:59:04, 1.35s/it] 14%|█▍ | 8598/61904 [4:22:51<20:08:08, 1.36s/it] 14%|█▍ | 8599/61904 [4:22:53<21:14:26, 1.43s/it] 14%|█▍ | 8600/61904 [4:22:54<20:21:04, 1.37s/it] {'loss': 2.9006, 'learning_rate': 1.8638661999222091e-07, 'epoch': 2.22} 14%|█▍ | 8600/61904 [4:22:54<20:21:04, 1.37s/it] 14%|█▍ | 8601/61904 [4:22:55<20:13:12, 1.37s/it] 14%|█▍ | 8602/61904 [4:22:57<20:30:07, 1.38s/it] 14%|█▍ | 8603/61904 [4:22:58<20:24:11, 1.38s/it] 14%|█▍ | 8604/61904 [4:22:59<19:36:01, 1.32s/it] 14%|█▍ | 8605/61904 [4:23:01<19:50:25, 1.34s/it] 14%|█▍ | 8606/61904 [4:23:02<19:38:21, 1.33s/it] 14%|█▍ | 8607/61904 [4:23:04<20:20:56, 1.37s/it] 14%|█▍ | 8608/61904 [4:23:05<20:04:05, 1.36s/it] 14%|█▍ | 8609/61904 [4:23:06<19:55:10, 1.35s/it] 14%|█▍ | 8610/61904 [4:23:08<20:07:54, 1.36s/it] 14%|█▍ | 8611/61904 [4:23:09<19:40:52, 1.33s/it] 14%|█▍ | 8612/61904 [4:23:10<19:29:22, 1.32s/it] 14%|█▍ | 8613/61904 [4:23:12<19:35:13, 1.32s/it] 14%|█▍ | 8614/61904 [4:23:13<20:10:04, 1.36s/it] 14%|█▍ | 8615/61904 [4:23:14<19:51:46, 1.34s/it] 14%|█▍ | 8616/61904 [4:23:16<19:30:17, 1.32s/it] 14%|█▍ | 8617/61904 [4:23:17<19:12:42, 1.30s/it] 14%|█▍ | 8618/61904 [4:23:18<20:03:13, 1.35s/it] 14%|█▍ | 8619/61904 [4:23:20<19:42:14, 1.33s/it] 14%|█▍ | 8620/61904 [4:23:21<19:27:15, 1.31s/it] {'loss': 2.8851, 'learning_rate': 1.8635420718267858e-07, 'epoch': 2.23} 14%|█▍ | 8620/61904 [4:23:21<19:27:15, 1.31s/it] 14%|█▍ | 8621/61904 [4:23:22<19:10:55, 1.30s/it] 14%|█▍ | 8622/61904 [4:23:23<19:32:48, 1.32s/it] 14%|█▍ | 8623/61904 [4:23:25<19:34:14, 1.32s/it] 14%|█▍ | 8624/61904 [4:23:26<20:00:05, 1.35s/it] 14%|█▍ | 8625/61904 [4:23:28<19:52:34, 1.34s/it] 14%|█▍ | 8626/61904 [4:23:29<19:48:14, 1.34s/it] 14%|█▍ | 8627/61904 [4:23:30<19:44:40, 1.33s/it] 14%|█▍ | 8628/61904 [4:23:32<19:58:55, 1.35s/it] 14%|█▍ | 8629/61904 [4:23:33<20:24:50, 1.38s/it] 14%|█▍ | 8630/61904 [4:23:34<19:53:56, 1.34s/it] 14%|█▍ | 8631/61904 [4:23:36<19:38:00, 1.33s/it] 14%|█▍ | 8632/61904 [4:23:37<19:16:29, 1.30s/it] 14%|█▍ | 8633/61904 [4:23:38<19:56:40, 1.35s/it] 14%|█▍ | 8634/61904 [4:23:40<20:42:27, 1.40s/it] 14%|█▍ | 8635/61904 [4:23:41<20:30:31, 1.39s/it] 14%|█▍ | 8636/61904 [4:23:42<20:11:48, 1.36s/it] 14%|█▍ | 8637/61904 [4:23:44<20:31:40, 1.39s/it] 14%|█▍ | 8638/61904 [4:23:45<21:04:23, 1.42s/it] 14%|█▍ | 8639/61904 [4:23:47<21:03:35, 1.42s/it] 14%|█▍ | 8640/61904 [4:23:48<21:13:19, 1.43s/it] {'loss': 2.8807, 'learning_rate': 1.8632179437313626e-07, 'epoch': 2.23} 14%|█▍ | 8640/61904 [4:23:48<21:13:19, 1.43s/it] 14%|█▍ | 8641/61904 [4:23:50<20:59:51, 1.42s/it] 14%|█▍ | 8642/61904 [4:23:51<20:36:49, 1.39s/it] 14%|█▍ | 8643/61904 [4:23:52<20:24:02, 1.38s/it] 14%|█▍ | 8644/61904 [4:23:54<19:59:24, 1.35s/it] 14%|█▍ | 8645/61904 [4:23:55<20:11:01, 1.36s/it] 14%|█▍ | 8646/61904 [4:23:56<20:25:15, 1.38s/it] 14%|█▍ | 8647/61904 [4:23:58<21:09:59, 1.43s/it] 14%|█▍ | 8648/61904 [4:23:59<20:40:46, 1.40s/it] 14%|█▍ | 8649/61904 [4:24:01<20:00:22, 1.35s/it] 14%|█▍ | 8650/61904 [4:24:02<19:47:23, 1.34s/it] 14%|█▍ | 8651/61904 [4:24:03<20:22:23, 1.38s/it] 14%|█▍ | 8652/61904 [4:24:05<20:21:49, 1.38s/it] 14%|█▍ | 8653/61904 [4:24:06<19:58:50, 1.35s/it] 14%|█▍ | 8654/61904 [4:24:07<19:51:04, 1.34s/it] 14%|█▍ | 8655/61904 [4:24:09<19:38:23, 1.33s/it] 14%|█▍ | 8656/61904 [4:24:10<19:01:01, 1.29s/it] 14%|█▍ | 8657/61904 [4:24:11<19:04:37, 1.29s/it] 14%|█▍ | 8658/61904 [4:24:12<19:09:51, 1.30s/it] 14%|█▍ | 8659/61904 [4:24:14<19:25:34, 1.31s/it] 14%|█▍ | 8660/61904 [4:24:15<19:32:07, 1.32s/it] {'loss': 2.8064, 'learning_rate': 1.8628938156359393e-07, 'epoch': 2.24} 14%|█▍ | 8660/61904 [4:24:15<19:32:07, 1.32s/it] 14%|█▍ | 8661/61904 [4:24:16<19:35:49, 1.33s/it] 14%|█▍ | 8662/61904 [4:24:18<18:57:21, 1.28s/it] 14%|█▍ | 8663/61904 [4:24:19<19:45:23, 1.34s/it] 14%|█▍ | 8664/61904 [4:24:20<19:24:43, 1.31s/it] 14%|█▍ | 8665/61904 [4:24:22<20:11:05, 1.36s/it] 14%|█▍ | 8666/61904 [4:24:23<19:43:52, 1.33s/it] 14%|█▍ | 8667/61904 [4:24:24<19:46:25, 1.34s/it] 14%|█▍ | 8668/61904 [4:24:26<20:07:16, 1.36s/it] 14%|█▍ | 8669/61904 [4:24:27<20:08:33, 1.36s/it] 14%|█▍ | 8670/61904 [4:24:29<20:07:48, 1.36s/it] 14%|█▍ | 8671/61904 [4:24:30<19:55:38, 1.35s/it] 14%|█▍ | 8672/61904 [4:24:31<19:48:48, 1.34s/it] 14%|█▍ | 8673/61904 [4:24:33<20:10:48, 1.36s/it] 14%|█▍ | 8674/61904 [4:24:34<19:58:47, 1.35s/it] 14%|█▍ | 8675/61904 [4:24:35<20:21:39, 1.38s/it] 14%|█▍ | 8676/61904 [4:24:37<20:28:43, 1.39s/it] 14%|█▍ | 8677/61904 [4:24:38<20:17:18, 1.37s/it] 14%|█▍ | 8678/61904 [4:24:39<20:00:40, 1.35s/it] 14%|█▍ | 8679/61904 [4:24:41<20:37:34, 1.40s/it] 14%|█▍ | 8680/61904 [4:24:42<20:34:50, 1.39s/it] {'loss': 2.89, 'learning_rate': 1.862569687540516e-07, 'epoch': 2.24} 14%|█▍ | 8680/61904 [4:24:42<20:34:50, 1.39s/it] 14%|█▍ | 8681/61904 [4:24:44<20:11:19, 1.37s/it] 14%|█▍ | 8682/61904 [4:24:45<19:46:21, 1.34s/it] 14%|█▍ | 8683/61904 [4:24:46<20:06:39, 1.36s/it] 14%|█▍ | 8684/61904 [4:24:48<19:50:04, 1.34s/it] 14%|█▍ | 8685/61904 [4:24:49<19:31:59, 1.32s/it] 14%|█▍ | 8686/61904 [4:24:50<19:37:20, 1.33s/it] 14%|█▍ | 8687/61904 [4:24:51<19:24:31, 1.31s/it] 14%|█▍ | 8688/61904 [4:24:53<20:26:34, 1.38s/it] 14%|█▍ | 8689/61904 [4:24:54<20:08:45, 1.36s/it] 14%|█▍ | 8690/61904 [4:24:56<20:11:31, 1.37s/it] 14%|█▍ | 8691/61904 [4:24:57<19:45:50, 1.34s/it] 14%|█▍ | 8692/61904 [4:24:58<19:28:28, 1.32s/it] 14%|█▍ | 8693/61904 [4:25:00<19:38:07, 1.33s/it] 14%|█▍ | 8694/61904 [4:25:01<20:10:51, 1.37s/it] 14%|█▍ | 8695/61904 [4:25:03<20:35:10, 1.39s/it] 14%|█▍ | 8696/61904 [4:25:04<20:10:56, 1.37s/it] 14%|█▍ | 8697/61904 [4:25:05<19:45:17, 1.34s/it] 14%|█▍ | 8698/61904 [4:25:06<19:59:11, 1.35s/it] 14%|█▍ | 8699/61904 [4:25:08<20:25:20, 1.38s/it] 14%|█▍ | 8700/61904 [4:25:09<20:53:01, 1.41s/it] {'loss': 2.8585, 'learning_rate': 1.8622455594450928e-07, 'epoch': 2.25} 14%|█▍ | 8700/61904 [4:25:09<20:53:01, 1.41s/it] 14%|█▍ | 8701/61904 [4:25:11<20:30:35, 1.39s/it] 14%|█▍ | 8702/61904 [4:25:12<19:50:57, 1.34s/it] 14%|█▍ | 8703/61904 [4:25:13<19:37:08, 1.33s/it] 14%|█▍ | 8704/61904 [4:25:15<20:21:30, 1.38s/it] 14%|█▍ | 8705/61904 [4:25:16<20:16:55, 1.37s/it] 14%|█▍ | 8706/61904 [4:25:18<20:20:44, 1.38s/it] 14%|█▍ | 8707/61904 [4:25:19<19:55:18, 1.35s/it] 14%|█▍ | 8708/61904 [4:25:20<19:59:40, 1.35s/it] 14%|█▍ | 8709/61904 [4:25:21<19:37:47, 1.33s/it] 14%|█▍ | 8710/61904 [4:25:23<20:02:25, 1.36s/it] 14%|█▍ | 8711/61904 [4:25:24<20:29:54, 1.39s/it] 14%|█▍ | 8712/61904 [4:25:26<20:24:51, 1.38s/it] 14%|█▍ | 8713/61904 [4:25:27<19:48:34, 1.34s/it] 14%|█▍ | 8714/61904 [4:25:28<20:31:22, 1.39s/it] 14%|█▍ | 8715/61904 [4:25:30<19:44:21, 1.34s/it] 14%|█▍ | 8716/61904 [4:25:31<19:27:57, 1.32s/it] 14%|█▍ | 8717/61904 [4:25:32<19:22:29, 1.31s/it] 14%|█▍ | 8718/61904 [4:25:34<20:09:31, 1.36s/it] 14%|█▍ | 8719/61904 [4:25:35<20:24:55, 1.38s/it] 14%|█▍ | 8720/61904 [4:25:37<20:34:59, 1.39s/it] {'loss': 2.8825, 'learning_rate': 1.861921431349669e-07, 'epoch': 2.25} 14%|█▍ | 8720/61904 [4:25:37<20:34:59, 1.39s/it] 14%|█▍ | 8721/61904 [4:25:38<20:24:26, 1.38s/it] 14%|█▍ | 8722/61904 [4:25:39<19:59:58, 1.35s/it] 14%|█▍ | 8723/61904 [4:25:40<19:38:48, 1.33s/it] 14%|█▍ | 8724/61904 [4:25:42<19:50:33, 1.34s/it] 14%|█▍ | 8725/61904 [4:25:43<20:03:31, 1.36s/it] 14%|█▍ | 8726/61904 [4:25:45<20:05:01, 1.36s/it] 14%|█▍ | 8727/61904 [4:25:46<20:23:49, 1.38s/it] 14%|█▍ | 8728/61904 [4:25:47<20:05:15, 1.36s/it] 14%|█▍ | 8729/61904 [4:25:49<19:28:48, 1.32s/it] 14%|█▍ | 8730/61904 [4:25:50<19:05:32, 1.29s/it] 14%|█▍ | 8731/61904 [4:25:51<19:03:17, 1.29s/it] 14%|█▍ | 8732/61904 [4:25:52<19:17:43, 1.31s/it] 14%|█▍ | 8733/61904 [4:25:54<19:25:24, 1.32s/it] 14%|█▍ | 8734/61904 [4:25:55<19:17:52, 1.31s/it] 14%|█▍ | 8735/61904 [4:25:57<20:11:49, 1.37s/it] 14%|█▍ | 8736/61904 [4:25:58<20:08:31, 1.36s/it] 14%|█▍ | 8737/61904 [4:25:59<19:57:34, 1.35s/it] 14%|█▍ | 8738/61904 [4:26:01<20:00:42, 1.36s/it] 14%|█▍ | 8739/61904 [4:26:02<19:28:38, 1.32s/it] 14%|█▍ | 8740/61904 [4:26:03<19:58:46, 1.35s/it] {'loss': 2.8311, 'learning_rate': 1.861597303254246e-07, 'epoch': 2.26} 14%|█▍ | 8740/61904 [4:26:03<19:58:46, 1.35s/it] 14%|█▍ | 8741/61904 [4:26:05<19:50:05, 1.34s/it] 14%|█▍ | 8742/61904 [4:26:06<20:14:26, 1.37s/it] 14%|█▍ | 8743/61904 [4:26:07<19:58:04, 1.35s/it] 14%|█▍ | 8744/61904 [4:26:09<20:07:32, 1.36s/it] 14%|█▍ | 8745/61904 [4:26:10<19:40:48, 1.33s/it] 14%|█▍ | 8746/61904 [4:26:11<19:40:04, 1.33s/it] 14%|█▍ | 8747/61904 [4:26:13<19:46:32, 1.34s/it] 14%|█▍ | 8748/61904 [4:26:14<19:30:02, 1.32s/it] 14%|█▍ | 8749/61904 [4:26:16<20:39:43, 1.40s/it] 14%|█▍ | 8750/61904 [4:26:17<20:35:55, 1.40s/it] 14%|█▍ | 8751/61904 [4:26:18<20:20:23, 1.38s/it] 14%|█▍ | 8752/61904 [4:26:20<20:23:38, 1.38s/it] 14%|█▍ | 8753/61904 [4:26:21<21:12:01, 1.44s/it] 14%|█▍ | 8754/61904 [4:26:23<21:50:40, 1.48s/it] 14%|█▍ | 8755/61904 [4:26:24<20:58:56, 1.42s/it] 14%|█▍ | 8756/61904 [4:26:25<20:13:44, 1.37s/it] 14%|█▍ | 8757/61904 [4:26:27<20:58:39, 1.42s/it] 14%|█▍ | 8758/61904 [4:26:28<20:08:31, 1.36s/it] 14%|█▍ | 8759/61904 [4:26:30<20:22:43, 1.38s/it] 14%|█▍ | 8760/61904 [4:26:31<20:27:50, 1.39s/it] {'loss': 2.8857, 'learning_rate': 1.8612731751588226e-07, 'epoch': 2.26} 14%|█▍ | 8760/61904 [4:26:31<20:27:50, 1.39s/it] 14%|█▍ | 8761/61904 [4:26:32<20:12:01, 1.37s/it] 14%|█▍ | 8762/61904 [4:26:34<19:59:02, 1.35s/it] 14%|█▍ | 8763/61904 [4:26:35<20:08:47, 1.36s/it] 14%|█▍ | 8764/61904 [4:26:36<19:43:39, 1.34s/it] 14%|█▍ | 8765/61904 [4:26:38<19:35:52, 1.33s/it] 14%|█▍ | 8766/61904 [4:26:39<19:41:15, 1.33s/it] 14%|█▍ | 8767/61904 [4:26:40<19:44:53, 1.34s/it] 14%|█▍ | 8768/61904 [4:26:42<19:51:59, 1.35s/it] 14%|█▍ | 8769/61904 [4:26:43<20:37:51, 1.40s/it] 14%|█▍ | 8770/61904 [4:26:44<20:19:04, 1.38s/it] 14%|█▍ | 8771/61904 [4:26:46<20:04:04, 1.36s/it] 14%|█▍ | 8772/61904 [4:26:47<19:46:26, 1.34s/it] 14%|█▍ | 8773/61904 [4:26:48<19:53:49, 1.35s/it] 14%|█▍ | 8774/61904 [4:26:50<21:07:30, 1.43s/it] 14%|█▍ | 8775/61904 [4:26:51<21:00:39, 1.42s/it] 14%|█▍ | 8776/61904 [4:26:53<21:14:02, 1.44s/it] 14%|█▍ | 8777/61904 [4:26:54<20:47:29, 1.41s/it] 14%|█▍ | 8778/61904 [4:26:56<20:15:05, 1.37s/it] 14%|█▍ | 8779/61904 [4:26:57<19:43:37, 1.34s/it] 14%|█▍ | 8780/61904 [4:26:58<20:08:32, 1.36s/it] {'loss': 2.8278, 'learning_rate': 1.8609490470633992e-07, 'epoch': 2.27} 14%|█▍ | 8780/61904 [4:26:58<20:08:32, 1.36s/it] 14%|█▍ | 8781/61904 [4:27:00<19:59:52, 1.36s/it] 14%|█▍ | 8782/61904 [4:27:01<20:02:11, 1.36s/it] 14%|█▍ | 8783/61904 [4:27:02<19:54:28, 1.35s/it] 14%|█▍ | 8784/61904 [4:27:04<19:58:35, 1.35s/it] 14%|█▍ | 8785/61904 [4:27:05<19:46:01, 1.34s/it] 14%|█▍ | 8786/61904 [4:27:06<20:28:34, 1.39s/it] 14%|█▍ | 8787/61904 [4:27:08<20:14:30, 1.37s/it] 14%|█▍ | 8788/61904 [4:27:09<20:13:52, 1.37s/it] 14%|█▍ | 8789/61904 [4:27:11<20:27:29, 1.39s/it] 14%|█▍ | 8790/61904 [4:27:12<19:50:34, 1.34s/it] 14%|█▍ | 8791/61904 [4:27:13<19:54:26, 1.35s/it] 14%|█▍ | 8792/61904 [4:27:15<20:11:51, 1.37s/it] 14%|█▍ | 8793/61904 [4:27:16<19:43:44, 1.34s/it] 14%|█▍ | 8794/61904 [4:27:17<20:33:35, 1.39s/it] 14%|█▍ | 8795/61904 [4:27:19<20:30:30, 1.39s/it] 14%|█▍ | 8796/61904 [4:27:20<20:20:38, 1.38s/it] 14%|█▍ | 8797/61904 [4:27:22<20:29:16, 1.39s/it] 14%|█▍ | 8798/61904 [4:27:23<20:30:22, 1.39s/it] 14%|█▍ | 8799/61904 [4:27:24<20:54:53, 1.42s/it] 14%|█▍ | 8800/61904 [4:27:26<21:11:56, 1.44s/it] {'loss': 2.8549, 'learning_rate': 1.860624918967976e-07, 'epoch': 2.27} 14%|█▍ | 8800/61904 [4:27:26<21:11:56, 1.44s/it] 14%|█▍ | 8801/61904 [4:27:27<20:36:18, 1.40s/it] 14%|█▍ | 8802/61904 [4:27:29<21:08:52, 1.43s/it] 14%|█▍ | 8803/61904 [4:27:30<20:10:30, 1.37s/it] 14%|█▍ | 8804/61904 [4:27:31<19:50:13, 1.34s/it] 14%|█▍ | 8805/61904 [4:27:33<20:18:03, 1.38s/it] 14%|█▍ | 8806/61904 [4:27:34<19:56:55, 1.35s/it] 14%|█▍ | 8807/61904 [4:27:35<20:28:15, 1.39s/it] 14%|█▍ | 8808/61904 [4:27:37<20:11:56, 1.37s/it] 14%|█▍ | 8809/61904 [4:27:38<20:31:34, 1.39s/it] 14%|█▍ | 8810/61904 [4:27:39<20:08:21, 1.37s/it] 14%|█▍ | 8811/61904 [4:27:41<20:48:27, 1.41s/it] 14%|█▍ | 8812/61904 [4:27:42<20:11:58, 1.37s/it] 14%|█▍ | 8813/61904 [4:27:44<19:53:52, 1.35s/it] 14%|█▍ | 8814/61904 [4:27:45<20:18:46, 1.38s/it] 14%|█▍ | 8815/61904 [4:27:46<20:10:45, 1.37s/it] 14%|█▍ | 8816/61904 [4:27:48<20:12:21, 1.37s/it] 14%|█▍ | 8817/61904 [4:27:49<19:36:16, 1.33s/it] 14%|█▍ | 8818/61904 [4:27:50<20:14:54, 1.37s/it] 14%|█▍ | 8819/61904 [4:27:52<20:26:28, 1.39s/it] 14%|█▍ | 8820/61904 [4:27:53<19:59:38, 1.36s/it] {'loss': 2.8433, 'learning_rate': 1.8603007908725527e-07, 'epoch': 2.28} 14%|█▍ | 8820/61904 [4:27:53<19:59:38, 1.36s/it] 14%|█▍ | 8821/61904 [4:27:55<21:22:28, 1.45s/it] 14%|█▍ | 8822/61904 [4:27:56<20:38:39, 1.40s/it] 14%|█▍ | 8823/61904 [4:27:57<20:18:29, 1.38s/it] 14%|█▍ | 8824/61904 [4:27:59<20:12:33, 1.37s/it] 14%|█▍ | 8825/61904 [4:28:00<20:31:52, 1.39s/it] 14%|█▍ | 8826/61904 [4:28:02<20:08:29, 1.37s/it] 14%|█▍ | 8827/61904 [4:28:03<20:36:00, 1.40s/it] 14%|█▍ | 8828/61904 [4:28:04<19:48:12, 1.34s/it] 14%|█▍ | 8829/61904 [4:28:06<20:07:57, 1.37s/it] 14%|█▍ | 8830/61904 [4:28:07<20:29:17, 1.39s/it] 14%|█▍ | 8831/61904 [4:28:09<20:39:45, 1.40s/it] 14%|█▍ | 8832/61904 [4:28:10<20:27:36, 1.39s/it] 14%|█▍ | 8833/61904 [4:28:11<20:17:21, 1.38s/it] 14%|█▍ | 8834/61904 [4:28:13<20:12:24, 1.37s/it] 14%|█▍ | 8835/61904 [4:28:14<20:04:34, 1.36s/it] 14%|█▍ | 8836/61904 [4:28:15<19:56:24, 1.35s/it] 14%|█▍ | 8837/61904 [4:28:17<20:14:54, 1.37s/it] 14%|█▍ | 8838/61904 [4:28:18<19:31:03, 1.32s/it] 14%|█▍ | 8839/61904 [4:28:19<20:14:36, 1.37s/it] 14%|█▍ | 8840/61904 [4:28:21<20:11:30, 1.37s/it] {'loss': 2.8928, 'learning_rate': 1.8599766627771294e-07, 'epoch': 2.28} 14%|█▍ | 8840/61904 [4:28:21<20:11:30, 1.37s/it] 14%|█▍ | 8841/61904 [4:28:22<20:03:43, 1.36s/it] 14%|█▍ | 8842/61904 [4:28:23<19:46:39, 1.34s/it] 14%|█▍ | 8843/61904 [4:28:25<20:33:16, 1.39s/it] 14%|█▍ | 8844/61904 [4:28:26<20:31:51, 1.39s/it] 14%|█▍ | 8845/61904 [4:28:28<20:49:34, 1.41s/it] 14%|█▍ | 8846/61904 [4:28:29<20:37:29, 1.40s/it] 14%|█▍ | 8847/61904 [4:28:30<20:07:52, 1.37s/it] 14%|█▍ | 8848/61904 [4:28:32<19:43:46, 1.34s/it] 14%|█▍ | 8849/61904 [4:28:33<19:52:20, 1.35s/it] 14%|█▍ | 8850/61904 [4:28:34<20:04:20, 1.36s/it] 14%|█▍ | 8851/61904 [4:28:36<20:08:12, 1.37s/it] 14%|█▍ | 8852/61904 [4:28:37<19:41:13, 1.34s/it] 14%|█▍ | 8853/61904 [4:28:38<20:00:25, 1.36s/it] 14%|█▍ | 8854/61904 [4:28:40<19:37:49, 1.33s/it] 14%|█▍ | 8855/61904 [4:28:41<19:56:09, 1.35s/it] 14%|█▍ | 8856/61904 [4:28:42<19:36:49, 1.33s/it] 14%|█▍ | 8857/61904 [4:28:44<19:54:07, 1.35s/it] 14%|█▍ | 8858/61904 [4:28:45<19:44:57, 1.34s/it] 14%|█▍ | 8859/61904 [4:28:47<19:48:27, 1.34s/it] 14%|█▍ | 8860/61904 [4:28:48<19:39:39, 1.33s/it] {'loss': 2.909, 'learning_rate': 1.8596525346817062e-07, 'epoch': 2.29} 14%|█▍ | 8860/61904 [4:28:48<19:39:39, 1.33s/it] 14%|█▍ | 8861/61904 [4:28:49<19:28:56, 1.32s/it] 14%|█▍ | 8862/61904 [4:28:51<19:50:22, 1.35s/it] 14%|█▍ | 8863/61904 [4:28:52<19:36:11, 1.33s/it] 14%|█▍ | 8864/61904 [4:28:53<20:48:18, 1.41s/it] 14%|█▍ | 8865/61904 [4:28:55<20:42:14, 1.41s/it] 14%|█▍ | 8866/61904 [4:28:56<20:06:12, 1.36s/it] 14%|█▍ | 8867/61904 [4:28:57<19:47:05, 1.34s/it] 14%|█▍ | 8868/61904 [4:28:59<20:22:55, 1.38s/it] 14%|█▍ | 8869/61904 [4:29:00<20:02:43, 1.36s/it] 14%|█▍ | 8870/61904 [4:29:01<19:27:47, 1.32s/it] 14%|█▍ | 8871/61904 [4:29:03<19:39:27, 1.33s/it] 14%|█▍ | 8872/61904 [4:29:04<20:07:56, 1.37s/it] 14%|█▍ | 8873/61904 [4:29:06<20:11:49, 1.37s/it] 14%|█▍ | 8874/61904 [4:29:07<20:35:13, 1.40s/it] 14%|█▍ | 8875/61904 [4:29:08<20:00:43, 1.36s/it] 14%|█▍ | 8876/61904 [4:29:10<21:19:40, 1.45s/it] 14%|█▍ | 8877/61904 [4:29:11<20:38:21, 1.40s/it] 14%|█▍ | 8878/61904 [4:29:13<20:32:46, 1.39s/it] 14%|█▍ | 8879/61904 [4:29:14<19:32:08, 1.33s/it] 14%|█▍ | 8880/61904 [4:29:15<19:49:54, 1.35s/it] {'loss': 2.8203, 'learning_rate': 1.8593284065862829e-07, 'epoch': 2.29} 14%|█▍ | 8880/61904 [4:29:15<19:49:54, 1.35s/it] 14%|█▍ | 8881/61904 [4:29:17<20:10:52, 1.37s/it] 14%|█▍ | 8882/61904 [4:29:18<20:24:15, 1.39s/it] 14%|█▍ | 8883/61904 [4:29:19<20:34:42, 1.40s/it] 14%|█▍ | 8884/61904 [4:29:21<21:11:59, 1.44s/it] 14%|█▍ | 8885/61904 [4:29:22<20:56:38, 1.42s/it] 14%|█▍ | 8886/61904 [4:29:24<21:20:45, 1.45s/it] 14%|█▍ | 8887/61904 [4:29:25<21:16:53, 1.45s/it] 14%|█▍ | 8888/61904 [4:29:27<21:38:57, 1.47s/it] 14%|█▍ | 8889/61904 [4:29:28<20:50:39, 1.42s/it] 14%|█▍ | 8890/61904 [4:29:29<19:51:12, 1.35s/it] 14%|█▍ | 8891/61904 [4:29:31<19:51:13, 1.35s/it] 14%|█▍ | 8892/61904 [4:29:32<19:45:42, 1.34s/it] 14%|█▍ | 8893/61904 [4:29:33<19:48:14, 1.34s/it] 14%|█▍ | 8894/61904 [4:29:35<20:31:46, 1.39s/it] 14%|█▍ | 8895/61904 [4:29:36<20:22:14, 1.38s/it] 14%|█▍ | 8896/61904 [4:29:38<20:20:52, 1.38s/it] 14%|█▍ | 8897/61904 [4:29:39<20:07:59, 1.37s/it] 14%|█▍ | 8898/61904 [4:29:40<19:50:01, 1.35s/it] 14%|█▍ | 8899/61904 [4:29:42<19:36:54, 1.33s/it] 14%|█▍ | 8900/61904 [4:29:43<19:37:05, 1.33s/it] {'loss': 2.8176, 'learning_rate': 1.8590042784908595e-07, 'epoch': 2.3} 14%|█▍ | 8900/61904 [4:29:43<19:37:05, 1.33s/it] 14%|█▍ | 8901/61904 [4:29:44<19:02:40, 1.29s/it] 14%|█▍ | 8902/61904 [4:29:45<19:22:20, 1.32s/it] 14%|█▍ | 8903/61904 [4:29:47<19:39:25, 1.34s/it] 14%|█▍ | 8904/61904 [4:29:48<19:56:55, 1.36s/it] 14%|█▍ | 8905/61904 [4:29:50<20:01:25, 1.36s/it] 14%|█▍ | 8906/61904 [4:29:51<19:42:49, 1.34s/it] 14%|█▍ | 8907/61904 [4:29:52<19:20:00, 1.31s/it] 14%|█▍ | 8908/61904 [4:29:53<18:49:24, 1.28s/it] 14%|█▍ | 8909/61904 [4:29:55<19:18:41, 1.31s/it] 14%|█▍ | 8910/61904 [4:29:56<19:04:28, 1.30s/it] 14%|█▍ | 8911/61904 [4:29:57<19:12:37, 1.31s/it] 14%|█▍ | 8912/61904 [4:29:59<19:24:18, 1.32s/it] 14%|█▍ | 8913/61904 [4:30:00<20:02:00, 1.36s/it] 14%|█▍ | 8914/61904 [4:30:01<19:45:22, 1.34s/it] 14%|█▍ | 8915/61904 [4:30:03<19:37:51, 1.33s/it] 14%|█▍ | 8916/61904 [4:30:04<19:35:43, 1.33s/it] 14%|█▍ | 8917/61904 [4:30:06<20:15:56, 1.38s/it] 14%|█▍ | 8918/61904 [4:30:07<20:11:06, 1.37s/it] 14%|█▍ | 8919/61904 [4:30:08<19:45:28, 1.34s/it] 14%|█▍ | 8920/61904 [4:30:09<19:20:16, 1.31s/it] {'loss': 2.8396, 'learning_rate': 1.858680150395436e-07, 'epoch': 2.31} 14%|█▍ | 8920/61904 [4:30:09<19:20:16, 1.31s/it] 14%|█▍ | 8921/61904 [4:30:11<19:18:09, 1.31s/it] 14%|█▍ | 8922/61904 [4:30:12<19:02:03, 1.29s/it] 14%|█▍ | 8923/61904 [4:30:14<21:11:01, 1.44s/it] 14%|█▍ | 8924/61904 [4:30:15<21:03:46, 1.43s/it] 14%|█▍ | 8925/61904 [4:30:17<21:08:58, 1.44s/it] 14%|█▍ | 8926/61904 [4:30:18<21:11:40, 1.44s/it] 14%|█▍ | 8927/61904 [4:30:19<20:37:57, 1.40s/it] 14%|█▍ | 8928/61904 [4:30:21<20:40:51, 1.41s/it] 14%|█▍ | 8929/61904 [4:30:22<20:42:42, 1.41s/it] 14%|█▍ | 8930/61904 [4:30:24<20:16:39, 1.38s/it] 14%|█▍ | 8931/61904 [4:30:25<20:10:23, 1.37s/it] 14%|█▍ | 8932/61904 [4:30:26<19:45:38, 1.34s/it] 14%|█▍ | 8933/61904 [4:30:27<19:45:30, 1.34s/it] 14%|█▍ | 8934/61904 [4:30:29<19:13:42, 1.31s/it] 14%|█▍ | 8935/61904 [4:30:30<20:04:14, 1.36s/it] 14%|█▍ | 8936/61904 [4:30:32<20:44:50, 1.41s/it] 14%|█▍ | 8937/61904 [4:30:33<20:23:44, 1.39s/it] 14%|█▍ | 8938/61904 [4:30:34<20:10:20, 1.37s/it] 14%|█▍ | 8939/61904 [4:30:36<20:29:07, 1.39s/it] 14%|█▍ | 8940/61904 [4:30:37<20:14:51, 1.38s/it] {'loss': 2.8307, 'learning_rate': 1.8583560223000127e-07, 'epoch': 2.31} 14%|█▍ | 8940/61904 [4:30:37<20:14:51, 1.38s/it] 14%|█▍ | 8941/61904 [4:30:39<20:12:31, 1.37s/it] 14%|█▍ | 8942/61904 [4:30:40<20:16:15, 1.38s/it] 14%|█▍ | 8943/61904 [4:30:41<20:11:48, 1.37s/it] 14%|█▍ | 8944/61904 [4:30:43<19:52:15, 1.35s/it] 14%|█▍ | 8945/61904 [4:30:44<20:05:50, 1.37s/it] 14%|█▍ | 8946/61904 [4:30:45<19:47:47, 1.35s/it] 14%|█▍ | 8947/61904 [4:30:47<19:49:13, 1.35s/it] 14%|█▍ | 8948/61904 [4:30:48<19:58:10, 1.36s/it] 14%|█▍ | 8949/61904 [4:30:50<20:41:35, 1.41s/it] 14%|█▍ | 8950/61904 [4:30:51<20:10:39, 1.37s/it] 14%|█▍ | 8951/61904 [4:30:52<20:01:07, 1.36s/it] 14%|█▍ | 8952/61904 [4:30:54<19:59:39, 1.36s/it] 14%|█▍ | 8953/61904 [4:30:55<20:01:22, 1.36s/it] 14%|█▍ | 8954/61904 [4:30:56<19:44:44, 1.34s/it] 14%|█▍ | 8955/61904 [4:30:58<19:49:36, 1.35s/it] 14%|█▍ | 8956/61904 [4:30:59<19:40:59, 1.34s/it] 14%|█▍ | 8957/61904 [4:31:00<19:09:38, 1.30s/it] 14%|█▍ | 8958/61904 [4:31:01<18:47:52, 1.28s/it] 14%|█▍ | 8959/61904 [4:31:03<19:18:30, 1.31s/it] 14%|█▍ | 8960/61904 [4:31:04<19:40:16, 1.34s/it] {'loss': 2.895, 'learning_rate': 1.8580318942045896e-07, 'epoch': 2.32} 14%|█▍ | 8960/61904 [4:31:04<19:40:16, 1.34s/it] 14%|█▍ | 8961/61904 [4:31:06<20:00:51, 1.36s/it] 14%|█▍ | 8962/61904 [4:31:07<19:46:45, 1.34s/it] 14%|█▍ | 8963/61904 [4:31:08<19:24:08, 1.32s/it] 14%|█▍ | 8964/61904 [4:31:09<19:28:50, 1.32s/it] 14%|█▍ | 8965/61904 [4:31:11<19:06:35, 1.30s/it] 14%|█▍ | 8966/61904 [4:31:12<19:38:29, 1.34s/it] 14%|█▍ | 8967/61904 [4:31:13<19:32:50, 1.33s/it] 14%|█▍ | 8968/61904 [4:31:15<20:17:00, 1.38s/it] 14%|█▍ | 8969/61904 [4:31:16<19:27:20, 1.32s/it] 14%|█▍ | 8970/61904 [4:31:17<19:26:24, 1.32s/it] 14%|█▍ | 8971/61904 [4:31:19<19:10:14, 1.30s/it] 14%|█▍ | 8972/61904 [4:31:20<19:03:53, 1.30s/it] 14%|█▍ | 8973/61904 [4:31:21<18:49:56, 1.28s/it] 14%|█▍ | 8974/61904 [4:31:23<19:30:32, 1.33s/it] 14%|█▍ | 8975/61904 [4:31:24<19:29:56, 1.33s/it] 14%|█▍ | 8976/61904 [4:31:25<20:06:44, 1.37s/it] 15%|█▍ | 8977/61904 [4:31:27<21:27:51, 1.46s/it] 15%|█▍ | 8978/61904 [4:31:29<21:24:19, 1.46s/it] 15%|█▍ | 8979/61904 [4:31:30<21:11:22, 1.44s/it] 15%|█▍ | 8980/61904 [4:31:31<20:53:06, 1.42s/it] {'loss': 2.8683, 'learning_rate': 1.8577077661091662e-07, 'epoch': 2.32} 15%|█▍ | 8980/61904 [4:31:31<20:53:06, 1.42s/it] 15%|█▍ | 8981/61904 [4:31:33<20:41:51, 1.41s/it] 15%|█▍ | 8982/61904 [4:31:34<20:17:30, 1.38s/it] 15%|█▍ | 8983/61904 [4:31:35<20:33:21, 1.40s/it] 15%|█▍ | 8984/61904 [4:31:37<20:13:25, 1.38s/it] 15%|█▍ | 8985/61904 [4:31:38<19:49:21, 1.35s/it] 15%|█▍ | 8986/61904 [4:31:39<19:42:01, 1.34s/it] 15%|█▍ | 8987/61904 [4:31:41<19:30:38, 1.33s/it] 15%|█▍ | 8988/61904 [4:31:42<19:35:28, 1.33s/it] 15%|█▍ | 8989/61904 [4:31:43<19:22:55, 1.32s/it] 15%|█▍ | 8990/61904 [4:31:45<19:37:14, 1.33s/it] 15%|█▍ | 8991/61904 [4:31:46<19:32:56, 1.33s/it] 15%|█▍ | 8992/61904 [4:31:47<19:55:37, 1.36s/it] 15%|█▍ | 8993/61904 [4:31:49<19:45:24, 1.34s/it] 15%|█▍ | 8994/61904 [4:31:50<20:07:33, 1.37s/it] 15%|█▍ | 8995/61904 [4:31:52<20:08:53, 1.37s/it] 15%|█▍ | 8996/61904 [4:31:53<20:15:18, 1.38s/it] 15%|█▍ | 8997/61904 [4:31:54<20:13:59, 1.38s/it] 15%|█▍ | 8998/61904 [4:31:56<20:46:33, 1.41s/it] 15%|█▍ | 8999/61904 [4:31:57<20:27:49, 1.39s/it] 15%|█▍ | 9000/61904 [4:31:59<20:47:36, 1.41s/it] {'loss': 2.8272, 'learning_rate': 1.8573836380137428e-07, 'epoch': 2.33} 15%|█▍ | 9000/61904 [4:31:59<20:47:36, 1.41s/it] 15%|█▍ | 9001/61904 [4:32:00<20:41:03, 1.41s/it] 15%|█▍ | 9002/61904 [4:32:01<20:13:47, 1.38s/it] 15%|█▍ | 9003/61904 [4:32:03<20:28:07, 1.39s/it] 15%|█▍ | 9004/61904 [4:32:04<20:07:21, 1.37s/it] 15%|█▍ | 9005/61904 [4:32:05<20:06:39, 1.37s/it] 15%|█▍ | 9006/61904 [4:32:07<19:38:32, 1.34s/it] 15%|█▍ | 9007/61904 [4:32:08<19:28:42, 1.33s/it] 15%|█▍ | 9008/61904 [4:32:09<19:59:59, 1.36s/it] 15%|█▍ | 9009/61904 [4:32:11<19:59:40, 1.36s/it] 15%|█▍ | 9010/61904 [4:32:12<20:13:02, 1.38s/it] 15%|█▍ | 9011/61904 [4:32:14<20:30:36, 1.40s/it] 15%|█▍ | 9012/61904 [4:32:15<20:04:03, 1.37s/it] 15%|█▍ | 9013/61904 [4:32:16<20:02:49, 1.36s/it] 15%|█▍ | 9014/61904 [4:32:18<19:34:00, 1.33s/it] 15%|█▍ | 9015/61904 [4:32:19<20:10:24, 1.37s/it] 15%|█▍ | 9016/61904 [4:32:20<19:47:21, 1.35s/it] 15%|█▍ | 9017/61904 [4:32:22<19:40:14, 1.34s/it] 15%|█▍ | 9018/61904 [4:32:23<19:57:33, 1.36s/it] 15%|█▍ | 9019/61904 [4:32:24<20:29:06, 1.39s/it] 15%|█▍ | 9020/61904 [4:32:26<20:35:22, 1.40s/it] {'loss': 2.8701, 'learning_rate': 1.8570595099183197e-07, 'epoch': 2.33} 15%|█▍ | 9020/61904 [4:32:26<20:35:22, 1.40s/it] 15%|█▍ | 9021/61904 [4:32:27<20:08:38, 1.37s/it] 15%|█▍ | 9022/61904 [4:32:28<19:45:20, 1.34s/it] 15%|█▍ | 9023/61904 [4:32:30<19:58:17, 1.36s/it] 15%|█▍ | 9024/61904 [4:32:31<19:16:34, 1.31s/it] 15%|█▍ | 9025/61904 [4:32:33<20:17:19, 1.38s/it] 15%|█▍ | 9026/61904 [4:32:34<20:30:19, 1.40s/it] 15%|█▍ | 9027/61904 [4:32:36<21:00:36, 1.43s/it] 15%|█▍ | 9028/61904 [4:32:37<21:20:55, 1.45s/it] 15%|█▍ | 9029/61904 [4:32:38<20:19:53, 1.38s/it] 15%|█▍ | 9030/61904 [4:32:40<20:13:44, 1.38s/it] 15%|█▍ | 9031/61904 [4:32:41<20:00:54, 1.36s/it] 15%|█▍ | 9032/61904 [4:32:42<19:57:33, 1.36s/it] 15%|█▍ | 9033/61904 [4:32:44<20:13:04, 1.38s/it] 15%|█▍ | 9034/61904 [4:32:45<20:30:53, 1.40s/it] 15%|█▍ | 9035/61904 [4:32:47<20:32:08, 1.40s/it] 15%|█▍ | 9036/61904 [4:32:48<20:41:09, 1.41s/it] 15%|█▍ | 9037/61904 [4:32:49<20:37:17, 1.40s/it] 15%|█▍ | 9038/61904 [4:32:51<20:27:50, 1.39s/it] 15%|█▍ | 9039/61904 [4:32:52<19:57:07, 1.36s/it] 15%|█▍ | 9040/61904 [4:32:53<20:05:15, 1.37s/it] {'loss': 2.9526, 'learning_rate': 1.8567353818228963e-07, 'epoch': 2.34} 15%|█▍ | 9040/61904 [4:32:53<20:05:15, 1.37s/it] 15%|█▍ | 9041/61904 [4:32:55<20:06:44, 1.37s/it] 15%|█▍ | 9042/61904 [4:32:56<19:59:35, 1.36s/it] 15%|█▍ | 9043/61904 [4:32:58<19:51:40, 1.35s/it] 15%|█▍ | 9044/61904 [4:32:59<20:03:29, 1.37s/it] 15%|█▍ | 9045/61904 [4:33:00<19:53:09, 1.35s/it] 15%|█▍ | 9046/61904 [4:33:02<19:41:24, 1.34s/it] 15%|█▍ | 9047/61904 [4:33:03<19:34:33, 1.33s/it] 15%|█▍ | 9048/61904 [4:33:04<20:13:13, 1.38s/it] 15%|█▍ | 9049/61904 [4:33:06<21:20:50, 1.45s/it] 15%|█▍ | 9050/61904 [4:33:07<20:54:30, 1.42s/it] 15%|█▍ | 9051/61904 [4:33:09<20:19:13, 1.38s/it] 15%|█▍ | 9052/61904 [4:33:10<19:55:18, 1.36s/it] 15%|█▍ | 9053/61904 [4:33:11<20:18:54, 1.38s/it] 15%|█▍ | 9054/61904 [4:33:13<20:02:04, 1.36s/it] 15%|█▍ | 9055/61904 [4:33:14<19:56:47, 1.36s/it] 15%|█▍ | 9056/61904 [4:33:15<20:11:00, 1.37s/it] 15%|█▍ | 9057/61904 [4:33:17<20:34:54, 1.40s/it] 15%|█▍ | 9058/61904 [4:33:18<20:29:54, 1.40s/it] 15%|█▍ | 9059/61904 [4:33:20<20:34:51, 1.40s/it] 15%|█▍ | 9060/61904 [4:33:21<20:16:11, 1.38s/it] {'loss': 2.8748, 'learning_rate': 1.856411253727473e-07, 'epoch': 2.34} 15%|█▍ | 9060/61904 [4:33:21<20:16:11, 1.38s/it] 15%|█▍ | 9061/61904 [4:33:22<19:39:32, 1.34s/it] 15%|█▍ | 9062/61904 [4:33:24<19:29:55, 1.33s/it] 15%|█▍ | 9063/61904 [4:33:25<19:26:09, 1.32s/it] 15%|█▍ | 9064/61904 [4:33:26<19:42:17, 1.34s/it] 15%|█▍ | 9065/61904 [4:33:28<19:59:01, 1.36s/it] 15%|█▍ | 9066/61904 [4:33:29<19:59:21, 1.36s/it] 15%|█▍ | 9067/61904 [4:33:30<19:59:51, 1.36s/it] 15%|█▍ | 9068/61904 [4:33:32<19:29:47, 1.33s/it] 15%|█▍ | 9069/61904 [4:33:33<19:44:51, 1.35s/it] 15%|█▍ | 9070/61904 [4:33:34<20:06:34, 1.37s/it] 15%|█▍ | 9071/61904 [4:33:36<20:38:46, 1.41s/it] 15%|█▍ | 9072/61904 [4:33:37<19:52:09, 1.35s/it] 15%|█▍ | 9073/61904 [4:33:39<19:39:54, 1.34s/it] 15%|█▍ | 9074/61904 [4:33:40<19:42:33, 1.34s/it] 15%|█▍ | 9075/61904 [4:33:41<19:46:46, 1.35s/it] 15%|█▍ | 9076/61904 [4:33:43<19:33:29, 1.33s/it] 15%|█▍ | 9077/61904 [4:33:44<19:29:40, 1.33s/it] 15%|█▍ | 9078/61904 [4:33:45<20:00:38, 1.36s/it] 15%|█▍ | 9079/61904 [4:33:47<19:55:14, 1.36s/it] 15%|█▍ | 9080/61904 [4:33:48<20:07:01, 1.37s/it] {'loss': 2.8858, 'learning_rate': 1.8560871256320498e-07, 'epoch': 2.35} 15%|█▍ | 9080/61904 [4:33:48<20:07:01, 1.37s/it] 15%|█▍ | 9081/61904 [4:33:50<20:38:05, 1.41s/it] 15%|█▍ | 9082/61904 [4:33:51<21:07:48, 1.44s/it] 15%|█▍ | 9083/61904 [4:33:52<20:37:42, 1.41s/it] 15%|█▍ | 9084/61904 [4:33:54<20:42:36, 1.41s/it] 15%|█▍ | 9085/61904 [4:33:55<20:55:36, 1.43s/it] 15%|█▍ | 9086/61904 [4:33:57<20:27:35, 1.39s/it] 15%|█▍ | 9087/61904 [4:33:58<19:35:39, 1.34s/it] 15%|█▍ | 9088/61904 [4:33:59<20:00:16, 1.36s/it] 15%|█▍ | 9089/61904 [4:34:01<19:51:33, 1.35s/it] 15%|█▍ | 9090/61904 [4:34:02<19:34:30, 1.33s/it] 15%|█▍ | 9091/61904 [4:34:03<19:23:38, 1.32s/it] 15%|█▍ | 9092/61904 [4:34:04<19:22:58, 1.32s/it] 15%|█▍ | 9093/61904 [4:34:06<19:23:06, 1.32s/it] 15%|█▍ | 9094/61904 [4:34:07<20:05:59, 1.37s/it] 15%|█▍ | 9095/61904 [4:34:09<19:57:41, 1.36s/it] 15%|█▍ | 9096/61904 [4:34:10<19:31:30, 1.33s/it] 15%|█▍ | 9097/61904 [4:34:11<20:23:05, 1.39s/it] 15%|█▍ | 9098/61904 [4:34:13<20:15:21, 1.38s/it] 15%|█▍ | 9099/61904 [4:34:14<19:50:36, 1.35s/it] 15%|█▍ | 9100/61904 [4:34:15<19:38:52, 1.34s/it] {'loss': 2.9003, 'learning_rate': 1.8557629975366262e-07, 'epoch': 2.35} 15%|█▍ | 9100/61904 [4:34:15<19:38:52, 1.34s/it] 15%|█▍ | 9101/61904 [4:34:17<19:49:06, 1.35s/it] 15%|█▍ | 9102/61904 [4:34:18<19:10:57, 1.31s/it] 15%|█▍ | 9103/61904 [4:34:19<19:50:53, 1.35s/it] 15%|█▍ | 9104/61904 [4:34:21<20:08:11, 1.37s/it] 15%|█▍ | 9105/61904 [4:34:22<19:52:30, 1.36s/it] 15%|█▍ | 9106/61904 [4:34:23<20:01:33, 1.37s/it] 15%|█▍ | 9107/61904 [4:34:25<19:45:21, 1.35s/it] 15%|█▍ | 9108/61904 [4:34:26<19:31:45, 1.33s/it] 15%|█▍ | 9109/61904 [4:34:27<19:14:36, 1.31s/it] 15%|█▍ | 9110/61904 [4:34:29<20:16:45, 1.38s/it] 15%|█▍ | 9111/61904 [4:34:30<20:27:39, 1.40s/it] 15%|█▍ | 9112/61904 [4:34:32<21:00:04, 1.43s/it] 15%|█▍ | 9113/61904 [4:34:33<20:49:52, 1.42s/it] 15%|█▍ | 9114/61904 [4:34:35<20:26:13, 1.39s/it] 15%|█▍ | 9115/61904 [4:34:36<20:39:58, 1.41s/it] 15%|█▍ | 9116/61904 [4:34:38<21:14:31, 1.45s/it] 15%|█▍ | 9117/61904 [4:34:39<21:01:26, 1.43s/it] 15%|█▍ | 9118/61904 [4:34:40<21:15:01, 1.45s/it] 15%|█▍ | 9119/61904 [4:34:42<20:20:57, 1.39s/it] 15%|█▍ | 9120/61904 [4:34:43<20:06:32, 1.37s/it] {'loss': 2.8281, 'learning_rate': 1.855438869441203e-07, 'epoch': 2.36} 15%|█▍ | 9120/61904 [4:34:43<20:06:32, 1.37s/it] 15%|█▍ | 9121/61904 [4:34:45<20:41:01, 1.41s/it] 15%|█▍ | 9122/61904 [4:34:46<20:27:30, 1.40s/it] 15%|█▍ | 9123/61904 [4:34:47<20:14:43, 1.38s/it] 15%|█▍ | 9124/61904 [4:34:49<20:59:07, 1.43s/it] 15%|█▍ | 9125/61904 [4:34:50<20:26:11, 1.39s/it] 15%|█▍ | 9126/61904 [4:34:51<19:55:30, 1.36s/it] 15%|█▍ | 9127/61904 [4:34:53<19:48:41, 1.35s/it] 15%|█▍ | 9128/61904 [4:34:54<20:06:31, 1.37s/it] 15%|█▍ | 9129/61904 [4:34:55<19:37:52, 1.34s/it] 15%|█▍ | 9130/61904 [4:34:57<19:42:33, 1.34s/it] 15%|█▍ | 9131/61904 [4:34:58<19:42:27, 1.34s/it] 15%|█▍ | 9132/61904 [4:34:59<19:27:54, 1.33s/it] 15%|█▍ | 9133/61904 [4:35:01<20:07:03, 1.37s/it] 15%|█▍ | 9134/61904 [4:35:02<20:34:04, 1.40s/it] 15%|█▍ | 9135/61904 [4:35:04<20:27:12, 1.40s/it] 15%|█▍ | 9136/61904 [4:35:05<20:33:03, 1.40s/it] 15%|█▍ | 9137/61904 [4:35:06<20:08:18, 1.37s/it] 15%|█▍ | 9138/61904 [4:35:08<19:47:02, 1.35s/it] 15%|█▍ | 9139/61904 [4:35:09<19:49:32, 1.35s/it] 15%|█▍ | 9140/61904 [4:35:11<20:29:52, 1.40s/it] {'loss': 2.81, 'learning_rate': 1.8551147413457797e-07, 'epoch': 2.36} 15%|█▍ | 9140/61904 [4:35:11<20:29:52, 1.40s/it] 15%|█▍ | 9141/61904 [4:35:12<20:38:02, 1.41s/it] 15%|█▍ | 9142/61904 [4:35:13<20:54:04, 1.43s/it] 15%|█▍ | 9143/61904 [4:35:15<20:33:41, 1.40s/it] 15%|█▍ | 9144/61904 [4:35:16<20:06:26, 1.37s/it] 15%|█▍ | 9145/61904 [4:35:17<19:56:33, 1.36s/it] 15%|█▍ | 9146/61904 [4:35:19<19:50:50, 1.35s/it] 15%|█▍ | 9147/61904 [4:35:20<20:41:43, 1.41s/it] 15%|█▍ | 9148/61904 [4:35:22<20:13:23, 1.38s/it] 15%|█▍ | 9149/61904 [4:35:23<19:53:10, 1.36s/it] 15%|█▍ | 9150/61904 [4:35:24<19:48:45, 1.35s/it] 15%|█▍ | 9151/61904 [4:35:26<20:27:37, 1.40s/it] 15%|█▍ | 9152/61904 [4:35:27<20:17:00, 1.38s/it] 15%|█▍ | 9153/61904 [4:35:29<20:11:07, 1.38s/it] 15%|█▍ | 9154/61904 [4:35:30<19:40:36, 1.34s/it] 15%|█▍ | 9155/61904 [4:35:31<19:34:16, 1.34s/it] 15%|█▍ | 9156/61904 [4:35:32<19:39:01, 1.34s/it] 15%|█▍ | 9157/61904 [4:35:34<19:29:20, 1.33s/it] 15%|█▍ | 9158/61904 [4:35:35<19:30:00, 1.33s/it] 15%|█▍ | 9159/61904 [4:35:36<19:47:58, 1.35s/it] 15%|█▍ | 9160/61904 [4:35:38<21:13:12, 1.45s/it] {'loss': 2.8814, 'learning_rate': 1.8547906132503563e-07, 'epoch': 2.37} 15%|█▍ | 9160/61904 [4:35:38<21:13:12, 1.45s/it] 15%|█▍ | 9161/61904 [4:35:39<20:39:14, 1.41s/it] 15%|█▍ | 9162/61904 [4:35:41<20:37:54, 1.41s/it] 15%|█▍ | 9163/61904 [4:35:42<20:22:06, 1.39s/it] 15%|█▍ | 9164/61904 [4:35:44<19:56:57, 1.36s/it] 15%|█▍ | 9165/61904 [4:35:45<19:53:51, 1.36s/it] 15%|█▍ | 9166/61904 [4:35:46<19:54:28, 1.36s/it] 15%|█▍ | 9167/61904 [4:35:48<19:34:03, 1.34s/it] 15%|█▍ | 9168/61904 [4:35:49<19:47:48, 1.35s/it] 15%|█▍ | 9169/61904 [4:35:50<19:40:51, 1.34s/it] 15%|█▍ | 9170/61904 [4:35:52<20:03:46, 1.37s/it] 15%|█▍ | 9171/61904 [4:35:53<19:36:49, 1.34s/it] 15%|█▍ | 9172/61904 [4:35:54<19:51:35, 1.36s/it] 15%|█▍ | 9173/61904 [4:35:56<20:22:12, 1.39s/it] 15%|█▍ | 9174/61904 [4:35:57<19:37:43, 1.34s/it] 15%|█▍ | 9175/61904 [4:35:58<19:43:00, 1.35s/it] 15%|█▍ | 9176/61904 [4:36:00<19:49:13, 1.35s/it] 15%|█▍ | 9177/61904 [4:36:01<20:08:07, 1.37s/it] 15%|█▍ | 9178/61904 [4:36:03<21:05:06, 1.44s/it] 15%|█▍ | 9179/61904 [4:36:04<21:15:20, 1.45s/it] 15%|█▍ | 9180/61904 [4:36:06<21:03:02, 1.44s/it] {'loss': 2.9104, 'learning_rate': 1.8544664851549332e-07, 'epoch': 2.37} 15%|█▍ | 9180/61904 [4:36:06<21:03:02, 1.44s/it] 15%|█▍ | 9181/61904 [4:36:07<21:23:36, 1.46s/it] 15%|█▍ | 9182/61904 [4:36:09<20:58:53, 1.43s/it] 15%|█▍ | 9183/61904 [4:36:10<20:29:36, 1.40s/it] 15%|█▍ | 9184/61904 [4:36:11<20:07:08, 1.37s/it] 15%|█▍ | 9185/61904 [4:36:13<20:11:14, 1.38s/it] 15%|█▍ | 9186/61904 [4:36:14<20:09:35, 1.38s/it] 15%|█▍ | 9187/61904 [4:36:15<19:53:26, 1.36s/it] 15%|█▍ | 9188/61904 [4:36:17<19:48:37, 1.35s/it] 15%|█▍ | 9189/61904 [4:36:18<19:41:45, 1.35s/it] 15%|█▍ | 9190/61904 [4:36:19<19:41:51, 1.35s/it] 15%|█▍ | 9191/61904 [4:36:21<19:51:02, 1.36s/it] 15%|█▍ | 9192/61904 [4:36:22<20:23:25, 1.39s/it] 15%|█▍ | 9193/61904 [4:36:23<19:48:45, 1.35s/it] 15%|█▍ | 9194/61904 [4:36:25<19:57:45, 1.36s/it] 15%|█▍ | 9195/61904 [4:36:26<19:31:34, 1.33s/it] 15%|█▍ | 9196/61904 [4:36:27<19:01:22, 1.30s/it] 15%|█▍ | 9197/61904 [4:36:29<19:06:34, 1.31s/it] 15%|█▍ | 9198/61904 [4:36:30<19:12:42, 1.31s/it] 15%|█▍ | 9199/61904 [4:36:31<19:17:00, 1.32s/it] 15%|█▍ | 9200/61904 [4:36:33<19:32:39, 1.34s/it] {'loss': 2.8546, 'learning_rate': 1.8541423570595098e-07, 'epoch': 2.38} 15%|█▍ | 9200/61904 [4:36:33<19:32:39, 1.34s/it] 15%|█▍ | 9201/61904 [4:36:34<19:43:16, 1.35s/it] 15%|█▍ | 9202/61904 [4:36:35<20:07:39, 1.37s/it] 15%|█▍ | 9203/61904 [4:36:37<19:53:53, 1.36s/it] 15%|█▍ | 9204/61904 [4:36:38<19:45:54, 1.35s/it] 15%|█▍ | 9205/61904 [4:36:39<20:04:47, 1.37s/it] 15%|█▍ | 9206/61904 [4:36:41<20:02:50, 1.37s/it] 15%|█▍ | 9207/61904 [4:36:42<20:07:18, 1.37s/it] 15%|█▍ | 9208/61904 [4:36:44<19:53:19, 1.36s/it] 15%|█▍ | 9209/61904 [4:36:45<20:26:33, 1.40s/it] 15%|█▍ | 9210/61904 [4:36:46<19:55:04, 1.36s/it] 15%|█▍ | 9211/61904 [4:36:48<20:26:16, 1.40s/it] 15%|█▍ | 9212/61904 [4:36:49<19:58:35, 1.36s/it] 15%|█▍ | 9213/61904 [4:36:51<20:21:29, 1.39s/it] 15%|█▍ | 9214/61904 [4:36:52<19:58:07, 1.36s/it] 15%|█▍ | 9215/61904 [4:36:53<20:21:53, 1.39s/it] 15%|█▍ | 9216/61904 [4:36:55<19:46:46, 1.35s/it] 15%|█▍ | 9217/61904 [4:36:56<19:53:53, 1.36s/it] 15%|█▍ | 9218/61904 [4:36:57<19:24:22, 1.33s/it] 15%|█▍ | 9219/61904 [4:36:58<19:15:58, 1.32s/it] 15%|█▍ | 9220/61904 [4:37:00<19:46:48, 1.35s/it] {'loss': 2.8443, 'learning_rate': 1.8538182289640864e-07, 'epoch': 2.38} 15%|█▍ | 9220/61904 [4:37:00<19:46:48, 1.35s/it] 15%|█▍ | 9221/61904 [4:37:01<19:28:59, 1.33s/it] 15%|█▍ | 9222/61904 [4:37:03<19:29:06, 1.33s/it] 15%|█▍ | 9223/61904 [4:37:04<19:43:08, 1.35s/it] 15%|█▍ | 9224/61904 [4:37:05<19:44:51, 1.35s/it] 15%|█▍ | 9225/61904 [4:37:07<19:58:35, 1.37s/it] 15%|█▍ | 9226/61904 [4:37:08<19:37:45, 1.34s/it] 15%|█▍ | 9227/61904 [4:37:09<19:28:07, 1.33s/it] 15%|█▍ | 9228/61904 [4:37:11<19:35:48, 1.34s/it] 15%|█▍ | 9229/61904 [4:37:12<19:35:33, 1.34s/it] 15%|█▍ | 9230/61904 [4:37:13<19:24:11, 1.33s/it] 15%|█▍ | 9231/61904 [4:37:14<19:02:30, 1.30s/it] 15%|█▍ | 9232/61904 [4:37:16<19:16:11, 1.32s/it] 15%|█▍ | 9233/61904 [4:37:17<20:15:15, 1.38s/it] 15%|█▍ | 9234/61904 [4:37:19<20:31:32, 1.40s/it] 15%|█▍ | 9235/61904 [4:37:20<20:27:45, 1.40s/it] 15%|█▍ | 9236/61904 [4:37:22<20:15:24, 1.38s/it] 15%|█▍ | 9237/61904 [4:37:23<19:52:14, 1.36s/it] 15%|█▍ | 9238/61904 [4:37:24<20:45:50, 1.42s/it] 15%|█▍ | 9239/61904 [4:37:26<20:03:54, 1.37s/it] 15%|█▍ | 9240/61904 [4:37:27<20:26:39, 1.40s/it] {'loss': 2.8323, 'learning_rate': 1.8534941008686633e-07, 'epoch': 2.39} 15%|█▍ | 9240/61904 [4:37:27<20:26:39, 1.40s/it] 15%|█▍ | 9241/61904 [4:37:29<20:35:30, 1.41s/it] 15%|█▍ | 9242/61904 [4:37:30<20:11:19, 1.38s/it] 15%|█▍ | 9243/61904 [4:37:31<19:58:12, 1.37s/it] 15%|█▍ | 9244/61904 [4:37:33<19:34:08, 1.34s/it] 15%|█▍ | 9245/61904 [4:37:34<19:48:17, 1.35s/it] 15%|█▍ | 9246/61904 [4:37:35<20:12:55, 1.38s/it] 15%|█▍ | 9247/61904 [4:37:37<19:43:41, 1.35s/it] 15%|█▍ | 9248/61904 [4:37:38<19:33:37, 1.34s/it] 15%|█▍ | 9249/61904 [4:37:39<20:06:33, 1.37s/it] 15%|█▍ | 9250/61904 [4:37:41<19:51:48, 1.36s/it] 15%|█▍ | 9251/61904 [4:37:42<19:41:36, 1.35s/it] 15%|█▍ | 9252/61904 [4:37:43<20:00:22, 1.37s/it] 15%|█▍ | 9253/61904 [4:37:45<19:58:42, 1.37s/it] 15%|█▍ | 9254/61904 [4:37:46<19:37:55, 1.34s/it] 15%|█▍ | 9255/61904 [4:37:47<19:44:49, 1.35s/it] 15%|█▍ | 9256/61904 [4:37:49<19:54:04, 1.36s/it] 15%|█▍ | 9257/61904 [4:37:50<20:16:10, 1.39s/it] 15%|█▍ | 9258/61904 [4:37:52<20:00:32, 1.37s/it] 15%|█▍ | 9259/61904 [4:37:53<19:32:25, 1.34s/it] 15%|█▍ | 9260/61904 [4:37:54<19:22:16, 1.32s/it] {'loss': 2.8195, 'learning_rate': 1.85316997277324e-07, 'epoch': 2.39} 15%|█▍ | 9260/61904 [4:37:54<19:22:16, 1.32s/it] 15%|█▍ | 9261/61904 [4:37:56<20:06:51, 1.38s/it] 15%|█▍ | 9262/61904 [4:37:57<19:33:09, 1.34s/it] 15%|█▍ | 9263/61904 [4:37:58<19:09:51, 1.31s/it] 15%|█▍ | 9264/61904 [4:38:00<19:42:29, 1.35s/it] 15%|█▍ | 9265/61904 [4:38:01<19:19:50, 1.32s/it] 15%|█▍ | 9266/61904 [4:38:02<19:43:57, 1.35s/it] 15%|█▍ | 9267/61904 [4:38:04<20:20:03, 1.39s/it] 15%|█▍ | 9268/61904 [4:38:05<20:02:26, 1.37s/it] 15%|█▍ | 9269/61904 [4:38:06<20:06:41, 1.38s/it] 15%|█▍ | 9270/61904 [4:38:08<20:11:05, 1.38s/it] 15%|█▍ | 9271/61904 [4:38:09<19:47:25, 1.35s/it] 15%|█▍ | 9272/61904 [4:38:11<20:11:31, 1.38s/it] 15%|█▍ | 9273/61904 [4:38:12<20:19:25, 1.39s/it] 15%|█▍ | 9274/61904 [4:38:13<20:19:10, 1.39s/it] 15%|█▍ | 9275/61904 [4:38:15<20:11:52, 1.38s/it] 15%|█▍ | 9276/61904 [4:38:16<20:08:03, 1.38s/it] 15%|█▍ | 9277/61904 [4:38:18<20:22:03, 1.39s/it] 15%|█▍ | 9278/61904 [4:38:19<21:03:11, 1.44s/it] 15%|█▍ | 9279/61904 [4:38:20<20:31:46, 1.40s/it] 15%|█▍ | 9280/61904 [4:38:22<20:43:52, 1.42s/it] {'loss': 2.7998, 'learning_rate': 1.8528458446778166e-07, 'epoch': 2.4} 15%|█▍ | 9280/61904 [4:38:22<20:43:52, 1.42s/it] 15%|█▍ | 9281/61904 [4:38:23<20:24:04, 1.40s/it] 15%|█▍ | 9282/61904 [4:38:25<20:29:51, 1.40s/it] 15%|█▍ | 9283/61904 [4:38:26<20:20:27, 1.39s/it] 15%|█▍ | 9284/61904 [4:38:27<19:59:26, 1.37s/it] 15%|█▍ | 9285/61904 [4:38:29<20:06:34, 1.38s/it] 15%|█▌ | 9286/61904 [4:38:30<20:01:13, 1.37s/it] 15%|█▌ | 9287/61904 [4:38:32<20:39:36, 1.41s/it] 15%|█▌ | 9288/61904 [4:38:33<20:36:31, 1.41s/it] 15%|█▌ | 9289/61904 [4:38:34<20:35:45, 1.41s/it] 15%|█▌ | 9290/61904 [4:38:36<20:20:10, 1.39s/it] 15%|█▌ | 9291/61904 [4:38:37<20:30:53, 1.40s/it] 15%|█▌ | 9292/61904 [4:38:39<20:11:24, 1.38s/it] 15%|█▌ | 9293/61904 [4:38:40<19:46:15, 1.35s/it] 15%|█▌ | 9294/61904 [4:38:41<19:15:22, 1.32s/it] 15%|█▌ | 9295/61904 [4:38:42<19:18:59, 1.32s/it] 15%|█▌ | 9296/61904 [4:38:44<19:09:51, 1.31s/it] 15%|█▌ | 9297/61904 [4:38:45<19:28:18, 1.33s/it] 15%|█▌ | 9298/61904 [4:38:47<20:03:02, 1.37s/it] 15%|█▌ | 9299/61904 [4:38:48<19:57:33, 1.37s/it] 15%|█▌ | 9300/61904 [4:38:49<19:43:41, 1.35s/it] {'loss': 2.85, 'learning_rate': 1.8525217165823934e-07, 'epoch': 2.4} 15%|█▌ | 9300/61904 [4:38:49<19:43:41, 1.35s/it] 15%|█▌ | 9301/61904 [4:38:50<19:26:06, 1.33s/it] 15%|█▌ | 9302/61904 [4:38:52<19:20:34, 1.32s/it] 15%|█▌ | 9303/61904 [4:38:53<18:52:43, 1.29s/it] 15%|█▌ | 9304/61904 [4:38:54<18:57:41, 1.30s/it] 15%|█▌ | 9305/61904 [4:38:56<20:11:06, 1.38s/it] 15%|█▌ | 9306/61904 [4:38:57<20:05:54, 1.38s/it] 15%|█▌ | 9307/61904 [4:38:59<20:04:29, 1.37s/it] 15%|█▌ | 9308/61904 [4:39:00<19:45:11, 1.35s/it] 15%|█▌ | 9309/61904 [4:39:01<19:25:42, 1.33s/it] 15%|█▌ | 9310/61904 [4:39:02<19:21:03, 1.32s/it] 15%|█▌ | 9311/61904 [4:39:04<19:45:24, 1.35s/it] 15%|█▌ | 9312/61904 [4:39:05<19:55:07, 1.36s/it] 15%|█▌ | 9313/61904 [4:39:07<19:45:50, 1.35s/it] 15%|█▌ | 9314/61904 [4:39:08<20:34:09, 1.41s/it] 15%|█▌ | 9315/61904 [4:39:10<21:00:28, 1.44s/it] 15%|█▌ | 9316/61904 [4:39:11<20:30:43, 1.40s/it] 15%|█▌ | 9317/61904 [4:39:12<20:23:14, 1.40s/it] 15%|█▌ | 9318/61904 [4:39:14<20:30:47, 1.40s/it] 15%|█▌ | 9319/61904 [4:39:15<20:07:27, 1.38s/it] 15%|█▌ | 9320/61904 [4:39:16<19:26:37, 1.33s/it] {'loss': 2.8015, 'learning_rate': 1.8521975884869698e-07, 'epoch': 2.41} 15%|█▌ | 9320/61904 [4:39:16<19:26:37, 1.33s/it] 15%|█▌ | 9321/61904 [4:39:18<19:17:45, 1.32s/it] 15%|█▌ | 9322/61904 [4:39:19<19:28:48, 1.33s/it] 15%|█▌ | 9323/61904 [4:39:20<19:12:25, 1.32s/it] 15%|█▌ | 9324/61904 [4:39:22<20:18:21, 1.39s/it] 15%|█▌ | 9325/61904 [4:39:23<20:22:53, 1.40s/it] 15%|█▌ | 9326/61904 [4:39:24<19:41:33, 1.35s/it] 15%|█▌ | 9327/61904 [4:39:26<19:43:46, 1.35s/it] 15%|█▌ | 9328/61904 [4:39:27<19:34:59, 1.34s/it] 15%|█▌ | 9329/61904 [4:39:28<19:21:51, 1.33s/it] 15%|█▌ | 9330/61904 [4:39:30<20:26:21, 1.40s/it] 15%|█▌ | 9331/61904 [4:39:32<20:58:33, 1.44s/it] 15%|█▌ | 9332/61904 [4:39:33<20:25:12, 1.40s/it] 15%|█▌ | 9333/61904 [4:39:34<21:27:08, 1.47s/it] 15%|█▌ | 9334/61904 [4:39:36<20:16:20, 1.39s/it] 15%|█▌ | 9335/61904 [4:39:37<20:37:45, 1.41s/it] 15%|█▌ | 9336/61904 [4:39:39<20:29:33, 1.40s/it] 15%|█▌ | 9337/61904 [4:39:40<19:54:30, 1.36s/it] 15%|█▌ | 9338/61904 [4:39:41<19:45:44, 1.35s/it] 15%|█▌ | 9339/61904 [4:39:42<19:21:51, 1.33s/it] 15%|█▌ | 9340/61904 [4:39:44<19:08:31, 1.31s/it] {'loss': 2.8752, 'learning_rate': 1.8518734603915467e-07, 'epoch': 2.41} 15%|█▌ | 9340/61904 [4:39:44<19:08:31, 1.31s/it] 15%|█▌ | 9341/61904 [4:39:45<19:25:15, 1.33s/it] 15%|█▌ | 9342/61904 [4:39:46<19:57:18, 1.37s/it] 15%|█▌ | 9343/61904 [4:39:48<19:49:21, 1.36s/it] 15%|█▌ | 9344/61904 [4:39:49<20:09:06, 1.38s/it] 15%|█▌ | 9345/61904 [4:39:51<19:36:39, 1.34s/it] 15%|█▌ | 9346/61904 [4:39:52<19:33:01, 1.34s/it] 15%|█▌ | 9347/61904 [4:39:53<19:49:46, 1.36s/it] 15%|█▌ | 9348/61904 [4:39:55<19:52:00, 1.36s/it] 15%|█▌ | 9349/61904 [4:39:56<19:36:35, 1.34s/it] 15%|█▌ | 9350/61904 [4:39:57<20:23:16, 1.40s/it] 15%|█▌ | 9351/61904 [4:39:59<20:38:23, 1.41s/it] 15%|█▌ | 9352/61904 [4:40:00<20:15:09, 1.39s/it] 15%|█▌ | 9353/61904 [4:40:02<20:30:53, 1.41s/it] 15%|█▌ | 9354/61904 [4:40:03<20:20:03, 1.39s/it] 15%|█▌ | 9355/61904 [4:40:04<19:56:54, 1.37s/it] 15%|█▌ | 9356/61904 [4:40:06<20:52:11, 1.43s/it] 15%|█▌ | 9357/61904 [4:40:07<20:56:34, 1.43s/it] 15%|█▌ | 9358/61904 [4:40:09<20:30:00, 1.40s/it] 15%|█▌ | 9359/61904 [4:40:10<19:55:32, 1.37s/it] 15%|█▌ | 9360/61904 [4:40:11<19:50:05, 1.36s/it] {'loss': 2.7702, 'learning_rate': 1.8515493322961233e-07, 'epoch': 2.42} 15%|█▌ | 9360/61904 [4:40:11<19:50:05, 1.36s/it] 15%|█▌ | 9361/61904 [4:40:13<19:39:03, 1.35s/it] 15%|█▌ | 9362/61904 [4:40:14<20:08:03, 1.38s/it] 15%|█▌ | 9363/61904 [4:40:15<19:39:00, 1.35s/it] 15%|█▌ | 9364/61904 [4:40:17<20:08:14, 1.38s/it] 15%|█▌ | 9365/61904 [4:40:18<19:49:39, 1.36s/it] 15%|█▌ | 9366/61904 [4:40:19<19:52:31, 1.36s/it] 15%|█▌ | 9367/61904 [4:40:21<20:00:47, 1.37s/it] 15%|█▌ | 9368/61904 [4:40:22<19:54:48, 1.36s/it] 15%|█▌ | 9369/61904 [4:40:24<19:37:34, 1.34s/it] 15%|█▌ | 9370/61904 [4:40:25<20:13:23, 1.39s/it] 15%|█▌ | 9371/61904 [4:40:26<19:27:07, 1.33s/it] 15%|█▌ | 9372/61904 [4:40:27<19:05:03, 1.31s/it] 15%|█▌ | 9373/61904 [4:40:29<19:27:55, 1.33s/it] 15%|█▌ | 9374/61904 [4:40:30<19:49:02, 1.36s/it] 15%|█▌ | 9375/61904 [4:40:32<20:40:08, 1.42s/it] 15%|█▌ | 9376/61904 [4:40:33<20:17:57, 1.39s/it] 15%|█▌ | 9377/61904 [4:40:35<21:05:42, 1.45s/it] 15%|█▌ | 9378/61904 [4:40:36<20:58:44, 1.44s/it] 15%|█▌ | 9379/61904 [4:40:38<20:34:19, 1.41s/it] 15%|█▌ | 9380/61904 [4:40:39<20:18:03, 1.39s/it] {'loss': 2.9099, 'learning_rate': 1.8512252042007e-07, 'epoch': 2.42} 15%|█▌ | 9380/61904 [4:40:39<20:18:03, 1.39s/it] 15%|█▌ | 9381/61904 [4:40:40<20:45:04, 1.42s/it] 15%|█▌ | 9382/61904 [4:40:42<20:12:45, 1.39s/it] 15%|█▌ | 9383/61904 [4:40:43<19:17:49, 1.32s/it] 15%|█▌ | 9384/61904 [4:40:44<19:57:09, 1.37s/it] 15%|█▌ | 9385/61904 [4:40:46<19:55:31, 1.37s/it] 15%|█▌ | 9386/61904 [4:40:47<20:43:54, 1.42s/it] 15%|█▌ | 9387/61904 [4:40:49<21:09:02, 1.45s/it] 15%|█▌ | 9388/61904 [4:40:50<20:34:31, 1.41s/it] 15%|█▌ | 9389/61904 [4:40:51<20:10:45, 1.38s/it] 15%|█▌ | 9390/61904 [4:40:53<19:42:35, 1.35s/it] 15%|█▌ | 9391/61904 [4:40:54<19:19:43, 1.33s/it] 15%|█▌ | 9392/61904 [4:40:55<19:22:52, 1.33s/it] 15%|█▌ | 9393/61904 [4:40:57<19:29:27, 1.34s/it] 15%|█▌ | 9394/61904 [4:40:58<20:19:14, 1.39s/it] 15%|█▌ | 9395/61904 [4:40:59<19:56:06, 1.37s/it] 15%|█▌ | 9396/61904 [4:41:01<20:17:09, 1.39s/it] 15%|█▌ | 9397/61904 [4:41:02<20:08:57, 1.38s/it] 15%|█▌ | 9398/61904 [4:41:04<20:39:55, 1.42s/it] 15%|█▌ | 9399/61904 [4:41:05<20:10:51, 1.38s/it] 15%|█▌ | 9400/61904 [4:41:06<20:17:05, 1.39s/it] {'loss': 2.8563, 'learning_rate': 1.8509010761052768e-07, 'epoch': 2.43} 15%|█▌ | 9400/61904 [4:41:06<20:17:05, 1.39s/it] 15%|█▌ | 9401/61904 [4:41:08<19:56:52, 1.37s/it] 15%|█▌ | 9402/61904 [4:41:09<19:37:51, 1.35s/it] 15%|█▌ | 9403/61904 [4:41:10<19:23:56, 1.33s/it] 15%|█▌ | 9404/61904 [4:41:12<19:03:41, 1.31s/it] 15%|█▌ | 9405/61904 [4:41:13<18:59:23, 1.30s/it] 15%|█▌ | 9406/61904 [4:41:14<19:08:28, 1.31s/it] 15%|█▌ | 9407/61904 [4:41:16<19:03:10, 1.31s/it] 15%|█▌ | 9408/61904 [4:41:17<18:46:33, 1.29s/it] 15%|█▌ | 9409/61904 [4:41:18<19:26:43, 1.33s/it] 15%|█▌ | 9410/61904 [4:41:20<19:59:43, 1.37s/it] 15%|█▌ | 9411/61904 [4:41:21<20:20:24, 1.39s/it] 15%|█▌ | 9412/61904 [4:41:23<20:26:54, 1.40s/it] 15%|█▌ | 9413/61904 [4:41:24<19:58:56, 1.37s/it] 15%|█▌ | 9414/61904 [4:41:25<19:31:02, 1.34s/it] 15%|█▌ | 9415/61904 [4:41:26<19:25:24, 1.33s/it] 15%|█▌ | 9416/61904 [4:41:28<19:02:25, 1.31s/it] 15%|█▌ | 9417/61904 [4:41:29<19:32:24, 1.34s/it] 15%|█▌ | 9418/61904 [4:41:30<19:35:22, 1.34s/it] 15%|█▌ | 9419/61904 [4:41:32<20:36:02, 1.41s/it] 15%|█▌ | 9420/61904 [4:41:33<20:34:12, 1.41s/it] {'loss': 2.7581, 'learning_rate': 1.8505769480098534e-07, 'epoch': 2.43} 15%|█▌ | 9420/61904 [4:41:33<20:34:12, 1.41s/it] 15%|█▌ | 9421/61904 [4:41:35<20:44:43, 1.42s/it] 15%|█▌ | 9422/61904 [4:41:36<20:53:11, 1.43s/it] 15%|█▌ | 9423/61904 [4:41:38<20:10:28, 1.38s/it] 15%|█▌ | 9424/61904 [4:41:39<20:00:07, 1.37s/it] 15%|█▌ | 9425/61904 [4:41:40<20:37:10, 1.41s/it] 15%|█▌ | 9426/61904 [4:41:42<20:24:43, 1.40s/it] 15%|█▌ | 9427/61904 [4:41:43<21:03:09, 1.44s/it] 15%|█▌ | 9428/61904 [4:41:45<21:01:30, 1.44s/it] 15%|█▌ | 9429/61904 [4:41:46<21:01:48, 1.44s/it] 15%|█▌ | 9430/61904 [4:41:47<20:05:23, 1.38s/it] 15%|█▌ | 9431/61904 [4:41:49<19:54:42, 1.37s/it] 15%|█▌ | 9432/61904 [4:41:50<19:45:49, 1.36s/it] 15%|█▌ | 9433/61904 [4:41:52<20:41:41, 1.42s/it] 15%|█▌ | 9434/61904 [4:41:53<20:25:31, 1.40s/it] 15%|█▌ | 9435/61904 [4:41:54<19:59:32, 1.37s/it] 15%|█▌ | 9436/61904 [4:41:56<20:20:47, 1.40s/it] 15%|█▌ | 9437/61904 [4:41:57<19:55:51, 1.37s/it] 15%|█▌ | 9438/61904 [4:41:59<20:12:41, 1.39s/it] 15%|█▌ | 9439/61904 [4:42:00<19:51:41, 1.36s/it] 15%|█▌ | 9440/61904 [4:42:01<20:09:53, 1.38s/it] {'loss': 2.9199, 'learning_rate': 1.85025281991443e-07, 'epoch': 2.44} 15%|█▌ | 9440/61904 [4:42:01<20:09:53, 1.38s/it] 15%|█▌ | 9441/61904 [4:42:03<20:33:27, 1.41s/it] 15%|█▌ | 9442/61904 [4:42:04<20:14:39, 1.39s/it] 15%|█▌ | 9443/61904 [4:42:06<20:18:45, 1.39s/it] 15%|█▌ | 9444/61904 [4:42:07<19:45:47, 1.36s/it] 15%|█▌ | 9445/61904 [4:42:08<19:42:13, 1.35s/it] 15%|█▌ | 9446/61904 [4:42:10<20:00:41, 1.37s/it] 15%|█▌ | 9447/61904 [4:42:11<20:11:21, 1.39s/it] 15%|█▌ | 9448/61904 [4:42:12<19:21:32, 1.33s/it] 15%|█▌ | 9449/61904 [4:42:14<20:34:12, 1.41s/it] 15%|█▌ | 9450/61904 [4:42:15<20:25:33, 1.40s/it] 15%|█▌ | 9451/61904 [4:42:17<20:43:35, 1.42s/it] 15%|█▌ | 9452/61904 [4:42:18<20:35:15, 1.41s/it] 15%|█▌ | 9453/61904 [4:42:19<20:40:05, 1.42s/it] 15%|█▌ | 9454/61904 [4:42:21<19:42:57, 1.35s/it] 15%|█▌ | 9455/61904 [4:42:22<19:04:52, 1.31s/it] 15%|█▌ | 9456/61904 [4:42:23<18:33:41, 1.27s/it] 15%|█▌ | 9457/61904 [4:42:24<19:19:54, 1.33s/it] 15%|█▌ | 9458/61904 [4:42:26<19:41:31, 1.35s/it] 15%|█▌ | 9459/61904 [4:42:27<19:34:57, 1.34s/it] 15%|█▌ | 9460/61904 [4:42:29<19:46:10, 1.36s/it] {'loss': 2.8404, 'learning_rate': 1.849928691819007e-07, 'epoch': 2.44} 15%|█▌ | 9460/61904 [4:42:29<19:46:10, 1.36s/it] 15%|█▌ | 9461/61904 [4:42:30<20:04:16, 1.38s/it] 15%|█▌ | 9462/61904 [4:42:31<19:42:01, 1.35s/it] 15%|█▌ | 9463/61904 [4:42:33<19:48:38, 1.36s/it] 15%|█▌ | 9464/61904 [4:42:34<19:19:55, 1.33s/it] 15%|█▌ | 9465/61904 [4:42:35<19:25:26, 1.33s/it] 15%|█▌ | 9466/61904 [4:42:37<19:06:20, 1.31s/it] 15%|█▌ | 9467/61904 [4:42:38<18:54:49, 1.30s/it] 15%|█▌ | 9468/61904 [4:42:39<19:21:42, 1.33s/it] 15%|█▌ | 9469/61904 [4:42:41<19:35:21, 1.34s/it] 15%|█▌ | 9470/61904 [4:42:42<19:09:30, 1.32s/it] 15%|█▌ | 9471/61904 [4:42:43<19:38:24, 1.35s/it] 15%|█▌ | 9472/61904 [4:42:45<19:21:11, 1.33s/it] 15%|█▌ | 9473/61904 [4:42:46<19:46:54, 1.36s/it] 15%|█▌ | 9474/61904 [4:42:47<19:51:16, 1.36s/it] 15%|█▌ | 9475/61904 [4:42:49<19:52:11, 1.36s/it] 15%|█▌ | 9476/61904 [4:42:50<20:51:53, 1.43s/it] 15%|█▌ | 9477/61904 [4:42:52<20:06:46, 1.38s/it] 15%|█▌ | 9478/61904 [4:42:53<20:09:13, 1.38s/it] 15%|█▌ | 9479/61904 [4:42:54<20:27:22, 1.40s/it] 15%|█▌ | 9480/61904 [4:42:56<20:04:50, 1.38s/it] {'loss': 2.8321, 'learning_rate': 1.8496045637235835e-07, 'epoch': 2.45} 15%|█▌ | 9480/61904 [4:42:56<20:04:50, 1.38s/it] 15%|█▌ | 9481/61904 [4:42:57<20:18:21, 1.39s/it] 15%|█▌ | 9482/61904 [4:42:59<20:26:57, 1.40s/it] 15%|█▌ | 9483/61904 [4:43:00<19:45:40, 1.36s/it] 15%|█▌ | 9484/61904 [4:43:01<19:57:48, 1.37s/it] 15%|█▌ | 9485/61904 [4:43:03<19:28:47, 1.34s/it] 15%|█▌ | 9486/61904 [4:43:04<19:00:48, 1.31s/it] 15%|█▌ | 9487/61904 [4:43:05<19:20:34, 1.33s/it] 15%|█▌ | 9488/61904 [4:43:07<20:05:16, 1.38s/it] 15%|█▌ | 9489/61904 [4:43:08<19:43:00, 1.35s/it] 15%|█▌ | 9490/61904 [4:43:09<19:06:03, 1.31s/it] 15%|█▌ | 9491/61904 [4:43:10<18:36:38, 1.28s/it] 15%|█▌ | 9492/61904 [4:43:12<18:37:10, 1.28s/it] 15%|█▌ | 9493/61904 [4:43:13<18:34:58, 1.28s/it] 15%|█▌ | 9494/61904 [4:43:14<18:39:05, 1.28s/it] 15%|█▌ | 9495/61904 [4:43:16<19:06:05, 1.31s/it] 15%|█▌ | 9496/61904 [4:43:17<19:13:10, 1.32s/it] 15%|█▌ | 9497/61904 [4:43:18<19:43:21, 1.35s/it] 15%|█▌ | 9498/61904 [4:43:20<19:44:17, 1.36s/it] 15%|█▌ | 9499/61904 [4:43:21<19:46:34, 1.36s/it] 15%|█▌ | 9500/61904 [4:43:22<19:31:30, 1.34s/it] {'loss': 2.781, 'learning_rate': 1.8492804356281602e-07, 'epoch': 2.46} 15%|█▌ | 9500/61904 [4:43:22<19:31:30, 1.34s/it] 15%|█▌ | 9501/61904 [4:43:24<19:13:11, 1.32s/it] 15%|█▌ | 9502/61904 [4:43:25<19:29:04, 1.34s/it] 15%|█▌ | 9503/61904 [4:43:26<19:36:21, 1.35s/it] 15%|█▌ | 9504/61904 [4:43:28<19:59:23, 1.37s/it] 15%|█▌ | 9505/61904 [4:43:29<19:41:37, 1.35s/it] 15%|█▌ | 9506/61904 [4:43:31<20:14:08, 1.39s/it] 15%|█▌ | 9507/61904 [4:43:32<19:56:53, 1.37s/it] 15%|█▌ | 9508/61904 [4:43:33<19:43:58, 1.36s/it] 15%|█▌ | 9509/61904 [4:43:35<20:18:00, 1.39s/it] 15%|█▌ | 9510/61904 [4:43:36<19:51:07, 1.36s/it] 15%|█▌ | 9511/61904 [4:43:37<20:09:37, 1.39s/it] 15%|█▌ | 9512/61904 [4:43:39<20:04:16, 1.38s/it] 15%|█▌ | 9513/61904 [4:43:40<19:35:12, 1.35s/it] 15%|█▌ | 9514/61904 [4:43:41<19:11:41, 1.32s/it] 15%|█▌ | 9515/61904 [4:43:43<19:37:50, 1.35s/it] 15%|█▌ | 9516/61904 [4:43:44<19:42:45, 1.35s/it] 15%|█▌ | 9517/61904 [4:43:45<19:38:08, 1.35s/it] 15%|█▌ | 9518/61904 [4:43:47<19:57:28, 1.37s/it] 15%|█▌ | 9519/61904 [4:43:48<20:12:35, 1.39s/it] 15%|█▌ | 9520/61904 [4:43:50<20:54:18, 1.44s/it] {'loss': 2.9004, 'learning_rate': 1.8489563075327368e-07, 'epoch': 2.46} 15%|█▌ | 9520/61904 [4:43:50<20:54:18, 1.44s/it] 15%|█▌ | 9521/61904 [4:43:51<20:35:39, 1.42s/it] 15%|█▌ | 9522/61904 [4:43:53<20:47:02, 1.43s/it] 15%|█▌ | 9523/61904 [4:43:54<20:48:36, 1.43s/it] 15%|█▌ | 9524/61904 [4:43:55<20:01:08, 1.38s/it] 15%|█▌ | 9525/61904 [4:43:57<20:05:59, 1.38s/it] 15%|█▌ | 9526/61904 [4:43:58<21:29:20, 1.48s/it] 15%|█▌ | 9527/61904 [4:44:00<21:47:47, 1.50s/it] 15%|█▌ | 9528/61904 [4:44:02<21:41:08, 1.49s/it] 15%|█▌ | 9529/61904 [4:44:03<21:38:56, 1.49s/it] 15%|█▌ | 9530/61904 [4:44:04<20:44:24, 1.43s/it] 15%|█▌ | 9531/61904 [4:44:06<20:27:50, 1.41s/it] 15%|█▌ | 9532/61904 [4:44:07<21:10:16, 1.46s/it] 15%|█▌ | 9533/61904 [4:44:09<21:00:34, 1.44s/it] 15%|█▌ | 9534/61904 [4:44:10<20:08:48, 1.38s/it] 15%|█▌ | 9535/61904 [4:44:11<19:48:47, 1.36s/it] 15%|█▌ | 9536/61904 [4:44:13<19:57:18, 1.37s/it] 15%|█▌ | 9537/61904 [4:44:14<19:39:37, 1.35s/it] 15%|█▌ | 9538/61904 [4:44:15<19:49:03, 1.36s/it] 15%|█▌ | 9539/61904 [4:44:17<19:32:32, 1.34s/it] 15%|█▌ | 9540/61904 [4:44:18<20:02:44, 1.38s/it] {'loss': 2.8439, 'learning_rate': 1.8486321794373134e-07, 'epoch': 2.47} 15%|█▌ | 9540/61904 [4:44:18<20:02:44, 1.38s/it] 15%|█▌ | 9541/61904 [4:44:19<19:43:21, 1.36s/it] 15%|█▌ | 9542/61904 [4:44:21<19:24:33, 1.33s/it] 15%|█▌ | 9543/61904 [4:44:22<19:42:43, 1.36s/it] 15%|█▌ | 9544/61904 [4:44:23<19:40:33, 1.35s/it] 15%|█▌ | 9545/61904 [4:44:25<20:14:24, 1.39s/it] 15%|█▌ | 9546/61904 [4:44:26<19:57:45, 1.37s/it] 15%|█▌ | 9547/61904 [4:44:28<19:50:46, 1.36s/it] 15%|█▌ | 9548/61904 [4:44:29<19:46:47, 1.36s/it] 15%|█▌ | 9549/61904 [4:44:30<20:08:57, 1.39s/it] 15%|█▌ | 9550/61904 [4:44:32<21:12:02, 1.46s/it] 15%|█▌ | 9551/61904 [4:44:33<20:56:23, 1.44s/it] 15%|█▌ | 9552/61904 [4:44:35<20:38:30, 1.42s/it] 15%|█▌ | 9553/61904 [4:44:36<20:31:50, 1.41s/it] 15%|█▌ | 9554/61904 [4:44:37<20:28:05, 1.41s/it] 15%|█▌ | 9555/61904 [4:44:39<20:17:37, 1.40s/it] 15%|█▌ | 9556/61904 [4:44:40<20:47:25, 1.43s/it] 15%|█▌ | 9557/61904 [4:44:42<20:27:28, 1.41s/it] 15%|█▌ | 9558/61904 [4:44:43<19:36:54, 1.35s/it] 15%|█▌ | 9559/61904 [4:44:44<20:10:12, 1.39s/it] 15%|█▌ | 9560/61904 [4:44:46<19:49:00, 1.36s/it] {'loss': 2.9077, 'learning_rate': 1.8483080513418903e-07, 'epoch': 2.47} 15%|█▌ | 9560/61904 [4:44:46<19:49:00, 1.36s/it] 15%|█▌ | 9561/61904 [4:44:47<20:09:07, 1.39s/it] 15%|█▌ | 9562/61904 [4:44:48<19:51:46, 1.37s/it] 15%|█▌ | 9563/61904 [4:44:50<20:13:24, 1.39s/it] 15%|█▌ | 9564/61904 [4:44:51<19:44:04, 1.36s/it] 15%|█▌ | 9565/61904 [4:44:52<19:17:56, 1.33s/it] 15%|█▌ | 9566/61904 [4:44:54<19:35:13, 1.35s/it] 15%|█▌ | 9567/61904 [4:44:55<19:25:00, 1.34s/it] 15%|█▌ | 9568/61904 [4:44:57<20:07:29, 1.38s/it] 15%|█▌ | 9569/61904 [4:44:58<19:45:57, 1.36s/it] 15%|█▌ | 9570/61904 [4:44:59<19:55:35, 1.37s/it] 15%|█▌ | 9571/61904 [4:45:01<19:55:59, 1.37s/it] 15%|█▌ | 9572/61904 [4:45:02<19:38:09, 1.35s/it] 15%|█▌ | 9573/61904 [4:45:03<19:21:03, 1.33s/it] 15%|█▌ | 9574/61904 [4:45:05<19:35:48, 1.35s/it] 15%|█▌ | 9575/61904 [4:45:06<19:55:31, 1.37s/it] 15%|█▌ | 9576/61904 [4:45:07<19:50:26, 1.36s/it] 15%|█▌ | 9577/61904 [4:45:09<21:04:43, 1.45s/it] 15%|█▌ | 9578/61904 [4:45:11<20:57:35, 1.44s/it] 15%|█▌ | 9579/61904 [4:45:12<21:03:00, 1.45s/it] 15%|█▌ | 9580/61904 [4:45:13<20:54:27, 1.44s/it] {'loss': 2.8844, 'learning_rate': 1.847983923246467e-07, 'epoch': 2.48} 15%|█▌ | 9580/61904 [4:45:13<20:54:27, 1.44s/it] 15%|█▌ | 9581/61904 [4:45:15<20:28:33, 1.41s/it] 15%|█▌ | 9582/61904 [4:45:16<20:18:39, 1.40s/it] 15%|█▌ | 9583/61904 [4:45:17<19:37:10, 1.35s/it] 15%|█▌ | 9584/61904 [4:45:19<19:55:34, 1.37s/it] 15%|█▌ | 9585/61904 [4:45:20<19:31:50, 1.34s/it] 15%|█▌ | 9586/61904 [4:45:21<19:41:48, 1.36s/it] 15%|█▌ | 9587/61904 [4:45:23<19:33:59, 1.35s/it] 15%|█▌ | 9588/61904 [4:45:24<20:29:12, 1.41s/it] 15%|█▌ | 9589/61904 [4:45:26<20:16:08, 1.39s/it] 15%|█▌ | 9590/61904 [4:45:27<19:56:15, 1.37s/it] 15%|█▌ | 9591/61904 [4:45:28<19:53:32, 1.37s/it] 15%|█▌ | 9592/61904 [4:45:30<20:45:02, 1.43s/it] 15%|█▌ | 9593/61904 [4:45:31<20:26:49, 1.41s/it] 15%|█▌ | 9594/61904 [4:45:33<20:27:00, 1.41s/it] 15%|█▌ | 9595/61904 [4:45:34<19:55:36, 1.37s/it] 16%|█▌ | 9596/61904 [4:45:35<19:45:04, 1.36s/it] 16%|█▌ | 9597/61904 [4:45:37<19:58:15, 1.37s/it] 16%|█▌ | 9598/61904 [4:45:38<19:36:59, 1.35s/it] 16%|█▌ | 9599/61904 [4:45:39<19:32:58, 1.35s/it] 16%|█▌ | 9600/61904 [4:45:41<19:14:35, 1.32s/it] {'loss': 2.8135, 'learning_rate': 1.8476597951510435e-07, 'epoch': 2.48} 16%|█▌ | 9600/61904 [4:45:41<19:14:35, 1.32s/it] 16%|█▌ | 9601/61904 [4:45:42<19:25:34, 1.34s/it] 16%|█▌ | 9602/61904 [4:45:43<19:37:12, 1.35s/it] 16%|█▌ | 9603/61904 [4:45:45<19:43:52, 1.36s/it] 16%|█▌ | 9604/61904 [4:45:46<19:13:37, 1.32s/it] 16%|█▌ | 9605/61904 [4:45:47<19:19:25, 1.33s/it] 16%|█▌ | 9606/61904 [4:45:49<19:50:01, 1.37s/it] 16%|█▌ | 9607/61904 [4:45:50<19:47:24, 1.36s/it] 16%|█▌ | 9608/61904 [4:45:51<19:30:08, 1.34s/it] 16%|█▌ | 9609/61904 [4:45:53<19:47:23, 1.36s/it] 16%|█▌ | 9610/61904 [4:45:54<20:31:51, 1.41s/it] 16%|█▌ | 9611/61904 [4:45:56<19:59:27, 1.38s/it] 16%|█▌ | 9612/61904 [4:45:57<19:33:25, 1.35s/it] 16%|█▌ | 9613/61904 [4:45:58<19:18:29, 1.33s/it] 16%|█▌ | 9614/61904 [4:46:00<20:01:13, 1.38s/it] 16%|█▌ | 9615/61904 [4:46:01<20:08:27, 1.39s/it] 16%|█▌ | 9616/61904 [4:46:03<20:47:11, 1.43s/it] 16%|█▌ | 9617/61904 [4:46:04<20:49:10, 1.43s/it] 16%|█▌ | 9618/61904 [4:46:05<20:08:44, 1.39s/it] 16%|█▌ | 9619/61904 [4:46:07<20:21:15, 1.40s/it] 16%|█▌ | 9620/61904 [4:46:08<20:01:26, 1.38s/it] {'loss': 2.8818, 'learning_rate': 1.8473356670556204e-07, 'epoch': 2.49} 16%|█▌ | 9620/61904 [4:46:08<20:01:26, 1.38s/it] 16%|█▌ | 9621/61904 [4:46:10<20:03:21, 1.38s/it] 16%|█▌ | 9622/61904 [4:46:11<20:57:36, 1.44s/it] 16%|█▌ | 9623/61904 [4:46:13<20:55:35, 1.44s/it] 16%|█▌ | 9624/61904 [4:46:14<20:20:00, 1.40s/it] 16%|█▌ | 9625/61904 [4:46:15<20:17:07, 1.40s/it] 16%|█▌ | 9626/61904 [4:46:17<20:07:32, 1.39s/it] 16%|█▌ | 9627/61904 [4:46:18<19:28:09, 1.34s/it] 16%|█▌ | 9628/61904 [4:46:19<19:48:14, 1.36s/it] 16%|█▌ | 9629/61904 [4:46:21<19:53:48, 1.37s/it] 16%|█▌ | 9630/61904 [4:46:22<19:38:33, 1.35s/it] 16%|█▌ | 9631/61904 [4:46:24<20:26:24, 1.41s/it] 16%|█▌ | 9632/61904 [4:46:25<20:11:08, 1.39s/it] 16%|█▌ | 9633/61904 [4:46:26<20:04:05, 1.38s/it] 16%|█▌ | 9634/61904 [4:46:28<20:21:33, 1.40s/it] 16%|█▌ | 9635/61904 [4:46:29<20:41:02, 1.42s/it] 16%|█▌ | 9636/61904 [4:46:30<20:10:17, 1.39s/it] 16%|█▌ | 9637/61904 [4:46:32<19:31:12, 1.34s/it] 16%|█▌ | 9638/61904 [4:46:33<19:39:52, 1.35s/it] 16%|█▌ | 9639/61904 [4:46:35<20:10:14, 1.39s/it] 16%|█▌ | 9640/61904 [4:46:36<21:21:36, 1.47s/it] {'loss': 2.8716, 'learning_rate': 1.847011538960197e-07, 'epoch': 2.49} 16%|█▌ | 9640/61904 [4:46:36<21:21:36, 1.47s/it] 16%|█▌ | 9641/61904 [4:46:37<20:03:27, 1.38s/it] 16%|█▌ | 9642/61904 [4:46:39<19:45:13, 1.36s/it] 16%|█▌ | 9643/61904 [4:46:40<19:25:05, 1.34s/it] 16%|█▌ | 9644/61904 [4:46:41<19:22:42, 1.33s/it] 16%|█▌ | 9645/61904 [4:46:43<19:28:28, 1.34s/it] 16%|█▌ | 9646/61904 [4:46:44<19:35:18, 1.35s/it] 16%|█▌ | 9647/61904 [4:46:45<19:34:27, 1.35s/it] 16%|█▌ | 9648/61904 [4:46:47<19:26:37, 1.34s/it] 16%|█▌ | 9649/61904 [4:46:48<19:38:46, 1.35s/it] 16%|█▌ | 9650/61904 [4:46:49<19:49:04, 1.37s/it] 16%|█▌ | 9651/61904 [4:46:51<19:56:51, 1.37s/it] 16%|█▌ | 9652/61904 [4:46:52<19:08:58, 1.32s/it] 16%|█▌ | 9653/61904 [4:46:54<19:40:49, 1.36s/it] 16%|█▌ | 9654/61904 [4:46:55<19:46:39, 1.36s/it] 16%|█▌ | 9655/61904 [4:46:56<19:40:59, 1.36s/it] 16%|█▌ | 9656/61904 [4:46:58<20:00:47, 1.38s/it] 16%|█▌ | 9657/61904 [4:46:59<19:56:19, 1.37s/it] 16%|█▌ | 9658/61904 [4:47:00<19:39:31, 1.35s/it] 16%|█▌ | 9659/61904 [4:47:02<19:32:47, 1.35s/it] 16%|█▌ | 9660/61904 [4:47:03<19:49:19, 1.37s/it] {'loss': 2.8514, 'learning_rate': 1.8466874108647736e-07, 'epoch': 2.5} 16%|█▌ | 9660/61904 [4:47:03<19:49:19, 1.37s/it] 16%|█▌ | 9661/61904 [4:47:04<19:35:02, 1.35s/it] 16%|█▌ | 9662/61904 [4:47:06<19:17:32, 1.33s/it] 16%|█▌ | 9663/61904 [4:47:07<19:46:08, 1.36s/it] 16%|█▌ | 9664/61904 [4:47:08<19:21:27, 1.33s/it] 16%|█▌ | 9665/61904 [4:47:10<20:24:56, 1.41s/it] 16%|█▌ | 9666/61904 [4:47:11<20:48:57, 1.43s/it] 16%|█▌ | 9667/61904 [4:47:13<20:20:40, 1.40s/it] 16%|█▌ | 9668/61904 [4:47:14<20:08:16, 1.39s/it] 16%|█▌ | 9669/61904 [4:47:15<19:33:16, 1.35s/it] 16%|█▌ | 9670/61904 [4:47:17<20:00:22, 1.38s/it] 16%|█▌ | 9671/61904 [4:47:18<19:58:54, 1.38s/it] 16%|█▌ | 9672/61904 [4:47:20<20:01:14, 1.38s/it] 16%|█▌ | 9673/61904 [4:47:21<20:18:16, 1.40s/it] 16%|█▌ | 9674/61904 [4:47:22<20:13:22, 1.39s/it] 16%|█▌ | 9675/61904 [4:47:24<20:36:54, 1.42s/it] 16%|█▌ | 9676/61904 [4:47:25<20:18:17, 1.40s/it] 16%|█▌ | 9677/61904 [4:47:27<20:14:07, 1.39s/it] 16%|█▌ | 9678/61904 [4:47:28<19:56:59, 1.38s/it] 16%|█▌ | 9679/61904 [4:47:29<19:31:32, 1.35s/it] 16%|█▌ | 9680/61904 [4:47:31<19:54:00, 1.37s/it] {'loss': 2.8514, 'learning_rate': 1.8463632827693505e-07, 'epoch': 2.5} 16%|█▌ | 9680/61904 [4:47:31<19:54:00, 1.37s/it] 16%|█▌ | 9681/61904 [4:47:32<20:04:46, 1.38s/it] 16%|█▌ | 9682/61904 [4:47:33<19:40:05, 1.36s/it] 16%|█▌ | 9683/61904 [4:47:35<19:41:34, 1.36s/it] 16%|█▌ | 9684/61904 [4:47:36<19:18:16, 1.33s/it] 16%|█▌ | 9685/61904 [4:47:37<19:46:17, 1.36s/it] 16%|█▌ | 9686/61904 [4:47:39<18:53:37, 1.30s/it] 16%|█▌ | 9687/61904 [4:47:40<18:50:04, 1.30s/it] 16%|█▌ | 9688/61904 [4:47:41<19:34:20, 1.35s/it] 16%|█▌ | 9689/61904 [4:47:43<19:41:48, 1.36s/it] 16%|█▌ | 9690/61904 [4:47:44<19:31:44, 1.35s/it] 16%|█▌ | 9691/61904 [4:47:45<19:19:35, 1.33s/it] 16%|█▌ | 9692/61904 [4:47:47<19:23:29, 1.34s/it] 16%|█▌ | 9693/61904 [4:47:48<18:51:15, 1.30s/it] 16%|█▌ | 9694/61904 [4:47:49<18:50:52, 1.30s/it] 16%|█▌ | 9695/61904 [4:47:51<18:56:52, 1.31s/it] 16%|█▌ | 9696/61904 [4:47:52<18:37:24, 1.28s/it] 16%|█▌ | 9697/61904 [4:47:53<18:32:17, 1.28s/it] 16%|█▌ | 9698/61904 [4:47:54<19:04:48, 1.32s/it] 16%|█▌ | 9699/61904 [4:47:56<19:24:53, 1.34s/it] 16%|█▌ | 9700/61904 [4:47:57<19:13:21, 1.33s/it] {'loss': 2.8991, 'learning_rate': 1.846039154673927e-07, 'epoch': 2.51} 16%|█▌ | 9700/61904 [4:47:57<19:13:21, 1.33s/it] 16%|█▌ | 9701/61904 [4:47:58<19:16:05, 1.33s/it] 16%|█▌ | 9702/61904 [4:48:00<19:09:52, 1.32s/it] 16%|█▌ | 9703/61904 [4:48:01<19:06:52, 1.32s/it] 16%|█▌ | 9704/61904 [4:48:02<19:09:27, 1.32s/it] 16%|█▌ | 9705/61904 [4:48:04<19:29:00, 1.34s/it] 16%|█▌ | 9706/61904 [4:48:05<19:41:27, 1.36s/it] 16%|█▌ | 9707/61904 [4:48:07<20:20:22, 1.40s/it] 16%|█▌ | 9708/61904 [4:48:08<19:53:44, 1.37s/it] 16%|█▌ | 9709/61904 [4:48:09<19:45:23, 1.36s/it] 16%|█▌ | 9710/61904 [4:48:11<19:37:17, 1.35s/it] 16%|█▌ | 9711/61904 [4:48:12<19:14:18, 1.33s/it] 16%|█▌ | 9712/61904 [4:48:13<19:00:26, 1.31s/it] 16%|█▌ | 9713/61904 [4:48:14<18:42:14, 1.29s/it] 16%|█▌ | 9714/61904 [4:48:16<19:21:49, 1.34s/it] 16%|█▌ | 9715/61904 [4:48:17<18:55:42, 1.31s/it] 16%|█▌ | 9716/61904 [4:48:18<18:55:50, 1.31s/it] 16%|█▌ | 9717/61904 [4:48:20<19:32:37, 1.35s/it] 16%|█▌ | 9718/61904 [4:48:21<19:17:53, 1.33s/it] 16%|█▌ | 9719/61904 [4:48:23<20:38:03, 1.42s/it] 16%|█▌ | 9720/61904 [4:48:24<21:34:15, 1.49s/it] {'loss': 2.8639, 'learning_rate': 1.8457150265785038e-07, 'epoch': 2.51} 16%|█▌ | 9720/61904 [4:48:24<21:34:15, 1.49s/it] 16%|█▌ | 9721/61904 [4:48:26<20:32:09, 1.42s/it] 16%|█▌ | 9722/61904 [4:48:27<19:58:03, 1.38s/it] 16%|█▌ | 9723/61904 [4:48:28<19:55:19, 1.37s/it] 16%|█▌ | 9724/61904 [4:48:30<19:38:42, 1.36s/it] 16%|█▌ | 9725/61904 [4:48:31<19:53:26, 1.37s/it] 16%|█▌ | 9726/61904 [4:48:33<20:17:48, 1.40s/it] 16%|█▌ | 9727/61904 [4:48:34<20:07:56, 1.39s/it] 16%|█▌ | 9728/61904 [4:48:35<20:02:52, 1.38s/it] 16%|█▌ | 9729/61904 [4:48:37<20:20:53, 1.40s/it] 16%|█▌ | 9730/61904 [4:48:38<19:47:01, 1.37s/it] 16%|█▌ | 9731/61904 [4:48:39<19:36:17, 1.35s/it] 16%|█▌ | 9732/61904 [4:48:41<19:41:05, 1.36s/it] 16%|█▌ | 9733/61904 [4:48:42<20:49:19, 1.44s/it] 16%|█▌ | 9734/61904 [4:48:44<20:41:16, 1.43s/it] 16%|█▌ | 9735/61904 [4:48:45<20:17:40, 1.40s/it] 16%|█▌ | 9736/61904 [4:48:46<20:06:03, 1.39s/it] 16%|█▌ | 9737/61904 [4:48:48<20:44:02, 1.43s/it] 16%|█▌ | 9738/61904 [4:48:49<19:58:41, 1.38s/it] 16%|█▌ | 9739/61904 [4:48:50<19:27:10, 1.34s/it] 16%|█▌ | 9740/61904 [4:48:52<20:06:49, 1.39s/it] {'loss': 2.839, 'learning_rate': 1.8453908984830804e-07, 'epoch': 2.52} 16%|█▌ | 9740/61904 [4:48:52<20:06:49, 1.39s/it] 16%|█▌ | 9741/61904 [4:48:53<19:45:40, 1.36s/it] 16%|█▌ | 9742/61904 [4:48:55<19:33:33, 1.35s/it] 16%|█▌ | 9743/61904 [4:48:56<19:03:43, 1.32s/it] 16%|█▌ | 9744/61904 [4:48:57<18:53:02, 1.30s/it] 16%|█▌ | 9745/61904 [4:48:59<19:14:50, 1.33s/it] 16%|█▌ | 9746/61904 [4:49:00<19:32:37, 1.35s/it] 16%|█▌ | 9747/61904 [4:49:01<19:24:43, 1.34s/it] 16%|█▌ | 9748/61904 [4:49:03<19:39:30, 1.36s/it] 16%|█▌ | 9749/61904 [4:49:04<20:15:08, 1.40s/it] 16%|█▌ | 9750/61904 [4:49:05<19:54:39, 1.37s/it] 16%|█▌ | 9751/61904 [4:49:07<19:55:08, 1.37s/it] 16%|█▌ | 9752/61904 [4:49:08<20:04:16, 1.39s/it] 16%|█▌ | 9753/61904 [4:49:10<19:55:22, 1.38s/it] 16%|█▌ | 9754/61904 [4:49:11<19:42:25, 1.36s/it] 16%|█▌ | 9755/61904 [4:49:12<19:43:10, 1.36s/it] 16%|█▌ | 9756/61904 [4:49:14<19:31:23, 1.35s/it] 16%|█▌ | 9757/61904 [4:49:15<19:07:09, 1.32s/it] 16%|█▌ | 9758/61904 [4:49:16<19:38:08, 1.36s/it] 16%|█▌ | 9759/61904 [4:49:18<19:30:39, 1.35s/it] 16%|█▌ | 9760/61904 [4:49:19<18:59:00, 1.31s/it] {'loss': 2.8364, 'learning_rate': 1.845066770387657e-07, 'epoch': 2.52} 16%|█▌ | 9760/61904 [4:49:19<18:59:00, 1.31s/it] 16%|█▌ | 9761/61904 [4:49:20<19:06:29, 1.32s/it] 16%|█▌ | 9762/61904 [4:49:21<18:51:28, 1.30s/it] 16%|█▌ | 9763/61904 [4:49:23<19:06:36, 1.32s/it] 16%|█▌ | 9764/61904 [4:49:24<18:53:26, 1.30s/it] 16%|█▌ | 9765/61904 [4:49:25<19:31:15, 1.35s/it] 16%|█▌ | 9766/61904 [4:49:27<19:27:24, 1.34s/it] 16%|█▌ | 9767/61904 [4:49:28<19:43:11, 1.36s/it] 16%|█▌ | 9768/61904 [4:49:30<19:55:45, 1.38s/it] 16%|█▌ | 9769/61904 [4:49:31<19:57:03, 1.38s/it] 16%|█▌ | 9770/61904 [4:49:32<19:44:55, 1.36s/it] 16%|█▌ | 9771/61904 [4:49:34<19:49:31, 1.37s/it] 16%|█▌ | 9772/61904 [4:49:35<19:47:46, 1.37s/it] 16%|█▌ | 9773/61904 [4:49:37<19:59:35, 1.38s/it] 16%|█▌ | 9774/61904 [4:49:38<19:47:47, 1.37s/it] 16%|█▌ | 9775/61904 [4:49:39<19:10:24, 1.32s/it] 16%|█▌ | 9776/61904 [4:49:40<18:51:16, 1.30s/it] 16%|█▌ | 9777/61904 [4:49:42<19:18:17, 1.33s/it] 16%|█▌ | 9778/61904 [4:49:43<19:58:29, 1.38s/it] 16%|█▌ | 9779/61904 [4:49:45<20:09:39, 1.39s/it] 16%|█▌ | 9780/61904 [4:49:46<19:58:12, 1.38s/it] {'loss': 2.8632, 'learning_rate': 1.844742642292234e-07, 'epoch': 2.53} 16%|█▌ | 9780/61904 [4:49:46<19:58:12, 1.38s/it] 16%|█▌ | 9781/61904 [4:49:47<19:46:52, 1.37s/it] 16%|█▌ | 9782/61904 [4:49:49<19:40:12, 1.36s/it] 16%|█▌ | 9783/61904 [4:49:50<19:08:20, 1.32s/it] 16%|█▌ | 9784/61904 [4:49:51<19:59:01, 1.38s/it] 16%|█▌ | 9785/61904 [4:49:53<19:38:51, 1.36s/it] 16%|█▌ | 9786/61904 [4:49:54<19:34:21, 1.35s/it] 16%|█▌ | 9787/61904 [4:49:55<19:53:09, 1.37s/it] 16%|█▌ | 9788/61904 [4:49:57<19:40:13, 1.36s/it] 16%|█▌ | 9789/61904 [4:49:58<19:40:07, 1.36s/it] 16%|█▌ | 9790/61904 [4:49:59<19:15:59, 1.33s/it] 16%|█▌ | 9791/61904 [4:50:01<19:25:32, 1.34s/it] 16%|█▌ | 9792/61904 [4:50:02<19:05:25, 1.32s/it] 16%|█▌ | 9793/61904 [4:50:03<19:30:23, 1.35s/it] 16%|█▌ | 9794/61904 [4:50:05<20:11:03, 1.39s/it] 16%|█▌ | 9795/61904 [4:50:06<19:38:11, 1.36s/it] 16%|█▌ | 9796/61904 [4:50:08<20:10:56, 1.39s/it] 16%|█▌ | 9797/61904 [4:50:09<19:49:48, 1.37s/it] 16%|█▌ | 9798/61904 [4:50:10<20:05:20, 1.39s/it] 16%|█▌ | 9799/61904 [4:50:12<19:45:48, 1.37s/it] 16%|█▌ | 9800/61904 [4:50:13<19:56:49, 1.38s/it] {'loss': 2.8573, 'learning_rate': 1.8444185141968105e-07, 'epoch': 2.53} 16%|█▌ | 9800/61904 [4:50:13<19:56:49, 1.38s/it] 16%|█▌ | 9801/61904 [4:50:15<20:08:50, 1.39s/it] 16%|█▌ | 9802/61904 [4:50:16<20:28:58, 1.42s/it] 16%|█▌ | 9803/61904 [4:50:17<20:05:38, 1.39s/it] 16%|█▌ | 9804/61904 [4:50:19<20:14:45, 1.40s/it] 16%|█▌ | 9805/61904 [4:50:20<19:46:22, 1.37s/it] 16%|█▌ | 9806/61904 [4:50:22<21:12:56, 1.47s/it] 16%|█▌ | 9807/61904 [4:50:23<21:14:04, 1.47s/it] 16%|█▌ | 9808/61904 [4:50:25<21:05:03, 1.46s/it] 16%|█▌ | 9809/61904 [4:50:26<20:29:38, 1.42s/it] 16%|█▌ | 9810/61904 [4:50:27<19:35:04, 1.35s/it] 16%|█▌ | 9811/61904 [4:50:29<19:42:57, 1.36s/it] 16%|█▌ | 9812/61904 [4:50:30<21:17:43, 1.47s/it] 16%|█▌ | 9813/61904 [4:50:32<20:23:42, 1.41s/it] 16%|█▌ | 9814/61904 [4:50:33<19:49:49, 1.37s/it] 16%|█▌ | 9815/61904 [4:50:34<19:40:35, 1.36s/it] 16%|█▌ | 9816/61904 [4:50:36<20:17:46, 1.40s/it] 16%|█▌ | 9817/61904 [4:50:37<20:05:06, 1.39s/it] 16%|█▌ | 9818/61904 [4:50:38<19:45:13, 1.37s/it] 16%|█▌ | 9819/61904 [4:50:40<20:29:50, 1.42s/it] 16%|█▌ | 9820/61904 [4:50:41<20:59:55, 1.45s/it] {'loss': 2.869, 'learning_rate': 1.844094386101387e-07, 'epoch': 2.54} 16%|█▌ | 9820/61904 [4:50:41<20:59:55, 1.45s/it] 16%|█▌ | 9821/61904 [4:50:43<20:34:10, 1.42s/it] 16%|█▌ | 9822/61904 [4:50:44<20:37:52, 1.43s/it] 16%|█▌ | 9823/61904 [4:50:46<20:39:06, 1.43s/it] 16%|█▌ | 9824/61904 [4:50:47<20:13:14, 1.40s/it] 16%|█▌ | 9825/61904 [4:50:48<20:11:59, 1.40s/it] 16%|█▌ | 9826/61904 [4:50:50<19:53:08, 1.37s/it] 16%|█▌ | 9827/61904 [4:50:51<19:37:47, 1.36s/it] 16%|█▌ | 9828/61904 [4:50:52<19:56:00, 1.38s/it] 16%|█▌ | 9829/61904 [4:50:54<19:38:03, 1.36s/it] 16%|█▌ | 9830/61904 [4:50:55<19:45:47, 1.37s/it] 16%|█▌ | 9831/61904 [4:50:57<19:45:45, 1.37s/it] 16%|█▌ | 9832/61904 [4:50:58<19:55:41, 1.38s/it] 16%|█▌ | 9833/61904 [4:51:00<20:38:05, 1.43s/it] 16%|█▌ | 9834/61904 [4:51:01<20:10:04, 1.39s/it] 16%|█▌ | 9835/61904 [4:51:02<19:48:39, 1.37s/it] 16%|█▌ | 9836/61904 [4:51:04<20:41:21, 1.43s/it] 16%|█▌ | 9837/61904 [4:51:05<19:49:06, 1.37s/it] 16%|█▌ | 9838/61904 [4:51:06<19:59:18, 1.38s/it] 16%|█▌ | 9839/61904 [4:51:08<20:02:27, 1.39s/it] 16%|█▌ | 9840/61904 [4:51:09<19:21:01, 1.34s/it] {'loss': 2.8879, 'learning_rate': 1.843770258005964e-07, 'epoch': 2.54} 16%|█▌ | 9840/61904 [4:51:09<19:21:01, 1.34s/it] 16%|█▌ | 9841/61904 [4:51:10<18:47:35, 1.30s/it] 16%|█▌ | 9842/61904 [4:51:12<18:56:50, 1.31s/it] 16%|█▌ | 9843/61904 [4:51:13<19:10:27, 1.33s/it] 16%|█▌ | 9844/61904 [4:51:14<18:53:16, 1.31s/it] 16%|█▌ | 9845/61904 [4:51:16<20:44:40, 1.43s/it] 16%|█▌ | 9846/61904 [4:51:17<20:11:27, 1.40s/it] 16%|█▌ | 9847/61904 [4:51:18<19:43:35, 1.36s/it] 16%|█▌ | 9848/61904 [4:51:20<20:00:01, 1.38s/it] 16%|█▌ | 9849/61904 [4:51:21<19:42:13, 1.36s/it] 16%|█▌ | 9850/61904 [4:51:23<19:43:22, 1.36s/it] 16%|█▌ | 9851/61904 [4:51:24<19:58:32, 1.38s/it] 16%|█▌ | 9852/61904 [4:51:25<19:33:52, 1.35s/it] 16%|█▌ | 9853/61904 [4:51:27<19:35:15, 1.35s/it] 16%|█▌ | 9854/61904 [4:51:28<19:30:38, 1.35s/it] 16%|█▌ | 9855/61904 [4:51:29<19:10:25, 1.33s/it] 16%|█▌ | 9856/61904 [4:51:31<19:08:04, 1.32s/it] 16%|█▌ | 9857/61904 [4:51:32<19:29:38, 1.35s/it] 16%|█▌ | 9858/61904 [4:51:33<19:17:12, 1.33s/it] 16%|█▌ | 9859/61904 [4:51:35<19:49:14, 1.37s/it] 16%|█▌ | 9860/61904 [4:51:36<19:37:46, 1.36s/it] {'loss': 2.8887, 'learning_rate': 1.8434461299105406e-07, 'epoch': 2.55} 16%|█▌ | 9860/61904 [4:51:36<19:37:46, 1.36s/it] 16%|█▌ | 9861/61904 [4:51:37<19:26:03, 1.34s/it] 16%|█▌ | 9862/61904 [4:51:39<19:00:47, 1.32s/it] 16%|█▌ | 9863/61904 [4:51:40<18:36:32, 1.29s/it] 16%|█▌ | 9864/61904 [4:51:41<19:00:22, 1.31s/it] 16%|█▌ | 9865/61904 [4:51:43<19:22:32, 1.34s/it] 16%|█▌ | 9866/61904 [4:51:44<18:34:00, 1.28s/it] 16%|█▌ | 9867/61904 [4:51:45<19:24:37, 1.34s/it] 16%|█▌ | 9868/61904 [4:51:47<19:42:54, 1.36s/it] 16%|█▌ | 9869/61904 [4:51:48<19:48:28, 1.37s/it] 16%|█▌ | 9870/61904 [4:51:50<20:27:10, 1.42s/it] 16%|█▌ | 9871/61904 [4:51:51<19:51:39, 1.37s/it] 16%|█▌ | 9872/61904 [4:51:52<19:28:02, 1.35s/it] 16%|█▌ | 9873/61904 [4:51:54<20:10:05, 1.40s/it] 16%|█▌ | 9874/61904 [4:51:55<19:53:24, 1.38s/it] 16%|█▌ | 9875/61904 [4:51:56<20:07:37, 1.39s/it] 16%|█▌ | 9876/61904 [4:51:58<19:53:32, 1.38s/it] 16%|█▌ | 9877/61904 [4:51:59<20:00:00, 1.38s/it] 16%|█▌ | 9878/61904 [4:52:01<20:15:37, 1.40s/it] 16%|█▌ | 9879/61904 [4:52:02<19:26:45, 1.35s/it] 16%|█▌ | 9880/61904 [4:52:03<18:59:49, 1.31s/it] {'loss': 2.817, 'learning_rate': 1.8431220018151172e-07, 'epoch': 2.55} 16%|█▌ | 9880/61904 [4:52:03<18:59:49, 1.31s/it] 16%|█▌ | 9881/61904 [4:52:04<18:49:35, 1.30s/it] 16%|█▌ | 9882/61904 [4:52:06<18:41:28, 1.29s/it] 16%|█▌ | 9883/61904 [4:52:07<19:01:05, 1.32s/it] 16%|█▌ | 9884/61904 [4:52:08<19:37:32, 1.36s/it] 16%|█▌ | 9885/61904 [4:52:10<19:49:43, 1.37s/it] 16%|█▌ | 9886/61904 [4:52:11<19:03:43, 1.32s/it] 16%|█▌ | 9887/61904 [4:52:12<19:26:05, 1.35s/it] 16%|█▌ | 9888/61904 [4:52:14<19:51:12, 1.37s/it] 16%|█▌ | 9889/61904 [4:52:15<19:23:16, 1.34s/it] 16%|█▌ | 9890/61904 [4:52:16<19:21:28, 1.34s/it] 16%|█▌ | 9891/61904 [4:52:18<19:33:33, 1.35s/it] 16%|█▌ | 9892/61904 [4:52:19<20:10:50, 1.40s/it] 16%|█▌ | 9893/61904 [4:52:21<20:15:27, 1.40s/it] 16%|█▌ | 9894/61904 [4:52:22<20:01:30, 1.39s/it] 16%|█▌ | 9895/61904 [4:52:23<19:30:48, 1.35s/it] 16%|█▌ | 9896/61904 [4:52:25<19:19:34, 1.34s/it] 16%|█▌ | 9897/61904 [4:52:26<19:27:06, 1.35s/it] 16%|█▌ | 9898/61904 [4:52:27<19:48:15, 1.37s/it] 16%|█▌ | 9899/61904 [4:52:29<19:42:41, 1.36s/it] 16%|█▌ | 9900/61904 [4:52:30<19:33:03, 1.35s/it] {'loss': 2.8758, 'learning_rate': 1.842797873719694e-07, 'epoch': 2.56} 16%|█▌ | 9900/61904 [4:52:30<19:33:03, 1.35s/it] 16%|█▌ | 9901/61904 [4:52:31<18:56:51, 1.31s/it] 16%|█▌ | 9902/61904 [4:52:33<19:25:04, 1.34s/it] 16%|█▌ | 9903/61904 [4:52:34<19:09:45, 1.33s/it] 16%|█▌ | 9904/61904 [4:52:35<19:15:00, 1.33s/it] 16%|█▌ | 9905/61904 [4:52:37<19:25:33, 1.34s/it] 16%|█▌ | 9906/61904 [4:52:38<19:08:02, 1.32s/it] 16%|█▌ | 9907/61904 [4:52:39<18:58:02, 1.31s/it] 16%|█▌ | 9908/61904 [4:52:41<19:05:59, 1.32s/it] 16%|█▌ | 9909/61904 [4:52:42<19:20:52, 1.34s/it] 16%|█▌ | 9910/61904 [4:52:44<19:59:35, 1.38s/it] 16%|█▌ | 9911/61904 [4:52:45<19:52:00, 1.38s/it] 16%|█▌ | 9912/61904 [4:52:46<19:41:59, 1.36s/it] 16%|█▌ | 9913/61904 [4:52:47<19:07:01, 1.32s/it] 16%|█▌ | 9914/61904 [4:52:49<19:02:47, 1.32s/it] 16%|█▌ | 9915/61904 [4:52:50<19:04:44, 1.32s/it] 16%|█▌ | 9916/61904 [4:52:51<19:03:53, 1.32s/it] 16%|█▌ | 9917/61904 [4:52:53<19:40:59, 1.36s/it] 16%|█▌ | 9918/61904 [4:52:54<19:59:15, 1.38s/it] 16%|█▌ | 9919/61904 [4:52:56<19:35:55, 1.36s/it] 16%|█▌ | 9920/61904 [4:52:57<19:27:56, 1.35s/it] {'loss': 2.8466, 'learning_rate': 1.8424737456242705e-07, 'epoch': 2.56} 16%|█▌ | 9920/61904 [4:52:57<19:27:56, 1.35s/it] 16%|█▌ | 9921/61904 [4:52:58<18:33:52, 1.29s/it] 16%|█▌ | 9922/61904 [4:52:59<18:58:32, 1.31s/it] 16%|█▌ | 9923/61904 [4:53:01<19:15:57, 1.33s/it] 16%|█▌ | 9924/61904 [4:53:02<18:57:49, 1.31s/it] 16%|█▌ | 9925/61904 [4:53:03<19:00:38, 1.32s/it] 16%|█▌ | 9926/61904 [4:53:05<18:55:45, 1.31s/it] 16%|█▌ | 9927/61904 [4:53:06<19:01:02, 1.32s/it] 16%|█▌ | 9928/61904 [4:53:07<18:54:42, 1.31s/it] 16%|█▌ | 9929/61904 [4:53:09<18:57:58, 1.31s/it] 16%|█▌ | 9930/61904 [4:53:10<19:14:06, 1.33s/it] 16%|█▌ | 9931/61904 [4:53:11<19:05:05, 1.32s/it] 16%|█▌ | 9932/61904 [4:53:13<19:19:46, 1.34s/it] 16%|█▌ | 9933/61904 [4:53:14<19:23:37, 1.34s/it] 16%|█▌ | 9934/61904 [4:53:16<20:25:17, 1.41s/it] 16%|█▌ | 9935/61904 [4:53:17<20:21:55, 1.41s/it] 16%|█▌ | 9936/61904 [4:53:18<19:52:54, 1.38s/it] 16%|█▌ | 9937/61904 [4:53:20<20:08:10, 1.39s/it] 16%|█▌ | 9938/61904 [4:53:21<19:43:55, 1.37s/it] 16%|█▌ | 9939/61904 [4:53:23<20:23:07, 1.41s/it] 16%|█▌ | 9940/61904 [4:53:24<20:02:32, 1.39s/it] {'loss': 2.8203, 'learning_rate': 1.8421496175288474e-07, 'epoch': 2.57} 16%|█▌ | 9940/61904 [4:53:24<20:02:32, 1.39s/it] 16%|█▌ | 9941/61904 [4:53:25<20:07:07, 1.39s/it] 16%|█▌ | 9942/61904 [4:53:27<20:08:18, 1.40s/it] 16%|█▌ | 9943/61904 [4:53:28<19:42:06, 1.36s/it] 16%|█▌ | 9944/61904 [4:53:29<19:14:38, 1.33s/it] 16%|█▌ | 9945/61904 [4:53:31<19:42:37, 1.37s/it] 16%|█▌ | 9946/61904 [4:53:32<19:59:39, 1.39s/it] 16%|█▌ | 9947/61904 [4:53:34<20:00:31, 1.39s/it] 16%|█▌ | 9948/61904 [4:53:35<19:49:09, 1.37s/it] 16%|█▌ | 9949/61904 [4:53:36<19:22:05, 1.34s/it] 16%|█▌ | 9950/61904 [4:53:38<19:41:51, 1.36s/it] 16%|█▌ | 9951/61904 [4:53:39<19:17:35, 1.34s/it] 16%|█▌ | 9952/61904 [4:53:40<19:25:31, 1.35s/it] 16%|█▌ | 9953/61904 [4:53:42<19:17:55, 1.34s/it] 16%|█▌ | 9954/61904 [4:53:43<19:14:38, 1.33s/it] 16%|█▌ | 9955/61904 [4:53:44<19:18:04, 1.34s/it] 16%|█▌ | 9956/61904 [4:53:46<19:10:43, 1.33s/it] 16%|█▌ | 9957/61904 [4:53:47<19:40:06, 1.36s/it] 16%|█▌ | 9958/61904 [4:53:48<19:19:27, 1.34s/it] 16%|█▌ | 9959/61904 [4:53:50<19:06:31, 1.32s/it] 16%|█▌ | 9960/61904 [4:53:51<19:37:03, 1.36s/it] {'loss': 2.8138, 'learning_rate': 1.841825489433424e-07, 'epoch': 2.57} 16%|█▌ | 9960/61904 [4:53:51<19:37:03, 1.36s/it] 16%|█▌ | 9961/61904 [4:53:52<19:33:36, 1.36s/it] 16%|█▌ | 9962/61904 [4:53:54<19:37:30, 1.36s/it] 16%|█▌ | 9963/61904 [4:53:55<19:42:13, 1.37s/it] 16%|█▌ | 9964/61904 [4:53:56<19:28:44, 1.35s/it] 16%|█▌ | 9965/61904 [4:53:58<20:21:50, 1.41s/it] 16%|█▌ | 9966/61904 [4:53:59<20:32:52, 1.42s/it] 16%|█▌ | 9967/61904 [4:54:01<20:25:04, 1.42s/it] 16%|█▌ | 9968/61904 [4:54:02<19:50:21, 1.38s/it] 16%|█▌ | 9969/61904 [4:54:04<19:54:28, 1.38s/it] 16%|█▌ | 9970/61904 [4:54:05<19:19:04, 1.34s/it] 16%|█▌ | 9971/61904 [4:54:06<18:59:58, 1.32s/it] 16%|█▌ | 9972/61904 [4:54:07<19:01:03, 1.32s/it] 16%|█▌ | 9973/61904 [4:54:09<20:07:40, 1.40s/it] 16%|█▌ | 9974/61904 [4:54:10<19:46:07, 1.37s/it] 16%|█▌ | 9975/61904 [4:54:12<19:39:39, 1.36s/it] 16%|█▌ | 9976/61904 [4:54:13<19:28:59, 1.35s/it] 16%|█▌ | 9977/61904 [4:54:14<19:21:55, 1.34s/it] 16%|█▌ | 9978/61904 [4:54:16<19:42:38, 1.37s/it] 16%|█▌ | 9979/61904 [4:54:17<20:19:09, 1.41s/it] 16%|█▌ | 9980/61904 [4:54:19<21:13:03, 1.47s/it] {'loss': 2.8538, 'learning_rate': 1.8415013613380006e-07, 'epoch': 2.58} 16%|█▌ | 9980/61904 [4:54:19<21:13:03, 1.47s/it] 16%|█▌ | 9981/61904 [4:54:20<20:43:30, 1.44s/it] 16%|█▌ | 9982/61904 [4:54:22<21:17:53, 1.48s/it] 16%|█▌ | 9983/61904 [4:54:23<20:03:58, 1.39s/it] 16%|█▌ | 9984/61904 [4:54:24<20:03:51, 1.39s/it] 16%|█▌ | 9985/61904 [4:54:26<20:17:23, 1.41s/it] 16%|█▌ | 9986/61904 [4:54:27<20:19:53, 1.41s/it] 16%|█▌ | 9987/61904 [4:54:29<20:13:21, 1.40s/it] 16%|█▌ | 9988/61904 [4:54:30<19:58:20, 1.38s/it] 16%|█▌ | 9989/61904 [4:54:31<20:14:38, 1.40s/it] 16%|█▌ | 9990/61904 [4:54:33<19:39:53, 1.36s/it] 16%|█▌ | 9991/61904 [4:54:34<19:46:06, 1.37s/it] 16%|█▌ | 9992/61904 [4:54:35<20:05:11, 1.39s/it] 16%|█▌ | 9993/61904 [4:54:37<20:31:17, 1.42s/it] 16%|█▌ | 9994/61904 [4:54:38<20:09:48, 1.40s/it] 16%|█▌ | 9995/61904 [4:54:40<20:33:01, 1.43s/it] 16%|█▌ | 9996/61904 [4:54:41<20:19:21, 1.41s/it] 16%|█▌ | 9997/61904 [4:54:43<20:56:30, 1.45s/it] 16%|█▌ | 9998/61904 [4:54:44<20:18:52, 1.41s/it] 16%|█▌ | 9999/61904 [4:54:45<20:13:51, 1.40s/it] 16%|█▌ | 10000/61904 [4:54:47<20:09:23, 1.40s/it] {'loss': 2.779, 'learning_rate': 1.8411772332425775e-07, 'epoch': 2.58} 16%|█▌ | 10000/61904 [4:54:47<20:09:23, 1.40s/it] 16%|█▌ | 10001/61904 [4:54:48<19:55:26, 1.38s/it] 16%|█▌ | 10002/61904 [4:54:49<19:49:16, 1.37s/it] 16%|█▌ | 10003/61904 [4:54:51<18:56:52, 1.31s/it] 16%|█▌ | 10004/61904 [4:54:52<18:52:15, 1.31s/it] 16%|█▌ | 10005/61904 [4:54:53<19:32:44, 1.36s/it] 16%|█▌ | 10006/61904 [4:54:55<19:33:36, 1.36s/it] 16%|█▌ | 10007/61904 [4:54:56<20:04:12, 1.39s/it] 16%|█▌ | 10008/61904 [4:54:58<20:09:21, 1.40s/it] 16%|█▌ | 10009/61904 [4:54:59<20:46:46, 1.44s/it] 16%|█▌ | 10010/61904 [4:55:01<20:21:48, 1.41s/it] 16%|█▌ | 10011/61904 [4:55:02<20:24:27, 1.42s/it] 16%|█▌ | 10012/61904 [4:55:03<20:42:09, 1.44s/it] 16%|█▌ | 10013/61904 [4:55:05<20:22:17, 1.41s/it] 16%|█▌ | 10014/61904 [4:55:06<20:09:15, 1.40s/it] 16%|█▌ | 10015/61904 [4:55:08<20:42:30, 1.44s/it] 16%|█▌ | 10016/61904 [4:55:09<20:56:13, 1.45s/it] 16%|█▌ | 10017/61904 [4:55:10<19:57:51, 1.39s/it] 16%|█▌ | 10018/61904 [4:55:12<19:47:47, 1.37s/it] 16%|█▌ | 10019/61904 [4:55:13<20:12:25, 1.40s/it] 16%|█▌ | 10020/61904 [4:55:15<20:02:26, 1.39s/it] {'loss': 2.8533, 'learning_rate': 1.840853105147154e-07, 'epoch': 2.59} 16%|█▌ | 10020/61904 [4:55:15<20:02:26, 1.39s/it] 16%|█▌ | 10021/61904 [4:55:16<20:20:33, 1.41s/it] 16%|█▌ | 10022/61904 [4:55:17<20:14:20, 1.40s/it] 16%|█▌ | 10023/61904 [4:55:19<19:58:14, 1.39s/it] 16%|█▌ | 10024/61904 [4:55:20<20:10:59, 1.40s/it] 16%|█▌ | 10025/61904 [4:55:22<19:58:18, 1.39s/it] 16%|█▌ | 10026/61904 [4:55:23<19:34:06, 1.36s/it] 16%|█▌ | 10027/61904 [4:55:24<19:09:40, 1.33s/it] 16%|█▌ | 10028/61904 [4:55:26<19:49:09, 1.38s/it] 16%|█▌ | 10029/61904 [4:55:27<19:42:52, 1.37s/it] 16%|█▌ | 10030/61904 [4:55:28<19:44:43, 1.37s/it] 16%|█▌ | 10031/61904 [4:55:30<19:40:10, 1.37s/it] 16%|█▌ | 10032/61904 [4:55:31<19:29:12, 1.35s/it] 16%|█▌ | 10033/61904 [4:55:33<20:11:55, 1.40s/it] 16%|█▌ | 10034/61904 [4:55:34<19:59:57, 1.39s/it] 16%|█▌ | 10035/61904 [4:55:35<19:28:31, 1.35s/it] 16%|█▌ | 10036/61904 [4:55:37<19:34:12, 1.36s/it] 16%|█▌ | 10037/61904 [4:55:38<19:31:52, 1.36s/it] 16%|█▌ | 10038/61904 [4:55:39<19:15:19, 1.34s/it] 16%|█▌ | 10039/61904 [4:55:41<19:40:58, 1.37s/it] 16%|█▌ | 10040/61904 [4:55:42<20:08:15, 1.40s/it] {'loss': 2.8389, 'learning_rate': 1.8405289770517307e-07, 'epoch': 2.59} 16%|█▌ | 10040/61904 [4:55:42<20:08:15, 1.40s/it] 16%|█▌ | 10041/61904 [4:55:44<20:46:20, 1.44s/it] 16%|█▌ | 10042/61904 [4:55:45<20:32:40, 1.43s/it] 16%|█▌ | 10043/61904 [4:55:46<20:34:37, 1.43s/it] 16%|█▌ | 10044/61904 [4:55:48<20:01:34, 1.39s/it] 16%|█▌ | 10045/61904 [4:55:49<20:13:34, 1.40s/it] 16%|█▌ | 10046/61904 [4:55:50<19:41:48, 1.37s/it] 16%|█▌ | 10047/61904 [4:55:52<18:56:21, 1.31s/it] 16%|█▌ | 10048/61904 [4:55:53<19:21:22, 1.34s/it] 16%|█▌ | 10049/61904 [4:55:54<19:22:48, 1.35s/it] 16%|█▌ | 10050/61904 [4:55:56<19:19:07, 1.34s/it] 16%|█▌ | 10051/61904 [4:55:57<19:55:09, 1.38s/it] 16%|█▌ | 10052/61904 [4:55:59<19:46:01, 1.37s/it] 16%|█▌ | 10053/61904 [4:56:00<19:54:07, 1.38s/it] 16%|█▌ | 10054/61904 [4:56:01<19:58:52, 1.39s/it] 16%|█▌ | 10055/61904 [4:56:03<20:20:43, 1.41s/it] 16%|█▌ | 10056/61904 [4:56:04<19:52:15, 1.38s/it] 16%|█▌ | 10057/61904 [4:56:05<19:31:35, 1.36s/it] 16%|█▌ | 10058/61904 [4:56:07<19:41:31, 1.37s/it] 16%|█▌ | 10059/61904 [4:56:08<19:48:36, 1.38s/it] 16%|█▋ | 10060/61904 [4:56:09<19:20:57, 1.34s/it] {'loss': 2.9132, 'learning_rate': 1.8402048489563076e-07, 'epoch': 2.6} 16%|█▋ | 10060/61904 [4:56:09<19:20:57, 1.34s/it] 16%|█▋ | 10061/61904 [4:56:11<19:21:58, 1.34s/it] 16%|█▋ | 10062/61904 [4:56:12<19:23:43, 1.35s/it] 16%|█▋ | 10063/61904 [4:56:14<19:51:56, 1.38s/it] 16%|█▋ | 10064/61904 [4:56:15<19:37:41, 1.36s/it] 16%|█▋ | 10065/61904 [4:56:16<19:23:28, 1.35s/it] 16%|█▋ | 10066/61904 [4:56:18<20:08:30, 1.40s/it] 16%|█▋ | 10067/61904 [4:56:19<20:35:07, 1.43s/it] 16%|█▋ | 10068/61904 [4:56:21<20:25:57, 1.42s/it] 16%|█▋ | 10069/61904 [4:56:22<20:04:09, 1.39s/it] 16%|█▋ | 10070/61904 [4:56:23<19:56:02, 1.38s/it] 16%|█▋ | 10071/61904 [4:56:25<20:17:31, 1.41s/it] 16%|█▋ | 10072/61904 [4:56:26<19:25:52, 1.35s/it] 16%|█▋ | 10073/61904 [4:56:27<19:15:36, 1.34s/it] 16%|█▋ | 10074/61904 [4:56:29<19:15:18, 1.34s/it] 16%|█▋ | 10075/61904 [4:56:30<18:53:49, 1.31s/it] 16%|█▋ | 10076/61904 [4:56:31<19:08:22, 1.33s/it] 16%|█▋ | 10077/61904 [4:56:33<19:17:35, 1.34s/it] 16%|█▋ | 10078/61904 [4:56:34<19:00:44, 1.32s/it] 16%|█▋ | 10079/61904 [4:56:35<19:12:14, 1.33s/it] 16%|█▋ | 10080/61904 [4:56:37<21:18:27, 1.48s/it] {'loss': 2.8213, 'learning_rate': 1.8398807208608842e-07, 'epoch': 2.6} 16%|█▋ | 10080/61904 [4:56:37<21:18:27, 1.48s/it] 16%|█▋ | 10081/61904 [4:56:38<20:21:42, 1.41s/it] 16%|█▋ | 10082/61904 [4:56:40<20:06:05, 1.40s/it] 16%|█▋ | 10083/61904 [4:56:41<20:10:57, 1.40s/it] 16%|█▋ | 10084/61904 [4:56:43<20:17:30, 1.41s/it] 16%|█▋ | 10085/61904 [4:56:44<19:54:43, 1.38s/it] 16%|█▋ | 10086/61904 [4:56:45<19:41:53, 1.37s/it] 16%|█▋ | 10087/61904 [4:56:47<19:49:55, 1.38s/it] 16%|█▋ | 10088/61904 [4:56:48<19:55:17, 1.38s/it] 16%|█▋ | 10089/61904 [4:56:50<20:18:46, 1.41s/it] 16%|█▋ | 10090/61904 [4:56:51<20:06:24, 1.40s/it] 16%|█▋ | 10091/61904 [4:56:52<20:49:30, 1.45s/it] 16%|█▋ | 10092/61904 [4:56:54<20:37:16, 1.43s/it] 16%|█▋ | 10093/61904 [4:56:55<20:10:35, 1.40s/it] 16%|█▋ | 10094/61904 [4:56:57<20:35:30, 1.43s/it] 16%|█▋ | 10095/61904 [4:56:58<20:13:56, 1.41s/it] 16%|█▋ | 10096/61904 [4:56:59<20:19:21, 1.41s/it] 16%|█▋ | 10097/61904 [4:57:01<20:09:33, 1.40s/it] 16%|█▋ | 10098/61904 [4:57:02<20:25:49, 1.42s/it] 16%|█▋ | 10099/61904 [4:57:04<20:40:53, 1.44s/it] 16%|█▋ | 10100/61904 [4:57:05<20:20:32, 1.41s/it] {'loss': 2.7963, 'learning_rate': 1.8395565927654608e-07, 'epoch': 2.61} 16%|█▋ | 10100/61904 [4:57:05<20:20:32, 1.41s/it] 16%|█▋ | 10101/61904 [4:57:07<20:56:48, 1.46s/it] 16%|█▋ | 10102/61904 [4:57:08<21:13:03, 1.47s/it] 16%|█▋ | 10103/61904 [4:57:10<20:56:19, 1.46s/it] 16%|█▋ | 10104/61904 [4:57:11<21:30:29, 1.49s/it] 16%|█▋ | 10105/61904 [4:57:13<21:42:04, 1.51s/it] 16%|█▋ | 10106/61904 [4:57:14<21:26:21, 1.49s/it] 16%|█▋ | 10107/61904 [4:57:15<20:34:27, 1.43s/it] 16%|█▋ | 10108/61904 [4:57:17<20:13:27, 1.41s/it] 16%|█▋ | 10109/61904 [4:57:18<20:29:43, 1.42s/it] 16%|█▋ | 10110/61904 [4:57:20<20:04:36, 1.40s/it] 16%|█▋ | 10111/61904 [4:57:21<19:44:20, 1.37s/it] 16%|█▋ | 10112/61904 [4:57:22<19:33:42, 1.36s/it] 16%|█▋ | 10113/61904 [4:57:24<19:10:53, 1.33s/it] 16%|█▋ | 10114/61904 [4:57:25<19:19:17, 1.34s/it] 16%|█▋ | 10115/61904 [4:57:26<19:24:44, 1.35s/it] 16%|█▋ | 10116/61904 [4:57:28<19:21:03, 1.35s/it] 16%|█▋ | 10117/61904 [4:57:29<18:50:32, 1.31s/it] 16%|█▋ | 10118/61904 [4:57:30<18:59:03, 1.32s/it] 16%|█▋ | 10119/61904 [4:57:32<20:00:05, 1.39s/it] 16%|█▋ | 10120/61904 [4:57:33<19:35:53, 1.36s/it] {'loss': 2.7789, 'learning_rate': 1.8392324646700374e-07, 'epoch': 2.62} 16%|█▋ | 10120/61904 [4:57:33<19:35:53, 1.36s/it] 16%|█▋ | 10121/61904 [4:57:34<19:22:00, 1.35s/it] 16%|█▋ | 10122/61904 [4:57:36<19:38:32, 1.37s/it] 16%|█▋ | 10123/61904 [4:57:37<18:56:31, 1.32s/it] 16%|█▋ | 10124/61904 [4:57:38<19:29:41, 1.36s/it] 16%|█▋ | 10125/61904 [4:57:40<20:41:15, 1.44s/it] 16%|█▋ | 10126/61904 [4:57:41<20:21:16, 1.42s/it] 16%|█▋ | 10127/61904 [4:57:43<20:16:23, 1.41s/it] 16%|█▋ | 10128/61904 [4:57:44<19:27:10, 1.35s/it] 16%|█▋ | 10129/61904 [4:57:46<20:07:57, 1.40s/it] 16%|█▋ | 10130/61904 [4:57:47<19:21:19, 1.35s/it] 16%|█▋ | 10131/61904 [4:57:48<19:37:01, 1.36s/it] 16%|█▋ | 10132/61904 [4:57:50<19:48:10, 1.38s/it] 16%|█▋ | 10133/61904 [4:57:51<19:37:17, 1.36s/it] 16%|█▋ | 10134/61904 [4:57:52<19:30:32, 1.36s/it] 16%|█▋ | 10135/61904 [4:57:53<19:01:15, 1.32s/it] 16%|█▋ | 10136/61904 [4:57:55<19:05:27, 1.33s/it] 16%|█▋ | 10137/61904 [4:57:56<18:48:56, 1.31s/it] 16%|█▋ | 10138/61904 [4:57:57<18:56:47, 1.32s/it] 16%|█▋ | 10139/61904 [4:57:59<19:31:04, 1.36s/it] 16%|█▋ | 10140/61904 [4:58:00<19:22:49, 1.35s/it] {'loss': 2.8891, 'learning_rate': 1.838908336574614e-07, 'epoch': 2.62} 16%|█▋ | 10140/61904 [4:58:00<19:22:49, 1.35s/it] 16%|█▋ | 10141/61904 [4:58:01<18:58:55, 1.32s/it] 16%|█▋ | 10142/61904 [4:58:03<18:49:43, 1.31s/it] 16%|█▋ | 10143/61904 [4:58:04<18:52:24, 1.31s/it] 16%|█▋ | 10144/61904 [4:58:06<19:26:29, 1.35s/it] 16%|█▋ | 10145/61904 [4:58:07<19:26:50, 1.35s/it] 16%|█▋ | 10146/61904 [4:58:08<19:28:35, 1.35s/it] 16%|█▋ | 10147/61904 [4:58:09<19:01:06, 1.32s/it] 16%|█▋ | 10148/61904 [4:58:11<19:03:16, 1.33s/it] 16%|█▋ | 10149/61904 [4:58:12<18:34:24, 1.29s/it] 16%|█▋ | 10150/61904 [4:58:13<18:46:37, 1.31s/it] 16%|█▋ | 10151/61904 [4:58:15<19:16:29, 1.34s/it] 16%|█▋ | 10152/61904 [4:58:16<19:16:26, 1.34s/it] 16%|█▋ | 10153/61904 [4:58:18<19:48:58, 1.38s/it] 16%|█▋ | 10154/61904 [4:58:19<19:21:46, 1.35s/it] 16%|█▋ | 10155/61904 [4:58:20<19:56:09, 1.39s/it] 16%|█▋ | 10156/61904 [4:58:22<20:13:15, 1.41s/it] 16%|█▋ | 10157/61904 [4:58:23<19:42:02, 1.37s/it] 16%|█▋ | 10158/61904 [4:58:24<19:20:10, 1.35s/it] 16%|█▋ | 10159/61904 [4:58:26<19:15:44, 1.34s/it] 16%|█▋ | 10160/61904 [4:58:27<20:14:03, 1.41s/it] {'loss': 2.8305, 'learning_rate': 1.838584208479191e-07, 'epoch': 2.63} 16%|█▋ | 10160/61904 [4:58:27<20:14:03, 1.41s/it] 16%|█▋ | 10161/61904 [4:58:29<19:44:20, 1.37s/it] 16%|█▋ | 10162/61904 [4:58:30<19:10:44, 1.33s/it] 16%|█▋ | 10163/61904 [4:58:31<19:15:32, 1.34s/it] 16%|█▋ | 10164/61904 [4:58:32<18:55:31, 1.32s/it] 16%|█▋ | 10165/61904 [4:58:34<19:10:09, 1.33s/it] 16%|█▋ | 10166/61904 [4:58:35<19:43:19, 1.37s/it] 16%|█▋ | 10167/61904 [4:58:37<19:46:46, 1.38s/it] 16%|█▋ | 10168/61904 [4:58:38<19:38:41, 1.37s/it] 16%|█▋ | 10169/61904 [4:58:39<19:09:23, 1.33s/it] 16%|█▋ | 10170/61904 [4:58:40<18:50:31, 1.31s/it] 16%|█▋ | 10171/61904 [4:58:42<19:16:45, 1.34s/it] 16%|█▋ | 10172/61904 [4:58:43<19:06:44, 1.33s/it] 16%|█▋ | 10173/61904 [4:58:44<18:36:48, 1.30s/it] 16%|█▋ | 10174/61904 [4:58:46<18:53:11, 1.31s/it] 16%|█▋ | 10175/61904 [4:58:47<19:07:05, 1.33s/it] 16%|█▋ | 10176/61904 [4:58:48<18:35:52, 1.29s/it] 16%|█▋ | 10177/61904 [4:58:50<20:24:44, 1.42s/it] 16%|█▋ | 10178/61904 [4:58:51<19:32:07, 1.36s/it] 16%|█▋ | 10179/61904 [4:58:53<19:42:50, 1.37s/it] 16%|█▋ | 10180/61904 [4:58:54<19:27:58, 1.35s/it] {'loss': 2.8333, 'learning_rate': 1.8382600803837676e-07, 'epoch': 2.63} 16%|█▋ | 10180/61904 [4:58:54<19:27:58, 1.35s/it] 16%|█▋ | 10181/61904 [4:58:55<19:15:27, 1.34s/it] 16%|█▋ | 10182/61904 [4:58:57<19:34:00, 1.36s/it] 16%|█▋ | 10183/61904 [4:58:58<19:40:37, 1.37s/it] 16%|█▋ | 10184/61904 [4:59:00<20:14:39, 1.41s/it] 16%|█▋ | 10185/61904 [4:59:01<20:06:34, 1.40s/it] 16%|█▋ | 10186/61904 [4:59:02<19:41:43, 1.37s/it] 16%|█▋ | 10187/61904 [4:59:04<19:02:10, 1.33s/it] 16%|█▋ | 10188/61904 [4:59:05<19:19:47, 1.35s/it] 16%|█▋ | 10189/61904 [4:59:06<19:26:31, 1.35s/it] 16%|█▋ | 10190/61904 [4:59:08<19:13:34, 1.34s/it] 16%|█▋ | 10191/61904 [4:59:09<20:06:57, 1.40s/it] 16%|█▋ | 10192/61904 [4:59:10<19:40:22, 1.37s/it] 16%|█▋ | 10193/61904 [4:59:12<18:51:42, 1.31s/it] 16%|█▋ | 10194/61904 [4:59:13<18:44:29, 1.30s/it] 16%|█▋ | 10195/61904 [4:59:14<18:49:49, 1.31s/it] 16%|█▋ | 10196/61904 [4:59:16<20:15:06, 1.41s/it] 16%|█▋ | 10197/61904 [4:59:17<20:15:54, 1.41s/it] 16%|█▋ | 10198/61904 [4:59:19<19:58:23, 1.39s/it] 16%|█▋ | 10199/61904 [4:59:20<19:48:02, 1.38s/it] 16%|█▋ | 10200/61904 [4:59:21<19:35:46, 1.36s/it] {'loss': 2.8771, 'learning_rate': 1.8379359522883442e-07, 'epoch': 2.64} 16%|█▋ | 10200/61904 [4:59:21<19:35:46, 1.36s/it] 16%|█▋ | 10201/61904 [4:59:23<19:42:39, 1.37s/it] 16%|█▋ | 10202/61904 [4:59:24<20:00:45, 1.39s/it] 16%|█▋ | 10203/61904 [4:59:25<19:52:55, 1.38s/it] 16%|█▋ | 10204/61904 [4:59:27<20:17:56, 1.41s/it] 16%|█▋ | 10205/61904 [4:59:28<19:37:20, 1.37s/it] 16%|█▋ | 10206/61904 [4:59:30<19:28:04, 1.36s/it] 16%|█▋ | 10207/61904 [4:59:31<19:44:16, 1.37s/it] 16%|█▋ | 10208/61904 [4:59:32<19:07:48, 1.33s/it] 16%|█▋ | 10209/61904 [4:59:33<18:54:57, 1.32s/it] 16%|█▋ | 10210/61904 [4:59:35<18:41:27, 1.30s/it] 16%|█▋ | 10211/61904 [4:59:36<18:38:14, 1.30s/it] 16%|█▋ | 10212/61904 [4:59:37<18:50:39, 1.31s/it] 16%|█▋ | 10213/61904 [4:59:39<19:25:42, 1.35s/it] 16%|█▋ | 10214/61904 [4:59:40<19:31:41, 1.36s/it] 17%|█▋ | 10215/61904 [4:59:42<19:24:22, 1.35s/it] 17%|█▋ | 10216/61904 [4:59:43<19:30:35, 1.36s/it] 17%|█▋ | 10217/61904 [4:59:44<19:12:13, 1.34s/it] 17%|█▋ | 10218/61904 [4:59:46<19:06:05, 1.33s/it] 17%|█▋ | 10219/61904 [4:59:47<20:06:07, 1.40s/it] 17%|█▋ | 10220/61904 [4:59:48<19:50:33, 1.38s/it] {'loss': 2.8126, 'learning_rate': 1.837611824192921e-07, 'epoch': 2.64} 17%|█▋ | 10220/61904 [4:59:48<19:50:33, 1.38s/it] 17%|█▋ | 10221/61904 [4:59:50<19:05:09, 1.33s/it] 17%|█▋ | 10222/61904 [4:59:51<19:24:13, 1.35s/it] 17%|█▋ | 10223/61904 [4:59:52<19:31:56, 1.36s/it] 17%|█▋ | 10224/61904 [4:59:54<20:33:05, 1.43s/it] 17%|█▋ | 10225/61904 [4:59:55<20:42:25, 1.44s/it] 17%|█▋ | 10226/61904 [4:59:57<20:14:53, 1.41s/it] 17%|█▋ | 10227/61904 [4:59:58<20:12:23, 1.41s/it] 17%|█▋ | 10228/61904 [5:00:00<19:47:09, 1.38s/it] 17%|█▋ | 10229/61904 [5:00:01<19:31:16, 1.36s/it] 17%|█▋ | 10230/61904 [5:00:02<19:14:35, 1.34s/it] 17%|█▋ | 10231/61904 [5:00:04<19:24:08, 1.35s/it] 17%|█▋ | 10232/61904 [5:00:05<19:50:12, 1.38s/it] 17%|█▋ | 10233/61904 [5:00:06<19:44:08, 1.38s/it] 17%|█▋ | 10234/61904 [5:00:08<19:18:04, 1.34s/it] 17%|█▋ | 10235/61904 [5:00:09<19:25:38, 1.35s/it] 17%|█▋ | 10236/61904 [5:00:10<19:21:47, 1.35s/it] 17%|█▋ | 10237/61904 [5:00:12<19:55:42, 1.39s/it] 17%|█▋ | 10238/61904 [5:00:13<20:14:43, 1.41s/it] 17%|█▋ | 10239/61904 [5:00:15<19:52:37, 1.39s/it] 17%|█▋ | 10240/61904 [5:00:16<20:10:24, 1.41s/it] {'loss': 2.8727, 'learning_rate': 1.8372876960974977e-07, 'epoch': 2.65} 17%|█▋ | 10240/61904 [5:00:16<20:10:24, 1.41s/it] 17%|█▋ | 10241/61904 [5:00:17<20:10:36, 1.41s/it] 17%|█▋ | 10242/61904 [5:00:19<19:51:03, 1.38s/it] 17%|█▋ | 10243/61904 [5:00:20<19:41:15, 1.37s/it] 17%|█▋ | 10244/61904 [5:00:21<19:36:28, 1.37s/it] 17%|█▋ | 10245/61904 [5:00:23<19:52:18, 1.38s/it] 17%|█▋ | 10246/61904 [5:00:24<19:37:12, 1.37s/it] 17%|█▋ | 10247/61904 [5:00:26<20:11:16, 1.41s/it] 17%|█▋ | 10248/61904 [5:00:27<19:46:28, 1.38s/it] 17%|█▋ | 10249/61904 [5:00:28<19:45:04, 1.38s/it] 17%|█▋ | 10250/61904 [5:00:30<19:26:05, 1.35s/it] 17%|█▋ | 10251/61904 [5:00:31<19:17:23, 1.34s/it] 17%|█▋ | 10252/61904 [5:00:32<19:25:10, 1.35s/it] 17%|█▋ | 10253/61904 [5:00:34<18:52:29, 1.32s/it] 17%|█▋ | 10254/61904 [5:00:35<19:15:54, 1.34s/it] 17%|█▋ | 10255/61904 [5:00:36<19:26:09, 1.35s/it] 17%|█▋ | 10256/61904 [5:00:38<19:34:04, 1.36s/it] 17%|█▋ | 10257/61904 [5:00:39<19:18:28, 1.35s/it] 17%|█▋ | 10258/61904 [5:00:40<19:23:11, 1.35s/it] 17%|█▋ | 10259/61904 [5:00:42<19:49:54, 1.38s/it] 17%|█▋ | 10260/61904 [5:00:43<19:39:32, 1.37s/it] {'loss': 2.8323, 'learning_rate': 1.8369635680020743e-07, 'epoch': 2.65} 17%|█▋ | 10260/61904 [5:00:43<19:39:32, 1.37s/it] 17%|█▋ | 10261/61904 [5:00:45<19:41:25, 1.37s/it] 17%|█▋ | 10262/61904 [5:00:46<19:15:00, 1.34s/it] 17%|█▋ | 10263/61904 [5:00:47<19:57:37, 1.39s/it] 17%|█▋ | 10264/61904 [5:00:49<19:55:25, 1.39s/it] 17%|█▋ | 10265/61904 [5:00:50<19:43:27, 1.38s/it] 17%|█▋ | 10266/61904 [5:00:52<20:22:02, 1.42s/it] 17%|█▋ | 10267/61904 [5:00:53<19:56:00, 1.39s/it] 17%|█▋ | 10268/61904 [5:00:54<19:58:14, 1.39s/it] 17%|█▋ | 10269/61904 [5:00:56<19:22:42, 1.35s/it] 17%|█▋ | 10270/61904 [5:00:57<19:18:12, 1.35s/it] 17%|█▋ | 10271/61904 [5:00:58<19:21:09, 1.35s/it] 17%|█▋ | 10272/61904 [5:01:00<19:36:53, 1.37s/it] 17%|█▋ | 10273/61904 [5:01:01<19:38:11, 1.37s/it] 17%|█▋ | 10274/61904 [5:01:03<20:01:32, 1.40s/it] 17%|█▋ | 10275/61904 [5:01:04<19:54:49, 1.39s/it] 17%|█▋ | 10276/61904 [5:01:05<19:51:02, 1.38s/it] 17%|█▋ | 10277/61904 [5:01:07<19:38:14, 1.37s/it] 17%|█▋ | 10278/61904 [5:01:08<19:23:58, 1.35s/it] 17%|█▋ | 10279/61904 [5:01:09<19:21:49, 1.35s/it] 17%|█▋ | 10280/61904 [5:01:11<19:01:40, 1.33s/it] {'loss': 2.7787, 'learning_rate': 1.8366394399066512e-07, 'epoch': 2.66} 17%|█▋ | 10280/61904 [5:01:11<19:01:40, 1.33s/it] 17%|█▋ | 10281/61904 [5:01:12<19:40:15, 1.37s/it] 17%|█▋ | 10282/61904 [5:01:13<19:15:39, 1.34s/it] 17%|█▋ | 10283/61904 [5:01:15<19:05:04, 1.33s/it] 17%|█▋ | 10284/61904 [5:01:16<20:07:18, 1.40s/it] 17%|█▋ | 10285/61904 [5:01:18<19:33:47, 1.36s/it] 17%|█▋ | 10286/61904 [5:01:19<19:12:04, 1.34s/it] 17%|█▋ | 10287/61904 [5:01:20<18:44:04, 1.31s/it] 17%|█▋ | 10288/61904 [5:01:21<18:47:48, 1.31s/it] 17%|█▋ | 10289/61904 [5:01:23<19:24:44, 1.35s/it] 17%|█▋ | 10290/61904 [5:01:24<19:01:55, 1.33s/it] 17%|█▋ | 10291/61904 [5:01:25<19:09:58, 1.34s/it] 17%|█▋ | 10292/61904 [5:01:27<18:52:12, 1.32s/it] 17%|█▋ | 10293/61904 [5:01:28<18:54:07, 1.32s/it] 17%|█▋ | 10294/61904 [5:01:30<20:02:36, 1.40s/it] 17%|█▋ | 10295/61904 [5:01:31<20:28:00, 1.43s/it] 17%|█▋ | 10296/61904 [5:01:33<20:24:27, 1.42s/it] 17%|█▋ | 10297/61904 [5:01:34<20:21:04, 1.42s/it] 17%|█▋ | 10298/61904 [5:01:35<19:42:20, 1.37s/it] 17%|█▋ | 10299/61904 [5:01:36<19:19:43, 1.35s/it] 17%|█▋ | 10300/61904 [5:01:38<19:01:38, 1.33s/it] {'loss': 2.8147, 'learning_rate': 1.8363153118112275e-07, 'epoch': 2.66} 17%|█▋ | 10300/61904 [5:01:38<19:01:38, 1.33s/it] 17%|█▋ | 10301/61904 [5:01:39<18:49:10, 1.31s/it] 17%|█▋ | 10302/61904 [5:01:40<19:07:35, 1.33s/it] 17%|█▋ | 10303/61904 [5:01:42<20:03:10, 1.40s/it] 17%|█▋ | 10304/61904 [5:01:43<20:06:05, 1.40s/it] 17%|█▋ | 10305/61904 [5:01:45<20:31:36, 1.43s/it] 17%|█▋ | 10306/61904 [5:01:46<20:05:26, 1.40s/it] 17%|█▋ | 10307/61904 [5:01:48<19:40:40, 1.37s/it] 17%|█▋ | 10308/61904 [5:01:49<19:51:45, 1.39s/it] 17%|█▋ | 10309/61904 [5:01:50<20:29:07, 1.43s/it] 17%|█▋ | 10310/61904 [5:01:52<19:49:00, 1.38s/it] 17%|█▋ | 10311/61904 [5:01:53<19:59:55, 1.40s/it] 17%|█▋ | 10312/61904 [5:01:55<20:14:36, 1.41s/it] 17%|█▋ | 10313/61904 [5:01:56<20:33:42, 1.43s/it] 17%|█▋ | 10314/61904 [5:01:57<20:21:53, 1.42s/it] 17%|█▋ | 10315/61904 [5:01:59<19:39:07, 1.37s/it] 17%|█▋ | 10316/61904 [5:02:00<19:40:47, 1.37s/it] 17%|█▋ | 10317/61904 [5:02:02<20:11:54, 1.41s/it] 17%|█▋ | 10318/61904 [5:02:03<19:37:15, 1.37s/it] 17%|█▋ | 10319/61904 [5:02:04<19:23:38, 1.35s/it] 17%|█▋ | 10320/61904 [5:02:06<19:13:46, 1.34s/it] {'loss': 2.7373, 'learning_rate': 1.8359911837158044e-07, 'epoch': 2.67} 17%|█▋ | 10320/61904 [5:02:06<19:13:46, 1.34s/it] 17%|█▋ | 10321/61904 [5:02:07<19:14:55, 1.34s/it] 17%|█▋ | 10322/61904 [5:02:08<18:44:42, 1.31s/it] 17%|█▋ | 10323/61904 [5:02:09<19:03:39, 1.33s/it] 17%|█▋ | 10324/61904 [5:02:11<19:38:33, 1.37s/it] 17%|█▋ | 10325/61904 [5:02:12<19:24:30, 1.35s/it] 17%|█▋ | 10326/61904 [5:02:14<19:40:00, 1.37s/it] 17%|█▋ | 10327/61904 [5:02:15<19:50:35, 1.39s/it] 17%|█▋ | 10328/61904 [5:02:17<20:18:44, 1.42s/it] 17%|█▋ | 10329/61904 [5:02:18<20:09:15, 1.41s/it] 17%|█▋ | 10330/61904 [5:02:19<19:58:07, 1.39s/it] 17%|█▋ | 10331/61904 [5:02:21<20:10:36, 1.41s/it] 17%|█▋ | 10332/61904 [5:02:22<20:32:30, 1.43s/it] 17%|█▋ | 10333/61904 [5:02:24<20:22:10, 1.42s/it] 17%|█▋ | 10334/61904 [5:02:25<19:57:28, 1.39s/it] 17%|█▋ | 10335/61904 [5:02:26<19:23:34, 1.35s/it] 17%|█▋ | 10336/61904 [5:02:28<19:15:38, 1.34s/it] 17%|█▋ | 10337/61904 [5:02:29<19:10:37, 1.34s/it] 17%|█▋ | 10338/61904 [5:02:30<19:57:37, 1.39s/it] 17%|█▋ | 10339/61904 [5:02:32<20:16:44, 1.42s/it] 17%|█▋ | 10340/61904 [5:02:33<19:59:21, 1.40s/it] {'loss': 2.8365, 'learning_rate': 1.835667055620381e-07, 'epoch': 2.67} 17%|█▋ | 10340/61904 [5:02:33<19:59:21, 1.40s/it] 17%|█▋ | 10341/61904 [5:02:35<20:23:33, 1.42s/it] 17%|█▋ | 10342/61904 [5:02:36<20:25:25, 1.43s/it] 17%|█▋ | 10343/61904 [5:02:37<20:02:53, 1.40s/it] 17%|█▋ | 10344/61904 [5:02:39<19:53:20, 1.39s/it] 17%|█▋ | 10345/61904 [5:02:40<19:58:50, 1.40s/it] 17%|█▋ | 10346/61904 [5:02:42<20:17:08, 1.42s/it] 17%|█▋ | 10347/61904 [5:02:43<20:27:00, 1.43s/it] 17%|█▋ | 10348/61904 [5:02:45<20:17:30, 1.42s/it] 17%|█▋ | 10349/61904 [5:02:46<20:09:34, 1.41s/it] 17%|█▋ | 10350/61904 [5:02:47<19:45:14, 1.38s/it] 17%|█▋ | 10351/61904 [5:02:49<19:30:49, 1.36s/it] 17%|█▋ | 10352/61904 [5:02:50<19:20:07, 1.35s/it] 17%|█▋ | 10353/61904 [5:02:51<19:35:36, 1.37s/it] 17%|█▋ | 10354/61904 [5:02:53<19:22:17, 1.35s/it] 17%|█▋ | 10355/61904 [5:02:54<19:43:02, 1.38s/it] 17%|█▋ | 10356/61904 [5:02:55<19:25:09, 1.36s/it] 17%|█▋ | 10357/61904 [5:02:57<20:07:29, 1.41s/it] 17%|█▋ | 10358/61904 [5:02:58<20:40:09, 1.44s/it] 17%|█▋ | 10359/61904 [5:03:00<20:11:09, 1.41s/it] 17%|█▋ | 10360/61904 [5:03:01<19:28:36, 1.36s/it] {'loss': 2.7914, 'learning_rate': 1.8353429275249577e-07, 'epoch': 2.68} 17%|█▋ | 10360/61904 [5:03:01<19:28:36, 1.36s/it] 17%|█▋ | 10361/61904 [5:03:02<19:27:48, 1.36s/it] 17%|█▋ | 10362/61904 [5:03:04<19:45:54, 1.38s/it] 17%|█▋ | 10363/61904 [5:03:05<19:42:30, 1.38s/it] 17%|█▋ | 10364/61904 [5:03:07<19:37:43, 1.37s/it] 17%|█▋ | 10365/61904 [5:03:08<19:32:47, 1.37s/it] 17%|█▋ | 10366/61904 [5:03:09<19:09:32, 1.34s/it] 17%|█▋ | 10367/61904 [5:03:11<19:27:34, 1.36s/it] 17%|█▋ | 10368/61904 [5:03:12<19:14:22, 1.34s/it] 17%|█▋ | 10369/61904 [5:03:13<19:08:44, 1.34s/it] 17%|█▋ | 10370/61904 [5:03:15<19:09:33, 1.34s/it] 17%|█▋ | 10371/61904 [5:03:16<19:07:43, 1.34s/it] 17%|█▋ | 10372/61904 [5:03:17<18:51:57, 1.32s/it] 17%|█▋ | 10373/61904 [5:03:18<18:59:46, 1.33s/it] 17%|█▋ | 10374/61904 [5:03:20<19:04:35, 1.33s/it] 17%|█▋ | 10375/61904 [5:03:21<19:22:40, 1.35s/it] 17%|█▋ | 10376/61904 [5:03:23<19:15:22, 1.35s/it] 17%|█▋ | 10377/61904 [5:03:24<20:29:12, 1.43s/it] 17%|█▋ | 10378/61904 [5:03:26<20:32:09, 1.43s/it] 17%|█▋ | 10379/61904 [5:03:27<20:30:59, 1.43s/it] 17%|█▋ | 10380/61904 [5:03:28<19:36:32, 1.37s/it] {'loss': 2.8569, 'learning_rate': 1.8350187994295346e-07, 'epoch': 2.68} 17%|█▋ | 10380/61904 [5:03:28<19:36:32, 1.37s/it] 17%|█▋ | 10381/61904 [5:03:30<19:45:54, 1.38s/it] 17%|█▋ | 10382/61904 [5:03:31<19:32:47, 1.37s/it] 17%|█▋ | 10383/61904 [5:03:32<19:36:06, 1.37s/it] 17%|█▋ | 10384/61904 [5:03:34<19:07:46, 1.34s/it] 17%|█▋ | 10385/61904 [5:03:35<19:25:40, 1.36s/it] 17%|█▋ | 10386/61904 [5:03:37<20:10:31, 1.41s/it] 17%|█▋ | 10387/61904 [5:03:38<20:17:26, 1.42s/it] 17%|█▋ | 10388/61904 [5:03:39<19:48:55, 1.38s/it] 17%|█▋ | 10389/61904 [5:03:41<20:22:07, 1.42s/it] 17%|█▋ | 10390/61904 [5:03:42<20:56:42, 1.46s/it] 17%|█▋ | 10391/61904 [5:03:44<20:48:24, 1.45s/it] 17%|█▋ | 10392/61904 [5:03:45<20:31:16, 1.43s/it] 17%|█▋ | 10393/61904 [5:03:47<20:21:50, 1.42s/it] 17%|█▋ | 10394/61904 [5:03:48<20:06:35, 1.41s/it] 17%|█▋ | 10395/61904 [5:03:49<20:00:01, 1.40s/it] 17%|█▋ | 10396/61904 [5:03:51<19:54:35, 1.39s/it] 17%|█▋ | 10397/61904 [5:03:52<20:04:31, 1.40s/it] 17%|█▋ | 10398/61904 [5:03:54<19:53:45, 1.39s/it] 17%|█▋ | 10399/61904 [5:03:55<19:48:30, 1.38s/it] 17%|█▋ | 10400/61904 [5:03:56<20:18:17, 1.42s/it] {'loss': 2.8851, 'learning_rate': 1.8346946713341112e-07, 'epoch': 2.69} 17%|█▋ | 10400/61904 [5:03:56<20:18:17, 1.42s/it] 17%|█▋ | 10401/61904 [5:03:58<19:32:53, 1.37s/it] 17%|█▋ | 10402/61904 [5:03:59<19:14:17, 1.34s/it] 17%|█▋ | 10403/61904 [5:04:00<19:03:44, 1.33s/it] 17%|█▋ | 10404/61904 [5:04:02<19:14:23, 1.34s/it] 17%|█▋ | 10405/61904 [5:04:03<19:55:55, 1.39s/it] 17%|█▋ | 10406/61904 [5:04:05<20:17:51, 1.42s/it] 17%|█▋ | 10407/61904 [5:04:06<20:13:09, 1.41s/it] 17%|█▋ | 10408/61904 [5:04:07<19:44:03, 1.38s/it] 17%|█▋ | 10409/61904 [5:04:09<19:47:03, 1.38s/it] 17%|█▋ | 10410/61904 [5:04:10<19:51:21, 1.39s/it] 17%|█▋ | 10411/61904 [5:04:11<19:13:22, 1.34s/it] 17%|█▋ | 10412/61904 [5:04:13<19:09:19, 1.34s/it] 17%|█▋ | 10413/61904 [5:04:14<19:43:33, 1.38s/it] 17%|█▋ | 10414/61904 [5:04:15<19:27:11, 1.36s/it] 17%|█▋ | 10415/61904 [5:04:17<20:19:34, 1.42s/it] 17%|█▋ | 10416/61904 [5:04:18<19:39:11, 1.37s/it] 17%|█▋ | 10417/61904 [5:04:20<19:05:24, 1.33s/it] 17%|█▋ | 10418/61904 [5:04:21<18:57:17, 1.33s/it] 17%|█▋ | 10419/61904 [5:04:22<19:06:47, 1.34s/it] 17%|█▋ | 10420/61904 [5:04:24<19:29:15, 1.36s/it] {'loss': 2.8061, 'learning_rate': 1.8343705432386878e-07, 'epoch': 2.69} 17%|█▋ | 10420/61904 [5:04:24<19:29:15, 1.36s/it] 17%|█▋ | 10421/61904 [5:04:25<18:55:01, 1.32s/it] 17%|█▋ | 10422/61904 [5:04:26<19:12:17, 1.34s/it] 17%|█▋ | 10423/61904 [5:04:28<19:25:13, 1.36s/it] 17%|█▋ | 10424/61904 [5:04:29<19:38:00, 1.37s/it] 17%|█▋ | 10425/61904 [5:04:30<19:30:20, 1.36s/it] 17%|█▋ | 10426/61904 [5:04:32<19:35:44, 1.37s/it] 17%|█▋ | 10427/61904 [5:04:33<19:28:35, 1.36s/it] 17%|█▋ | 10428/61904 [5:04:34<19:13:06, 1.34s/it] 17%|█▋ | 10429/61904 [5:04:36<19:03:58, 1.33s/it] 17%|█▋ | 10430/61904 [5:04:37<19:28:32, 1.36s/it] 17%|█▋ | 10431/61904 [5:04:38<19:18:32, 1.35s/it] 17%|█▋ | 10432/61904 [5:04:40<19:16:42, 1.35s/it] 17%|█▋ | 10433/61904 [5:04:41<19:05:30, 1.34s/it] 17%|█▋ | 10434/61904 [5:04:42<18:59:07, 1.33s/it] 17%|█▋ | 10435/61904 [5:04:44<19:06:23, 1.34s/it] 17%|█▋ | 10436/61904 [5:04:45<19:09:01, 1.34s/it] 17%|█▋ | 10437/61904 [5:04:46<18:34:50, 1.30s/it] 17%|█▋ | 10438/61904 [5:04:48<19:19:15, 1.35s/it] 17%|█▋ | 10439/61904 [5:04:49<19:26:57, 1.36s/it] 17%|█▋ | 10440/61904 [5:04:51<19:50:53, 1.39s/it] {'loss': 2.8493, 'learning_rate': 1.8340464151432647e-07, 'epoch': 2.7} 17%|█▋ | 10440/61904 [5:04:51<19:50:53, 1.39s/it] 17%|█▋ | 10441/61904 [5:04:52<19:57:55, 1.40s/it] 17%|█▋ | 10442/61904 [5:04:53<19:45:06, 1.38s/it] 17%|█▋ | 10443/61904 [5:04:55<19:51:54, 1.39s/it] 17%|█▋ | 10444/61904 [5:04:56<19:45:24, 1.38s/it] 17%|█▋ | 10445/61904 [5:04:58<19:32:13, 1.37s/it] 17%|█▋ | 10446/61904 [5:04:59<19:14:06, 1.35s/it] 17%|█▋ | 10447/61904 [5:05:00<19:36:01, 1.37s/it] 17%|█▋ | 10448/61904 [5:05:02<19:46:43, 1.38s/it] 17%|█▋ | 10449/61904 [5:05:03<19:20:38, 1.35s/it] 17%|█▋ | 10450/61904 [5:05:04<19:45:13, 1.38s/it] 17%|█▋ | 10451/61904 [5:05:06<20:35:23, 1.44s/it] 17%|█▋ | 10452/61904 [5:05:07<20:15:10, 1.42s/it] 17%|█▋ | 10453/61904 [5:05:09<19:51:42, 1.39s/it] 17%|█▋ | 10454/61904 [5:05:10<19:06:22, 1.34s/it] 17%|█▋ | 10455/61904 [5:05:11<18:44:02, 1.31s/it] 17%|█▋ | 10456/61904 [5:05:13<19:01:23, 1.33s/it] 17%|█▋ | 10457/61904 [5:05:14<19:08:47, 1.34s/it] 17%|█▋ | 10458/61904 [5:05:15<18:44:44, 1.31s/it] 17%|█▋ | 10459/61904 [5:05:16<18:49:05, 1.32s/it] 17%|█▋ | 10460/61904 [5:05:18<19:35:04, 1.37s/it] {'loss': 2.7935, 'learning_rate': 1.8337222870478413e-07, 'epoch': 2.7} 17%|█▋ | 10460/61904 [5:05:18<19:35:04, 1.37s/it] 17%|█▋ | 10461/61904 [5:05:19<19:39:37, 1.38s/it] 17%|█▋ | 10462/61904 [5:05:21<20:00:24, 1.40s/it] 17%|█▋ | 10463/61904 [5:05:22<20:04:44, 1.41s/it] 17%|█▋ | 10464/61904 [5:05:24<20:08:38, 1.41s/it] 17%|█▋ | 10465/61904 [5:05:25<20:02:59, 1.40s/it] 17%|█▋ | 10466/61904 [5:05:27<21:01:42, 1.47s/it] 17%|█▋ | 10467/61904 [5:05:28<20:50:20, 1.46s/it] 17%|█▋ | 10468/61904 [5:05:29<20:21:10, 1.42s/it] 17%|█▋ | 10469/61904 [5:05:31<20:49:03, 1.46s/it] 17%|█▋ | 10470/61904 [5:05:32<20:40:18, 1.45s/it] 17%|█▋ | 10471/61904 [5:05:34<20:06:58, 1.41s/it] 17%|█▋ | 10472/61904 [5:05:35<19:51:17, 1.39s/it] 17%|█▋ | 10473/61904 [5:05:37<20:12:58, 1.42s/it] 17%|█▋ | 10474/61904 [5:05:38<19:54:05, 1.39s/it] 17%|█▋ | 10475/61904 [5:05:39<19:23:39, 1.36s/it] 17%|█▋ | 10476/61904 [5:05:40<18:58:31, 1.33s/it] 17%|█▋ | 10477/61904 [5:05:42<19:35:38, 1.37s/it] 17%|█▋ | 10478/61904 [5:05:43<19:35:31, 1.37s/it] 17%|█▋ | 10479/61904 [5:05:45<19:59:55, 1.40s/it] 17%|█▋ | 10480/61904 [5:05:46<20:05:38, 1.41s/it] {'loss': 2.8418, 'learning_rate': 1.833398158952418e-07, 'epoch': 2.71} 17%|█▋ | 10480/61904 [5:05:46<20:05:38, 1.41s/it] 17%|█▋ | 10481/61904 [5:05:48<20:18:53, 1.42s/it] 17%|█▋ | 10482/61904 [5:05:49<20:26:30, 1.43s/it] 17%|█▋ | 10483/61904 [5:05:50<19:38:37, 1.38s/it] 17%|█▋ | 10484/61904 [5:05:52<19:23:41, 1.36s/it] 17%|█▋ | 10485/61904 [5:05:53<19:14:46, 1.35s/it] 17%|█▋ | 10486/61904 [5:05:54<19:50:23, 1.39s/it] 17%|█▋ | 10487/61904 [5:05:56<19:51:15, 1.39s/it] 17%|█▋ | 10488/61904 [5:05:57<19:42:01, 1.38s/it] 17%|█▋ | 10489/61904 [5:05:59<19:35:02, 1.37s/it] 17%|█▋ | 10490/61904 [5:06:00<19:34:38, 1.37s/it] 17%|█▋ | 10491/61904 [5:06:01<19:21:34, 1.36s/it] 17%|█▋ | 10492/61904 [5:06:03<19:15:50, 1.35s/it] 17%|█▋ | 10493/61904 [5:06:04<20:00:19, 1.40s/it] 17%|█▋ | 10494/61904 [5:06:05<19:16:32, 1.35s/it] 17%|█▋ | 10495/61904 [5:06:07<19:51:56, 1.39s/it] 17%|█▋ | 10496/61904 [5:06:08<20:19:12, 1.42s/it] 17%|█▋ | 10497/61904 [5:06:10<20:06:34, 1.41s/it] 17%|█▋ | 10498/61904 [5:06:11<19:56:16, 1.40s/it] 17%|█▋ | 10499/61904 [5:06:12<19:17:47, 1.35s/it] 17%|█▋ | 10500/61904 [5:06:14<19:43:23, 1.38s/it] {'loss': 2.7704, 'learning_rate': 1.8330740308569948e-07, 'epoch': 2.71} 17%|█▋ | 10500/61904 [5:06:14<19:43:23, 1.38s/it] 17%|█▋ | 10501/61904 [5:06:15<20:20:32, 1.42s/it] 17%|█▋ | 10502/61904 [5:06:17<20:29:14, 1.43s/it] 17%|█▋ | 10503/61904 [5:06:18<21:05:21, 1.48s/it] 17%|█▋ | 10504/61904 [5:06:20<21:07:03, 1.48s/it] 17%|█▋ | 10505/61904 [5:06:21<20:38:40, 1.45s/it] 17%|█▋ | 10506/61904 [5:06:22<20:04:09, 1.41s/it] 17%|█▋ | 10507/61904 [5:06:24<20:04:29, 1.41s/it] 17%|█▋ | 10508/61904 [5:06:25<19:31:26, 1.37s/it] 17%|█▋ | 10509/61904 [5:06:27<20:07:19, 1.41s/it] 17%|█▋ | 10510/61904 [5:06:28<20:05:39, 1.41s/it] 17%|█▋ | 10511/61904 [5:06:29<19:46:30, 1.39s/it] 17%|█▋ | 10512/61904 [5:06:31<20:20:27, 1.42s/it] 17%|█▋ | 10513/61904 [5:06:32<20:05:43, 1.41s/it] 17%|█▋ | 10514/61904 [5:06:34<19:44:58, 1.38s/it] 17%|█▋ | 10515/61904 [5:06:35<20:05:29, 1.41s/it] 17%|█▋ | 10516/61904 [5:06:36<20:12:59, 1.42s/it] 17%|█▋ | 10517/61904 [5:06:38<20:14:54, 1.42s/it] 17%|█▋ | 10518/61904 [5:06:39<19:15:06, 1.35s/it] 17%|█▋ | 10519/61904 [5:06:40<19:01:41, 1.33s/it] 17%|█▋ | 10520/61904 [5:06:42<19:55:05, 1.40s/it] {'loss': 2.87, 'learning_rate': 1.8327499027615711e-07, 'epoch': 2.72} 17%|█▋ | 10520/61904 [5:06:42<19:55:05, 1.40s/it] 17%|█▋ | 10521/61904 [5:06:43<19:55:33, 1.40s/it] 17%|█▋ | 10522/61904 [5:06:45<20:13:17, 1.42s/it] 17%|█▋ | 10523/61904 [5:06:46<20:12:48, 1.42s/it] 17%|█▋ | 10524/61904 [5:06:47<19:44:04, 1.38s/it] 17%|█▋ | 10525/61904 [5:06:49<19:05:44, 1.34s/it] 17%|█▋ | 10526/61904 [5:06:50<19:18:00, 1.35s/it] 17%|█▋ | 10527/61904 [5:06:51<18:51:37, 1.32s/it] 17%|█▋ | 10528/61904 [5:06:53<19:31:33, 1.37s/it] 17%|█▋ | 10529/61904 [5:06:54<19:11:07, 1.34s/it] 17%|█▋ | 10530/61904 [5:06:56<19:24:05, 1.36s/it] 17%|█▋ | 10531/61904 [5:06:57<20:00:16, 1.40s/it] 17%|█▋ | 10532/61904 [5:06:58<19:41:24, 1.38s/it] 17%|█▋ | 10533/61904 [5:07:00<20:08:50, 1.41s/it] 17%|█▋ | 10534/61904 [5:07:01<19:43:49, 1.38s/it] 17%|█▋ | 10535/61904 [5:07:03<19:44:53, 1.38s/it] 17%|█▋ | 10536/61904 [5:07:04<19:49:09, 1.39s/it] 17%|█▋ | 10537/61904 [5:07:05<19:28:02, 1.36s/it] 17%|█▋ | 10538/61904 [5:07:07<19:27:54, 1.36s/it] 17%|█▋ | 10539/61904 [5:07:08<19:30:35, 1.37s/it] 17%|█▋ | 10540/61904 [5:07:09<19:33:42, 1.37s/it] {'loss': 2.8444, 'learning_rate': 1.832425774666148e-07, 'epoch': 2.72} 17%|█▋ | 10540/61904 [5:07:09<19:33:42, 1.37s/it] 17%|█▋ | 10541/61904 [5:07:11<19:53:03, 1.39s/it] 17%|█▋ | 10542/61904 [5:07:12<19:24:37, 1.36s/it] 17%|█▋ | 10543/61904 [5:07:13<19:34:11, 1.37s/it] 17%|█▋ | 10544/61904 [5:07:15<19:16:58, 1.35s/it] 17%|█▋ | 10545/61904 [5:07:16<18:54:31, 1.33s/it] 17%|█▋ | 10546/61904 [5:07:17<19:09:42, 1.34s/it] 17%|█▋ | 10547/61904 [5:07:19<18:57:50, 1.33s/it] 17%|█▋ | 10548/61904 [5:07:20<19:11:57, 1.35s/it] 17%|█▋ | 10549/61904 [5:07:21<19:16:42, 1.35s/it] 17%|█▋ | 10550/61904 [5:07:23<18:59:18, 1.33s/it] 17%|█▋ | 10551/61904 [5:07:24<18:58:56, 1.33s/it] 17%|█▋ | 10552/61904 [5:07:25<18:48:39, 1.32s/it] 17%|█▋ | 10553/61904 [5:07:27<19:09:58, 1.34s/it] 17%|█▋ | 10554/61904 [5:07:28<20:15:29, 1.42s/it] 17%|█▋ | 10555/61904 [5:07:30<20:49:34, 1.46s/it] 17%|█▋ | 10556/61904 [5:07:31<20:18:56, 1.42s/it] 17%|█▋ | 10557/61904 [5:07:33<20:36:17, 1.44s/it] 17%|█▋ | 10558/61904 [5:07:34<19:53:39, 1.39s/it] 17%|█▋ | 10559/61904 [5:07:35<19:31:26, 1.37s/it] 17%|█▋ | 10560/61904 [5:07:37<19:55:14, 1.40s/it] {'loss': 2.7719, 'learning_rate': 1.8321016465707246e-07, 'epoch': 2.73} 17%|█▋ | 10560/61904 [5:07:37<19:55:14, 1.40s/it] 17%|█▋ | 10561/61904 [5:07:38<19:32:57, 1.37s/it] 17%|█▋ | 10562/61904 [5:07:39<18:58:29, 1.33s/it] 17%|█▋ | 10563/61904 [5:07:41<18:53:01, 1.32s/it] 17%|█▋ | 10564/61904 [5:07:42<19:25:02, 1.36s/it] 17%|█▋ | 10565/61904 [5:07:43<19:09:33, 1.34s/it] 17%|█▋ | 10566/61904 [5:07:45<19:24:47, 1.36s/it] 17%|█▋ | 10567/61904 [5:07:46<19:38:51, 1.38s/it] 17%|█▋ | 10568/61904 [5:07:48<20:01:34, 1.40s/it] 17%|█▋ | 10569/61904 [5:07:49<19:56:41, 1.40s/it] 17%|█▋ | 10570/61904 [5:07:50<19:29:47, 1.37s/it] 17%|█▋ | 10571/61904 [5:07:52<19:21:24, 1.36s/it] 17%|█▋ | 10572/61904 [5:07:53<19:39:02, 1.38s/it] 17%|█▋ | 10573/61904 [5:07:54<19:24:45, 1.36s/it] 17%|█▋ | 10574/61904 [5:07:56<19:17:05, 1.35s/it] 17%|█▋ | 10575/61904 [5:07:57<19:01:11, 1.33s/it] 17%|█▋ | 10576/61904 [5:07:58<19:06:38, 1.34s/it] 17%|█▋ | 10577/61904 [5:08:00<19:02:05, 1.34s/it] 17%|█▋ | 10578/61904 [5:08:01<19:22:05, 1.36s/it] 17%|█▋ | 10579/61904 [5:08:02<19:04:39, 1.34s/it] 17%|█▋ | 10580/61904 [5:08:04<19:23:08, 1.36s/it] {'loss': 2.823, 'learning_rate': 1.8317775184753013e-07, 'epoch': 2.73} 17%|█▋ | 10580/61904 [5:08:04<19:23:08, 1.36s/it] 17%|█▋ | 10581/61904 [5:08:05<19:24:36, 1.36s/it] 17%|█▋ | 10582/61904 [5:08:07<19:18:40, 1.35s/it] 17%|█▋ | 10583/61904 [5:08:08<18:43:43, 1.31s/it] 17%|█▋ | 10584/61904 [5:08:09<19:03:31, 1.34s/it] 17%|█▋ | 10585/61904 [5:08:11<19:26:20, 1.36s/it] 17%|█▋ | 10586/61904 [5:08:12<19:00:11, 1.33s/it] 17%|█▋ | 10587/61904 [5:08:13<19:22:13, 1.36s/it] 17%|█▋ | 10588/61904 [5:08:15<19:50:01, 1.39s/it] 17%|█▋ | 10589/61904 [5:08:16<19:32:53, 1.37s/it] 17%|█▋ | 10590/61904 [5:08:17<19:32:18, 1.37s/it] 17%|█▋ | 10591/61904 [5:08:19<19:10:58, 1.35s/it] 17%|█▋ | 10592/61904 [5:08:20<19:31:03, 1.37s/it] 17%|█▋ | 10593/61904 [5:08:22<19:14:18, 1.35s/it] 17%|█▋ | 10594/61904 [5:08:23<18:59:35, 1.33s/it] 17%|█▋ | 10595/61904 [5:08:24<19:26:13, 1.36s/it] 17%|█▋ | 10596/61904 [5:08:26<19:22:54, 1.36s/it] 17%|█▋ | 10597/61904 [5:08:27<19:27:31, 1.37s/it] 17%|█▋ | 10598/61904 [5:08:28<19:38:33, 1.38s/it] 17%|█▋ | 10599/61904 [5:08:30<19:27:34, 1.37s/it] 17%|█▋ | 10600/61904 [5:08:31<19:20:19, 1.36s/it] {'loss': 2.8562, 'learning_rate': 1.8314533903798782e-07, 'epoch': 2.74} 17%|█▋ | 10600/61904 [5:08:31<19:20:19, 1.36s/it] 17%|█▋ | 10601/61904 [5:08:32<18:55:48, 1.33s/it] 17%|█▋ | 10602/61904 [5:08:34<18:55:27, 1.33s/it] 17%|█▋ | 10603/61904 [5:08:35<19:06:07, 1.34s/it] 17%|█▋ | 10604/61904 [5:08:36<19:17:26, 1.35s/it] 17%|█▋ | 10605/61904 [5:08:38<18:54:20, 1.33s/it] 17%|█▋ | 10606/61904 [5:08:39<19:36:50, 1.38s/it] 17%|█▋ | 10607/61904 [5:08:40<19:06:25, 1.34s/it] 17%|█▋ | 10608/61904 [5:08:42<18:54:39, 1.33s/it] 17%|█▋ | 10609/61904 [5:08:43<19:12:45, 1.35s/it] 17%|█▋ | 10610/61904 [5:08:44<18:42:43, 1.31s/it] 17%|█▋ | 10611/61904 [5:08:46<18:59:24, 1.33s/it] 17%|█▋ | 10612/61904 [5:08:47<19:10:45, 1.35s/it] 17%|█▋ | 10613/61904 [5:08:48<19:23:54, 1.36s/it] 17%|█▋ | 10614/61904 [5:08:50<18:59:30, 1.33s/it] 17%|█▋ | 10615/61904 [5:08:51<19:24:17, 1.36s/it] 17%|█▋ | 10616/61904 [5:08:53<19:18:02, 1.35s/it] 17%|█▋ | 10617/61904 [5:08:54<18:53:02, 1.33s/it] 17%|█▋ | 10618/61904 [5:08:55<18:44:39, 1.32s/it] 17%|█▋ | 10619/61904 [5:08:56<18:58:22, 1.33s/it] 17%|█▋ | 10620/61904 [5:08:58<18:36:06, 1.31s/it] {'loss': 2.8903, 'learning_rate': 1.8311292622844548e-07, 'epoch': 2.74} 17%|█▋ | 10620/61904 [5:08:58<18:36:06, 1.31s/it] 17%|█▋ | 10621/61904 [5:08:59<18:51:40, 1.32s/it] 17%|█▋ | 10622/61904 [5:09:00<19:11:49, 1.35s/it] 17%|█▋ | 10623/61904 [5:09:02<19:20:57, 1.36s/it] 17%|█▋ | 10624/61904 [5:09:03<19:40:47, 1.38s/it] 17%|█▋ | 10625/61904 [5:09:05<20:02:12, 1.41s/it] 17%|█▋ | 10626/61904 [5:09:06<19:29:03, 1.37s/it] 17%|█▋ | 10627/61904 [5:09:07<19:24:16, 1.36s/it] 17%|█▋ | 10628/61904 [5:09:09<19:05:00, 1.34s/it] 17%|█▋ | 10629/61904 [5:09:10<19:16:21, 1.35s/it] 17%|█▋ | 10630/61904 [5:09:11<19:20:06, 1.36s/it] 17%|█▋ | 10631/61904 [5:09:13<18:51:21, 1.32s/it] 17%|█▋ | 10632/61904 [5:09:14<19:11:53, 1.35s/it] 17%|█▋ | 10633/61904 [5:09:15<19:04:39, 1.34s/it] 17%|█▋ | 10634/61904 [5:09:17<18:42:51, 1.31s/it] 17%|█▋ | 10635/61904 [5:09:18<20:04:30, 1.41s/it] 17%|█▋ | 10636/61904 [5:09:20<20:13:50, 1.42s/it] 17%|█▋ | 10637/61904 [5:09:21<19:43:13, 1.38s/it] 17%|█▋ | 10638/61904 [5:09:23<20:19:01, 1.43s/it] 17%|█▋ | 10639/61904 [5:09:24<19:57:37, 1.40s/it] 17%|█▋ | 10640/61904 [5:09:25<19:19:12, 1.36s/it] {'loss': 2.8069, 'learning_rate': 1.8308051341890314e-07, 'epoch': 2.75} 17%|█▋ | 10640/61904 [5:09:25<19:19:12, 1.36s/it] 17%|█▋ | 10641/61904 [5:09:26<19:12:56, 1.35s/it] 17%|█▋ | 10642/61904 [5:09:28<19:29:49, 1.37s/it] 17%|█▋ | 10643/61904 [5:09:29<20:02:09, 1.41s/it] 17%|█▋ | 10644/61904 [5:09:31<19:54:50, 1.40s/it] 17%|█▋ | 10645/61904 [5:09:32<20:45:57, 1.46s/it] 17%|█▋ | 10646/61904 [5:09:34<20:16:25, 1.42s/it] 17%|█▋ | 10647/61904 [5:09:35<19:46:28, 1.39s/it] 17%|█▋ | 10648/61904 [5:09:37<20:19:29, 1.43s/it] 17%|█▋ | 10649/61904 [5:09:38<20:17:05, 1.42s/it] 17%|█▋ | 10650/61904 [5:09:39<20:27:27, 1.44s/it] 17%|█▋ | 10651/61904 [5:09:41<20:06:07, 1.41s/it] 17%|█▋ | 10652/61904 [5:09:42<19:37:14, 1.38s/it] 17%|█▋ | 10653/61904 [5:09:43<19:16:27, 1.35s/it] 17%|█▋ | 10654/61904 [5:09:45<18:46:22, 1.32s/it] 17%|█▋ | 10655/61904 [5:09:46<18:57:07, 1.33s/it] 17%|█▋ | 10656/61904 [5:09:47<18:53:42, 1.33s/it] 17%|█▋ | 10657/61904 [5:09:49<19:13:56, 1.35s/it] 17%|█▋ | 10658/61904 [5:09:50<19:32:06, 1.37s/it] 17%|█▋ | 10659/61904 [5:09:51<19:08:46, 1.35s/it] 17%|█▋ | 10660/61904 [5:09:53<19:05:51, 1.34s/it] {'loss': 2.7696, 'learning_rate': 1.8304810060936083e-07, 'epoch': 2.75} 17%|█▋ | 10660/61904 [5:09:53<19:05:51, 1.34s/it] 17%|█▋ | 10661/61904 [5:09:54<18:46:39, 1.32s/it] 17%|█▋ | 10662/61904 [5:09:55<18:47:15, 1.32s/it] 17%|█▋ | 10663/61904 [5:09:57<18:49:33, 1.32s/it] 17%|█▋ | 10664/61904 [5:09:58<19:01:08, 1.34s/it] 17%|█▋ | 10665/61904 [5:09:59<19:08:22, 1.34s/it] 17%|█▋ | 10666/61904 [5:10:01<19:03:38, 1.34s/it] 17%|█▋ | 10667/61904 [5:10:02<18:58:15, 1.33s/it] 17%|█▋ | 10668/61904 [5:10:03<19:27:01, 1.37s/it] 17%|█▋ | 10669/61904 [5:10:05<19:14:35, 1.35s/it] 17%|█▋ | 10670/61904 [5:10:06<19:04:13, 1.34s/it] 17%|█▋ | 10671/61904 [5:10:07<19:13:38, 1.35s/it] 17%|█▋ | 10672/61904 [5:10:09<19:24:06, 1.36s/it] 17%|█▋ | 10673/61904 [5:10:10<20:08:33, 1.42s/it] 17%|█▋ | 10674/61904 [5:10:12<19:47:34, 1.39s/it] 17%|█▋ | 10675/61904 [5:10:13<20:24:48, 1.43s/it] 17%|█▋ | 10676/61904 [5:10:15<20:06:22, 1.41s/it] 17%|█▋ | 10677/61904 [5:10:16<19:31:17, 1.37s/it] 17%|█▋ | 10678/61904 [5:10:17<19:47:10, 1.39s/it] 17%|█▋ | 10679/61904 [5:10:19<19:28:00, 1.37s/it] 17%|█▋ | 10680/61904 [5:10:20<19:12:00, 1.35s/it] {'loss': 2.7616, 'learning_rate': 1.830156877998185e-07, 'epoch': 2.76} 17%|█▋ | 10680/61904 [5:10:20<19:12:00, 1.35s/it] 17%|█▋ | 10681/61904 [5:10:21<19:12:17, 1.35s/it] 17%|█▋ | 10682/61904 [5:10:23<19:11:55, 1.35s/it] 17%|█▋ | 10683/61904 [5:10:24<18:39:16, 1.31s/it] 17%|█▋ | 10684/61904 [5:10:25<19:07:56, 1.34s/it] 17%|█▋ | 10685/61904 [5:10:27<18:39:01, 1.31s/it] 17%|█▋ | 10686/61904 [5:10:28<18:19:33, 1.29s/it] 17%|█▋ | 10687/61904 [5:10:29<18:14:27, 1.28s/it] 17%|█▋ | 10688/61904 [5:10:30<18:29:38, 1.30s/it] 17%|█▋ | 10689/61904 [5:10:32<19:32:39, 1.37s/it] 17%|█▋ | 10690/61904 [5:10:33<19:49:03, 1.39s/it] 17%|█▋ | 10691/61904 [5:10:35<20:01:25, 1.41s/it] 17%|█▋ | 10692/61904 [5:10:36<20:06:51, 1.41s/it] 17%|█▋ | 10693/61904 [5:10:38<20:11:18, 1.42s/it] 17%|█▋ | 10694/61904 [5:10:39<19:55:00, 1.40s/it] 17%|█▋ | 10695/61904 [5:10:40<19:33:52, 1.38s/it] 17%|█▋ | 10696/61904 [5:10:42<19:27:46, 1.37s/it] 17%|█▋ | 10697/61904 [5:10:43<18:54:13, 1.33s/it] 17%|█▋ | 10698/61904 [5:10:44<18:23:12, 1.29s/it] 17%|█▋ | 10699/61904 [5:10:46<20:11:05, 1.42s/it] 17%|█▋ | 10700/61904 [5:10:47<20:02:42, 1.41s/it] {'loss': 2.7976, 'learning_rate': 1.8298327499027615e-07, 'epoch': 2.77} 17%|█▋ | 10700/61904 [5:10:47<20:02:42, 1.41s/it] 17%|█▋ | 10701/61904 [5:10:49<19:35:00, 1.38s/it] 17%|█▋ | 10702/61904 [5:10:50<19:20:21, 1.36s/it] 17%|█▋ | 10703/61904 [5:10:51<19:11:33, 1.35s/it] 17%|█▋ | 10704/61904 [5:10:52<18:52:49, 1.33s/it] 17%|█▋ | 10705/61904 [5:10:54<18:42:38, 1.32s/it] 17%|█▋ | 10706/61904 [5:10:55<19:06:42, 1.34s/it] 17%|█▋ | 10707/61904 [5:10:57<19:30:39, 1.37s/it] 17%|█▋ | 10708/61904 [5:10:58<19:57:49, 1.40s/it] 17%|█▋ | 10709/61904 [5:10:59<19:49:36, 1.39s/it] 17%|█▋ | 10710/61904 [5:11:01<19:34:32, 1.38s/it] 17%|█▋ | 10711/61904 [5:11:02<20:16:47, 1.43s/it] 17%|█▋ | 10712/61904 [5:11:04<19:45:51, 1.39s/it] 17%|█▋ | 10713/61904 [5:11:05<19:41:09, 1.38s/it] 17%|█▋ | 10714/61904 [5:11:06<19:40:21, 1.38s/it] 17%|█▋ | 10715/61904 [5:11:08<19:32:24, 1.37s/it] 17%|█▋ | 10716/61904 [5:11:09<19:24:51, 1.37s/it] 17%|█▋ | 10717/61904 [5:11:10<19:24:55, 1.37s/it] 17%|█▋ | 10718/61904 [5:11:12<18:59:46, 1.34s/it] 17%|█▋ | 10719/61904 [5:11:13<18:52:40, 1.33s/it] 17%|█▋ | 10720/61904 [5:11:14<19:04:12, 1.34s/it] {'loss': 2.8024, 'learning_rate': 1.829508621807338e-07, 'epoch': 2.77} 17%|█▋ | 10720/61904 [5:11:14<19:04:12, 1.34s/it] 17%|█▋ | 10721/61904 [5:11:16<18:53:00, 1.33s/it] 17%|█▋ | 10722/61904 [5:11:17<19:01:44, 1.34s/it] 17%|█▋ | 10723/61904 [5:11:18<19:31:29, 1.37s/it] 17%|█▋ | 10724/61904 [5:11:20<19:00:19, 1.34s/it] 17%|█▋ | 10725/61904 [5:11:21<18:51:12, 1.33s/it] 17%|█▋ | 10726/61904 [5:11:22<18:50:41, 1.33s/it] 17%|█▋ | 10727/61904 [5:11:24<19:06:52, 1.34s/it] 17%|█▋ | 10728/61904 [5:11:25<18:50:19, 1.33s/it] 17%|█▋ | 10729/61904 [5:11:27<19:39:35, 1.38s/it] 17%|█▋ | 10730/61904 [5:11:28<20:00:53, 1.41s/it] 17%|█▋ | 10731/61904 [5:11:30<20:25:22, 1.44s/it] 17%|█▋ | 10732/61904 [5:11:31<20:04:08, 1.41s/it] 17%|█▋ | 10733/61904 [5:11:32<19:50:06, 1.40s/it] 17%|█▋ | 10734/61904 [5:11:34<19:37:54, 1.38s/it] 17%|█▋ | 10735/61904 [5:11:35<20:17:28, 1.43s/it] 17%|█▋ | 10736/61904 [5:11:36<20:02:13, 1.41s/it] 17%|█▋ | 10737/61904 [5:11:38<19:58:12, 1.41s/it] 17%|█▋ | 10738/61904 [5:11:39<19:37:48, 1.38s/it] 17%|█▋ | 10739/61904 [5:11:41<19:56:54, 1.40s/it] 17%|█▋ | 10740/61904 [5:11:42<19:48:27, 1.39s/it] {'loss': 2.8195, 'learning_rate': 1.8291844937119147e-07, 'epoch': 2.78} 17%|█▋ | 10740/61904 [5:11:42<19:48:27, 1.39s/it] 17%|█▋ | 10741/61904 [5:11:43<19:23:42, 1.36s/it] 17%|█▋ | 10742/61904 [5:11:45<19:35:48, 1.38s/it] 17%|█▋ | 10743/61904 [5:11:46<20:29:57, 1.44s/it] 17%|█▋ | 10744/61904 [5:11:48<19:25:10, 1.37s/it] 17%|█▋ | 10745/61904 [5:11:49<20:12:32, 1.42s/it] 17%|█▋ | 10746/61904 [5:11:50<19:55:27, 1.40s/it] 17%|█▋ | 10747/61904 [5:11:52<19:28:52, 1.37s/it] 17%|█▋ | 10748/61904 [5:11:53<19:41:54, 1.39s/it] 17%|█▋ | 10749/61904 [5:11:55<19:42:19, 1.39s/it] 17%|█▋ | 10750/61904 [5:11:56<19:35:12, 1.38s/it] 17%|█▋ | 10751/61904 [5:11:57<19:23:01, 1.36s/it] 17%|█▋ | 10752/61904 [5:11:59<19:24:07, 1.37s/it] 17%|█▋ | 10753/61904 [5:12:00<19:17:52, 1.36s/it] 17%|█▋ | 10754/61904 [5:12:01<19:14:17, 1.35s/it] 17%|█▋ | 10755/61904 [5:12:02<18:37:30, 1.31s/it] 17%|█▋ | 10756/61904 [5:12:04<18:27:46, 1.30s/it] 17%|█▋ | 10757/61904 [5:12:05<18:32:21, 1.30s/it] 17%|█▋ | 10758/61904 [5:12:07<19:11:31, 1.35s/it] 17%|█▋ | 10759/61904 [5:12:08<21:14:15, 1.49s/it] 17%|█▋ | 10760/61904 [5:12:10<20:36:16, 1.45s/it] {'loss': 2.8792, 'learning_rate': 1.8288603656164916e-07, 'epoch': 2.78} 17%|█▋ | 10760/61904 [5:12:10<20:36:16, 1.45s/it] 17%|█▋ | 10761/61904 [5:12:11<20:23:12, 1.44s/it] 17%|█▋ | 10762/61904 [5:12:13<20:25:19, 1.44s/it] 17%|█▋ | 10763/61904 [5:12:14<20:47:39, 1.46s/it] 17%|█▋ | 10764/61904 [5:12:15<19:56:21, 1.40s/it] 17%|█▋ | 10765/61904 [5:12:17<20:03:46, 1.41s/it] 17%|█▋ | 10766/61904 [5:12:18<19:55:55, 1.40s/it] 17%|█▋ | 10767/61904 [5:12:20<19:49:53, 1.40s/it] 17%|█▋ | 10768/61904 [5:12:21<19:13:37, 1.35s/it] 17%|█▋ | 10769/61904 [5:12:22<18:50:17, 1.33s/it] 17%|█▋ | 10770/61904 [5:12:23<18:24:01, 1.30s/it] 17%|█▋ | 10771/61904 [5:12:25<19:08:09, 1.35s/it] 17%|█▋ | 10772/61904 [5:12:26<19:01:29, 1.34s/it] 17%|█▋ | 10773/61904 [5:12:27<18:49:06, 1.32s/it] 17%|█▋ | 10774/61904 [5:12:29<18:49:58, 1.33s/it] 17%|█▋ | 10775/61904 [5:12:30<19:09:30, 1.35s/it] 17%|█▋ | 10776/61904 [5:12:32<19:36:14, 1.38s/it] 17%|█▋ | 10777/61904 [5:12:33<19:34:54, 1.38s/it] 17%|█▋ | 10778/61904 [5:12:34<19:09:34, 1.35s/it] 17%|█▋ | 10779/61904 [5:12:36<19:03:50, 1.34s/it] 17%|█▋ | 10780/61904 [5:12:37<19:28:10, 1.37s/it] {'loss': 2.8563, 'learning_rate': 1.8285362375210682e-07, 'epoch': 2.79} 17%|█▋ | 10780/61904 [5:12:37<19:28:10, 1.37s/it] 17%|█▋ | 10781/61904 [5:12:38<19:23:59, 1.37s/it] 17%|█▋ | 10782/61904 [5:12:40<18:56:41, 1.33s/it] 17%|█▋ | 10783/61904 [5:12:41<18:38:45, 1.31s/it] 17%|█▋ | 10784/61904 [5:12:42<18:43:37, 1.32s/it] 17%|█▋ | 10785/61904 [5:12:44<19:20:51, 1.36s/it] 17%|█▋ | 10786/61904 [5:12:45<18:58:05, 1.34s/it] 17%|█▋ | 10787/61904 [5:12:46<18:51:11, 1.33s/it] 17%|█▋ | 10788/61904 [5:12:48<19:01:23, 1.34s/it] 17%|█▋ | 10789/61904 [5:12:49<19:23:16, 1.37s/it] 17%|█▋ | 10790/61904 [5:12:50<19:28:27, 1.37s/it] 17%|█▋ | 10791/61904 [5:12:52<19:13:31, 1.35s/it] 17%|█▋ | 10792/61904 [5:12:53<19:29:49, 1.37s/it] 17%|█▋ | 10793/61904 [5:12:54<19:16:53, 1.36s/it] 17%|█▋ | 10794/61904 [5:12:56<18:41:58, 1.32s/it] 17%|█▋ | 10795/61904 [5:12:57<18:37:55, 1.31s/it] 17%|█▋ | 10796/61904 [5:12:58<18:34:14, 1.31s/it] 17%|█▋ | 10797/61904 [5:13:00<18:55:52, 1.33s/it] 17%|█▋ | 10798/61904 [5:13:01<18:49:55, 1.33s/it] 17%|█▋ | 10799/61904 [5:13:02<18:25:36, 1.30s/it] 17%|█▋ | 10800/61904 [5:13:04<19:25:24, 1.37s/it] {'loss': 2.7792, 'learning_rate': 1.8282121094256449e-07, 'epoch': 2.79} 17%|█▋ | 10800/61904 [5:13:04<19:25:24, 1.37s/it] 17%|█▋ | 10801/61904 [5:13:05<19:32:40, 1.38s/it] 17%|█▋ | 10802/61904 [5:13:06<19:17:40, 1.36s/it] 17%|█▋ | 10803/61904 [5:13:08<19:15:53, 1.36s/it] 17%|█▋ | 10804/61904 [5:13:09<19:25:36, 1.37s/it] 17%|█▋ | 10805/61904 [5:13:11<19:18:50, 1.36s/it] 17%|█▋ | 10806/61904 [5:13:12<19:11:15, 1.35s/it] 17%|█▋ | 10807/61904 [5:13:13<19:54:31, 1.40s/it] 17%|█▋ | 10808/61904 [5:13:15<19:21:44, 1.36s/it] 17%|█▋ | 10809/61904 [5:13:16<19:43:51, 1.39s/it] 17%|█▋ | 10810/61904 [5:13:18<19:46:02, 1.39s/it] 17%|█▋ | 10811/61904 [5:13:19<19:42:09, 1.39s/it] 17%|█▋ | 10812/61904 [5:13:20<19:29:26, 1.37s/it] 17%|█▋ | 10813/61904 [5:13:21<18:57:26, 1.34s/it] 17%|█▋ | 10814/61904 [5:13:23<19:00:12, 1.34s/it] 17%|█▋ | 10815/61904 [5:13:24<18:41:39, 1.32s/it] 17%|█▋ | 10816/61904 [5:13:25<18:31:37, 1.31s/it] 17%|█▋ | 10817/61904 [5:13:27<18:48:27, 1.33s/it] 17%|█▋ | 10818/61904 [5:13:28<19:04:32, 1.34s/it] 17%|█▋ | 10819/61904 [5:13:29<18:37:09, 1.31s/it] 17%|█▋ | 10820/61904 [5:13:31<19:25:24, 1.37s/it] {'loss': 2.8382, 'learning_rate': 1.8278879813302218e-07, 'epoch': 2.8} 17%|█▋ | 10820/61904 [5:13:31<19:25:24, 1.37s/it] 17%|█▋ | 10821/61904 [5:13:32<19:26:58, 1.37s/it] 17%|█▋ | 10822/61904 [5:13:34<19:03:10, 1.34s/it] 17%|█▋ | 10823/61904 [5:13:35<18:48:09, 1.33s/it] 17%|█▋ | 10824/61904 [5:13:36<19:21:25, 1.36s/it] 17%|█▋ | 10825/61904 [5:13:38<19:40:49, 1.39s/it] 17%|█▋ | 10826/61904 [5:13:39<19:15:51, 1.36s/it] 17%|█▋ | 10827/61904 [5:13:40<19:25:33, 1.37s/it] 17%|█▋ | 10828/61904 [5:13:42<19:25:07, 1.37s/it] 17%|█▋ | 10829/61904 [5:13:43<20:01:16, 1.41s/it] 17%|█▋ | 10830/61904 [5:13:45<19:27:43, 1.37s/it] 17%|█▋ | 10831/61904 [5:13:46<19:46:43, 1.39s/it] 17%|█▋ | 10832/61904 [5:13:47<19:39:13, 1.39s/it] 17%|█▋ | 10833/61904 [5:13:49<20:14:31, 1.43s/it] 18%|█▊ | 10834/61904 [5:13:50<19:33:49, 1.38s/it] 18%|█▊ | 10835/61904 [5:13:51<19:03:47, 1.34s/it] 18%|█▊ | 10836/61904 [5:13:53<18:49:05, 1.33s/it] 18%|█▊ | 10837/61904 [5:13:54<19:27:44, 1.37s/it] 18%|█▊ | 10838/61904 [5:13:56<19:16:46, 1.36s/it] 18%|█▊ | 10839/61904 [5:13:57<18:59:52, 1.34s/it] 18%|█▊ | 10840/61904 [5:13:58<18:53:10, 1.33s/it] {'loss': 2.7697, 'learning_rate': 1.8275638532347984e-07, 'epoch': 2.8} 18%|█▊ | 10840/61904 [5:13:58<18:53:10, 1.33s/it] 18%|█▊ | 10841/61904 [5:13:59<19:04:34, 1.34s/it] 18%|█▊ | 10842/61904 [5:14:01<19:01:44, 1.34s/it] 18%|█▊ | 10843/61904 [5:14:02<18:33:39, 1.31s/it] 18%|█▊ | 10844/61904 [5:14:03<19:01:28, 1.34s/it] 18%|█▊ | 10845/61904 [5:14:05<18:47:06, 1.32s/it] 18%|█▊ | 10846/61904 [5:14:06<18:30:02, 1.30s/it] 18%|█▊ | 10847/61904 [5:14:07<19:00:58, 1.34s/it] 18%|█▊ | 10848/61904 [5:14:09<18:58:18, 1.34s/it] 18%|█▊ | 10849/61904 [5:14:10<18:49:34, 1.33s/it] 18%|█▊ | 10850/61904 [5:14:11<18:53:30, 1.33s/it] 18%|█▊ | 10851/61904 [5:14:13<19:33:05, 1.38s/it] 18%|█▊ | 10852/61904 [5:14:14<19:09:31, 1.35s/it] 18%|█▊ | 10853/61904 [5:14:16<19:47:03, 1.40s/it] 18%|█▊ | 10854/61904 [5:14:17<19:30:42, 1.38s/it] 18%|█▊ | 10855/61904 [5:14:18<19:03:07, 1.34s/it] 18%|█▊ | 10856/61904 [5:14:20<18:51:06, 1.33s/it] 18%|█▊ | 10857/61904 [5:14:21<18:39:12, 1.32s/it] 18%|█▊ | 10858/61904 [5:14:22<18:17:45, 1.29s/it] 18%|█▊ | 10859/61904 [5:14:23<18:24:56, 1.30s/it] 18%|█▊ | 10860/61904 [5:14:25<18:55:16, 1.33s/it] {'loss': 2.8915, 'learning_rate': 1.827239725139375e-07, 'epoch': 2.81} 18%|█▊ | 10860/61904 [5:14:25<18:55:16, 1.33s/it] 18%|█▊ | 10861/61904 [5:14:26<18:55:23, 1.33s/it] 18%|█▊ | 10862/61904 [5:14:28<19:13:21, 1.36s/it] 18%|█▊ | 10863/61904 [5:14:29<19:22:36, 1.37s/it] 18%|█▊ | 10864/61904 [5:14:30<19:56:19, 1.41s/it] 18%|█▊ | 10865/61904 [5:14:32<19:54:45, 1.40s/it] 18%|█▊ | 10866/61904 [5:14:33<19:27:46, 1.37s/it] 18%|█▊ | 10867/61904 [5:14:34<19:10:30, 1.35s/it] 18%|█▊ | 10868/61904 [5:14:36<19:08:36, 1.35s/it] 18%|█▊ | 10869/61904 [5:14:37<19:23:04, 1.37s/it] 18%|█▊ | 10870/61904 [5:14:39<19:16:26, 1.36s/it] 18%|█▊ | 10871/61904 [5:14:40<19:04:26, 1.35s/it] 18%|█▊ | 10872/61904 [5:14:41<19:54:54, 1.40s/it] 18%|█▊ | 10873/61904 [5:14:43<20:28:39, 1.44s/it] 18%|█▊ | 10874/61904 [5:14:44<20:21:29, 1.44s/it] 18%|█▊ | 10875/61904 [5:14:46<19:55:51, 1.41s/it] 18%|█▊ | 10876/61904 [5:14:47<19:50:36, 1.40s/it] 18%|█▊ | 10877/61904 [5:14:48<19:38:32, 1.39s/it] 18%|█▊ | 10878/61904 [5:14:50<19:42:42, 1.39s/it] 18%|█▊ | 10879/61904 [5:14:51<19:24:29, 1.37s/it] 18%|█▊ | 10880/61904 [5:14:53<19:24:04, 1.37s/it] {'loss': 2.8202, 'learning_rate': 1.826915597043952e-07, 'epoch': 2.81} 18%|█▊ | 10880/61904 [5:14:53<19:24:04, 1.37s/it] 18%|█▊ | 10881/61904 [5:14:54<20:05:20, 1.42s/it] 18%|█▊ | 10882/61904 [5:14:55<20:04:08, 1.42s/it] 18%|█▊ | 10883/61904 [5:14:57<19:52:27, 1.40s/it] 18%|█▊ | 10884/61904 [5:14:58<19:24:03, 1.37s/it] 18%|█▊ | 10885/61904 [5:15:00<19:20:31, 1.36s/it] 18%|█▊ | 10886/61904 [5:15:01<19:05:56, 1.35s/it] 18%|█▊ | 10887/61904 [5:15:02<19:18:20, 1.36s/it] 18%|█▊ | 10888/61904 [5:15:04<19:13:22, 1.36s/it] 18%|█▊ | 10889/61904 [5:15:05<19:02:56, 1.34s/it] 18%|█▊ | 10890/61904 [5:15:06<19:06:56, 1.35s/it] 18%|█▊ | 10891/61904 [5:15:08<20:24:33, 1.44s/it] 18%|█▊ | 10892/61904 [5:15:09<20:04:48, 1.42s/it] 18%|█▊ | 10893/61904 [5:15:11<19:28:48, 1.37s/it] 18%|█▊ | 10894/61904 [5:15:12<19:20:17, 1.36s/it] 18%|█▊ | 10895/61904 [5:15:13<20:02:26, 1.41s/it] 18%|█▊ | 10896/61904 [5:15:15<19:19:18, 1.36s/it] 18%|█▊ | 10897/61904 [5:15:16<19:26:01, 1.37s/it] 18%|█▊ | 10898/61904 [5:15:17<19:27:40, 1.37s/it] 18%|█▊ | 10899/61904 [5:15:19<19:24:16, 1.37s/it] 18%|█▊ | 10900/61904 [5:15:20<19:40:06, 1.39s/it] {'loss': 2.7699, 'learning_rate': 1.8265914689485282e-07, 'epoch': 2.82} 18%|█▊ | 10900/61904 [5:15:20<19:40:06, 1.39s/it] 18%|█▊ | 10901/61904 [5:15:22<19:41:09, 1.39s/it] 18%|█▊ | 10902/61904 [5:15:23<20:08:39, 1.42s/it] 18%|█▊ | 10903/61904 [5:15:24<20:03:51, 1.42s/it] 18%|█▊ | 10904/61904 [5:15:26<19:47:07, 1.40s/it] 18%|█▊ | 10905/61904 [5:15:27<19:44:00, 1.39s/it] 18%|█▊ | 10906/61904 [5:15:29<19:37:59, 1.39s/it] 18%|█▊ | 10907/61904 [5:15:30<19:56:59, 1.41s/it] 18%|█▊ | 10908/61904 [5:15:31<19:42:43, 1.39s/it] 18%|█▊ | 10909/61904 [5:15:33<20:06:42, 1.42s/it] 18%|█▊ | 10910/61904 [5:15:35<20:56:08, 1.48s/it] 18%|█▊ | 10911/61904 [5:15:36<20:06:25, 1.42s/it] 18%|█▊ | 10912/61904 [5:15:37<19:53:57, 1.40s/it] 18%|█▊ | 10913/61904 [5:15:39<19:59:34, 1.41s/it] 18%|█▊ | 10914/61904 [5:15:40<19:27:34, 1.37s/it] 18%|█▊ | 10915/61904 [5:15:41<18:59:32, 1.34s/it] 18%|█▊ | 10916/61904 [5:15:43<19:11:52, 1.36s/it] 18%|█▊ | 10917/61904 [5:15:44<19:44:48, 1.39s/it] 18%|█▊ | 10918/61904 [5:15:45<19:15:20, 1.36s/it] 18%|█▊ | 10919/61904 [5:15:47<19:25:37, 1.37s/it] 18%|█▊ | 10920/61904 [5:15:48<19:07:06, 1.35s/it] {'loss': 2.8795, 'learning_rate': 1.826267340853105e-07, 'epoch': 2.82} 18%|█▊ | 10920/61904 [5:15:48<19:07:06, 1.35s/it] 18%|█▊ | 10921/61904 [5:15:49<18:58:03, 1.34s/it] 18%|█▊ | 10922/61904 [5:15:51<19:26:16, 1.37s/it] 18%|█▊ | 10923/61904 [5:15:52<19:30:32, 1.38s/it] 18%|█▊ | 10924/61904 [5:15:54<19:51:07, 1.40s/it] 18%|█▊ | 10925/61904 [5:15:55<20:17:13, 1.43s/it] 18%|█▊ | 10926/61904 [5:15:57<20:24:35, 1.44s/it] 18%|█▊ | 10927/61904 [5:15:58<20:59:56, 1.48s/it] 18%|█▊ | 10928/61904 [5:15:59<20:20:39, 1.44s/it] 18%|█▊ | 10929/61904 [5:16:01<20:26:27, 1.44s/it] 18%|█▊ | 10930/61904 [5:16:02<19:49:01, 1.40s/it] 18%|█▊ | 10931/61904 [5:16:04<19:52:04, 1.40s/it] 18%|█▊ | 10932/61904 [5:16:05<20:20:57, 1.44s/it] 18%|█▊ | 10933/61904 [5:16:07<19:58:49, 1.41s/it] 18%|█▊ | 10934/61904 [5:16:08<19:42:26, 1.39s/it] 18%|█▊ | 10935/61904 [5:16:09<19:45:20, 1.40s/it] 18%|█▊ | 10936/61904 [5:16:11<19:19:08, 1.36s/it] 18%|█▊ | 10937/61904 [5:16:12<19:12:57, 1.36s/it] 18%|█▊ | 10938/61904 [5:16:13<18:41:01, 1.32s/it] 18%|█▊ | 10939/61904 [5:16:15<18:58:36, 1.34s/it] 18%|█▊ | 10940/61904 [5:16:16<18:59:50, 1.34s/it] {'loss': 2.7777, 'learning_rate': 1.8259432127576817e-07, 'epoch': 2.83} 18%|█▊ | 10940/61904 [5:16:16<18:59:50, 1.34s/it] 18%|█▊ | 10941/61904 [5:16:17<18:58:57, 1.34s/it] 18%|█▊ | 10942/61904 [5:16:18<18:43:01, 1.32s/it] 18%|█▊ | 10943/61904 [5:16:20<18:45:21, 1.32s/it] 18%|█▊ | 10944/61904 [5:16:21<18:36:26, 1.31s/it] 18%|█▊ | 10945/61904 [5:16:22<18:55:17, 1.34s/it] 18%|█▊ | 10946/61904 [5:16:24<19:55:18, 1.41s/it] 18%|█▊ | 10947/61904 [5:16:25<19:35:24, 1.38s/it] 18%|█▊ | 10948/61904 [5:16:27<19:43:09, 1.39s/it] 18%|█▊ | 10949/61904 [5:16:28<19:25:31, 1.37s/it] 18%|█▊ | 10950/61904 [5:16:29<19:14:23, 1.36s/it] 18%|█▊ | 10951/61904 [5:16:31<19:58:38, 1.41s/it] 18%|█▊ | 10952/61904 [5:16:32<20:10:17, 1.43s/it] 18%|█▊ | 10953/61904 [5:16:34<19:39:05, 1.39s/it] 18%|█▊ | 10954/61904 [5:16:35<19:00:51, 1.34s/it] 18%|█▊ | 10955/61904 [5:16:36<19:11:47, 1.36s/it] 18%|█▊ | 10956/61904 [5:16:38<19:12:47, 1.36s/it] 18%|█▊ | 10957/61904 [5:16:39<19:15:31, 1.36s/it] 18%|█▊ | 10958/61904 [5:16:40<19:09:35, 1.35s/it] 18%|█▊ | 10959/61904 [5:16:42<18:41:06, 1.32s/it] 18%|█▊ | 10960/61904 [5:16:43<19:10:35, 1.36s/it] {'loss': 2.8015, 'learning_rate': 1.8256190846622583e-07, 'epoch': 2.83} 18%|█▊ | 10960/61904 [5:16:43<19:10:35, 1.36s/it] 18%|█▊ | 10961/61904 [5:16:45<19:59:54, 1.41s/it] 18%|█▊ | 10962/61904 [5:16:46<20:18:39, 1.44s/it] 18%|█▊ | 10963/61904 [5:16:47<19:55:21, 1.41s/it] 18%|█▊ | 10964/61904 [5:16:49<20:01:33, 1.42s/it] 18%|█▊ | 10965/61904 [5:16:50<19:24:31, 1.37s/it] 18%|█▊ | 10966/61904 [5:16:52<19:27:38, 1.38s/it] 18%|█▊ | 10967/61904 [5:16:53<19:43:26, 1.39s/it] 18%|█▊ | 10968/61904 [5:16:54<19:09:25, 1.35s/it] 18%|█▊ | 10969/61904 [5:16:56<19:36:05, 1.39s/it] 18%|█▊ | 10970/61904 [5:16:57<19:13:49, 1.36s/it] 18%|█▊ | 10971/61904 [5:16:58<19:20:17, 1.37s/it] 18%|█▊ | 10972/61904 [5:17:00<19:13:08, 1.36s/it] 18%|█▊ | 10973/61904 [5:17:01<19:35:08, 1.38s/it] 18%|█▊ | 10974/61904 [5:17:02<19:01:15, 1.34s/it] 18%|█▊ | 10975/61904 [5:17:04<18:43:13, 1.32s/it] 18%|█▊ | 10976/61904 [5:17:05<19:18:14, 1.36s/it] 18%|█▊ | 10977/61904 [5:17:07<19:35:23, 1.38s/it] 18%|█▊ | 10978/61904 [5:17:08<20:10:17, 1.43s/it] 18%|█▊ | 10979/61904 [5:17:10<20:16:20, 1.43s/it] 18%|█▊ | 10980/61904 [5:17:11<19:15:51, 1.36s/it] {'loss': 2.8306, 'learning_rate': 1.8252949565668352e-07, 'epoch': 2.84} 18%|█▊ | 10980/61904 [5:17:11<19:15:51, 1.36s/it] 18%|█▊ | 10981/61904 [5:17:12<19:07:51, 1.35s/it] 18%|█▊ | 10982/61904 [5:17:13<19:02:23, 1.35s/it] 18%|█▊ | 10983/61904 [5:17:15<19:37:03, 1.39s/it] 18%|█▊ | 10984/61904 [5:17:16<19:26:33, 1.37s/it] 18%|█▊ | 10985/61904 [5:17:18<18:56:11, 1.34s/it] 18%|█▊ | 10986/61904 [5:17:19<19:11:00, 1.36s/it] 18%|█▊ | 10987/61904 [5:17:20<18:44:10, 1.32s/it] 18%|█▊ | 10988/61904 [5:17:22<18:51:11, 1.33s/it] 18%|█▊ | 10989/61904 [5:17:23<19:00:19, 1.34s/it] 18%|█▊ | 10990/61904 [5:17:24<18:47:41, 1.33s/it] 18%|█▊ | 10991/61904 [5:17:26<19:28:04, 1.38s/it] 18%|█▊ | 10992/61904 [5:17:27<19:57:00, 1.41s/it] 18%|█▊ | 10993/61904 [5:17:28<19:21:53, 1.37s/it] 18%|█▊ | 10994/61904 [5:17:30<19:05:21, 1.35s/it] 18%|█▊ | 10995/61904 [5:17:31<19:19:39, 1.37s/it] 18%|█▊ | 10996/61904 [5:17:32<18:52:49, 1.34s/it] 18%|█▊ | 10997/61904 [5:17:34<18:30:27, 1.31s/it] 18%|█▊ | 10998/61904 [5:17:35<18:34:43, 1.31s/it] 18%|█▊ | 10999/61904 [5:17:36<18:26:25, 1.30s/it] 18%|█▊ | 11000/61904 [5:17:38<18:55:49, 1.34s/it] {'loss': 2.986, 'learning_rate': 1.8249708284714118e-07, 'epoch': 2.84} 18%|█▊ | 11000/61904 [5:17:38<18:55:49, 1.34s/it] 18%|█▊ | 11001/61904 [5:17:39<18:56:41, 1.34s/it] 18%|█▊ | 11002/61904 [5:17:40<19:01:44, 1.35s/it] 18%|█▊ | 11003/61904 [5:17:42<18:46:15, 1.33s/it] 18%|█▊ | 11004/61904 [5:17:43<18:42:54, 1.32s/it] 18%|█▊ | 11005/61904 [5:17:44<19:04:48, 1.35s/it] 18%|█▊ | 11006/61904 [5:17:46<19:36:27, 1.39s/it] 18%|█▊ | 11007/61904 [5:17:47<19:26:10, 1.37s/it] 18%|█▊ | 11008/61904 [5:17:48<18:50:00, 1.33s/it] 18%|█▊ | 11009/61904 [5:17:50<19:47:30, 1.40s/it] 18%|█▊ | 11010/61904 [5:17:51<19:25:56, 1.37s/it] 18%|█▊ | 11011/61904 [5:17:53<19:35:57, 1.39s/it] 18%|█▊ | 11012/61904 [5:17:54<19:08:29, 1.35s/it] 18%|█▊ | 11013/61904 [5:17:56<19:42:45, 1.39s/it] 18%|█▊ | 11014/61904 [5:17:57<20:08:48, 1.43s/it] 18%|█▊ | 11015/61904 [5:17:59<20:26:49, 1.45s/it] 18%|█▊ | 11016/61904 [5:18:00<21:06:34, 1.49s/it] 18%|█▊ | 11017/61904 [5:18:01<20:32:57, 1.45s/it] 18%|█▊ | 11018/61904 [5:18:03<19:43:10, 1.40s/it] 18%|█▊ | 11019/61904 [5:18:04<19:37:12, 1.39s/it] 18%|█▊ | 11020/61904 [5:18:06<20:01:22, 1.42s/it] {'loss': 2.785, 'learning_rate': 1.8246467003759885e-07, 'epoch': 2.85} 18%|█▊ | 11020/61904 [5:18:06<20:01:22, 1.42s/it] 18%|█▊ | 11021/61904 [5:18:07<19:32:16, 1.38s/it] 18%|█▊ | 11022/61904 [5:18:08<18:53:35, 1.34s/it] 18%|█▊ | 11023/61904 [5:18:09<19:00:04, 1.34s/it] 18%|█▊ | 11024/61904 [5:18:11<18:57:23, 1.34s/it] 18%|█▊ | 11025/61904 [5:18:12<18:41:41, 1.32s/it] 18%|█▊ | 11026/61904 [5:18:14<19:19:00, 1.37s/it] 18%|█▊ | 11027/61904 [5:18:15<18:53:45, 1.34s/it] 18%|█▊ | 11028/61904 [5:18:16<18:40:41, 1.32s/it] 18%|█▊ | 11029/61904 [5:18:17<18:26:29, 1.30s/it] 18%|█▊ | 11030/61904 [5:18:19<19:06:12, 1.35s/it] 18%|█▊ | 11031/61904 [5:18:20<18:37:55, 1.32s/it] 18%|█▊ | 11032/61904 [5:18:21<18:49:31, 1.33s/it] 18%|█▊ | 11033/61904 [5:18:23<19:19:29, 1.37s/it] 18%|█▊ | 11034/61904 [5:18:24<19:13:08, 1.36s/it] 18%|█▊ | 11035/61904 [5:18:26<18:51:27, 1.33s/it] 18%|█▊ | 11036/61904 [5:18:27<19:06:02, 1.35s/it] 18%|█▊ | 11037/61904 [5:18:28<19:17:37, 1.37s/it] 18%|█▊ | 11038/61904 [5:18:30<19:40:42, 1.39s/it] 18%|█▊ | 11039/61904 [5:18:31<19:03:23, 1.35s/it] 18%|█▊ | 11040/61904 [5:18:32<19:01:47, 1.35s/it] {'loss': 2.7796, 'learning_rate': 1.8243225722805653e-07, 'epoch': 2.85} 18%|█▊ | 11040/61904 [5:18:32<19:01:47, 1.35s/it] 18%|█▊ | 11041/61904 [5:18:34<19:15:39, 1.36s/it] 18%|█▊ | 11042/61904 [5:18:35<19:31:57, 1.38s/it] 18%|█▊ | 11043/61904 [5:18:37<19:29:37, 1.38s/it] 18%|█▊ | 11044/61904 [5:18:38<19:13:57, 1.36s/it] 18%|█▊ | 11045/61904 [5:18:39<19:21:36, 1.37s/it] 18%|█▊ | 11046/61904 [5:18:41<18:58:42, 1.34s/it] 18%|█▊ | 11047/61904 [5:18:42<18:30:53, 1.31s/it] 18%|█▊ | 11048/61904 [5:18:43<19:30:15, 1.38s/it] 18%|█▊ | 11049/61904 [5:18:45<19:57:56, 1.41s/it] 18%|█▊ | 11050/61904 [5:18:46<19:50:46, 1.40s/it] 18%|█▊ | 11051/61904 [5:18:48<20:21:42, 1.44s/it] 18%|█▊ | 11052/61904 [5:18:49<20:13:02, 1.43s/it] 18%|█▊ | 11053/61904 [5:18:50<19:43:04, 1.40s/it] 18%|█▊ | 11054/61904 [5:18:52<19:09:59, 1.36s/it] 18%|█▊ | 11055/61904 [5:18:53<19:20:16, 1.37s/it] 18%|█▊ | 11056/61904 [5:18:55<19:54:27, 1.41s/it] 18%|█▊ | 11057/61904 [5:18:56<19:56:48, 1.41s/it] 18%|█▊ | 11058/61904 [5:18:57<19:36:42, 1.39s/it] 18%|█▊ | 11059/61904 [5:18:59<19:43:14, 1.40s/it] 18%|█▊ | 11060/61904 [5:19:00<20:02:14, 1.42s/it] {'loss': 2.7851, 'learning_rate': 1.823998444185142e-07, 'epoch': 2.86} 18%|█▊ | 11060/61904 [5:19:00<20:02:14, 1.42s/it] 18%|█▊ | 11061/61904 [5:19:02<19:59:21, 1.42s/it] 18%|█▊ | 11062/61904 [5:19:03<19:44:09, 1.40s/it] 18%|█▊ | 11063/61904 [5:19:04<19:22:13, 1.37s/it] 18%|█▊ | 11064/61904 [5:19:06<19:20:57, 1.37s/it] 18%|█▊ | 11065/61904 [5:19:07<18:53:15, 1.34s/it] 18%|█▊ | 11066/61904 [5:19:08<19:39:21, 1.39s/it] 18%|█▊ | 11067/61904 [5:19:10<20:07:56, 1.43s/it] 18%|█▊ | 11068/61904 [5:19:11<19:15:15, 1.36s/it] 18%|█▊ | 11069/61904 [5:19:13<19:10:51, 1.36s/it] 18%|█▊ | 11070/61904 [5:19:14<19:00:22, 1.35s/it] 18%|█▊ | 11071/61904 [5:19:15<19:41:17, 1.39s/it] 18%|█▊ | 11072/61904 [5:19:17<19:26:30, 1.38s/it] 18%|█▊ | 11073/61904 [5:19:18<19:39:12, 1.39s/it] 18%|█▊ | 11074/61904 [5:19:19<19:12:46, 1.36s/it] 18%|█▊ | 11075/61904 [5:19:21<19:19:34, 1.37s/it] 18%|█▊ | 11076/61904 [5:19:22<18:46:21, 1.33s/it] 18%|█▊ | 11077/61904 [5:19:23<19:00:18, 1.35s/it] 18%|█▊ | 11078/61904 [5:19:25<18:41:02, 1.32s/it] 18%|█▊ | 11079/61904 [5:19:26<18:58:41, 1.34s/it] 18%|█▊ | 11080/61904 [5:19:27<18:49:05, 1.33s/it] {'loss': 2.8133, 'learning_rate': 1.8236743160897186e-07, 'epoch': 2.86} 18%|█▊ | 11080/61904 [5:19:27<18:49:05, 1.33s/it] 18%|█▊ | 11081/61904 [5:19:29<18:58:45, 1.34s/it] 18%|█▊ | 11082/61904 [5:19:30<18:49:45, 1.33s/it] 18%|█▊ | 11083/61904 [5:19:31<19:00:39, 1.35s/it] 18%|█▊ | 11084/61904 [5:19:33<19:24:07, 1.37s/it] 18%|█▊ | 11085/61904 [5:19:34<19:23:24, 1.37s/it] 18%|█▊ | 11086/61904 [5:19:36<18:56:50, 1.34s/it] 18%|█▊ | 11087/61904 [5:19:37<18:39:49, 1.32s/it] 18%|█▊ | 11088/61904 [5:19:38<18:52:00, 1.34s/it] 18%|█▊ | 11089/61904 [5:19:40<19:15:44, 1.36s/it] 18%|█▊ | 11090/61904 [5:19:41<18:56:16, 1.34s/it] 18%|█▊ | 11091/61904 [5:19:42<18:55:30, 1.34s/it] 18%|█▊ | 11092/61904 [5:19:44<19:12:30, 1.36s/it] 18%|█▊ | 11093/61904 [5:19:45<20:41:16, 1.47s/it] 18%|█▊ | 11094/61904 [5:19:47<20:13:49, 1.43s/it] 18%|█▊ | 11095/61904 [5:19:48<19:54:42, 1.41s/it] 18%|█▊ | 11096/61904 [5:19:49<19:51:15, 1.41s/it] 18%|█▊ | 11097/61904 [5:19:51<19:31:33, 1.38s/it] 18%|█▊ | 11098/61904 [5:19:52<19:33:52, 1.39s/it] 18%|█▊ | 11099/61904 [5:19:54<19:46:19, 1.40s/it] 18%|█▊ | 11100/61904 [5:19:55<20:08:16, 1.43s/it] {'loss': 2.7767, 'learning_rate': 1.8233501879942955e-07, 'epoch': 2.87} 18%|█▊ | 11100/61904 [5:19:55<20:08:16, 1.43s/it] 18%|█▊ | 11101/61904 [5:19:56<19:31:04, 1.38s/it] 18%|█▊ | 11102/61904 [5:19:58<19:48:29, 1.40s/it] 18%|█▊ | 11103/61904 [5:19:59<19:40:32, 1.39s/it] 18%|█▊ | 11104/61904 [5:20:01<19:23:46, 1.37s/it] 18%|█▊ | 11105/61904 [5:20:02<19:10:11, 1.36s/it] 18%|█▊ | 11106/61904 [5:20:03<19:04:58, 1.35s/it] 18%|█▊ | 11107/61904 [5:20:05<19:23:00, 1.37s/it] 18%|█▊ | 11108/61904 [5:20:06<19:05:58, 1.35s/it] 18%|█▊ | 11109/61904 [5:20:07<18:44:51, 1.33s/it] 18%|█▊ | 11110/61904 [5:20:09<19:19:34, 1.37s/it] 18%|█▊ | 11111/61904 [5:20:10<18:48:41, 1.33s/it] 18%|█▊ | 11112/61904 [5:20:11<18:54:28, 1.34s/it] 18%|█▊ | 11113/61904 [5:20:13<18:46:44, 1.33s/it] 18%|█▊ | 11114/61904 [5:20:14<18:24:41, 1.31s/it] 18%|█▊ | 11115/61904 [5:20:15<18:48:28, 1.33s/it] 18%|█▊ | 11116/61904 [5:20:17<18:52:35, 1.34s/it] 18%|█▊ | 11117/61904 [5:20:18<18:39:15, 1.32s/it] 18%|█▊ | 11118/61904 [5:20:19<18:29:08, 1.31s/it] 18%|█▊ | 11119/61904 [5:20:20<18:29:12, 1.31s/it] 18%|█▊ | 11120/61904 [5:20:22<18:29:09, 1.31s/it] {'loss': 2.8456, 'learning_rate': 1.8230260598988718e-07, 'epoch': 2.87} 18%|█▊ | 11120/61904 [5:20:22<18:29:09, 1.31s/it] 18%|█▊ | 11121/61904 [5:20:23<19:05:54, 1.35s/it] 18%|█▊ | 11122/61904 [5:20:24<18:23:27, 1.30s/it] 18%|█▊ | 11123/61904 [5:20:26<19:01:44, 1.35s/it] 18%|█▊ | 11124/61904 [5:20:27<19:04:04, 1.35s/it] 18%|█▊ | 11125/61904 [5:20:29<19:04:22, 1.35s/it] 18%|█▊ | 11126/61904 [5:20:30<19:03:39, 1.35s/it] 18%|█▊ | 11127/61904 [5:20:31<18:40:22, 1.32s/it] 18%|█▊ | 11128/61904 [5:20:32<18:09:04, 1.29s/it] 18%|█▊ | 11129/61904 [5:20:34<18:24:33, 1.31s/it] 18%|█▊ | 11130/61904 [5:20:35<19:19:52, 1.37s/it] 18%|█▊ | 11131/61904 [5:20:37<19:06:47, 1.36s/it] 18%|█▊ | 11132/61904 [5:20:38<18:53:37, 1.34s/it] 18%|█▊ | 11133/61904 [5:20:39<19:41:22, 1.40s/it] 18%|█▊ | 11134/61904 [5:20:41<19:57:45, 1.42s/it] 18%|█▊ | 11135/61904 [5:20:42<19:16:07, 1.37s/it] 18%|█▊ | 11136/61904 [5:20:43<18:48:36, 1.33s/it] 18%|█▊ | 11137/61904 [5:20:45<18:36:47, 1.32s/it] 18%|█▊ | 11138/61904 [5:20:46<18:36:55, 1.32s/it] 18%|█▊ | 11139/61904 [5:20:47<18:39:36, 1.32s/it] 18%|█▊ | 11140/61904 [5:20:49<18:26:15, 1.31s/it] {'loss': 2.7697, 'learning_rate': 1.8227019318034487e-07, 'epoch': 2.88} 18%|█▊ | 11140/61904 [5:20:49<18:26:15, 1.31s/it] 18%|█▊ | 11141/61904 [5:20:50<19:25:09, 1.38s/it] 18%|█▊ | 11142/61904 [5:20:52<19:34:08, 1.39s/it] 18%|█▊ | 11143/61904 [5:20:53<19:17:57, 1.37s/it] 18%|█▊ | 11144/61904 [5:20:54<18:46:29, 1.33s/it] 18%|█▊ | 11145/61904 [5:20:55<18:21:27, 1.30s/it] 18%|█▊ | 11146/61904 [5:20:57<18:31:19, 1.31s/it] 18%|█▊ | 11147/61904 [5:20:58<18:49:09, 1.33s/it] 18%|█▊ | 11148/61904 [5:20:59<19:08:14, 1.36s/it] 18%|█▊ | 11149/61904 [5:21:01<19:17:40, 1.37s/it] 18%|█▊ | 11150/61904 [5:21:02<19:28:44, 1.38s/it] 18%|█▊ | 11151/61904 [5:21:04<19:11:29, 1.36s/it] 18%|█▊ | 11152/61904 [5:21:05<19:38:37, 1.39s/it] 18%|█▊ | 11153/61904 [5:21:06<19:41:05, 1.40s/it] 18%|█▊ | 11154/61904 [5:21:08<20:19:57, 1.44s/it] 18%|█▊ | 11155/61904 [5:21:09<19:33:50, 1.39s/it] 18%|█▊ | 11156/61904 [5:21:11<19:20:47, 1.37s/it] 18%|█▊ | 11157/61904 [5:21:12<19:23:53, 1.38s/it] 18%|█▊ | 11158/61904 [5:21:13<19:30:08, 1.38s/it] 18%|█▊ | 11159/61904 [5:21:15<19:16:55, 1.37s/it] 18%|█▊ | 11160/61904 [5:21:16<19:19:14, 1.37s/it] {'loss': 2.7501, 'learning_rate': 1.8223778037080253e-07, 'epoch': 2.88} 18%|█▊ | 11160/61904 [5:21:16<19:19:14, 1.37s/it] 18%|█▊ | 11161/61904 [5:21:17<19:18:25, 1.37s/it] 18%|█▊ | 11162/61904 [5:21:19<19:37:57, 1.39s/it] 18%|█▊ | 11163/61904 [5:21:20<19:06:16, 1.36s/it] 18%|█▊ | 11164/61904 [5:21:22<19:12:44, 1.36s/it] 18%|█▊ | 11165/61904 [5:21:23<19:05:06, 1.35s/it] 18%|█▊ | 11166/61904 [5:21:24<19:08:18, 1.36s/it] 18%|█▊ | 11167/61904 [5:21:26<19:32:12, 1.39s/it] 18%|█▊ | 11168/61904 [5:21:27<19:19:00, 1.37s/it] 18%|█▊ | 11169/61904 [5:21:28<19:31:29, 1.39s/it] 18%|█▊ | 11170/61904 [5:21:30<21:25:30, 1.52s/it] 18%|█▊ | 11171/61904 [5:21:32<20:28:12, 1.45s/it] 18%|█▊ | 11172/61904 [5:21:33<20:20:04, 1.44s/it] 18%|█▊ | 11173/61904 [5:21:34<20:10:23, 1.43s/it] 18%|█▊ | 11174/61904 [5:21:36<19:50:51, 1.41s/it] 18%|█▊ | 11175/61904 [5:21:37<19:37:16, 1.39s/it] 18%|█▊ | 11176/61904 [5:21:39<20:13:25, 1.44s/it] 18%|█▊ | 11177/61904 [5:21:40<19:59:25, 1.42s/it] 18%|█▊ | 11178/61904 [5:21:41<19:39:48, 1.40s/it] 18%|█▊ | 11179/61904 [5:21:43<19:15:18, 1.37s/it] 18%|█▊ | 11180/61904 [5:21:44<18:46:21, 1.33s/it] {'loss': 2.8102, 'learning_rate': 1.822053675612602e-07, 'epoch': 2.89} 18%|█▊ | 11180/61904 [5:21:44<18:46:21, 1.33s/it] 18%|█▊ | 11181/61904 [5:21:46<19:42:14, 1.40s/it] 18%|█▊ | 11182/61904 [5:21:47<19:32:35, 1.39s/it] 18%|█▊ | 11183/61904 [5:21:48<18:49:44, 1.34s/it] 18%|█▊ | 11184/61904 [5:21:50<19:21:37, 1.37s/it] 18%|█▊ | 11185/61904 [5:21:51<19:49:09, 1.41s/it] 18%|█▊ | 11186/61904 [5:21:52<19:17:47, 1.37s/it] 18%|█▊ | 11187/61904 [5:21:54<19:11:10, 1.36s/it] 18%|█▊ | 11188/61904 [5:21:55<19:30:16, 1.38s/it] 18%|█▊ | 11189/61904 [5:21:56<19:28:02, 1.38s/it] 18%|█▊ | 11190/61904 [5:21:58<19:38:43, 1.39s/it] 18%|█▊ | 11191/61904 [5:21:59<19:16:42, 1.37s/it] 18%|█▊ | 11192/61904 [5:22:01<19:18:25, 1.37s/it] 18%|█▊ | 11193/61904 [5:22:02<19:54:45, 1.41s/it] 18%|█▊ | 11194/61904 [5:22:03<19:49:13, 1.41s/it] 18%|█▊ | 11195/61904 [5:22:05<19:52:50, 1.41s/it] 18%|█▊ | 11196/61904 [5:22:06<20:31:53, 1.46s/it] 18%|█▊ | 11197/61904 [5:22:08<20:00:47, 1.42s/it] 18%|█▊ | 11198/61904 [5:22:09<20:38:16, 1.47s/it] 18%|█▊ | 11199/61904 [5:22:11<20:24:09, 1.45s/it] 18%|█▊ | 11200/61904 [5:22:12<20:23:54, 1.45s/it] {'loss': 2.8303, 'learning_rate': 1.8217295475171788e-07, 'epoch': 2.89} 18%|█▊ | 11200/61904 [5:22:12<20:23:54, 1.45s/it] 18%|█▊ | 11201/61904 [5:22:13<19:32:00, 1.39s/it] 18%|█▊ | 11202/61904 [5:22:15<19:04:07, 1.35s/it] 18%|█▊ | 11203/61904 [5:22:16<18:32:00, 1.32s/it] 18%|█▊ | 11204/61904 [5:22:17<18:48:08, 1.34s/it] 18%|█▊ | 11205/61904 [5:22:19<18:48:58, 1.34s/it] 18%|█▊ | 11206/61904 [5:22:20<19:45:21, 1.40s/it] 18%|█▊ | 11207/61904 [5:22:22<20:10:23, 1.43s/it] 18%|█▊ | 11208/61904 [5:22:23<20:15:07, 1.44s/it] 18%|█▊ | 11209/61904 [5:22:24<19:31:23, 1.39s/it] 18%|█▊ | 11210/61904 [5:22:26<19:30:28, 1.39s/it] 18%|█▊ | 11211/61904 [5:22:27<19:11:05, 1.36s/it] 18%|█▊ | 11212/61904 [5:22:29<19:25:28, 1.38s/it] 18%|█▊ | 11213/61904 [5:22:30<19:45:09, 1.40s/it] 18%|█▊ | 11214/61904 [5:22:31<19:06:06, 1.36s/it] 18%|█▊ | 11215/61904 [5:22:33<19:42:04, 1.40s/it] 18%|█▊ | 11216/61904 [5:22:34<19:17:59, 1.37s/it] 18%|█▊ | 11217/61904 [5:22:36<20:14:53, 1.44s/it] 18%|█▊ | 11218/61904 [5:22:37<19:39:42, 1.40s/it] 18%|█▊ | 11219/61904 [5:22:38<19:14:13, 1.37s/it] 18%|█▊ | 11220/61904 [5:22:40<19:19:09, 1.37s/it] {'loss': 2.8061, 'learning_rate': 1.8214054194217554e-07, 'epoch': 2.9} 18%|█▊ | 11220/61904 [5:22:40<19:19:09, 1.37s/it] 18%|█▊ | 11221/61904 [5:22:41<19:44:58, 1.40s/it] 18%|█▊ | 11222/61904 [5:22:42<19:30:08, 1.39s/it] 18%|█▊ | 11223/61904 [5:22:44<20:08:53, 1.43s/it] 18%|█▊ | 11224/61904 [5:22:45<19:43:19, 1.40s/it] 18%|█▊ | 11225/61904 [5:22:47<19:49:40, 1.41s/it] 18%|█▊ | 11226/61904 [5:22:48<19:55:31, 1.42s/it] 18%|█▊ | 11227/61904 [5:22:50<19:39:30, 1.40s/it] 18%|█▊ | 11228/61904 [5:22:51<19:15:58, 1.37s/it] 18%|█▊ | 11229/61904 [5:22:52<19:25:21, 1.38s/it] 18%|█▊ | 11230/61904 [5:22:54<20:11:20, 1.43s/it] 18%|█▊ | 11231/61904 [5:22:55<19:18:27, 1.37s/it] 18%|█▊ | 11232/61904 [5:22:56<18:46:06, 1.33s/it] 18%|█▊ | 11233/61904 [5:22:58<18:31:34, 1.32s/it] 18%|█▊ | 11234/61904 [5:22:59<18:33:06, 1.32s/it] 18%|█▊ | 11235/61904 [5:23:00<18:46:26, 1.33s/it] 18%|█▊ | 11236/61904 [5:23:02<19:19:47, 1.37s/it] 18%|█▊ | 11237/61904 [5:23:03<19:08:10, 1.36s/it] 18%|█▊ | 11238/61904 [5:23:05<19:35:09, 1.39s/it] 18%|█▊ | 11239/61904 [5:23:06<19:23:37, 1.38s/it] 18%|█▊ | 11240/61904 [5:23:07<18:49:20, 1.34s/it] {'loss': 2.853, 'learning_rate': 1.821081291326332e-07, 'epoch': 2.9} 18%|█▊ | 11240/61904 [5:23:07<18:49:20, 1.34s/it] 18%|█▊ | 11241/61904 [5:23:08<18:26:21, 1.31s/it] 18%|█▊ | 11242/61904 [5:23:10<18:22:07, 1.31s/it] 18%|█▊ | 11243/61904 [5:23:11<19:26:31, 1.38s/it] 18%|█▊ | 11244/61904 [5:23:13<19:30:10, 1.39s/it] 18%|█▊ | 11245/61904 [5:23:14<18:50:01, 1.34s/it] 18%|█▊ | 11246/61904 [5:23:15<19:25:16, 1.38s/it] 18%|█▊ | 11247/61904 [5:23:17<19:27:15, 1.38s/it] 18%|█▊ | 11248/61904 [5:23:18<19:38:31, 1.40s/it] 18%|█▊ | 11249/61904 [5:23:20<19:50:21, 1.41s/it] 18%|█▊ | 11250/61904 [5:23:21<19:40:50, 1.40s/it] 18%|█▊ | 11251/61904 [5:23:23<20:37:09, 1.47s/it] 18%|█▊ | 11252/61904 [5:23:24<21:04:32, 1.50s/it] 18%|█▊ | 11253/61904 [5:23:26<21:08:10, 1.50s/it] 18%|█▊ | 11254/61904 [5:23:27<20:19:45, 1.44s/it] 18%|█▊ | 11255/61904 [5:23:28<20:18:11, 1.44s/it] 18%|█▊ | 11256/61904 [5:23:30<19:17:59, 1.37s/it] 18%|█▊ | 11257/61904 [5:23:31<18:46:09, 1.33s/it] 18%|█▊ | 11258/61904 [5:23:32<18:42:46, 1.33s/it] 18%|█▊ | 11259/61904 [5:23:33<18:37:24, 1.32s/it] 18%|█▊ | 11260/61904 [5:23:35<18:54:13, 1.34s/it] {'loss': 2.8419, 'learning_rate': 1.820757163230909e-07, 'epoch': 2.91} 18%|█▊ | 11260/61904 [5:23:35<18:54:13, 1.34s/it] 18%|█▊ | 11261/61904 [5:23:36<19:11:53, 1.36s/it] 18%|█▊ | 11262/61904 [5:23:38<19:11:39, 1.36s/it] 18%|█▊ | 11263/61904 [5:23:39<19:12:10, 1.37s/it] 18%|█▊ | 11264/61904 [5:23:41<19:47:00, 1.41s/it] 18%|█▊ | 11265/61904 [5:23:42<19:21:26, 1.38s/it] 18%|█▊ | 11266/61904 [5:23:43<19:35:45, 1.39s/it] 18%|█▊ | 11267/61904 [5:23:45<19:35:47, 1.39s/it] 18%|█▊ | 11268/61904 [5:23:46<19:12:17, 1.37s/it] 18%|█▊ | 11269/61904 [5:23:47<18:39:40, 1.33s/it] 18%|█▊ | 11270/61904 [5:23:49<18:45:58, 1.33s/it] 18%|█▊ | 11271/61904 [5:23:50<19:23:57, 1.38s/it] 18%|█▊ | 11272/61904 [5:23:52<20:06:24, 1.43s/it] 18%|█▊ | 11273/61904 [5:23:53<20:38:29, 1.47s/it] 18%|█▊ | 11274/61904 [5:23:55<20:43:45, 1.47s/it] 18%|█▊ | 11275/61904 [5:23:56<20:14:52, 1.44s/it] 18%|█▊ | 11276/61904 [5:23:57<20:10:16, 1.43s/it] 18%|█▊ | 11277/61904 [5:23:59<19:37:28, 1.40s/it] 18%|█▊ | 11278/61904 [5:24:00<19:20:58, 1.38s/it] 18%|█▊ | 11279/61904 [5:24:02<19:44:29, 1.40s/it] 18%|█▊ | 11280/61904 [5:24:03<19:37:04, 1.40s/it] {'loss': 2.8818, 'learning_rate': 1.8204330351354856e-07, 'epoch': 2.92} 18%|█▊ | 11280/61904 [5:24:03<19:37:04, 1.40s/it] 18%|█▊ | 11281/61904 [5:24:04<19:40:14, 1.40s/it] 18%|█▊ | 11282/61904 [5:24:06<19:35:44, 1.39s/it] 18%|█▊ | 11283/61904 [5:24:07<19:50:53, 1.41s/it] 18%|█▊ | 11284/61904 [5:24:08<19:08:37, 1.36s/it] 18%|█▊ | 11285/61904 [5:24:10<19:56:38, 1.42s/it] 18%|█▊ | 11286/61904 [5:24:11<19:40:47, 1.40s/it] 18%|█▊ | 11287/61904 [5:24:13<19:17:38, 1.37s/it] 18%|█▊ | 11288/61904 [5:24:14<19:39:51, 1.40s/it] 18%|█▊ | 11289/61904 [5:24:16<19:58:07, 1.42s/it] 18%|█▊ | 11290/61904 [5:24:17<19:35:59, 1.39s/it] 18%|█▊ | 11291/61904 [5:24:18<20:27:31, 1.46s/it] 18%|█▊ | 11292/61904 [5:24:20<20:39:44, 1.47s/it] 18%|█▊ | 11293/61904 [5:24:21<20:34:59, 1.46s/it] 18%|█▊ | 11294/61904 [5:24:23<20:25:03, 1.45s/it] 18%|█▊ | 11295/61904 [5:24:24<20:07:28, 1.43s/it] 18%|█▊ | 11296/61904 [5:24:26<19:58:26, 1.42s/it] 18%|█▊ | 11297/61904 [5:24:27<20:18:22, 1.44s/it] 18%|█▊ | 11298/61904 [5:24:29<20:04:41, 1.43s/it] 18%|█▊ | 11299/61904 [5:24:30<19:35:01, 1.39s/it] 18%|█▊ | 11300/61904 [5:24:31<19:54:03, 1.42s/it] {'loss': 2.7992, 'learning_rate': 1.8201089070400622e-07, 'epoch': 2.92} 18%|█▊ | 11300/61904 [5:24:31<19:54:03, 1.42s/it] 18%|█▊ | 11301/61904 [5:24:33<19:39:54, 1.40s/it] 18%|█▊ | 11302/61904 [5:24:34<19:32:52, 1.39s/it] 18%|█▊ | 11303/61904 [5:24:35<19:49:47, 1.41s/it] 18%|█▊ | 11304/61904 [5:24:37<19:46:16, 1.41s/it] 18%|█▊ | 11305/61904 [5:24:38<19:38:53, 1.40s/it] 18%|█▊ | 11306/61904 [5:24:40<20:35:09, 1.46s/it] 18%|█▊ | 11307/61904 [5:24:41<20:13:12, 1.44s/it] 18%|█▊ | 11308/61904 [5:24:43<20:03:59, 1.43s/it] 18%|█▊ | 11309/61904 [5:24:44<19:30:13, 1.39s/it] 18%|█▊ | 11310/61904 [5:24:45<20:00:58, 1.42s/it] 18%|█▊ | 11311/61904 [5:24:47<19:55:41, 1.42s/it] 18%|█▊ | 11312/61904 [5:24:48<19:32:43, 1.39s/it] 18%|█▊ | 11313/61904 [5:24:50<19:16:49, 1.37s/it] 18%|█▊ | 11314/61904 [5:24:51<19:10:40, 1.36s/it] 18%|█▊ | 11315/61904 [5:24:52<19:29:52, 1.39s/it] 18%|█▊ | 11316/61904 [5:24:54<19:40:09, 1.40s/it] 18%|█▊ | 11317/61904 [5:24:55<19:21:51, 1.38s/it] 18%|█▊ | 11318/61904 [5:24:56<19:35:36, 1.39s/it] 18%|█▊ | 11319/61904 [5:24:58<19:43:53, 1.40s/it] 18%|█▊ | 11320/61904 [5:24:59<19:30:26, 1.39s/it] {'loss': 2.8553, 'learning_rate': 1.8197847789446388e-07, 'epoch': 2.93} 18%|█▊ | 11320/61904 [5:24:59<19:30:26, 1.39s/it] 18%|█▊ | 11321/61904 [5:25:01<19:22:06, 1.38s/it] 18%|█▊ | 11322/61904 [5:25:02<19:32:07, 1.39s/it] 18%|█▊ | 11323/61904 [5:25:04<19:57:15, 1.42s/it] 18%|█▊ | 11324/61904 [5:25:05<19:49:26, 1.41s/it] 18%|█▊ | 11325/61904 [5:25:06<20:06:24, 1.43s/it] 18%|█▊ | 11326/61904 [5:25:08<19:12:38, 1.37s/it] 18%|█▊ | 11327/61904 [5:25:09<19:26:51, 1.38s/it] 18%|█▊ | 11328/61904 [5:25:10<19:20:05, 1.38s/it] 18%|█▊ | 11329/61904 [5:25:12<19:25:51, 1.38s/it] 18%|█▊ | 11330/61904 [5:25:13<19:44:44, 1.41s/it] 18%|█▊ | 11331/61904 [5:25:15<19:05:06, 1.36s/it] 18%|█▊ | 11332/61904 [5:25:16<19:31:06, 1.39s/it] 18%|█▊ | 11333/61904 [5:25:17<19:19:38, 1.38s/it] 18%|█▊ | 11334/61904 [5:25:19<19:15:43, 1.37s/it] 18%|█▊ | 11335/61904 [5:25:20<18:55:34, 1.35s/it] 18%|█▊ | 11336/61904 [5:25:21<19:23:16, 1.38s/it] 18%|█▊ | 11337/61904 [5:25:23<20:06:48, 1.43s/it] 18%|█▊ | 11338/61904 [5:25:24<19:51:15, 1.41s/it] 18%|█▊ | 11339/61904 [5:25:26<20:10:05, 1.44s/it] 18%|█▊ | 11340/61904 [5:25:27<19:48:47, 1.41s/it] {'loss': 2.7901, 'learning_rate': 1.8194606508492154e-07, 'epoch': 2.93} 18%|█▊ | 11340/61904 [5:25:27<19:48:47, 1.41s/it] 18%|█▊ | 11341/61904 [5:25:29<19:32:06, 1.39s/it] 18%|█▊ | 11342/61904 [5:25:30<19:12:02, 1.37s/it] 18%|█▊ | 11343/61904 [5:25:31<19:03:05, 1.36s/it] 18%|█▊ | 11344/61904 [5:25:32<18:44:05, 1.33s/it] 18%|█▊ | 11345/61904 [5:25:34<18:46:14, 1.34s/it] 18%|█▊ | 11346/61904 [5:25:35<19:16:53, 1.37s/it] 18%|█▊ | 11347/61904 [5:25:37<19:03:11, 1.36s/it] 18%|█▊ | 11348/61904 [5:25:38<19:07:12, 1.36s/it] 18%|█▊ | 11349/61904 [5:25:39<18:46:47, 1.34s/it] 18%|█▊ | 11350/61904 [5:25:40<18:20:58, 1.31s/it] 18%|█▊ | 11351/61904 [5:25:42<18:17:26, 1.30s/it] 18%|█▊ | 11352/61904 [5:25:43<18:47:35, 1.34s/it] 18%|█▊ | 11353/61904 [5:25:45<19:06:26, 1.36s/it] 18%|█▊ | 11354/61904 [5:25:46<18:57:48, 1.35s/it] 18%|█▊ | 11355/61904 [5:25:47<19:03:18, 1.36s/it] 18%|█▊ | 11356/61904 [5:25:49<18:53:19, 1.35s/it] 18%|█▊ | 11357/61904 [5:25:50<19:19:59, 1.38s/it] 18%|█▊ | 11358/61904 [5:25:52<20:02:15, 1.43s/it] 18%|█▊ | 11359/61904 [5:25:53<19:46:59, 1.41s/it] 18%|█▊ | 11360/61904 [5:25:54<20:04:14, 1.43s/it] {'loss': 2.7795, 'learning_rate': 1.8191365227537923e-07, 'epoch': 2.94} 18%|█▊ | 11360/61904 [5:25:54<20:04:14, 1.43s/it] 18%|█▊ | 11361/61904 [5:25:56<19:59:28, 1.42s/it] 18%|█▊ | 11362/61904 [5:25:57<20:00:39, 1.43s/it] 18%|█▊ | 11363/61904 [5:25:59<20:35:31, 1.47s/it] 18%|█▊ | 11364/61904 [5:26:00<19:38:06, 1.40s/it] 18%|█▊ | 11365/61904 [5:26:01<19:26:09, 1.38s/it] 18%|█▊ | 11366/61904 [5:26:03<19:35:32, 1.40s/it] 18%|█▊ | 11367/61904 [5:26:04<19:07:27, 1.36s/it] 18%|█▊ | 11368/61904 [5:26:06<19:25:45, 1.38s/it] 18%|█▊ | 11369/61904 [5:26:07<19:55:58, 1.42s/it] 18%|█▊ | 11370/61904 [5:26:09<20:22:29, 1.45s/it] 18%|█▊ | 11371/61904 [5:26:10<19:57:58, 1.42s/it] 18%|█▊ | 11372/61904 [5:26:11<19:48:00, 1.41s/it] 18%|█▊ | 11373/61904 [5:26:13<19:33:44, 1.39s/it] 18%|█▊ | 11374/61904 [5:26:14<19:35:45, 1.40s/it] 18%|█▊ | 11375/61904 [5:26:15<19:16:45, 1.37s/it] 18%|█▊ | 11376/61904 [5:26:17<19:31:38, 1.39s/it] 18%|█▊ | 11377/61904 [5:26:18<19:19:04, 1.38s/it] 18%|█▊ | 11378/61904 [5:26:20<19:39:33, 1.40s/it] 18%|█▊ | 11379/61904 [5:26:21<19:46:31, 1.41s/it] 18%|█▊ | 11380/61904 [5:26:22<19:44:20, 1.41s/it] {'loss': 2.7977, 'learning_rate': 1.818812394658369e-07, 'epoch': 2.94} 18%|█▊ | 11380/61904 [5:26:22<19:44:20, 1.41s/it] 18%|█▊ | 11381/61904 [5:26:24<19:45:56, 1.41s/it] 18%|█▊ | 11382/61904 [5:26:25<19:35:45, 1.40s/it] 18%|█▊ | 11383/61904 [5:26:27<20:00:12, 1.43s/it] 18%|█▊ | 11384/61904 [5:26:28<19:54:24, 1.42s/it] 18%|█▊ | 11385/61904 [5:26:29<19:28:25, 1.39s/it] 18%|█▊ | 11386/61904 [5:26:31<19:38:53, 1.40s/it] 18%|█▊ | 11387/61904 [5:26:32<19:32:21, 1.39s/it] 18%|█▊ | 11388/61904 [5:26:34<19:37:57, 1.40s/it] 18%|█▊ | 11389/61904 [5:26:35<19:48:10, 1.41s/it] 18%|█▊ | 11390/61904 [5:26:37<19:52:41, 1.42s/it] 18%|█▊ | 11391/61904 [5:26:38<19:54:09, 1.42s/it] 18%|█▊ | 11392/61904 [5:26:40<20:22:33, 1.45s/it] 18%|█▊ | 11393/61904 [5:26:41<20:34:22, 1.47s/it] 18%|█▊ | 11394/61904 [5:26:42<19:54:52, 1.42s/it] 18%|█▊ | 11395/61904 [5:26:44<19:35:02, 1.40s/it] 18%|█▊ | 11396/61904 [5:26:45<19:36:58, 1.40s/it] 18%|█▊ | 11397/61904 [5:26:46<19:26:29, 1.39s/it] 18%|█▊ | 11398/61904 [5:26:48<19:30:31, 1.39s/it] 18%|█▊ | 11399/61904 [5:26:49<18:56:29, 1.35s/it] 18%|█▊ | 11400/61904 [5:26:50<19:05:53, 1.36s/it] {'loss': 2.7824, 'learning_rate': 1.8184882665629455e-07, 'epoch': 2.95} 18%|█▊ | 11400/61904 [5:26:50<19:05:53, 1.36s/it] 18%|█▊ | 11401/61904 [5:26:52<19:15:30, 1.37s/it] 18%|█▊ | 11402/61904 [5:26:53<19:19:38, 1.38s/it] 18%|█▊ | 11403/61904 [5:26:55<20:00:35, 1.43s/it] 18%|█▊ | 11404/61904 [5:26:56<20:03:44, 1.43s/it] 18%|█▊ | 11405/61904 [5:26:58<19:40:24, 1.40s/it] 18%|█▊ | 11406/61904 [5:26:59<19:27:25, 1.39s/it] 18%|█▊ | 11407/61904 [5:27:00<19:32:56, 1.39s/it] 18%|█▊ | 11408/61904 [5:27:02<19:28:20, 1.39s/it] 18%|█▊ | 11409/61904 [5:27:03<19:41:42, 1.40s/it] 18%|█▊ | 11410/61904 [5:27:04<18:46:29, 1.34s/it] 18%|█▊ | 11411/61904 [5:27:06<18:41:09, 1.33s/it] 18%|█▊ | 11412/61904 [5:27:07<19:04:00, 1.36s/it] 18%|█▊ | 11413/61904 [5:27:09<19:24:10, 1.38s/it] 18%|█▊ | 11414/61904 [5:27:10<19:33:26, 1.39s/it] 18%|█▊ | 11415/61904 [5:27:11<19:34:05, 1.40s/it] 18%|█▊ | 11416/61904 [5:27:13<19:22:02, 1.38s/it] 18%|█▊ | 11417/61904 [5:27:14<18:44:50, 1.34s/it] 18%|█▊ | 11418/61904 [5:27:15<18:50:09, 1.34s/it] 18%|█▊ | 11419/61904 [5:27:17<20:02:44, 1.43s/it] 18%|█▊ | 11420/61904 [5:27:18<19:33:21, 1.39s/it] {'loss': 2.7671, 'learning_rate': 1.8181641384675224e-07, 'epoch': 2.95} 18%|█▊ | 11420/61904 [5:27:18<19:33:21, 1.39s/it] 18%|█▊ | 11421/61904 [5:27:20<19:10:18, 1.37s/it] 18%|█▊ | 11422/61904 [5:27:21<18:57:02, 1.35s/it] 18%|█▊ | 11423/61904 [5:27:22<19:13:15, 1.37s/it] 18%|█▊ | 11424/61904 [5:27:24<19:46:48, 1.41s/it] 18%|█▊ | 11425/61904 [5:27:25<19:19:56, 1.38s/it] 18%|█▊ | 11426/61904 [5:27:26<18:47:20, 1.34s/it] 18%|█▊ | 11427/61904 [5:27:28<18:55:58, 1.35s/it] 18%|█▊ | 11428/61904 [5:27:29<19:35:06, 1.40s/it] 18%|█▊ | 11429/61904 [5:27:31<19:36:00, 1.40s/it] 18%|█▊ | 11430/61904 [5:27:32<19:46:52, 1.41s/it] 18%|█▊ | 11431/61904 [5:27:33<19:07:25, 1.36s/it] 18%|█▊ | 11432/61904 [5:27:35<18:57:07, 1.35s/it] 18%|█▊ | 11433/61904 [5:27:36<19:25:29, 1.39s/it] 18%|█▊ | 11434/61904 [5:27:37<19:08:33, 1.37s/it] 18%|█▊ | 11435/61904 [5:27:39<18:58:46, 1.35s/it] 18%|█▊ | 11436/61904 [5:27:40<18:57:55, 1.35s/it] 18%|█▊ | 11437/61904 [5:27:41<18:59:02, 1.35s/it] 18%|█▊ | 11438/61904 [5:27:43<19:04:56, 1.36s/it] 18%|█▊ | 11439/61904 [5:27:44<20:09:59, 1.44s/it] 18%|█▊ | 11440/61904 [5:27:46<20:02:05, 1.43s/it] {'loss': 2.8553, 'learning_rate': 1.817840010372099e-07, 'epoch': 2.96} 18%|█▊ | 11440/61904 [5:27:46<20:02:05, 1.43s/it] 18%|█▊ | 11441/61904 [5:27:47<20:22:59, 1.45s/it] 18%|█▊ | 11442/61904 [5:27:49<19:37:48, 1.40s/it] 18%|█▊ | 11443/61904 [5:27:50<19:20:34, 1.38s/it] 18%|█▊ | 11444/61904 [5:27:51<19:46:39, 1.41s/it] 18%|█▊ | 11445/61904 [5:27:53<19:47:10, 1.41s/it] 18%|█▊ | 11446/61904 [5:27:54<20:15:43, 1.45s/it] 18%|█▊ | 11447/61904 [5:27:56<19:56:56, 1.42s/it] 18%|█▊ | 11448/61904 [5:27:57<19:44:40, 1.41s/it] 18%|█▊ | 11449/61904 [5:27:58<18:54:42, 1.35s/it] 18%|█▊ | 11450/61904 [5:28:00<19:41:29, 1.41s/it] 18%|█▊ | 11451/61904 [5:28:01<19:17:15, 1.38s/it] 18%|█▊ | 11452/61904 [5:28:03<19:23:17, 1.38s/it] 19%|█▊ | 11453/61904 [5:28:04<19:35:39, 1.40s/it] 19%|█▊ | 11454/61904 [5:28:06<19:59:33, 1.43s/it] 19%|█▊ | 11455/61904 [5:28:07<19:29:01, 1.39s/it] 19%|█▊ | 11456/61904 [5:28:08<19:03:34, 1.36s/it] 19%|█▊ | 11457/61904 [5:28:09<18:30:51, 1.32s/it] 19%|█▊ | 11458/61904 [5:28:11<18:39:12, 1.33s/it] 19%|█▊ | 11459/61904 [5:28:12<18:48:07, 1.34s/it] 19%|█▊ | 11460/61904 [5:28:14<19:38:48, 1.40s/it] {'loss': 2.7253, 'learning_rate': 1.8175158822766757e-07, 'epoch': 2.96} 19%|█▊ | 11460/61904 [5:28:14<19:38:48, 1.40s/it] 19%|█▊ | 11461/61904 [5:28:15<19:05:58, 1.36s/it] 19%|█▊ | 11462/61904 [5:28:16<19:01:07, 1.36s/it] 19%|█▊ | 11463/61904 [5:28:17<18:35:45, 1.33s/it] 19%|█▊ | 11464/61904 [5:28:19<18:32:28, 1.32s/it] 19%|█▊ | 11465/61904 [5:28:20<18:23:50, 1.31s/it] 19%|█▊ | 11466/61904 [5:28:21<18:49:43, 1.34s/it] 19%|█▊ | 11467/61904 [5:28:23<18:45:20, 1.34s/it] 19%|█▊ | 11468/61904 [5:28:24<19:00:06, 1.36s/it] 19%|█▊ | 11469/61904 [5:28:26<19:01:49, 1.36s/it] 19%|█▊ | 11470/61904 [5:28:27<19:19:12, 1.38s/it] 19%|█▊ | 11471/61904 [5:28:28<18:53:57, 1.35s/it] 19%|█▊ | 11472/61904 [5:28:30<19:28:42, 1.39s/it] 19%|█▊ | 11473/61904 [5:28:31<19:13:51, 1.37s/it] 19%|█▊ | 11474/61904 [5:28:33<19:30:36, 1.39s/it] 19%|█▊ | 11475/61904 [5:28:34<19:18:39, 1.38s/it] 19%|█▊ | 11476/61904 [5:28:35<20:01:46, 1.43s/it] 19%|█▊ | 11477/61904 [5:28:37<19:37:02, 1.40s/it] 19%|█▊ | 11478/61904 [5:28:38<18:45:35, 1.34s/it] 19%|█▊ | 11479/61904 [5:28:39<18:58:21, 1.35s/it] 19%|█▊ | 11480/61904 [5:28:41<18:58:13, 1.35s/it] {'loss': 2.8338, 'learning_rate': 1.8171917541812525e-07, 'epoch': 2.97} 19%|█▊ | 11480/61904 [5:28:41<18:58:13, 1.35s/it] 19%|█▊ | 11481/61904 [5:28:42<19:08:13, 1.37s/it] 19%|█▊ | 11482/61904 [5:28:43<18:57:01, 1.35s/it] 19%|█▊ | 11483/61904 [5:28:45<19:05:33, 1.36s/it] 19%|█▊ | 11484/61904 [5:28:46<18:44:07, 1.34s/it] 19%|█▊ | 11485/61904 [5:28:47<18:26:35, 1.32s/it] 19%|█▊ | 11486/61904 [5:28:49<18:36:40, 1.33s/it] 19%|█▊ | 11487/61904 [5:28:50<18:26:11, 1.32s/it] 19%|█▊ | 11488/61904 [5:28:51<18:39:38, 1.33s/it] 19%|█▊ | 11489/61904 [5:28:53<18:57:02, 1.35s/it] 19%|█▊ | 11490/61904 [5:28:54<18:35:46, 1.33s/it] 19%|█▊ | 11491/61904 [5:28:55<18:43:33, 1.34s/it] 19%|█▊ | 11492/61904 [5:28:57<18:55:36, 1.35s/it] 19%|█▊ | 11493/61904 [5:28:58<18:46:17, 1.34s/it] 19%|█▊ | 11494/61904 [5:29:00<19:03:32, 1.36s/it] 19%|█▊ | 11495/61904 [5:29:01<18:38:37, 1.33s/it] 19%|█▊ | 11496/61904 [5:29:02<18:55:56, 1.35s/it] 19%|█▊ | 11497/61904 [5:29:03<18:08:49, 1.30s/it] 19%|█▊ | 11498/61904 [5:29:05<19:21:11, 1.38s/it] 19%|█▊ | 11499/61904 [5:29:06<19:31:01, 1.39s/it] 19%|█▊ | 11500/61904 [5:29:08<19:56:06, 1.42s/it] {'loss': 2.7843, 'learning_rate': 1.816867626085829e-07, 'epoch': 2.97} 19%|█▊ | 11500/61904 [5:29:08<19:56:06, 1.42s/it] 19%|█▊ | 11501/61904 [5:29:09<19:51:25, 1.42s/it] 19%|█▊ | 11502/61904 [5:29:11<19:41:03, 1.41s/it] 19%|█▊ | 11503/61904 [5:29:12<20:18:35, 1.45s/it] 19%|█▊ | 11504/61904 [5:29:14<19:49:26, 1.42s/it] 19%|█▊ | 11505/61904 [5:29:15<19:11:33, 1.37s/it] 19%|█▊ | 11506/61904 [5:29:16<19:16:20, 1.38s/it] 19%|█▊ | 11507/61904 [5:29:18<19:14:38, 1.37s/it] 19%|█▊ | 11508/61904 [5:29:19<18:49:02, 1.34s/it] 19%|█▊ | 11509/61904 [5:29:20<19:24:25, 1.39s/it] 19%|█▊ | 11510/61904 [5:29:22<18:46:29, 1.34s/it] 19%|█▊ | 11511/61904 [5:29:23<18:57:36, 1.35s/it] 19%|█▊ | 11512/61904 [5:29:24<18:42:40, 1.34s/it] 19%|█▊ | 11513/61904 [5:29:26<19:44:54, 1.41s/it] 19%|█▊ | 11514/61904 [5:29:27<19:27:42, 1.39s/it] 19%|█▊ | 11515/61904 [5:29:29<19:23:00, 1.38s/it] 19%|█▊ | 11516/61904 [5:29:30<19:19:12, 1.38s/it] 19%|█▊ | 11517/61904 [5:29:31<18:26:03, 1.32s/it] 19%|█▊ | 11518/61904 [5:29:32<18:23:35, 1.31s/it] 19%|█▊ | 11519/61904 [5:29:34<18:42:16, 1.34s/it] 19%|█▊ | 11520/61904 [5:29:35<18:49:49, 1.35s/it] {'loss': 2.8181, 'learning_rate': 1.8165434979904058e-07, 'epoch': 2.98} 19%|█▊ | 11520/61904 [5:29:35<18:49:49, 1.35s/it] 19%|█▊ | 11521/61904 [5:29:37<19:07:12, 1.37s/it] 19%|█▊ | 11522/61904 [5:29:38<19:25:31, 1.39s/it] 19%|█▊ | 11523/61904 [5:29:39<19:01:35, 1.36s/it] 19%|█▊ | 11524/61904 [5:29:40<18:29:12, 1.32s/it] 19%|█▊ | 11525/61904 [5:29:42<19:20:38, 1.38s/it] 19%|█▊ | 11526/61904 [5:29:43<19:01:36, 1.36s/it] 19%|█▊ | 11527/61904 [5:29:45<19:39:32, 1.40s/it] 19%|█▊ | 11528/61904 [5:29:46<19:38:15, 1.40s/it] 19%|█▊ | 11529/61904 [5:29:48<19:50:35, 1.42s/it] 19%|█▊ | 11530/61904 [5:29:49<19:15:23, 1.38s/it] 19%|█▊ | 11531/61904 [5:29:50<18:54:28, 1.35s/it] 19%|█▊ | 11532/61904 [5:29:52<19:09:57, 1.37s/it] 19%|█▊ | 11533/61904 [5:29:53<19:09:38, 1.37s/it] 19%|█▊ | 11534/61904 [5:29:54<19:11:20, 1.37s/it] 19%|█▊ | 11535/61904 [5:29:56<18:42:55, 1.34s/it] 19%|█▊ | 11536/61904 [5:29:57<19:21:56, 1.38s/it] 19%|█▊ | 11537/61904 [5:29:59<19:36:38, 1.40s/it] 19%|█▊ | 11538/61904 [5:30:00<19:42:05, 1.41s/it] 19%|█▊ | 11539/61904 [5:30:01<19:31:29, 1.40s/it] 19%|█▊ | 11540/61904 [5:30:03<19:16:06, 1.38s/it] {'loss': 2.7918, 'learning_rate': 1.8162193698949824e-07, 'epoch': 2.98} 19%|█▊ | 11540/61904 [5:30:03<19:16:06, 1.38s/it] 19%|█▊ | 11541/61904 [5:30:04<19:07:45, 1.37s/it] 19%|█▊ | 11542/61904 [5:30:06<19:59:29, 1.43s/it] 19%|█▊ | 11543/61904 [5:30:07<19:44:25, 1.41s/it] 19%|█▊ | 11544/61904 [5:30:08<19:12:44, 1.37s/it] 19%|█▊ | 11545/61904 [5:30:10<18:49:06, 1.35s/it] 19%|█▊ | 11546/61904 [5:30:11<18:34:09, 1.33s/it] 19%|█▊ | 11547/61904 [5:30:12<18:40:43, 1.34s/it] 19%|█▊ | 11548/61904 [5:30:14<19:06:24, 1.37s/it] 19%|█▊ | 11549/61904 [5:30:15<18:57:15, 1.36s/it] 19%|█▊ | 11550/61904 [5:30:16<19:14:27, 1.38s/it] 19%|█▊ | 11551/61904 [5:30:18<19:19:56, 1.38s/it] 19%|█▊ | 11552/61904 [5:30:19<19:14:51, 1.38s/it] 19%|█▊ | 11553/61904 [5:30:21<19:58:11, 1.43s/it] 19%|█▊ | 11554/61904 [5:30:22<18:51:14, 1.35s/it] 19%|█▊ | 11555/61904 [5:30:23<18:53:18, 1.35s/it] 19%|█▊ | 11556/61904 [5:30:24<18:20:27, 1.31s/it] 19%|█▊ | 11557/61904 [5:30:26<18:18:46, 1.31s/it] 19%|█▊ | 11558/61904 [5:30:27<18:30:29, 1.32s/it] 19%|█▊ | 11559/61904 [5:30:29<19:05:18, 1.36s/it] 19%|█▊ | 11560/61904 [5:30:30<18:49:57, 1.35s/it] {'loss': 2.8051, 'learning_rate': 1.815895241799559e-07, 'epoch': 2.99} 19%|█▊ | 11560/61904 [5:30:30<18:49:57, 1.35s/it] 19%|█▊ | 11561/61904 [5:30:31<19:06:56, 1.37s/it] 19%|█▊ | 11562/61904 [5:30:33<19:17:42, 1.38s/it] 19%|█▊ | 11563/61904 [5:30:34<19:34:08, 1.40s/it] 19%|█▊ | 11564/61904 [5:30:36<20:11:21, 1.44s/it] 19%|█▊ | 11565/61904 [5:30:37<21:04:19, 1.51s/it] 19%|█▊ | 11566/61904 [5:30:39<20:11:10, 1.44s/it] 19%|█▊ | 11567/61904 [5:30:40<19:50:05, 1.42s/it] 19%|█▊ | 11568/61904 [5:30:41<19:13:13, 1.37s/it] 19%|█▊ | 11569/61904 [5:30:43<19:40:07, 1.41s/it] 19%|█▊ | 11570/61904 [5:30:44<19:08:28, 1.37s/it] 19%|█▊ | 11571/61904 [5:30:45<18:51:07, 1.35s/it] 19%|█▊ | 11572/61904 [5:30:47<18:42:00, 1.34s/it] 19%|█▊ | 11573/61904 [5:30:48<18:16:24, 1.31s/it] 19%|█▊ | 11574/61904 [5:30:49<18:29:35, 1.32s/it] 19%|█▊ | 11575/61904 [5:30:51<18:39:35, 1.33s/it] 19%|█▊ | 11576/61904 [5:30:52<18:56:09, 1.35s/it] 19%|█▊ | 11577/61904 [5:30:53<18:25:58, 1.32s/it] 19%|█▊ | 11578/61904 [5:30:55<18:28:08, 1.32s/it] 19%|█▊ | 11579/61904 [5:30:56<18:29:51, 1.32s/it] 19%|█▊ | 11580/61904 [5:30:57<18:31:42, 1.33s/it] {'loss': 2.8276, 'learning_rate': 1.815571113704136e-07, 'epoch': 2.99} 19%|█▊ | 11580/61904 [5:30:57<18:31:42, 1.33s/it] 19%|█▊ | 11581/61904 [5:30:59<18:39:59, 1.34s/it] 19%|█▊ | 11582/61904 [5:31:00<19:21:04, 1.38s/it] 19%|█▊ | 11583/61904 [5:31:02<21:30:21, 1.54s/it] 19%|█▊ | 11584/61904 [5:31:03<20:38:07, 1.48s/it] 19%|█▊ | 11585/61904 [5:31:05<20:54:35, 1.50s/it] 19%|█▊ | 11586/61904 [5:31:06<19:56:48, 1.43s/it] 19%|█▊ | 11587/61904 [5:31:07<19:14:43, 1.38s/it] 19%|█▊ | 11588/61904 [5:31:09<18:52:41, 1.35s/it] 19%|█▊ | 11589/61904 [5:31:10<18:26:49, 1.32s/it] 19%|█▊ | 11590/61904 [5:31:11<18:33:11, 1.33s/it] 19%|█▊ | 11591/61904 [5:31:13<18:34:34, 1.33s/it] 19%|█▊ | 11592/61904 [5:31:14<20:02:51, 1.43s/it] 19%|█▊ | 11593/61904 [5:31:16<19:50:18, 1.42s/it] 19%|█▊ | 11594/61904 [5:31:17<20:07:18, 1.44s/it] 19%|█▊ | 11595/61904 [5:31:18<19:27:35, 1.39s/it] 19%|█▊ | 11596/61904 [5:31:20<18:59:37, 1.36s/it] 19%|█▊ | 11597/61904 [5:31:21<19:40:46, 1.41s/it] 19%|█▊ | 11598/61904 [5:31:22<19:02:26, 1.36s/it] 19%|█▊ | 11599/61904 [5:31:24<18:35:39, 1.33s/it] 19%|█▊ | 11600/61904 [5:31:25<18:42:04, 1.34s/it] {'loss': 2.8109, 'learning_rate': 1.8152469856087125e-07, 'epoch': 3.0} 19%|█▊ | 11600/61904 [5:31:25<18:42:04, 1.34s/it] 19%|█▊ | 11601/61904 [5:31:26<18:38:19, 1.33s/it] 19%|█▊ | 11602/61904 [5:31:28<18:34:03, 1.33s/it] 19%|█▊ | 11603/61904 [5:31:29<18:44:09, 1.34s/it] 19%|█▊ | 11604/61904 [5:31:30<18:28:21, 1.32s/it] 19%|█▊ | 11605/61904 [5:31:32<18:38:11, 1.33s/it] 19%|█▊ | 11606/61904 [5:31:33<18:25:34, 1.32s/it] 19%|█▉ | 11607/61904 [5:31:34<18:43:32, 1.34s/it] 19%|█▉ | 11608/61904 [5:31:36<18:28:34, 1.32s/it]Generation Kwargs: {'max_length': 384, 'max_gen_length': 380, 'num_beams': 5} 0%| | 0/861 [00:00> Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model) instead. This warning will be raised to an exception in v4.41. Non-default generation parameters: {'max_length': 200, 'early_stopping': True, 'num_beams': 5, 'forced_eos_token_id': 2} /opt/conda/lib/python3.10/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock. self.pid = os.fork() 19%|█▉ | 11609/61904 [6:04:18<8237:10:56, 589.60s/it] 19%|█▉ | 11610/61904 [6:04:20<5773:02:43, 413.23s/it] 19%|█▉ | 11611/61904 [6:04:21<4046:58:49, 289.69s/it] 19%|█▉ | 11612/61904 [6:04:22<2838:21:34, 203.18s/it] 19%|█▉ | 11613/61904 [6:04:24<1992:53:35, 142.66s/it] 19%|█▉ | 11614/61904 [6:04:25<1401:03:08, 100.29s/it] 19%|█▉ | 11615/61904 [6:04:27<986:15:33, 70.60s/it] 19%|█▉ | 11616/61904 [6:04:28<696:06:03, 49.83s/it] 19%|█▉ | 11617/61904 [6:04:29<492:53:12, 35.29s/it] 19%|█▉ | 11618/61904 [6:04:31<350:58:48, 25.13s/it] 19%|█▉ | 11619/61904 [6:04:32<251:37:02, 18.01s/it] 19%|█▉ | 11620/61904 [6:04:33<181:18:10, 12.98s/it] {'loss': 2.8225, 'learning_rate': 1.8149228575132891e-07, 'epoch': 3.0} 19%|█▉ | 11620/61904 [6:04:33<181:18:10, 12.98s/it] 19%|█▉ | 11621/61904 [6:04:35<133:00:31, 9.52s/it] 19%|█▉ | 11622/61904 [6:04:37<100:07:02, 7.17s/it] 19%|█▉ | 11623/61904 [6:04:38<75:58:30, 5.44s/it] 19%|█▉ | 11624/61904 [6:04:39<58:47:41, 4.21s/it] 19%|█▉ | 11625/61904 [6:04:41<46:56:52, 3.36s/it] 19%|█▉ | 11626/61904 [6:04:42<38:49:45, 2.78s/it] 19%|█▉ | 11627/61904 [6:04:44<33:38:10, 2.41s/it] 19%|█▉ | 11628/61904 [6:04:45<28:46:00, 2.06s/it] 19%|█▉ | 11629/61904 [6:04:46<25:49:53, 1.85s/it] 19%|█▉ | 11630/61904 [6:04:48<23:55:02, 1.71s/it] 19%|█▉ | 11631/61904 [6:04:49<22:23:59, 1.60s/it] 19%|█▉ | 11632/61904 [6:04:50<21:46:45, 1.56s/it] 19%|█▉ | 11633/61904 [6:04:52<21:23:28, 1.53s/it] 19%|█▉ | 11634/61904 [6:04:53<20:32:33, 1.47s/it] 19%|█▉ | 11635/61904 [6:04:55<19:51:28, 1.42s/it] 19%|█▉ | 11636/61904 [6:04:56<19:26:14, 1.39s/it] 19%|█▉ | 11637/61904 [6:04:57<19:18:33, 1.38s/it] 19%|█▉ | 11638/61904 [6:04:59<19:11:37, 1.37s/it] 19%|█▉ | 11639/61904 [6:05:00<19:44:00, 1.41s/it] 19%|█▉ | 11640/61904 [6:05:01<19:19:00, 1.38s/it] {'loss': 2.7593, 'learning_rate': 1.814598729417866e-07, 'epoch': 3.01} 19%|█▉ | 11640/61904 [6:05:01<19:19:00, 1.38s/it] 19%|█▉ | 11641/61904 [6:05:03<19:14:00, 1.38s/it] 19%|█▉ | 11642/61904 [6:05:04<19:33:04, 1.40s/it] 19%|█▉ | 11643/61904 [6:05:06<19:11:00, 1.37s/it] 19%|█▉ | 11644/61904 [6:05:07<19:01:33, 1.36s/it] 19%|█▉ | 11645/61904 [6:05:08<19:07:31, 1.37s/it] 19%|█▉ | 11646/61904 [6:05:10<19:33:26, 1.40s/it] 19%|█▉ | 11647/61904 [6:05:11<20:44:42, 1.49s/it] 19%|█▉ | 11648/61904 [6:05:13<20:11:52, 1.45s/it] 19%|█▉ | 11649/61904 [6:05:14<19:50:24, 1.42s/it] 19%|█▉ | 11650/61904 [6:05:15<19:39:59, 1.41s/it] 19%|█▉ | 11651/61904 [6:05:17<20:31:24, 1.47s/it] 19%|█▉ | 11652/61904 [6:05:18<19:58:07, 1.43s/it] 19%|█▉ | 11653/61904 [6:05:20<19:24:34, 1.39s/it] 19%|█▉ | 11654/61904 [6:05:21<19:04:24, 1.37s/it] 19%|█▉ | 11655/61904 [6:05:22<19:23:49, 1.39s/it] 19%|█▉ | 11656/61904 [6:05:24<19:19:55, 1.39s/it] 19%|█▉ | 11657/61904 [6:05:25<19:08:37, 1.37s/it] 19%|█▉ | 11658/61904 [6:05:26<18:44:36, 1.34s/it] 19%|█▉ | 11659/61904 [6:05:28<19:09:11, 1.37s/it] 19%|█▉ | 11660/61904 [6:05:29<19:15:03, 1.38s/it] {'loss': 2.7975, 'learning_rate': 1.8142746013224426e-07, 'epoch': 3.01} 19%|█▉ | 11660/61904 [6:05:29<19:15:03, 1.38s/it] 19%|█▉ | 11661/61904 [6:05:31<19:04:11, 1.37s/it] 19%|█▉ | 11662/61904 [6:05:32<18:40:39, 1.34s/it] 19%|█▉ | 11663/61904 [6:05:33<18:20:12, 1.31s/it] 19%|█▉ | 11664/61904 [6:05:35<18:59:26, 1.36s/it] 19%|█▉ | 11665/61904 [6:05:36<19:24:12, 1.39s/it] 19%|█▉ | 11666/61904 [6:05:38<19:30:23, 1.40s/it] 19%|█▉ | 11667/61904 [6:05:39<19:12:15, 1.38s/it] 19%|█▉ | 11668/61904 [6:05:40<19:24:50, 1.39s/it] 19%|█▉ | 11669/61904 [6:05:42<19:32:27, 1.40s/it] 19%|█▉ | 11670/61904 [6:05:43<19:39:09, 1.41s/it] 19%|█▉ | 11671/61904 [6:05:44<19:23:27, 1.39s/it] 19%|█▉ | 11672/61904 [6:05:46<19:56:37, 1.43s/it] 19%|█▉ | 11673/61904 [6:05:47<19:31:20, 1.40s/it] 19%|█▉ | 11674/61904 [6:05:49<19:24:36, 1.39s/it] 19%|█▉ | 11675/61904 [6:05:50<19:10:49, 1.37s/it] 19%|█▉ | 11676/61904 [6:05:51<19:31:36, 1.40s/it] 19%|█▉ | 11677/61904 [6:05:53<20:40:49, 1.48s/it] 19%|█▉ | 11678/61904 [6:05:55<20:11:52, 1.45s/it] 19%|█▉ | 11679/61904 [6:05:56<20:09:32, 1.44s/it] 19%|█▉ | 11680/61904 [6:05:57<19:59:43, 1.43s/it] {'loss': 2.8483, 'learning_rate': 1.8139504732270193e-07, 'epoch': 3.02} 19%|█▉ | 11680/61904 [6:05:57<19:59:43, 1.43s/it] 19%|█▉ | 11681/61904 [6:05:59<19:47:25, 1.42s/it] 19%|█▉ | 11682/61904 [6:06:00<19:30:21, 1.40s/it] 19%|█▉ | 11683/61904 [6:06:01<19:04:27, 1.37s/it] 19%|█▉ | 11684/61904 [6:06:03<19:11:13, 1.38s/it] 19%|█▉ | 11685/61904 [6:06:04<19:10:27, 1.37s/it] 19%|█▉ | 11686/61904 [6:06:05<18:48:27, 1.35s/it] 19%|█▉ | 11687/61904 [6:06:07<20:10:59, 1.45s/it] 19%|█▉ | 11688/61904 [6:06:08<19:46:06, 1.42s/it] 19%|█▉ | 11689/61904 [6:06:10<20:03:15, 1.44s/it] 19%|█▉ | 11690/61904 [6:06:11<20:11:44, 1.45s/it] 19%|█▉ | 11691/61904 [6:06:13<19:34:06, 1.40s/it] 19%|█▉ | 11692/61904 [6:06:14<19:23:14, 1.39s/it] 19%|█▉ | 11693/61904 [6:06:15<19:22:12, 1.39s/it] 19%|█▉ | 11694/61904 [6:06:17<19:57:44, 1.43s/it] 19%|█▉ | 11695/61904 [6:06:18<19:42:06, 1.41s/it] 19%|█▉ | 11696/61904 [6:06:20<19:21:36, 1.39s/it] 19%|█▉ | 11697/61904 [6:06:21<18:56:29, 1.36s/it] 19%|█▉ | 11698/61904 [6:06:22<18:46:29, 1.35s/it] 19%|█▉ | 11699/61904 [6:06:24<19:02:41, 1.37s/it] 19%|█▉ | 11700/61904 [6:06:25<19:17:33, 1.38s/it] {'loss': 2.8307, 'learning_rate': 1.8136263451315961e-07, 'epoch': 3.02} 19%|█▉ | 11700/61904 [6:06:25<19:17:33, 1.38s/it] 19%|█▉ | 11701/61904 [6:06:27<19:07:27, 1.37s/it] 19%|█▉ | 11702/61904 [6:06:28<19:40:36, 1.41s/it] 19%|█▉ | 11703/61904 [6:06:29<19:03:17, 1.37s/it] 19%|█▉ | 11704/61904 [6:06:31<18:38:41, 1.34s/it] 19%|█▉ | 11705/61904 [6:06:32<18:28:35, 1.33s/it] 19%|█▉ | 11706/61904 [6:06:33<19:09:01, 1.37s/it] 19%|█▉ | 11707/61904 [6:06:35<20:21:54, 1.46s/it] 19%|█▉ | 11708/61904 [6:06:36<20:27:44, 1.47s/it] 19%|█▉ | 11709/61904 [6:06:38<20:39:10, 1.48s/it] 19%|█▉ | 11710/61904 [6:06:40<21:56:10, 1.57s/it] 19%|█▉ | 11711/61904 [6:06:41<20:42:06, 1.48s/it] 19%|█▉ | 11712/61904 [6:06:42<20:22:57, 1.46s/it] 19%|█▉ | 11713/61904 [6:06:44<19:47:47, 1.42s/it] 19%|█▉ | 11714/61904 [6:06:45<19:20:38, 1.39s/it] 19%|█▉ | 11715/61904 [6:06:46<19:03:20, 1.37s/it] 19%|█▉ | 11716/61904 [6:06:48<19:16:22, 1.38s/it] 19%|█▉ | 11717/61904 [6:06:49<19:02:01, 1.37s/it] 19%|█▉ | 11718/61904 [6:06:51<19:14:45, 1.38s/it] 19%|█▉ | 11719/61904 [6:06:52<19:09:28, 1.37s/it] 19%|█▉ | 11720/61904 [6:06:53<18:55:48, 1.36s/it] {'loss': 2.7256, 'learning_rate': 1.8133022170361725e-07, 'epoch': 3.03} 19%|█▉ | 11720/61904 [6:06:53<18:55:48, 1.36s/it] 19%|█▉ | 11721/61904 [6:06:55<19:36:27, 1.41s/it] 19%|█▉ | 11722/61904 [6:06:56<19:12:44, 1.38s/it] 19%|█▉ | 11723/61904 [6:06:58<19:40:28, 1.41s/it] 19%|█▉ | 11724/61904 [6:06:59<19:14:27, 1.38s/it] 19%|█▉ | 11725/61904 [6:07:00<18:21:52, 1.32s/it] 19%|█▉ | 11726/61904 [6:07:01<18:37:46, 1.34s/it] 19%|█▉ | 11727/61904 [6:07:03<18:55:32, 1.36s/it] 19%|█▉ | 11728/61904 [6:07:04<18:49:38, 1.35s/it] 19%|█▉ | 11729/61904 [6:07:06<19:07:04, 1.37s/it] 19%|█▉ | 11730/61904 [6:07:07<18:41:31, 1.34s/it] 19%|█▉ | 11731/61904 [6:07:08<18:58:42, 1.36s/it] 19%|█▉ | 11732/61904 [6:07:10<18:51:56, 1.35s/it] 19%|█▉ | 11733/61904 [6:07:11<19:40:56, 1.41s/it] 19%|█▉ | 11734/61904 [6:07:12<19:21:34, 1.39s/it] 19%|█▉ | 11735/61904 [6:07:14<18:52:54, 1.35s/it] 19%|█▉ | 11736/61904 [6:07:15<18:35:34, 1.33s/it] 19%|█▉ | 11737/61904 [6:07:17<19:12:19, 1.38s/it] 19%|█▉ | 11738/61904 [6:07:18<18:49:35, 1.35s/it] 19%|█▉ | 11739/61904 [6:07:19<18:17:34, 1.31s/it] 19%|█▉ | 11740/61904 [6:07:20<18:17:19, 1.31s/it] {'loss': 2.8067, 'learning_rate': 1.8129780889407494e-07, 'epoch': 3.03} 19%|█▉ | 11740/61904 [6:07:20<18:17:19, 1.31s/it] 19%|█▉ | 11741/61904 [6:07:22<18:11:05, 1.31s/it] 19%|█▉ | 11742/61904 [6:07:23<18:07:53, 1.30s/it] 19%|█▉ | 11743/61904 [6:07:24<18:33:14, 1.33s/it] 19%|█▉ | 11744/61904 [6:07:26<18:31:33, 1.33s/it] 19%|█▉ | 11745/61904 [6:07:27<18:34:48, 1.33s/it] 19%|█▉ | 11746/61904 [6:07:28<18:19:11, 1.31s/it] 19%|█▉ | 11747/61904 [6:07:30<18:47:19, 1.35s/it] 19%|█▉ | 11748/61904 [6:07:31<18:40:42, 1.34s/it] 19%|█▉ | 11749/61904 [6:07:32<18:59:36, 1.36s/it] 19%|█▉ | 11750/61904 [6:07:34<18:46:36, 1.35s/it] 19%|█▉ | 11751/61904 [6:07:35<18:51:56, 1.35s/it] 19%|█▉ | 11752/61904 [6:07:36<18:47:23, 1.35s/it] 19%|█▉ | 11753/61904 [6:07:38<18:38:06, 1.34s/it] 19%|█▉ | 11754/61904 [6:07:39<18:58:55, 1.36s/it] 19%|█▉ | 11755/61904 [6:07:40<18:24:47, 1.32s/it] 19%|█▉ | 11756/61904 [6:07:42<18:35:36, 1.33s/it] 19%|█▉ | 11757/61904 [6:07:43<18:46:57, 1.35s/it] 19%|█▉ | 11758/61904 [6:07:45<19:17:22, 1.38s/it] 19%|█▉ | 11759/61904 [6:07:46<19:24:55, 1.39s/it] 19%|█▉ | 11760/61904 [6:07:47<18:59:54, 1.36s/it] {'loss': 2.7359, 'learning_rate': 1.812653960845326e-07, 'epoch': 3.04} 19%|█▉ | 11760/61904 [6:07:47<18:59:54, 1.36s/it] 19%|█▉ | 11761/61904 [6:07:49<18:45:10, 1.35s/it] 19%|█▉ | 11762/61904 [6:07:50<19:32:19, 1.40s/it] 19%|█▉ | 11763/61904 [6:07:52<19:35:53, 1.41s/it] 19%|█▉ | 11764/61904 [6:07:53<18:56:44, 1.36s/it] 19%|█▉ | 11765/61904 [6:07:54<19:40:05, 1.41s/it] 19%|█▉ | 11766/61904 [6:07:56<19:24:49, 1.39s/it] 19%|█▉ | 11767/61904 [6:07:57<20:18:23, 1.46s/it] 19%|█▉ | 11768/61904 [6:07:59<20:04:39, 1.44s/it] 19%|█▉ | 11769/61904 [6:08:00<19:41:47, 1.41s/it] 19%|█▉ | 11770/61904 [6:08:02<20:18:49, 1.46s/it] 19%|█▉ | 11771/61904 [6:08:03<19:38:56, 1.41s/it] 19%|█▉ | 11772/61904 [6:08:04<19:21:54, 1.39s/it] 19%|█▉ | 11773/61904 [6:08:06<19:04:05, 1.37s/it] 19%|█▉ | 11774/61904 [6:08:07<19:08:59, 1.38s/it] 19%|█▉ | 11775/61904 [6:08:08<19:13:58, 1.38s/it] 19%|█▉ | 11776/61904 [6:08:10<18:52:32, 1.36s/it] 19%|█▉ | 11777/61904 [6:08:11<18:36:54, 1.34s/it] 19%|█▉ | 11778/61904 [6:08:12<18:30:29, 1.33s/it] 19%|█▉ | 11779/61904 [6:08:14<19:00:52, 1.37s/it] 19%|█▉ | 11780/61904 [6:08:15<19:14:04, 1.38s/it] {'loss': 2.8118, 'learning_rate': 1.8123298327499026e-07, 'epoch': 3.04} 19%|█▉ | 11780/61904 [6:08:15<19:14:04, 1.38s/it] 19%|█▉ | 11781/61904 [6:08:17<19:00:38, 1.37s/it] 19%|█▉ | 11782/61904 [6:08:18<19:15:09, 1.38s/it] 19%|█▉ | 11783/61904 [6:08:19<19:17:36, 1.39s/it] 19%|█▉ | 11784/61904 [6:08:21<19:10:19, 1.38s/it] 19%|█▉ | 11785/61904 [6:08:22<19:45:49, 1.42s/it] 19%|█▉ | 11786/61904 [6:08:23<19:09:37, 1.38s/it] 19%|█▉ | 11787/61904 [6:08:25<19:15:17, 1.38s/it] 19%|█▉ | 11788/61904 [6:08:26<19:17:05, 1.39s/it] 19%|█▉ | 11789/61904 [6:08:28<18:59:59, 1.36s/it] 19%|█▉ | 11790/61904 [6:08:29<18:44:13, 1.35s/it] 19%|█▉ | 11791/61904 [6:08:30<18:32:55, 1.33s/it] 19%|█▉ | 11792/61904 [6:08:31<18:13:53, 1.31s/it] 19%|█▉ | 11793/61904 [6:08:33<18:30:57, 1.33s/it] 19%|█▉ | 11794/61904 [6:08:34<18:29:05, 1.33s/it] 19%|█▉ | 11795/61904 [6:08:35<18:19:38, 1.32s/it] 19%|█▉ | 11796/61904 [6:08:37<18:14:05, 1.31s/it] 19%|█▉ | 11797/61904 [6:08:38<18:49:24, 1.35s/it] 19%|█▉ | 11798/61904 [6:08:40<19:42:37, 1.42s/it] 19%|█▉ | 11799/61904 [6:08:41<19:53:36, 1.43s/it] 19%|█▉ | 11800/61904 [6:08:43<19:40:28, 1.41s/it] {'loss': 2.8233, 'learning_rate': 1.8120057046544795e-07, 'epoch': 3.05} 19%|█▉ | 11800/61904 [6:08:43<19:40:28, 1.41s/it] 19%|█▉ | 11801/61904 [6:08:44<19:00:20, 1.37s/it] 19%|█▉ | 11802/61904 [6:08:45<18:58:47, 1.36s/it] 19%|█▉ | 11803/61904 [6:08:47<18:53:45, 1.36s/it] 19%|█▉ | 11804/61904 [6:08:48<18:50:50, 1.35s/it] 19%|█▉ | 11805/61904 [6:08:49<19:11:25, 1.38s/it] 19%|█▉ | 11806/61904 [6:08:51<19:04:32, 1.37s/it] 19%|█▉ | 11807/61904 [6:08:52<19:19:17, 1.39s/it] 19%|█▉ | 11808/61904 [6:08:54<19:31:48, 1.40s/it] 19%|█▉ | 11809/61904 [6:08:55<18:47:21, 1.35s/it] 19%|█▉ | 11810/61904 [6:08:56<18:57:51, 1.36s/it] 19%|█▉ | 11811/61904 [6:08:58<19:20:25, 1.39s/it] 19%|█▉ | 11812/61904 [6:08:59<18:55:22, 1.36s/it] 19%|█▉ | 11813/61904 [6:09:00<18:44:16, 1.35s/it] 19%|█▉ | 11814/61904 [6:09:01<18:21:15, 1.32s/it] 19%|█▉ | 11815/61904 [6:09:03<18:03:16, 1.30s/it] 19%|█▉ | 11816/61904 [6:09:04<18:54:11, 1.36s/it] 19%|█▉ | 11817/61904 [6:09:06<20:01:16, 1.44s/it] 19%|█▉ | 11818/61904 [6:09:07<19:59:20, 1.44s/it] 19%|█▉ | 11819/61904 [6:09:09<20:22:24, 1.46s/it] 19%|█▉ | 11820/61904 [6:09:10<19:55:34, 1.43s/it] {'loss': 2.8124, 'learning_rate': 1.811681576559056e-07, 'epoch': 3.05} 19%|█▉ | 11820/61904 [6:09:10<19:55:34, 1.43s/it] 19%|█▉ | 11821/61904 [6:09:12<20:09:17, 1.45s/it] 19%|█▉ | 11822/61904 [6:09:13<19:44:20, 1.42s/it] 19%|█▉ | 11823/61904 [6:09:14<19:35:03, 1.41s/it] 19%|█▉ | 11824/61904 [6:09:16<19:48:00, 1.42s/it] 19%|█▉ | 11825/61904 [6:09:17<19:30:41, 1.40s/it] 19%|█▉ | 11826/61904 [6:09:19<19:21:33, 1.39s/it] 19%|█▉ | 11827/61904 [6:09:20<19:27:21, 1.40s/it] 19%|█▉ | 11828/61904 [6:09:21<19:50:00, 1.43s/it] 19%|█▉ | 11829/61904 [6:09:23<19:18:08, 1.39s/it] 19%|█▉ | 11830/61904 [6:09:24<18:37:01, 1.34s/it] 19%|█▉ | 11831/61904 [6:09:25<18:25:56, 1.33s/it] 19%|█▉ | 11832/61904 [6:09:27<18:33:12, 1.33s/it] 19%|█▉ | 11833/61904 [6:09:28<18:17:30, 1.32s/it] 19%|█▉ | 11834/61904 [6:09:29<19:05:59, 1.37s/it] 19%|█▉ | 11835/61904 [6:09:31<19:52:39, 1.43s/it] 19%|█▉ | 11836/61904 [6:09:32<19:26:28, 1.40s/it] 19%|█▉ | 11837/61904 [6:09:34<19:20:36, 1.39s/it] 19%|█▉ | 11838/61904 [6:09:35<19:28:37, 1.40s/it] 19%|█▉ | 11839/61904 [6:09:36<19:22:40, 1.39s/it] 19%|█▉ | 11840/61904 [6:09:38<19:12:39, 1.38s/it] {'loss': 2.7403, 'learning_rate': 1.8113574484636327e-07, 'epoch': 3.06} 19%|█▉ | 11840/61904 [6:09:38<19:12:39, 1.38s/it] 19%|█▉ | 11841/61904 [6:09:39<19:06:09, 1.37s/it] 19%|█▉ | 11842/61904 [6:09:41<18:55:07, 1.36s/it] 19%|█▉ | 11843/61904 [6:09:42<18:47:15, 1.35s/it] 19%|█▉ | 11844/61904 [6:09:43<18:51:32, 1.36s/it] 19%|█▉ | 11845/61904 [6:09:45<18:35:09, 1.34s/it] 19%|█▉ | 11846/61904 [6:09:46<19:02:10, 1.37s/it] 19%|█▉ | 11847/61904 [6:09:47<19:18:42, 1.39s/it] 19%|█▉ | 11848/61904 [6:09:49<19:54:19, 1.43s/it] 19%|█▉ | 11849/61904 [6:09:50<19:20:10, 1.39s/it] 19%|█▉ | 11850/61904 [6:09:52<20:17:03, 1.46s/it] 19%|█▉ | 11851/61904 [6:09:53<19:33:30, 1.41s/it] 19%|█▉ | 11852/61904 [6:09:55<19:28:42, 1.40s/it] 19%|█▉ | 11853/61904 [6:09:56<19:30:28, 1.40s/it] 19%|█▉ | 11854/61904 [6:09:57<19:13:13, 1.38s/it] 19%|█▉ | 11855/61904 [6:09:58<18:40:20, 1.34s/it] 19%|█▉ | 11856/61904 [6:10:00<18:43:38, 1.35s/it] 19%|█▉ | 11857/61904 [6:10:01<18:49:22, 1.35s/it] 19%|█▉ | 11858/61904 [6:10:03<19:47:39, 1.42s/it] 19%|█▉ | 11859/61904 [6:10:04<19:21:07, 1.39s/it] 19%|█▉ | 11860/61904 [6:10:05<19:06:19, 1.37s/it] {'loss': 2.8573, 'learning_rate': 1.8110333203682096e-07, 'epoch': 3.06} 19%|█▉ | 11860/61904 [6:10:05<19:06:19, 1.37s/it] 19%|█▉ | 11861/61904 [6:10:07<18:58:08, 1.36s/it] 19%|█▉ | 11862/61904 [6:10:08<18:56:27, 1.36s/it] 19%|█▉ | 11863/61904 [6:10:10<18:53:08, 1.36s/it] 19%|█▉ | 11864/61904 [6:10:11<18:45:36, 1.35s/it] 19%|█▉ | 11865/61904 [6:10:12<19:04:34, 1.37s/it] 19%|█▉ | 11866/61904 [6:10:14<19:01:22, 1.37s/it] 19%|█▉ | 11867/61904 [6:10:15<19:02:58, 1.37s/it] 19%|█▉ | 11868/61904 [6:10:16<19:17:22, 1.39s/it] 19%|█▉ | 11869/61904 [6:10:18<18:59:34, 1.37s/it] 19%|█▉ | 11870/61904 [6:10:19<19:09:26, 1.38s/it] 19%|█▉ | 11871/61904 [6:10:21<20:06:25, 1.45s/it] 19%|█▉ | 11872/61904 [6:10:22<19:49:05, 1.43s/it] 19%|█▉ | 11873/61904 [6:10:23<19:03:21, 1.37s/it] 19%|█▉ | 11874/61904 [6:10:25<19:22:24, 1.39s/it] 19%|█▉ | 11875/61904 [6:10:26<20:07:53, 1.45s/it] 19%|█▉ | 11876/61904 [6:10:28<20:11:47, 1.45s/it] 19%|█▉ | 11877/61904 [6:10:29<20:20:05, 1.46s/it] 19%|█▉ | 11878/61904 [6:10:31<20:02:47, 1.44s/it] 19%|█▉ | 11879/61904 [6:10:32<19:36:35, 1.41s/it] 19%|█▉ | 11880/61904 [6:10:33<19:13:29, 1.38s/it] {'loss': 2.8055, 'learning_rate': 1.810709192272786e-07, 'epoch': 3.07} 19%|█▉ | 11880/61904 [6:10:33<19:13:29, 1.38s/it] 19%|█▉ | 11881/61904 [6:10:35<19:07:38, 1.38s/it] 19%|█▉ | 11882/61904 [6:10:36<19:29:01, 1.40s/it] 19%|█▉ | 11883/61904 [6:10:38<19:06:34, 1.38s/it] 19%|█▉ | 11884/61904 [6:10:39<19:23:52, 1.40s/it] 19%|█▉ | 11885/61904 [6:10:40<18:55:39, 1.36s/it] 19%|█▉ | 11886/61904 [6:10:42<18:49:45, 1.36s/it] 19%|█▉ | 11887/61904 [6:10:43<18:47:03, 1.35s/it] 19%|█▉ | 11888/61904 [6:10:44<19:22:31, 1.39s/it] 19%|█▉ | 11889/61904 [6:10:46<19:03:41, 1.37s/it] 19%|█▉ | 11890/61904 [6:10:47<18:44:17, 1.35s/it] 19%|█▉ | 11891/61904 [6:10:48<18:27:03, 1.33s/it] 19%|█▉ | 11892/61904 [6:10:50<18:30:04, 1.33s/it] 19%|█▉ | 11893/61904 [6:10:51<18:22:34, 1.32s/it] 19%|█▉ | 11894/61904 [6:10:52<18:36:24, 1.34s/it] 19%|█▉ | 11895/61904 [6:10:54<18:24:30, 1.33s/it] 19%|█▉ | 11896/61904 [6:10:55<20:13:43, 1.46s/it] 19%|█▉ | 11897/61904 [6:10:57<19:18:50, 1.39s/it] 19%|█▉ | 11898/61904 [6:10:58<19:46:02, 1.42s/it] 19%|█▉ | 11899/61904 [6:10:59<19:07:34, 1.38s/it] 19%|█▉ | 11900/61904 [6:11:01<18:59:28, 1.37s/it] {'loss': 2.7713, 'learning_rate': 1.8103850641773629e-07, 'epoch': 3.08} 19%|█▉ | 11900/61904 [6:11:01<18:59:28, 1.37s/it] 19%|█▉ | 11901/61904 [6:11:02<18:57:46, 1.37s/it] 19%|█▉ | 11902/61904 [6:11:03<18:25:00, 1.33s/it] 19%|█▉ | 11903/61904 [6:11:05<19:13:48, 1.38s/it] 19%|█▉ | 11904/61904 [6:11:06<19:12:55, 1.38s/it] 19%|█▉ | 11905/61904 [6:11:08<18:45:54, 1.35s/it] 19%|█▉ | 11906/61904 [6:11:09<18:37:38, 1.34s/it] 19%|█▉ | 11907/61904 [6:11:10<18:49:17, 1.36s/it] 19%|█▉ | 11908/61904 [6:11:12<18:43:03, 1.35s/it] 19%|█▉ | 11909/61904 [6:11:13<19:47:15, 1.42s/it] 19%|█▉ | 11910/61904 [6:11:15<19:45:00, 1.42s/it] 19%|█▉ | 11911/61904 [6:11:16<19:09:11, 1.38s/it] 19%|█▉ | 11912/61904 [6:11:17<18:34:48, 1.34s/it] 19%|█▉ | 11913/61904 [6:11:19<19:06:38, 1.38s/it] 19%|█▉ | 11914/61904 [6:11:20<18:52:27, 1.36s/it] 19%|█▉ | 11915/61904 [6:11:21<18:18:55, 1.32s/it] 19%|█▉ | 11916/61904 [6:11:23<18:43:39, 1.35s/it] 19%|█▉ | 11917/61904 [6:11:24<18:07:03, 1.30s/it] 19%|█▉ | 11918/61904 [6:11:25<19:01:10, 1.37s/it] 19%|█▉ | 11919/61904 [6:11:27<18:54:24, 1.36s/it] 19%|█▉ | 11920/61904 [6:11:28<18:47:47, 1.35s/it] {'loss': 2.7659, 'learning_rate': 1.8100609360819395e-07, 'epoch': 3.08} 19%|█▉ | 11920/61904 [6:11:28<18:47:47, 1.35s/it] 19%|█▉ | 11921/61904 [6:11:29<18:59:36, 1.37s/it] 19%|█▉ | 11922/61904 [6:11:31<19:27:10, 1.40s/it] 19%|█▉ | 11923/61904 [6:11:32<20:19:41, 1.46s/it] 19%|█▉ | 11924/61904 [6:11:34<19:50:43, 1.43s/it] 19%|█▉ | 11925/61904 [6:11:35<19:38:17, 1.41s/it] 19%|█▉ | 11926/61904 [6:11:37<19:58:22, 1.44s/it] 19%|█▉ | 11927/61904 [6:11:38<19:23:40, 1.40s/it] 19%|█▉ | 11928/61904 [6:11:39<19:13:20, 1.38s/it] 19%|█▉ | 11929/61904 [6:11:41<18:54:37, 1.36s/it] 19%|█▉ | 11930/61904 [6:11:42<18:47:52, 1.35s/it] 19%|█▉ | 11931/61904 [6:11:43<18:50:36, 1.36s/it] 19%|█▉ | 11932/61904 [6:11:45<18:31:04, 1.33s/it] 19%|█▉ | 11933/61904 [6:11:46<19:05:56, 1.38s/it] 19%|█▉ | 11934/61904 [6:11:48<19:24:23, 1.40s/it] 19%|█▉ | 11935/61904 [6:11:49<18:44:47, 1.35s/it] 19%|█▉ | 11936/61904 [6:11:50<18:32:54, 1.34s/it] 19%|█▉ | 11937/61904 [6:11:51<18:32:31, 1.34s/it] 19%|█▉ | 11938/61904 [6:11:53<18:59:53, 1.37s/it] 19%|█▉ | 11939/61904 [6:11:54<19:33:14, 1.41s/it] 19%|█▉ | 11940/61904 [6:11:56<19:33:41, 1.41s/it] {'loss': 2.8172, 'learning_rate': 1.809736807986516e-07, 'epoch': 3.09} 19%|█▉ | 11940/61904 [6:11:56<19:33:41, 1.41s/it] 19%|█▉ | 11941/61904 [6:11:57<20:06:03, 1.45s/it] 19%|█▉ | 11942/61904 [6:11:59<19:37:17, 1.41s/it] 19%|█▉ | 11943/61904 [6:12:00<19:05:57, 1.38s/it] 19%|█▉ | 11944/61904 [6:12:01<18:40:42, 1.35s/it] 19%|█▉ | 11945/61904 [6:12:02<18:24:30, 1.33s/it] 19%|█▉ | 11946/61904 [6:12:04<19:09:35, 1.38s/it] 19%|█▉ | 11947/61904 [6:12:05<18:57:34, 1.37s/it] 19%|█▉ | 11948/61904 [6:12:07<19:14:56, 1.39s/it] 19%|█▉ | 11949/61904 [6:12:08<19:01:34, 1.37s/it] 19%|█▉ | 11950/61904 [6:12:09<18:54:30, 1.36s/it] 19%|█▉ | 11951/61904 [6:12:11<18:59:49, 1.37s/it] 19%|█▉ | 11952/61904 [6:12:12<18:08:47, 1.31s/it] 19%|█▉ | 11953/61904 [6:12:13<18:19:35, 1.32s/it] 19%|█▉ | 11954/61904 [6:12:15<18:33:16, 1.34s/it] 19%|█▉ | 11955/61904 [6:12:16<18:31:05, 1.33s/it] 19%|█▉ | 11956/61904 [6:12:17<18:37:53, 1.34s/it] 19%|█▉ | 11957/61904 [6:12:19<18:42:21, 1.35s/it] 19%|█▉ | 11958/61904 [6:12:20<18:31:34, 1.34s/it] 19%|█▉ | 11959/61904 [6:12:21<18:47:05, 1.35s/it] 19%|█▉ | 11960/61904 [6:12:23<19:01:06, 1.37s/it] {'loss': 2.8002, 'learning_rate': 1.809412679891093e-07, 'epoch': 3.09} 19%|█▉ | 11960/61904 [6:12:23<19:01:06, 1.37s/it] 19%|█▉ | 11961/61904 [6:12:24<19:18:57, 1.39s/it] 19%|█▉ | 11962/61904 [6:12:26<19:27:41, 1.40s/it] 19%|█▉ | 11963/61904 [6:12:27<19:15:48, 1.39s/it] 19%|█▉ | 11964/61904 [6:12:28<18:59:36, 1.37s/it] 19%|█▉ | 11965/61904 [6:12:30<18:49:18, 1.36s/it] 19%|█▉ | 11966/61904 [6:12:31<18:47:46, 1.36s/it] 19%|█▉ | 11967/61904 [6:12:33<19:20:58, 1.39s/it] 19%|█▉ | 11968/61904 [6:12:34<20:04:09, 1.45s/it] 19%|█▉ | 11969/61904 [6:12:36<19:43:51, 1.42s/it] 19%|█▉ | 11970/61904 [6:12:37<20:13:14, 1.46s/it] 19%|█▉ | 11971/61904 [6:12:38<19:40:11, 1.42s/it] 19%|█▉ | 11972/61904 [6:12:40<19:16:52, 1.39s/it] 19%|█▉ | 11973/61904 [6:12:41<19:10:15, 1.38s/it] 19%|█▉ | 11974/61904 [6:12:43<19:22:31, 1.40s/it] 19%|█▉ | 11975/61904 [6:12:44<19:18:11, 1.39s/it] 19%|█▉ | 11976/61904 [6:12:45<19:18:48, 1.39s/it] 19%|█▉ | 11977/61904 [6:12:47<19:07:07, 1.38s/it] 19%|█▉ | 11978/61904 [6:12:48<20:04:53, 1.45s/it] 19%|█▉ | 11979/61904 [6:12:50<19:27:07, 1.40s/it] 19%|█▉ | 11980/61904 [6:12:51<19:09:18, 1.38s/it] {'loss': 2.8898, 'learning_rate': 1.8090885517956696e-07, 'epoch': 3.1} 19%|█▉ | 11980/61904 [6:12:51<19:09:18, 1.38s/it] 19%|█▉ | 11981/61904 [6:12:52<19:17:24, 1.39s/it] 19%|█▉ | 11982/61904 [6:12:54<19:52:56, 1.43s/it] 19%|█▉ | 11983/61904 [6:12:55<19:17:33, 1.39s/it] 19%|█▉ | 11984/61904 [6:12:57<20:04:53, 1.45s/it] 19%|█▉ | 11985/61904 [6:12:58<19:45:28, 1.42s/it] 19%|█▉ | 11986/61904 [6:12:59<19:39:50, 1.42s/it] 19%|█▉ | 11987/61904 [6:13:01<19:51:40, 1.43s/it] 19%|█▉ | 11988/61904 [6:13:02<19:16:48, 1.39s/it] 19%|█▉ | 11989/61904 [6:13:04<19:20:38, 1.40s/it] 19%|█▉ | 11990/61904 [6:13:05<19:39:31, 1.42s/it] 19%|█▉ | 11991/61904 [6:13:06<19:19:09, 1.39s/it] 19%|█▉ | 11992/61904 [6:13:08<19:28:51, 1.41s/it] 19%|█▉ | 11993/61904 [6:13:09<18:31:34, 1.34s/it] 19%|█▉ | 11994/61904 [6:13:10<18:47:32, 1.36s/it] 19%|█▉ | 11995/61904 [6:13:12<19:11:46, 1.38s/it] 19%|█▉ | 11996/61904 [6:13:13<19:08:52, 1.38s/it] 19%|█▉ | 11997/61904 [6:13:15<18:44:02, 1.35s/it] 19%|█▉ | 11998/61904 [6:13:16<19:39:23, 1.42s/it] 19%|█▉ | 11999/61904 [6:13:17<18:57:51, 1.37s/it] 19%|█▉ | 12000/61904 [6:13:19<18:37:31, 1.34s/it] {'loss': 2.8703, 'learning_rate': 1.8087644237002462e-07, 'epoch': 3.1} 19%|█▉ | 12000/61904 [6:13:19<18:37:31, 1.34s/it] 19%|█▉ | 12001/61904 [6:13:20<19:27:16, 1.40s/it] 19%|█▉ | 12002/61904 [6:13:22<19:31:29, 1.41s/it] 19%|█▉ | 12003/61904 [6:13:23<19:37:41, 1.42s/it] 19%|█▉ | 12004/61904 [6:13:25<19:55:08, 1.44s/it] 19%|█▉ | 12005/61904 [6:13:26<19:36:22, 1.41s/it] 19%|█▉ | 12006/61904 [6:13:27<19:14:15, 1.39s/it] 19%|█▉ | 12007/61904 [6:13:29<19:02:40, 1.37s/it] 19%|█▉ | 12008/61904 [6:13:30<18:51:03, 1.36s/it] 19%|█▉ | 12009/61904 [6:13:31<18:29:25, 1.33s/it] 19%|█▉ | 12010/61904 [6:13:33<18:44:21, 1.35s/it] 19%|█▉ | 12011/61904 [6:13:34<19:15:02, 1.39s/it] 19%|█▉ | 12012/61904 [6:13:36<19:33:50, 1.41s/it] 19%|█▉ | 12013/61904 [6:13:37<19:04:28, 1.38s/it] 19%|█▉ | 12014/61904 [6:13:38<19:29:35, 1.41s/it] 19%|█▉ | 12015/61904 [6:13:40<19:34:25, 1.41s/it] 19%|█▉ | 12016/61904 [6:13:41<18:49:05, 1.36s/it] 19%|█▉ | 12017/61904 [6:13:42<18:45:09, 1.35s/it] 19%|█▉ | 12018/61904 [6:13:44<19:35:02, 1.41s/it] 19%|█▉ | 12019/61904 [6:13:45<19:30:37, 1.41s/it] 19%|█▉ | 12020/61904 [6:13:47<19:10:30, 1.38s/it] {'loss': 2.7963, 'learning_rate': 1.808440295604823e-07, 'epoch': 3.11} 19%|█▉ | 12020/61904 [6:13:47<19:10:30, 1.38s/it] 19%|█▉ | 12021/61904 [6:13:48<19:16:04, 1.39s/it] 19%|█▉ | 12022/61904 [6:13:49<19:32:14, 1.41s/it] 19%|█▉ | 12023/61904 [6:13:51<19:43:47, 1.42s/it] 19%|█▉ | 12024/61904 [6:13:52<19:38:02, 1.42s/it] 19%|█▉ | 12025/61904 [6:13:54<19:15:00, 1.39s/it] 19%|█▉ | 12026/61904 [6:13:55<18:44:26, 1.35s/it] 19%|█▉ | 12027/61904 [6:13:56<18:55:03, 1.37s/it] 19%|█▉ | 12028/61904 [6:13:58<19:02:49, 1.37s/it] 19%|█▉ | 12029/61904 [6:13:59<18:57:31, 1.37s/it] 19%|█▉ | 12030/61904 [6:14:00<18:23:19, 1.33s/it] 19%|█▉ | 12031/61904 [6:14:02<18:17:07, 1.32s/it] 19%|█▉ | 12032/61904 [6:14:03<18:18:44, 1.32s/it] 19%|█▉ | 12033/61904 [6:14:04<18:20:52, 1.32s/it] 19%|█▉ | 12034/61904 [6:14:06<18:50:40, 1.36s/it] 19%|█▉ | 12035/61904 [6:14:07<18:28:50, 1.33s/it] 19%|█▉ | 12036/61904 [6:14:08<18:14:23, 1.32s/it] 19%|█▉ | 12037/61904 [6:14:09<17:50:18, 1.29s/it] 19%|█▉ | 12038/61904 [6:14:11<17:49:33, 1.29s/it] 19%|█▉ | 12039/61904 [6:14:12<17:56:23, 1.30s/it] 19%|█▉ | 12040/61904 [6:14:13<17:56:08, 1.29s/it] {'loss': 2.7938, 'learning_rate': 1.8081161675093997e-07, 'epoch': 3.11} 19%|█▉ | 12040/61904 [6:14:13<17:56:08, 1.29s/it] 19%|█▉ | 12041/61904 [6:14:15<18:25:32, 1.33s/it] 19%|█▉ | 12042/61904 [6:14:16<18:44:55, 1.35s/it] 19%|█▉ | 12043/61904 [6:14:18<18:55:11, 1.37s/it] 19%|█▉ | 12044/61904 [6:14:19<18:34:15, 1.34s/it] 19%|█▉ | 12045/61904 [6:14:20<18:20:00, 1.32s/it] 19%|█▉ | 12046/61904 [6:14:21<18:19:02, 1.32s/it] 19%|█▉ | 12047/61904 [6:14:23<18:12:07, 1.31s/it] 19%|█▉ | 12048/61904 [6:14:24<18:11:09, 1.31s/it] 19%|█▉ | 12049/61904 [6:14:25<17:51:36, 1.29s/it] 19%|█▉ | 12050/61904 [6:14:27<17:45:11, 1.28s/it] 19%|█▉ | 12051/61904 [6:14:28<18:16:31, 1.32s/it] 19%|█▉ | 12052/61904 [6:14:29<19:13:06, 1.39s/it] 19%|█▉ | 12053/61904 [6:14:31<18:58:46, 1.37s/it] 19%|█▉ | 12054/61904 [6:14:32<18:59:01, 1.37s/it] 19%|█▉ | 12055/61904 [6:14:33<18:45:45, 1.36s/it] 19%|█▉ | 12056/61904 [6:14:35<18:20:15, 1.32s/it] 19%|█▉ | 12057/61904 [6:14:36<18:30:34, 1.34s/it] 19%|█▉ | 12058/61904 [6:14:38<19:16:56, 1.39s/it] 19%|█▉ | 12059/61904 [6:14:39<18:53:35, 1.36s/it] 19%|█▉ | 12060/61904 [6:14:40<18:34:46, 1.34s/it] {'loss': 2.7748, 'learning_rate': 1.8077920394139763e-07, 'epoch': 3.12} 19%|█▉ | 12060/61904 [6:14:40<18:34:46, 1.34s/it] 19%|█▉ | 12061/61904 [6:14:42<18:40:14, 1.35s/it] 19%|█▉ | 12062/61904 [6:14:43<18:36:55, 1.34s/it] 19%|█▉ | 12063/61904 [6:14:44<19:27:57, 1.41s/it] 19%|█▉ | 12064/61904 [6:14:46<19:12:25, 1.39s/it] 19%|█▉ | 12065/61904 [6:14:47<18:48:41, 1.36s/it] 19%|█▉ | 12066/61904 [6:14:49<19:11:54, 1.39s/it] 19%|█▉ | 12067/61904 [6:14:50<18:26:10, 1.33s/it] 19%|█▉ | 12068/61904 [6:14:51<18:59:48, 1.37s/it] 19%|█▉ | 12069/61904 [6:14:53<18:57:02, 1.37s/it] 19%|█▉ | 12070/61904 [6:14:54<19:09:35, 1.38s/it] 19%|█▉ | 12071/61904 [6:14:55<18:41:55, 1.35s/it] 20%|█▉ | 12072/61904 [6:14:57<18:38:29, 1.35s/it] 20%|█▉ | 12073/61904 [6:14:58<18:06:27, 1.31s/it] 20%|█▉ | 12074/61904 [6:14:59<17:52:38, 1.29s/it] 20%|█▉ | 12075/61904 [6:15:00<18:12:54, 1.32s/it] 20%|█▉ | 12076/61904 [6:15:02<18:27:55, 1.33s/it] 20%|█▉ | 12077/61904 [6:15:03<18:49:34, 1.36s/it] 20%|█▉ | 12078/61904 [6:15:05<18:38:01, 1.35s/it] 20%|█▉ | 12079/61904 [6:15:06<18:16:58, 1.32s/it] 20%|█▉ | 12080/61904 [6:15:07<18:03:13, 1.30s/it] {'loss': 2.7813, 'learning_rate': 1.8074679113185532e-07, 'epoch': 3.12} 20%|█▉ | 12080/61904 [6:15:07<18:03:13, 1.30s/it] 20%|█▉ | 12081/61904 [6:15:08<18:04:02, 1.31s/it] 20%|█▉ | 12082/61904 [6:15:10<18:01:26, 1.30s/it] 20%|█▉ | 12083/61904 [6:15:11<17:59:52, 1.30s/it] 20%|█▉ | 12084/61904 [6:15:12<18:18:06, 1.32s/it] 20%|█▉ | 12085/61904 [6:15:14<18:33:46, 1.34s/it] 20%|█▉ | 12086/61904 [6:15:15<19:17:13, 1.39s/it] 20%|█▉ | 12087/61904 [6:15:17<19:04:17, 1.38s/it] 20%|█▉ | 12088/61904 [6:15:18<19:09:27, 1.38s/it] 20%|█▉ | 12089/61904 [6:15:19<18:57:33, 1.37s/it] 20%|█▉ | 12090/61904 [6:15:21<19:17:07, 1.39s/it] 20%|█▉ | 12091/61904 [6:15:22<19:32:16, 1.41s/it] 20%|█▉ | 12092/61904 [6:15:24<19:21:09, 1.40s/it] 20%|█▉ | 12093/61904 [6:15:25<19:42:45, 1.42s/it] 20%|█▉ | 12094/61904 [6:15:26<19:02:09, 1.38s/it] 20%|█▉ | 12095/61904 [6:15:28<19:34:44, 1.42s/it] 20%|█▉ | 12096/61904 [6:15:29<19:22:13, 1.40s/it] 20%|█▉ | 12097/61904 [6:15:31<19:06:00, 1.38s/it] 20%|█▉ | 12098/61904 [6:15:32<19:07:08, 1.38s/it] 20%|█▉ | 12099/61904 [6:15:33<19:27:14, 1.41s/it] 20%|█▉ | 12100/61904 [6:15:35<19:15:40, 1.39s/it] {'loss': 2.8148, 'learning_rate': 1.8071437832231296e-07, 'epoch': 3.13} 20%|█▉ | 12100/61904 [6:15:35<19:15:40, 1.39s/it] 20%|█▉ | 12101/61904 [6:15:36<19:16:33, 1.39s/it] 20%|█▉ | 12102/61904 [6:15:37<18:46:30, 1.36s/it] 20%|█▉ | 12103/61904 [6:15:39<19:01:23, 1.38s/it] 20%|█▉ | 12104/61904 [6:15:40<18:46:28, 1.36s/it] 20%|█▉ | 12105/61904 [6:15:42<19:44:00, 1.43s/it] 20%|█▉ | 12106/61904 [6:15:43<19:07:40, 1.38s/it] 20%|█▉ | 12107/61904 [6:15:44<18:55:07, 1.37s/it] 20%|█▉ | 12108/61904 [6:15:46<19:31:17, 1.41s/it] 20%|█▉ | 12109/61904 [6:15:47<18:56:00, 1.37s/it] 20%|█▉ | 12110/61904 [6:15:48<18:25:52, 1.33s/it] 20%|█▉ | 12111/61904 [6:15:50<18:20:15, 1.33s/it] 20%|█▉ | 12112/61904 [6:15:51<18:37:42, 1.35s/it] 20%|█▉ | 12113/61904 [6:15:52<18:23:18, 1.33s/it] 20%|█▉ | 12114/61904 [6:15:54<18:24:03, 1.33s/it] 20%|█▉ | 12115/61904 [6:15:55<18:17:32, 1.32s/it] 20%|█▉ | 12116/61904 [6:15:56<18:39:57, 1.35s/it] 20%|█▉ | 12117/61904 [6:15:58<19:25:56, 1.41s/it] 20%|█▉ | 12118/61904 [6:15:59<19:12:52, 1.39s/it] 20%|█▉ | 12119/61904 [6:16:01<18:50:30, 1.36s/it] 20%|█▉ | 12120/61904 [6:16:02<18:22:48, 1.33s/it] {'loss': 2.7859, 'learning_rate': 1.8068196551277065e-07, 'epoch': 3.13} 20%|█▉ | 12120/61904 [6:16:02<18:22:48, 1.33s/it] 20%|█▉ | 12121/61904 [6:16:03<18:58:29, 1.37s/it] 20%|█▉ | 12122/61904 [6:16:05<19:26:22, 1.41s/it] 20%|█▉ | 12123/61904 [6:16:06<18:58:12, 1.37s/it] 20%|█▉ | 12124/61904 [6:16:08<19:04:17, 1.38s/it] 20%|█▉ | 12125/61904 [6:16:09<19:06:55, 1.38s/it] 20%|█▉ | 12126/61904 [6:16:10<19:14:17, 1.39s/it] 20%|█▉ | 12127/61904 [6:16:12<19:02:19, 1.38s/it] 20%|█▉ | 12128/61904 [6:16:13<19:37:10, 1.42s/it] 20%|█▉ | 12129/61904 [6:16:15<19:58:02, 1.44s/it] 20%|█▉ | 12130/61904 [6:16:16<19:44:39, 1.43s/it] 20%|█▉ | 12131/61904 [6:16:17<19:10:19, 1.39s/it] 20%|█▉ | 12132/61904 [6:16:19<19:04:31, 1.38s/it] 20%|█▉ | 12133/61904 [6:16:20<19:12:05, 1.39s/it] 20%|█▉ | 12134/61904 [6:16:21<18:45:27, 1.36s/it] 20%|█▉ | 12135/61904 [6:16:23<19:24:05, 1.40s/it] 20%|█▉ | 12136/61904 [6:16:24<19:14:42, 1.39s/it] 20%|█▉ | 12137/61904 [6:16:26<18:57:38, 1.37s/it] 20%|█▉ | 12138/61904 [6:16:27<19:39:38, 1.42s/it] 20%|█▉ | 12139/61904 [6:16:29<19:17:51, 1.40s/it] 20%|█▉ | 12140/61904 [6:16:30<18:50:49, 1.36s/it] {'loss': 2.7769, 'learning_rate': 1.806495527032283e-07, 'epoch': 3.14} 20%|█▉ | 12140/61904 [6:16:30<18:50:49, 1.36s/it] 20%|█▉ | 12141/61904 [6:16:31<19:02:29, 1.38s/it] 20%|█▉ | 12142/61904 [6:16:32<18:22:46, 1.33s/it] 20%|█▉ | 12143/61904 [6:16:34<18:26:06, 1.33s/it] 20%|█▉ | 12144/61904 [6:16:35<18:19:41, 1.33s/it] 20%|█▉ | 12145/61904 [6:16:36<18:32:10, 1.34s/it] 20%|█▉ | 12146/61904 [6:16:38<18:30:38, 1.34s/it] 20%|█▉ | 12147/61904 [6:16:39<18:16:31, 1.32s/it] 20%|█▉ | 12148/61904 [6:16:40<18:15:23, 1.32s/it] 20%|█▉ | 12149/61904 [6:16:42<18:18:14, 1.32s/it] 20%|█▉ | 12150/61904 [6:16:43<18:56:48, 1.37s/it] 20%|█▉ | 12151/61904 [6:16:45<19:18:23, 1.40s/it] 20%|█▉ | 12152/61904 [6:16:46<19:05:54, 1.38s/it] 20%|█▉ | 12153/61904 [6:16:47<18:30:54, 1.34s/it] 20%|█▉ | 12154/61904 [6:16:49<18:59:14, 1.37s/it] 20%|█▉ | 12155/61904 [6:16:50<19:00:55, 1.38s/it] 20%|█▉ | 12156/61904 [6:16:51<18:38:37, 1.35s/it] 20%|█▉ | 12157/61904 [6:16:53<18:23:22, 1.33s/it] 20%|█▉ | 12158/61904 [6:16:54<18:34:06, 1.34s/it] 20%|█▉ | 12159/61904 [6:16:56<19:14:12, 1.39s/it] 20%|█▉ | 12160/61904 [6:16:57<19:14:27, 1.39s/it] {'loss': 2.8138, 'learning_rate': 1.8061713989368597e-07, 'epoch': 3.14} 20%|█▉ | 12160/61904 [6:16:57<19:14:27, 1.39s/it] 20%|█▉ | 12161/61904 [6:16:58<19:52:47, 1.44s/it] 20%|█▉ | 12162/61904 [6:17:00<20:01:43, 1.45s/it] 20%|█▉ | 12163/61904 [6:17:01<19:27:02, 1.41s/it] 20%|█▉ | 12164/61904 [6:17:03<19:02:19, 1.38s/it] 20%|█▉ | 12165/61904 [6:17:04<18:33:55, 1.34s/it] 20%|█▉ | 12166/61904 [6:17:05<18:27:01, 1.34s/it] 20%|█▉ | 12167/61904 [6:17:07<18:45:25, 1.36s/it] 20%|█▉ | 12168/61904 [6:17:08<18:58:28, 1.37s/it] 20%|█▉ | 12169/61904 [6:17:09<19:14:48, 1.39s/it] 20%|█▉ | 12170/61904 [6:17:11<18:41:14, 1.35s/it] 20%|█▉ | 12171/61904 [6:17:12<19:19:14, 1.40s/it] 20%|█▉ | 12172/61904 [6:17:13<18:59:02, 1.37s/it] 20%|█▉ | 12173/61904 [6:17:15<18:53:18, 1.37s/it] 20%|█▉ | 12174/61904 [6:17:16<19:04:33, 1.38s/it] 20%|█▉ | 12175/61904 [6:17:18<19:28:17, 1.41s/it] 20%|█▉ | 12176/61904 [6:17:19<19:11:32, 1.39s/it] 20%|█▉ | 12177/61904 [6:17:21<19:40:09, 1.42s/it] 20%|█▉ | 12178/61904 [6:17:22<18:56:57, 1.37s/it] 20%|█▉ | 12179/61904 [6:17:23<19:08:36, 1.39s/it] 20%|█▉ | 12180/61904 [6:17:25<19:18:25, 1.40s/it] {'loss': 2.8262, 'learning_rate': 1.8058472708414366e-07, 'epoch': 3.15} 20%|█▉ | 12180/61904 [6:17:25<19:18:25, 1.40s/it] 20%|█▉ | 12181/61904 [6:17:26<19:22:56, 1.40s/it] 20%|█▉ | 12182/61904 [6:17:27<18:53:05, 1.37s/it] 20%|█▉ | 12183/61904 [6:17:29<18:40:48, 1.35s/it] 20%|█▉ | 12184/61904 [6:17:30<18:24:12, 1.33s/it] 20%|█▉ | 12185/61904 [6:17:31<18:09:29, 1.31s/it] 20%|█▉ | 12186/61904 [6:17:33<18:45:52, 1.36s/it] 20%|█▉ | 12187/61904 [6:17:34<19:20:01, 1.40s/it] 20%|█▉ | 12188/61904 [6:17:36<19:04:01, 1.38s/it] 20%|█▉ | 12189/61904 [6:17:37<19:01:10, 1.38s/it] 20%|█▉ | 12190/61904 [6:17:38<19:03:07, 1.38s/it] 20%|█▉ | 12191/61904 [6:17:40<18:59:57, 1.38s/it] 20%|█▉ | 12192/61904 [6:17:41<18:44:10, 1.36s/it] 20%|█▉ | 12193/61904 [6:17:42<19:16:39, 1.40s/it] 20%|█▉ | 12194/61904 [6:17:44<19:17:18, 1.40s/it] 20%|█▉ | 12195/61904 [6:17:45<18:56:46, 1.37s/it] 20%|█▉ | 12196/61904 [6:17:47<19:03:15, 1.38s/it] 20%|█▉ | 12197/61904 [6:17:48<19:16:39, 1.40s/it] 20%|█▉ | 12198/61904 [6:17:49<18:48:37, 1.36s/it] 20%|█▉ | 12199/61904 [6:17:51<18:41:52, 1.35s/it] 20%|█▉ | 12200/61904 [6:17:52<19:03:06, 1.38s/it] {'loss': 2.7442, 'learning_rate': 1.8055231427460132e-07, 'epoch': 3.15} 20%|█▉ | 12200/61904 [6:17:52<19:03:06, 1.38s/it] 20%|█▉ | 12201/61904 [6:17:54<20:34:44, 1.49s/it] 20%|█▉ | 12202/61904 [6:17:55<19:27:02, 1.41s/it] 20%|█▉ | 12203/61904 [6:17:56<19:25:56, 1.41s/it] 20%|█▉ | 12204/61904 [6:17:58<19:04:42, 1.38s/it] 20%|█▉ | 12205/61904 [6:17:59<18:38:56, 1.35s/it] 20%|█▉ | 12206/61904 [6:18:00<18:47:07, 1.36s/it] 20%|█▉ | 12207/61904 [6:18:02<18:26:42, 1.34s/it] 20%|█▉ | 12208/61904 [6:18:03<17:55:58, 1.30s/it] 20%|█▉ | 12209/61904 [6:18:04<17:43:29, 1.28s/it] 20%|█▉ | 12210/61904 [6:18:06<17:58:18, 1.30s/it] 20%|█▉ | 12211/61904 [6:18:07<17:57:31, 1.30s/it] 20%|█▉ | 12212/61904 [6:18:08<17:52:45, 1.30s/it] 20%|█▉ | 12213/61904 [6:18:09<17:38:19, 1.28s/it] 20%|█▉ | 12214/61904 [6:18:11<18:01:47, 1.31s/it] 20%|█▉ | 12215/61904 [6:18:12<18:33:16, 1.34s/it] 20%|█▉ | 12216/61904 [6:18:13<17:55:08, 1.30s/it] 20%|█▉ | 12217/61904 [6:18:15<18:08:00, 1.31s/it] 20%|█▉ | 12218/61904 [6:18:16<18:31:43, 1.34s/it] 20%|█▉ | 12219/61904 [6:18:18<19:42:29, 1.43s/it] 20%|█▉ | 12220/61904 [6:18:19<19:33:30, 1.42s/it] {'loss': 2.7795, 'learning_rate': 1.8051990146505898e-07, 'epoch': 3.16} 20%|█▉ | 12220/61904 [6:18:19<19:33:30, 1.42s/it] 20%|█▉ | 12221/61904 [6:18:21<19:42:10, 1.43s/it] 20%|█▉ | 12222/61904 [6:18:22<19:03:39, 1.38s/it] 20%|█▉ | 12223/61904 [6:18:23<19:29:35, 1.41s/it] 20%|█▉ | 12224/61904 [6:18:25<19:16:26, 1.40s/it] 20%|█▉ | 12225/61904 [6:18:26<19:50:28, 1.44s/it] 20%|█▉ | 12226/61904 [6:18:28<19:16:47, 1.40s/it] 20%|█▉ | 12227/61904 [6:18:29<18:51:22, 1.37s/it] 20%|█▉ | 12228/61904 [6:18:30<18:48:33, 1.36s/it] 20%|█▉ | 12229/61904 [6:18:32<19:24:52, 1.41s/it] 20%|█▉ | 12230/61904 [6:18:33<19:27:14, 1.41s/it] 20%|█▉ | 12231/61904 [6:18:35<20:08:23, 1.46s/it] 20%|█▉ | 12232/61904 [6:18:36<20:17:28, 1.47s/it] 20%|█▉ | 12233/61904 [6:18:37<19:42:04, 1.43s/it] 20%|█▉ | 12234/61904 [6:18:39<19:18:20, 1.40s/it] 20%|█▉ | 12235/61904 [6:18:40<19:22:09, 1.40s/it] 20%|█▉ | 12236/61904 [6:18:42<19:41:59, 1.43s/it] 20%|█▉ | 12237/61904 [6:18:43<19:00:59, 1.38s/it] 20%|█▉ | 12238/61904 [6:18:44<18:44:01, 1.36s/it] 20%|█▉ | 12239/61904 [6:18:46<19:03:17, 1.38s/it] 20%|█▉ | 12240/61904 [6:18:47<19:29:36, 1.41s/it] {'loss': 2.8031, 'learning_rate': 1.8048748865551667e-07, 'epoch': 3.16} 20%|█▉ | 12240/61904 [6:18:47<19:29:36, 1.41s/it] 20%|█▉ | 12241/61904 [6:18:49<18:59:04, 1.38s/it] 20%|█▉ | 12242/61904 [6:18:50<18:21:54, 1.33s/it] 20%|█▉ | 12243/61904 [6:18:51<18:33:03, 1.34s/it] 20%|█▉ | 12244/61904 [6:18:52<18:13:50, 1.32s/it] 20%|█▉ | 12245/61904 [6:18:54<18:28:02, 1.34s/it] 20%|█▉ | 12246/61904 [6:18:55<18:37:20, 1.35s/it] 20%|█▉ | 12247/61904 [6:18:57<19:57:22, 1.45s/it] 20%|█▉ | 12248/61904 [6:18:58<19:06:02, 1.38s/it] 20%|█▉ | 12249/61904 [6:18:59<19:18:51, 1.40s/it] 20%|█▉ | 12250/61904 [6:19:01<19:30:27, 1.41s/it] 20%|█▉ | 12251/61904 [6:19:02<19:06:09, 1.39s/it] 20%|█▉ | 12252/61904 [6:19:04<19:56:27, 1.45s/it] 20%|█▉ | 12253/61904 [6:19:05<19:32:21, 1.42s/it] 20%|█▉ | 12254/61904 [6:19:06<18:52:59, 1.37s/it] 20%|█▉ | 12255/61904 [6:19:08<18:54:06, 1.37s/it] 20%|█▉ | 12256/61904 [6:19:09<18:48:31, 1.36s/it] 20%|█▉ | 12257/61904 [6:19:10<18:41:17, 1.36s/it] 20%|█▉ | 12258/61904 [6:19:12<19:23:52, 1.41s/it] 20%|█▉ | 12259/61904 [6:19:13<19:22:28, 1.40s/it] 20%|█▉ | 12260/61904 [6:19:15<19:30:19, 1.41s/it] {'loss': 2.7879, 'learning_rate': 1.804550758459743e-07, 'epoch': 3.17} 20%|█▉ | 12260/61904 [6:19:15<19:30:19, 1.41s/it] 20%|█▉ | 12261/61904 [6:19:16<19:12:39, 1.39s/it] 20%|█▉ | 12262/61904 [6:19:17<18:45:04, 1.36s/it] 20%|█▉ | 12263/61904 [6:19:19<19:28:16, 1.41s/it] 20%|█▉ | 12264/61904 [6:19:20<19:05:02, 1.38s/it] 20%|█▉ | 12265/61904 [6:19:22<19:04:56, 1.38s/it] 20%|█▉ | 12266/61904 [6:19:23<18:41:45, 1.36s/it] 20%|█▉ | 12267/61904 [6:19:24<19:01:17, 1.38s/it] 20%|█▉ | 12268/61904 [6:19:26<18:55:10, 1.37s/it] 20%|█▉ | 12269/61904 [6:19:27<19:01:17, 1.38s/it] 20%|█▉ | 12270/61904 [6:19:28<18:36:11, 1.35s/it] 20%|█▉ | 12271/61904 [6:19:30<19:19:21, 1.40s/it] 20%|█▉ | 12272/61904 [6:19:31<18:56:52, 1.37s/it] 20%|█▉ | 12273/61904 [6:19:33<18:59:38, 1.38s/it] 20%|█▉ | 12274/61904 [6:19:34<18:40:49, 1.36s/it] 20%|█▉ | 12275/61904 [6:19:35<18:16:57, 1.33s/it] 20%|█▉ | 12276/61904 [6:19:37<18:28:28, 1.34s/it] 20%|█▉ | 12277/61904 [6:19:38<18:45:08, 1.36s/it] 20%|█▉ | 12278/61904 [6:19:39<18:49:08, 1.37s/it] 20%|█▉ | 12279/61904 [6:19:41<18:45:30, 1.36s/it] 20%|█▉ | 12280/61904 [6:19:42<18:58:58, 1.38s/it] {'loss': 2.8254, 'learning_rate': 1.80422663036432e-07, 'epoch': 3.17} 20%|█▉ | 12280/61904 [6:19:42<18:58:58, 1.38s/it] 20%|█▉ | 12281/61904 [6:19:44<18:44:12, 1.36s/it] 20%|█▉ | 12282/61904 [6:19:45<18:40:38, 1.36s/it] 20%|█▉ | 12283/61904 [6:19:46<18:40:26, 1.35s/it] 20%|█▉ | 12284/61904 [6:19:47<18:19:43, 1.33s/it] 20%|█▉ | 12285/61904 [6:19:49<18:18:25, 1.33s/it] 20%|█▉ | 12286/61904 [6:19:50<18:10:10, 1.32s/it] 20%|█▉ | 12287/61904 [6:19:51<18:12:57, 1.32s/it] 20%|█▉ | 12288/61904 [6:19:53<19:05:09, 1.38s/it] 20%|█▉ | 12289/61904 [6:19:54<18:19:03, 1.33s/it] 20%|█▉ | 12290/61904 [6:19:56<18:55:37, 1.37s/it] 20%|█▉ | 12291/61904 [6:19:57<19:03:48, 1.38s/it] 20%|█▉ | 12292/61904 [6:19:58<19:07:38, 1.39s/it] 20%|█▉ | 12293/61904 [6:20:00<19:37:19, 1.42s/it] 20%|█▉ | 12294/61904 [6:20:01<19:36:59, 1.42s/it] 20%|█▉ | 12295/61904 [6:20:03<19:47:39, 1.44s/it] 20%|█▉ | 12296/61904 [6:20:04<19:39:54, 1.43s/it] 20%|█▉ | 12297/61904 [6:20:06<19:50:31, 1.44s/it] 20%|█▉ | 12298/61904 [6:20:07<19:25:44, 1.41s/it] 20%|█▉ | 12299/61904 [6:20:09<20:05:48, 1.46s/it] 20%|█▉ | 12300/61904 [6:20:10<20:05:13, 1.46s/it] {'loss': 2.7154, 'learning_rate': 1.8039025022688966e-07, 'epoch': 3.18} 20%|█▉ | 12300/61904 [6:20:10<20:05:13, 1.46s/it] 20%|█▉ | 12301/61904 [6:20:12<20:41:32, 1.50s/it] 20%|█▉ | 12302/61904 [6:20:13<20:04:52, 1.46s/it] 20%|█▉ | 12303/61904 [6:20:14<19:29:58, 1.42s/it] 20%|█▉ | 12304/61904 [6:20:16<20:05:25, 1.46s/it] 20%|█▉ | 12305/61904 [6:20:17<19:39:41, 1.43s/it] 20%|█▉ | 12306/61904 [6:20:19<19:41:11, 1.43s/it] 20%|█▉ | 12307/61904 [6:20:20<19:24:30, 1.41s/it] 20%|█▉ | 12308/61904 [6:20:21<18:40:42, 1.36s/it] 20%|█▉ | 12309/61904 [6:20:23<18:25:57, 1.34s/it] 20%|█▉ | 12310/61904 [6:20:24<18:00:17, 1.31s/it] 20%|█▉ | 12311/61904 [6:20:25<18:01:09, 1.31s/it] 20%|█▉ | 12312/61904 [6:20:26<18:03:49, 1.31s/it] 20%|█▉ | 12313/61904 [6:20:28<18:10:39, 1.32s/it] 20%|█▉ | 12314/61904 [6:20:29<19:24:53, 1.41s/it] 20%|█▉ | 12315/61904 [6:20:31<19:38:23, 1.43s/it] 20%|█▉ | 12316/61904 [6:20:32<18:57:48, 1.38s/it] 20%|█▉ | 12317/61904 [6:20:34<19:40:29, 1.43s/it] 20%|█▉ | 12318/61904 [6:20:35<19:35:16, 1.42s/it] 20%|█▉ | 12319/61904 [6:20:36<18:58:10, 1.38s/it] 20%|█▉ | 12320/61904 [6:20:38<19:03:05, 1.38s/it] {'loss': 2.8229, 'learning_rate': 1.8035783741734732e-07, 'epoch': 3.18} 20%|█▉ | 12320/61904 [6:20:38<19:03:05, 1.38s/it] 20%|█▉ | 12321/61904 [6:20:39<18:38:23, 1.35s/it] 20%|█▉ | 12322/61904 [6:20:40<18:27:47, 1.34s/it] 20%|█▉ | 12323/61904 [6:20:42<19:07:29, 1.39s/it] 20%|█▉ | 12324/61904 [6:20:43<19:14:18, 1.40s/it] 20%|█▉ | 12325/61904 [6:20:45<19:01:07, 1.38s/it] 20%|█▉ | 12326/61904 [6:20:46<18:39:14, 1.35s/it] 20%|█▉ | 12327/61904 [6:20:47<18:32:53, 1.35s/it] 20%|█▉ | 12328/61904 [6:20:49<18:26:26, 1.34s/it] 20%|█▉ | 12329/61904 [6:20:50<18:28:35, 1.34s/it] 20%|█▉ | 12330/61904 [6:20:51<18:24:59, 1.34s/it] 20%|█▉ | 12331/61904 [6:20:53<18:23:51, 1.34s/it] 20%|█▉ | 12332/61904 [6:20:54<18:50:51, 1.37s/it] 20%|█▉ | 12333/61904 [6:20:55<19:11:15, 1.39s/it] 20%|█▉ | 12334/61904 [6:20:57<19:20:14, 1.40s/it] 20%|█▉ | 12335/61904 [6:20:58<19:46:00, 1.44s/it] 20%|█▉ | 12336/61904 [6:21:00<19:25:21, 1.41s/it] 20%|█▉ | 12337/61904 [6:21:01<19:22:40, 1.41s/it] 20%|█▉ | 12338/61904 [6:21:02<19:06:08, 1.39s/it] 20%|█▉ | 12339/61904 [6:21:04<18:36:44, 1.35s/it] 20%|█▉ | 12340/61904 [6:21:05<18:26:44, 1.34s/it] {'loss': 2.7942, 'learning_rate': 1.80325424607805e-07, 'epoch': 3.19} 20%|█▉ | 12340/61904 [6:21:05<18:26:44, 1.34s/it] 20%|█▉ | 12341/61904 [6:21:06<18:39:58, 1.36s/it] 20%|█▉ | 12342/61904 [6:21:08<19:03:13, 1.38s/it] 20%|█▉ | 12343/61904 [6:21:09<18:53:50, 1.37s/it] 20%|█▉ | 12344/61904 [6:21:11<18:20:39, 1.33s/it] 20%|█▉ | 12345/61904 [6:21:12<17:58:49, 1.31s/it] 20%|█▉ | 12346/61904 [6:21:13<18:29:32, 1.34s/it] 20%|█▉ | 12347/61904 [6:21:15<18:32:13, 1.35s/it] 20%|█▉ | 12348/61904 [6:21:16<18:27:24, 1.34s/it] 20%|█▉ | 12349/61904 [6:21:17<18:34:15, 1.35s/it] 20%|█▉ | 12350/61904 [6:21:19<19:25:13, 1.41s/it] 20%|█▉ | 12351/61904 [6:21:20<18:57:43, 1.38s/it] 20%|█▉ | 12352/61904 [6:21:21<18:39:34, 1.36s/it] 20%|█▉ | 12353/61904 [6:21:23<18:43:53, 1.36s/it] 20%|█▉ | 12354/61904 [6:21:24<19:05:09, 1.39s/it] 20%|█▉ | 12355/61904 [6:21:25<18:41:19, 1.36s/it] 20%|█▉ | 12356/61904 [6:21:27<18:50:47, 1.37s/it] 20%|█▉ | 12357/61904 [6:21:28<18:44:28, 1.36s/it] 20%|█▉ | 12358/61904 [6:21:29<18:10:05, 1.32s/it] 20%|█▉ | 12359/61904 [6:21:31<18:06:59, 1.32s/it] 20%|█▉ | 12360/61904 [6:21:32<18:18:00, 1.33s/it] {'loss': 2.7733, 'learning_rate': 1.8029301179826267e-07, 'epoch': 3.19} 20%|█▉ | 12360/61904 [6:21:32<18:18:00, 1.33s/it] 20%|█▉ | 12361/61904 [6:21:33<18:24:39, 1.34s/it] 20%|█▉ | 12362/61904 [6:21:35<19:15:40, 1.40s/it] 20%|█▉ | 12363/61904 [6:21:37<20:11:04, 1.47s/it] 20%|█▉ | 12364/61904 [6:21:38<19:23:58, 1.41s/it] 20%|█▉ | 12365/61904 [6:21:39<18:29:06, 1.34s/it] 20%|█▉ | 12366/61904 [6:21:41<18:56:58, 1.38s/it] 20%|█▉ | 12367/61904 [6:21:42<19:38:11, 1.43s/it] 20%|█▉ | 12368/61904 [6:21:43<19:01:17, 1.38s/it] 20%|█▉ | 12369/61904 [6:21:45<18:47:32, 1.37s/it] 20%|█▉ | 12370/61904 [6:21:46<19:07:41, 1.39s/it] 20%|█▉ | 12371/61904 [6:21:48<18:59:41, 1.38s/it] 20%|█▉ | 12372/61904 [6:21:49<18:49:50, 1.37s/it] 20%|█▉ | 12373/61904 [6:21:50<19:29:32, 1.42s/it] 20%|█▉ | 12374/61904 [6:21:52<18:41:52, 1.36s/it] 20%|█▉ | 12375/61904 [6:21:53<18:42:39, 1.36s/it] 20%|█▉ | 12376/61904 [6:21:55<19:26:05, 1.41s/it] 20%|█▉ | 12377/61904 [6:21:56<19:37:09, 1.43s/it] 20%|█▉ | 12378/61904 [6:21:57<19:10:36, 1.39s/it] 20%|█▉ | 12379/61904 [6:21:59<18:53:08, 1.37s/it] 20%|█▉ | 12380/61904 [6:22:00<18:20:13, 1.33s/it] {'loss': 2.775, 'learning_rate': 1.8026059898872033e-07, 'epoch': 3.2} 20%|█▉ | 12380/61904 [6:22:00<18:20:13, 1.33s/it] 20%|██ | 12381/61904 [6:22:01<19:08:40, 1.39s/it] 20%|██ | 12382/61904 [6:22:03<18:57:56, 1.38s/it] 20%|██ | 12383/61904 [6:22:04<19:39:35, 1.43s/it] 20%|██ | 12384/61904 [6:22:06<19:36:36, 1.43s/it] 20%|██ | 12385/61904 [6:22:07<19:48:44, 1.44s/it] 20%|██ | 12386/61904 [6:22:09<19:33:37, 1.42s/it] 20%|██ | 12387/61904 [6:22:10<18:52:15, 1.37s/it] 20%|██ | 12388/61904 [6:22:11<19:06:02, 1.39s/it] 20%|██ | 12389/61904 [6:22:13<20:12:05, 1.47s/it] 20%|██ | 12390/61904 [6:22:14<19:19:14, 1.40s/it] 20%|██ | 12391/61904 [6:22:15<19:00:50, 1.38s/it] 20%|██ | 12392/61904 [6:22:17<18:48:00, 1.37s/it] 20%|██ | 12393/61904 [6:22:18<19:10:06, 1.39s/it] 20%|██ | 12394/61904 [6:22:20<18:31:10, 1.35s/it] 20%|██ | 12395/61904 [6:22:21<18:12:04, 1.32s/it] 20%|██ | 12396/61904 [6:22:22<18:10:21, 1.32s/it] 20%|██ | 12397/61904 [6:22:23<18:11:26, 1.32s/it] 20%|██ | 12398/61904 [6:22:25<18:09:20, 1.32s/it] 20%|██ | 12399/61904 [6:22:26<19:03:46, 1.39s/it] 20%|██ | 12400/61904 [6:22:28<19:12:05, 1.40s/it] {'loss': 2.7004, 'learning_rate': 1.8022818617917802e-07, 'epoch': 3.2} 20%|██ | 12400/61904 [6:22:28<19:12:05, 1.40s/it] 20%|██ | 12401/61904 [6:22:29<19:04:14, 1.39s/it] 20%|██ | 12402/61904 [6:22:30<18:42:45, 1.36s/it] 20%|██ | 12403/61904 [6:22:32<18:51:04, 1.37s/it] 20%|██ | 12404/61904 [6:22:33<18:20:43, 1.33s/it] 20%|██ | 12405/61904 [6:22:34<18:37:58, 1.36s/it] 20%|██ | 12406/61904 [6:22:36<18:39:47, 1.36s/it] 20%|██ | 12407/61904 [6:22:37<18:50:04, 1.37s/it] 20%|██ | 12408/61904 [6:22:39<19:30:22, 1.42s/it] 20%|██ | 12409/61904 [6:22:40<18:41:13, 1.36s/it] 20%|██ | 12410/61904 [6:22:41<18:37:20, 1.35s/it] 20%|██ | 12411/61904 [6:22:43<18:30:36, 1.35s/it] 20%|██ | 12412/61904 [6:22:44<18:16:14, 1.33s/it] 20%|██ | 12413/61904 [6:22:45<18:31:29, 1.35s/it] 20%|██ | 12414/61904 [6:22:47<18:43:17, 1.36s/it] 20%|██ | 12415/61904 [6:22:48<18:32:22, 1.35s/it] 20%|██ | 12416/61904 [6:22:49<18:44:26, 1.36s/it] 20%|██ | 12417/61904 [6:22:51<18:43:32, 1.36s/it] 20%|██ | 12418/61904 [6:22:52<18:32:54, 1.35s/it] 20%|██ | 12419/61904 [6:22:54<19:25:25, 1.41s/it] 20%|██ | 12420/61904 [6:22:55<19:41:41, 1.43s/it] {'loss': 2.8086, 'learning_rate': 1.8019577336963565e-07, 'epoch': 3.21} 20%|██ | 12420/61904 [6:22:55<19:41:41, 1.43s/it] 20%|██ | 12421/61904 [6:22:56<19:11:01, 1.40s/it] 20%|██ | 12422/61904 [6:22:58<19:03:16, 1.39s/it] 20%|██ | 12423/61904 [6:22:59<19:25:31, 1.41s/it] 20%|██ | 12424/61904 [6:23:01<19:11:25, 1.40s/it] 20%|██ | 12425/61904 [6:23:02<19:02:57, 1.39s/it] 20%|██ | 12426/61904 [6:23:03<18:41:52, 1.36s/it] 20%|██ | 12427/61904 [6:23:04<17:59:38, 1.31s/it] 20%|██ | 12428/61904 [6:23:06<18:48:10, 1.37s/it] 20%|██ | 12429/61904 [6:23:07<18:10:58, 1.32s/it] 20%|██ | 12430/61904 [6:23:09<18:38:47, 1.36s/it] 20%|██ | 12431/61904 [6:23:10<18:17:19, 1.33s/it] 20%|██ | 12432/61904 [6:23:12<19:35:17, 1.43s/it] 20%|██ | 12433/61904 [6:23:13<20:03:15, 1.46s/it] 20%|██ | 12434/61904 [6:23:14<19:53:34, 1.45s/it] 20%|██ | 12435/61904 [6:23:16<19:36:19, 1.43s/it] 20%|██ | 12436/61904 [6:23:17<19:20:10, 1.41s/it] 20%|██ | 12437/61904 [6:23:19<19:00:59, 1.38s/it] 20%|██ | 12438/61904 [6:23:20<18:38:36, 1.36s/it] 20%|██ | 12439/61904 [6:23:21<18:17:50, 1.33s/it] 20%|██ | 12440/61904 [6:23:23<18:47:18, 1.37s/it] {'loss': 2.7877, 'learning_rate': 1.8016336056009334e-07, 'epoch': 3.21} 20%|██ | 12440/61904 [6:23:23<18:47:18, 1.37s/it] 20%|██ | 12441/61904 [6:23:24<19:10:44, 1.40s/it] 20%|██ | 12442/61904 [6:23:25<18:58:41, 1.38s/it] 20%|██ | 12443/61904 [6:23:27<18:45:39, 1.37s/it] 20%|██ | 12444/61904 [6:23:28<18:26:48, 1.34s/it] 20%|██ | 12445/61904 [6:23:29<18:26:45, 1.34s/it] 20%|██ | 12446/61904 [6:23:31<18:56:17, 1.38s/it] 20%|██ | 12447/61904 [6:23:32<18:57:14, 1.38s/it] 20%|██ | 12448/61904 [6:23:34<19:04:18, 1.39s/it] 20%|██ | 12449/61904 [6:23:35<19:03:15, 1.39s/it] 20%|██ | 12450/61904 [6:23:36<19:25:14, 1.41s/it] 20%|██ | 12451/61904 [6:23:38<18:44:20, 1.36s/it] 20%|██ | 12452/61904 [6:23:39<19:05:33, 1.39s/it] 20%|██ | 12453/61904 [6:23:40<18:26:57, 1.34s/it] 20%|██ | 12454/61904 [6:23:42<18:39:43, 1.36s/it] 20%|██ | 12455/61904 [6:23:43<19:39:37, 1.43s/it] 20%|██ | 12456/61904 [6:23:45<19:19:31, 1.41s/it] 20%|██ | 12457/61904 [6:23:46<19:55:57, 1.45s/it] 20%|██ | 12458/61904 [6:23:48<19:07:11, 1.39s/it] 20%|██ | 12459/61904 [6:23:49<18:38:47, 1.36s/it] 20%|██ | 12460/61904 [6:23:50<18:36:52, 1.36s/it] {'loss': 2.8041, 'learning_rate': 1.80130947750551e-07, 'epoch': 3.22} 20%|██ | 12460/61904 [6:23:50<18:36:52, 1.36s/it] 20%|██ | 12461/61904 [6:23:51<18:02:02, 1.31s/it] 20%|██ | 12462/61904 [6:23:53<17:45:59, 1.29s/it] 20%|██ | 12463/61904 [6:23:54<17:53:42, 1.30s/it] 20%|██ | 12464/61904 [6:23:55<17:52:42, 1.30s/it] 20%|██ | 12465/61904 [6:23:57<17:49:39, 1.30s/it] 20%|██ | 12466/61904 [6:23:58<18:00:59, 1.31s/it] 20%|██ | 12467/61904 [6:23:59<18:09:40, 1.32s/it] 20%|██ | 12468/61904 [6:24:01<18:22:27, 1.34s/it] 20%|██ | 12469/61904 [6:24:02<17:47:36, 1.30s/it] 20%|██ | 12470/61904 [6:24:03<17:31:02, 1.28s/it] 20%|██ | 12471/61904 [6:24:04<18:14:32, 1.33s/it] 20%|██ | 12472/61904 [6:24:06<18:32:42, 1.35s/it] 20%|██ | 12473/61904 [6:24:07<18:32:39, 1.35s/it] 20%|██ | 12474/61904 [6:24:09<18:51:08, 1.37s/it] 20%|██ | 12475/61904 [6:24:10<19:01:21, 1.39s/it] 20%|██ | 12476/61904 [6:24:11<18:41:46, 1.36s/it] 20%|██ | 12477/61904 [6:24:13<19:05:28, 1.39s/it] 20%|██ | 12478/61904 [6:24:14<18:56:06, 1.38s/it] 20%|██ | 12479/61904 [6:24:16<18:47:46, 1.37s/it] 20%|██ | 12480/61904 [6:24:17<18:34:08, 1.35s/it] {'loss': 2.8454, 'learning_rate': 1.8009853494100867e-07, 'epoch': 3.23} 20%|██ | 12480/61904 [6:24:17<18:34:08, 1.35s/it] 20%|██ | 12481/61904 [6:24:18<18:59:39, 1.38s/it] 20%|██ | 12482/61904 [6:24:20<18:43:42, 1.36s/it] 20%|██ | 12483/61904 [6:24:21<18:32:56, 1.35s/it] 20%|██ | 12484/61904 [6:24:22<19:18:49, 1.41s/it] 20%|██ | 12485/61904 [6:24:24<19:36:11, 1.43s/it] 20%|██ | 12486/61904 [6:24:25<19:01:02, 1.39s/it] 20%|██ | 12487/61904 [6:24:27<18:59:09, 1.38s/it] 20%|██ | 12488/61904 [6:24:28<19:00:02, 1.38s/it] 20%|██ | 12489/61904 [6:24:29<18:59:31, 1.38s/it] 20%|██ | 12490/61904 [6:24:31<18:43:13, 1.36s/it] 20%|██ | 12491/61904 [6:24:32<18:07:57, 1.32s/it] 20%|██ | 12492/61904 [6:24:33<18:24:42, 1.34s/it] 20%|██ | 12493/61904 [6:24:35<18:30:12, 1.35s/it] 20%|██ | 12494/61904 [6:24:36<18:50:15, 1.37s/it] 20%|██ | 12495/61904 [6:24:38<19:15:09, 1.40s/it] 20%|██ | 12496/61904 [6:24:39<19:14:33, 1.40s/it] 20%|██ | 12497/61904 [6:24:40<18:48:06, 1.37s/it] 20%|██ | 12498/61904 [6:24:42<18:35:36, 1.35s/it] 20%|██ | 12499/61904 [6:24:43<18:38:21, 1.36s/it] 20%|██ | 12500/61904 [6:24:44<18:20:50, 1.34s/it] {'loss': 2.8097, 'learning_rate': 1.8006612213146635e-07, 'epoch': 3.23} 20%|██ | 12500/61904 [6:24:44<18:20:50, 1.34s/it] 20%|██ | 12501/61904 [6:24:46<18:58:58, 1.38s/it] 20%|██ | 12502/61904 [6:24:47<19:04:54, 1.39s/it] 20%|██ | 12503/61904 [6:24:49<18:55:28, 1.38s/it] 20%|██ | 12504/61904 [6:24:50<18:55:17, 1.38s/it] 20%|██ | 12505/61904 [6:24:51<18:26:34, 1.34s/it] 20%|██ | 12506/61904 [6:24:53<18:33:59, 1.35s/it] 20%|██ | 12507/61904 [6:24:54<19:13:46, 1.40s/it] 20%|██ | 12508/61904 [6:24:55<18:56:30, 1.38s/it] 20%|██ | 12509/61904 [6:24:57<18:45:55, 1.37s/it] 20%|██ | 12510/61904 [6:24:58<18:31:12, 1.35s/it] 20%|██ | 12511/61904 [6:24:59<18:21:43, 1.34s/it] 20%|██ | 12512/61904 [6:25:01<18:40:28, 1.36s/it] 20%|██ | 12513/61904 [6:25:02<18:50:41, 1.37s/it] 20%|██ | 12514/61904 [6:25:03<18:18:57, 1.34s/it] 20%|██ | 12515/61904 [6:25:05<18:14:34, 1.33s/it] 20%|██ | 12516/61904 [6:25:06<18:08:36, 1.32s/it] 20%|██ | 12517/61904 [6:25:07<18:30:02, 1.35s/it] 20%|██ | 12518/61904 [6:25:09<18:27:53, 1.35s/it] 20%|██ | 12519/61904 [6:25:10<18:28:00, 1.35s/it] 20%|██ | 12520/61904 [6:25:11<18:22:32, 1.34s/it] {'loss': 2.8309, 'learning_rate': 1.8003370932192402e-07, 'epoch': 3.24} 20%|██ | 12520/61904 [6:25:11<18:22:32, 1.34s/it] 20%|██ | 12521/61904 [6:25:13<18:00:08, 1.31s/it] 20%|██ | 12522/61904 [6:25:14<18:30:22, 1.35s/it] 20%|██ | 12523/61904 [6:25:15<18:07:29, 1.32s/it] 20%|██ | 12524/61904 [6:25:17<18:15:41, 1.33s/it] 20%|██ | 12525/61904 [6:25:18<18:43:51, 1.37s/it] 20%|██ | 12526/61904 [6:25:20<18:40:16, 1.36s/it] 20%|██ | 12527/61904 [6:25:21<19:38:41, 1.43s/it] 20%|██ | 12528/61904 [6:25:23<19:24:26, 1.41s/it] 20%|██ | 12529/61904 [6:25:24<19:33:03, 1.43s/it] 20%|██ | 12530/61904 [6:25:25<18:41:44, 1.36s/it] 20%|██ | 12531/61904 [6:25:27<18:49:54, 1.37s/it] 20%|██ | 12532/61904 [6:25:28<18:39:35, 1.36s/it] 20%|██ | 12533/61904 [6:25:29<18:51:22, 1.37s/it] 20%|██ | 12534/61904 [6:25:31<19:06:41, 1.39s/it] 20%|██ | 12535/61904 [6:25:32<19:04:33, 1.39s/it] 20%|██ | 12536/61904 [6:25:34<19:27:12, 1.42s/it] 20%|██ | 12537/61904 [6:25:35<18:57:50, 1.38s/it] 20%|██ | 12538/61904 [6:25:36<18:55:18, 1.38s/it] 20%|██ | 12539/61904 [6:25:38<18:46:04, 1.37s/it] 20%|██ | 12540/61904 [6:25:39<18:22:57, 1.34s/it] {'loss': 2.7646, 'learning_rate': 1.8000129651238168e-07, 'epoch': 3.24} 20%|██ | 12540/61904 [6:25:39<18:22:57, 1.34s/it] 20%|██ | 12541/61904 [6:25:40<18:22:36, 1.34s/it] 20%|██ | 12542/61904 [6:25:42<18:16:10, 1.33s/it] 20%|██ | 12543/61904 [6:25:43<18:33:04, 1.35s/it] 20%|██ | 12544/61904 [6:25:44<18:38:47, 1.36s/it] 20%|██ | 12545/61904 [6:25:46<19:02:46, 1.39s/it] 20%|██ | 12546/61904 [6:25:47<19:49:34, 1.45s/it] 20%|██ | 12547/61904 [6:25:49<19:03:12, 1.39s/it] 20%|██ | 12548/61904 [6:25:50<19:53:31, 1.45s/it] 20%|██ | 12549/61904 [6:25:51<19:02:14, 1.39s/it] 20%|██ | 12550/61904 [6:25:53<18:55:29, 1.38s/it] 20%|██ | 12551/61904 [6:25:54<18:56:54, 1.38s/it] 20%|██ | 12552/61904 [6:25:56<18:54:15, 1.38s/it] 20%|██ | 12553/61904 [6:25:57<18:48:14, 1.37s/it] 20%|██ | 12554/61904 [6:25:58<19:07:48, 1.40s/it] 20%|██ | 12555/61904 [6:26:00<19:23:03, 1.41s/it] 20%|██ | 12556/61904 [6:26:01<18:38:34, 1.36s/it] 20%|██ | 12557/61904 [6:26:03<19:00:34, 1.39s/it] 20%|██ | 12558/61904 [6:26:04<18:56:13, 1.38s/it] 20%|██ | 12559/61904 [6:26:05<19:08:10, 1.40s/it] 20%|██ | 12560/61904 [6:26:07<19:44:18, 1.44s/it] {'loss': 2.7901, 'learning_rate': 1.7996888370283937e-07, 'epoch': 3.25} 20%|██ | 12560/61904 [6:26:07<19:44:18, 1.44s/it] 20%|██ | 12561/61904 [6:26:08<19:10:21, 1.40s/it] 20%|██ | 12562/61904 [6:26:10<18:57:06, 1.38s/it] 20%|██ | 12563/61904 [6:26:11<18:49:10, 1.37s/it] 20%|██ | 12564/61904 [6:26:12<18:36:32, 1.36s/it] 20%|██ | 12565/61904 [6:26:14<19:28:03, 1.42s/it] 20%|██ | 12566/61904 [6:26:15<19:30:41, 1.42s/it] 20%|██ | 12567/61904 [6:26:16<18:51:18, 1.38s/it] 20%|██ | 12568/61904 [6:26:18<19:41:31, 1.44s/it] 20%|██ | 12569/61904 [6:26:19<18:47:14, 1.37s/it] 20%|██ | 12570/61904 [6:26:21<18:30:07, 1.35s/it] 20%|██ | 12571/61904 [6:26:22<18:36:52, 1.36s/it] 20%|██ | 12572/61904 [6:26:23<18:58:02, 1.38s/it] 20%|██ | 12573/61904 [6:26:25<18:40:45, 1.36s/it] 20%|██ | 12574/61904 [6:26:26<18:42:20, 1.37s/it] 20%|██ | 12575/61904 [6:26:27<18:37:41, 1.36s/it] 20%|██ | 12576/61904 [6:26:29<18:41:21, 1.36s/it] 20%|██ | 12577/61904 [6:26:30<20:03:34, 1.46s/it] 20%|██ | 12578/61904 [6:26:32<19:51:34, 1.45s/it] 20%|██ | 12579/61904 [6:26:33<19:09:25, 1.40s/it] 20%|██ | 12580/61904 [6:26:35<19:13:10, 1.40s/it] {'loss': 2.7912, 'learning_rate': 1.79936470893297e-07, 'epoch': 3.25} 20%|██ | 12580/61904 [6:26:35<19:13:10, 1.40s/it] 20%|██ | 12581/61904 [6:26:36<19:04:02, 1.39s/it] 20%|██ | 12582/61904 [6:26:37<18:28:43, 1.35s/it] 20%|██ | 12583/61904 [6:26:39<18:56:17, 1.38s/it] 20%|██ | 12584/61904 [6:26:40<18:34:59, 1.36s/it] 20%|██ | 12585/61904 [6:26:41<18:16:38, 1.33s/it] 20%|██ | 12586/61904 [6:26:43<18:45:41, 1.37s/it] 20%|██ | 12587/61904 [6:26:44<18:32:16, 1.35s/it] 20%|██ | 12588/61904 [6:26:45<18:45:45, 1.37s/it] 20%|██ | 12589/61904 [6:26:47<18:32:19, 1.35s/it] 20%|██ | 12590/61904 [6:26:48<18:36:31, 1.36s/it] 20%|██ | 12591/61904 [6:26:49<18:43:57, 1.37s/it] 20%|██ | 12592/61904 [6:26:51<18:57:29, 1.38s/it] 20%|██ | 12593/61904 [6:26:52<18:47:28, 1.37s/it] 20%|██ | 12594/61904 [6:26:54<18:38:10, 1.36s/it] 20%|██ | 12595/61904 [6:26:55<18:58:04, 1.38s/it] 20%|██ | 12596/61904 [6:26:56<19:06:48, 1.40s/it] 20%|██ | 12597/61904 [6:26:58<19:37:39, 1.43s/it] 20%|██ | 12598/61904 [6:26:59<19:10:20, 1.40s/it] 20%|██ | 12599/61904 [6:27:01<18:39:08, 1.36s/it] 20%|██ | 12600/61904 [6:27:02<18:28:33, 1.35s/it] {'loss': 2.8004, 'learning_rate': 1.799040580837547e-07, 'epoch': 3.26} 20%|██ | 12600/61904 [6:27:02<18:28:33, 1.35s/it] 20%|██ | 12601/61904 [6:27:03<18:39:34, 1.36s/it] 20%|██ | 12602/61904 [6:27:05<18:33:50, 1.36s/it] 20%|██ | 12603/61904 [6:27:06<18:50:39, 1.38s/it] 20%|██ | 12604/61904 [6:27:07<18:29:53, 1.35s/it] 20%|██ | 12605/61904 [6:27:09<17:58:00, 1.31s/it] 20%|██ | 12606/61904 [6:27:10<18:25:53, 1.35s/it] 20%|██ | 12607/61904 [6:27:11<18:19:50, 1.34s/it] 20%|██ | 12608/61904 [6:27:13<18:23:57, 1.34s/it] 20%|██ | 12609/61904 [6:27:14<17:55:25, 1.31s/it] 20%|██ | 12610/61904 [6:27:15<17:55:41, 1.31s/it] 20%|██ | 12611/61904 [6:27:17<18:02:01, 1.32s/it] 20%|██ | 12612/61904 [6:27:18<18:10:18, 1.33s/it] 20%|██ | 12613/61904 [6:27:19<18:56:27, 1.38s/it] 20%|██ | 12614/61904 [6:27:21<19:00:47, 1.39s/it] 20%|██ | 12615/61904 [6:27:22<18:47:45, 1.37s/it] 20%|██ | 12616/61904 [6:27:23<18:33:09, 1.36s/it] 20%|██ | 12617/61904 [6:27:25<18:46:39, 1.37s/it] 20%|██ | 12618/61904 [6:27:26<19:10:06, 1.40s/it] 20%|██ | 12619/61904 [6:27:28<18:53:30, 1.38s/it] 20%|██ | 12620/61904 [6:27:29<18:44:06, 1.37s/it] {'loss': 2.7432, 'learning_rate': 1.7987164527421235e-07, 'epoch': 3.26} 20%|██ | 12620/61904 [6:27:29<18:44:06, 1.37s/it] 20%|██ | 12621/61904 [6:27:30<18:11:53, 1.33s/it] 20%|██ | 12622/61904 [6:27:32<18:04:03, 1.32s/it] 20%|██ | 12623/61904 [6:27:33<18:03:19, 1.32s/it] 20%|██ | 12624/61904 [6:27:34<18:18:41, 1.34s/it] 20%|██ | 12625/61904 [6:27:36<18:01:21, 1.32s/it] 20%|██ | 12626/61904 [6:27:37<19:04:56, 1.39s/it] 20%|██ | 12627/61904 [6:27:39<19:36:10, 1.43s/it] 20%|██ | 12628/61904 [6:27:40<19:22:20, 1.42s/it] 20%|██ | 12629/61904 [6:27:41<19:08:40, 1.40s/it] 20%|██ | 12630/61904 [6:27:43<18:49:15, 1.38s/it] 20%|██ | 12631/61904 [6:27:44<19:16:00, 1.41s/it] 20%|██ | 12632/61904 [6:27:45<18:43:17, 1.37s/it] 20%|██ | 12633/61904 [6:27:47<18:52:27, 1.38s/it] 20%|██ | 12634/61904 [6:27:48<18:47:19, 1.37s/it] 20%|██ | 12635/61904 [6:27:50<18:38:54, 1.36s/it] 20%|██ | 12636/61904 [6:27:51<18:57:53, 1.39s/it] 20%|██ | 12637/61904 [6:27:52<18:50:19, 1.38s/it] 20%|██ | 12638/61904 [6:27:54<18:37:38, 1.36s/it] 20%|██ | 12639/61904 [6:27:55<18:06:03, 1.32s/it] 20%|██ | 12640/61904 [6:27:56<18:39:34, 1.36s/it] {'loss': 2.7917, 'learning_rate': 1.7983923246467001e-07, 'epoch': 3.27} 20%|██ | 12640/61904 [6:27:56<18:39:34, 1.36s/it] 20%|██ | 12641/61904 [6:27:58<18:54:41, 1.38s/it] 20%|██ | 12642/61904 [6:27:59<18:56:26, 1.38s/it] 20%|██ | 12643/61904 [6:28:00<18:39:30, 1.36s/it] 20%|██ | 12644/61904 [6:28:02<18:48:24, 1.37s/it] 20%|██ | 12645/61904 [6:28:03<18:39:19, 1.36s/it] 20%|██ | 12646/61904 [6:28:05<19:11:10, 1.40s/it] 20%|██ | 12647/61904 [6:28:06<19:02:12, 1.39s/it] 20%|██ | 12648/61904 [6:28:07<18:49:33, 1.38s/it] 20%|██ | 12649/61904 [6:28:09<18:24:55, 1.35s/it] 20%|██ | 12650/61904 [6:28:10<18:59:45, 1.39s/it] 20%|██ | 12651/61904 [6:28:11<18:11:06, 1.33s/it] 20%|██ | 12652/61904 [6:28:13<18:15:39, 1.33s/it] 20%|██ | 12653/61904 [6:28:14<17:57:47, 1.31s/it] 20%|██ | 12654/61904 [6:28:15<18:17:06, 1.34s/it] 20%|██ | 12655/61904 [6:28:17<19:04:47, 1.39s/it] 20%|██ | 12656/61904 [6:28:18<19:10:58, 1.40s/it] 20%|██ | 12657/61904 [6:28:20<19:05:10, 1.40s/it] 20%|██ | 12658/61904 [6:28:21<18:39:13, 1.36s/it] 20%|██ | 12659/61904 [6:28:22<18:26:31, 1.35s/it] 20%|██ | 12660/61904 [6:28:24<18:40:14, 1.36s/it] {'loss': 2.8395, 'learning_rate': 1.798068196551277e-07, 'epoch': 3.27} 20%|██ | 12660/61904 [6:28:24<18:40:14, 1.36s/it] 20%|██ | 12661/61904 [6:28:25<18:14:23, 1.33s/it] 20%|██ | 12662/61904 [6:28:26<18:10:32, 1.33s/it] 20%|██ | 12663/61904 [6:28:28<18:54:59, 1.38s/it] 20%|██ | 12664/61904 [6:28:29<18:49:41, 1.38s/it] 20%|██ | 12665/61904 [6:28:31<19:03:24, 1.39s/it] 20%|██ | 12666/61904 [6:28:32<19:27:38, 1.42s/it] 20%|██ | 12667/61904 [6:28:33<19:11:48, 1.40s/it] 20%|██ | 12668/61904 [6:28:35<18:41:39, 1.37s/it] 20%|██ | 12669/61904 [6:28:36<19:42:38, 1.44s/it] 20%|██ | 12670/61904 [6:28:38<20:38:55, 1.51s/it] 20%|██ | 12671/61904 [6:28:39<20:07:35, 1.47s/it] 20%|██ | 12672/61904 [6:28:41<19:42:13, 1.44s/it] 20%|██ | 12673/61904 [6:28:42<19:32:54, 1.43s/it] 20%|██ | 12674/61904 [6:28:43<19:02:42, 1.39s/it] 20%|██ | 12675/61904 [6:28:45<18:58:27, 1.39s/it] 20%|██ | 12676/61904 [6:28:46<19:23:02, 1.42s/it] 20%|██ | 12677/61904 [6:28:48<19:08:47, 1.40s/it] 20%|██ | 12678/61904 [6:28:49<19:23:47, 1.42s/it] 20%|██ | 12679/61904 [6:28:50<19:07:55, 1.40s/it] 20%|██ | 12680/61904 [6:28:52<18:42:22, 1.37s/it] {'loss': 2.7566, 'learning_rate': 1.7977440684558536e-07, 'epoch': 3.28} 20%|██ | 12680/61904 [6:28:52<18:42:22, 1.37s/it] 20%|██ | 12681/61904 [6:28:53<19:21:20, 1.42s/it] 20%|██ | 12682/61904 [6:28:55<20:04:12, 1.47s/it] 20%|██ | 12683/61904 [6:28:56<19:57:07, 1.46s/it] 20%|██ | 12684/61904 [6:28:58<19:31:37, 1.43s/it] 20%|██ | 12685/61904 [6:28:59<19:36:39, 1.43s/it] 20%|██ | 12686/61904 [6:29:01<19:42:41, 1.44s/it] 20%|██ | 12687/61904 [6:29:02<19:22:18, 1.42s/it] 20%|██ | 12688/61904 [6:29:03<19:00:08, 1.39s/it] 20%|██ | 12689/61904 [6:29:05<19:31:24, 1.43s/it] 20%|██ | 12690/61904 [6:29:06<18:53:37, 1.38s/it] 21%|██ | 12691/61904 [6:29:08<19:58:38, 1.46s/it] 21%|██ | 12692/61904 [6:29:09<19:56:38, 1.46s/it] 21%|██ | 12693/61904 [6:29:11<20:09:31, 1.47s/it] 21%|██ | 12694/61904 [6:29:12<19:40:43, 1.44s/it] 21%|██ | 12695/61904 [6:29:14<20:59:57, 1.54s/it] 21%|██ | 12696/61904 [6:29:15<20:35:58, 1.51s/it] 21%|██ | 12697/61904 [6:29:17<19:45:59, 1.45s/it] 21%|██ | 12698/61904 [6:29:18<19:12:56, 1.41s/it] 21%|██ | 12699/61904 [6:29:19<18:58:02, 1.39s/it] 21%|██ | 12700/61904 [6:29:21<19:04:34, 1.40s/it] {'loss': 2.7603, 'learning_rate': 1.7974199403604303e-07, 'epoch': 3.28} 21%|██ | 12700/61904 [6:29:21<19:04:34, 1.40s/it] 21%|██ | 12701/61904 [6:29:22<18:38:04, 1.36s/it] 21%|██ | 12702/61904 [6:29:23<18:20:08, 1.34s/it] 21%|██ | 12703/61904 [6:29:24<18:07:15, 1.33s/it] 21%|██ | 12704/61904 [6:29:26<18:51:14, 1.38s/it] 21%|██ | 12705/61904 [6:29:28<19:33:53, 1.43s/it] 21%|██ | 12706/61904 [6:29:29<19:44:29, 1.44s/it] 21%|██ | 12707/61904 [6:29:31<20:00:53, 1.46s/it] 21%|██ | 12708/61904 [6:29:32<19:35:14, 1.43s/it] 21%|██ | 12709/61904 [6:29:33<19:52:02, 1.45s/it] 21%|██ | 12710/61904 [6:29:35<20:12:00, 1.48s/it] 21%|██ | 12711/61904 [6:29:36<19:20:04, 1.41s/it] 21%|██ | 12712/61904 [6:29:37<18:32:36, 1.36s/it] 21%|██ | 12713/61904 [6:29:39<18:30:08, 1.35s/it] 21%|██ | 12714/61904 [6:29:40<18:34:23, 1.36s/it] 21%|██ | 12715/61904 [6:29:42<18:49:09, 1.38s/it] 21%|██ | 12716/61904 [6:29:43<19:01:43, 1.39s/it] 21%|██ | 12717/61904 [6:29:44<18:17:29, 1.34s/it] 21%|██ | 12718/61904 [6:29:46<18:35:32, 1.36s/it] 21%|██ | 12719/61904 [6:29:47<18:43:19, 1.37s/it] 21%|██ | 12720/61904 [6:29:48<18:34:09, 1.36s/it] {'loss': 2.7443, 'learning_rate': 1.7970958122650071e-07, 'epoch': 3.29} 21%|██ | 12720/61904 [6:29:48<18:34:09, 1.36s/it] 21%|██ | 12721/61904 [6:29:50<19:10:27, 1.40s/it] 21%|██ | 12722/61904 [6:29:51<19:17:13, 1.41s/it] 21%|██ | 12723/61904 [6:29:53<18:49:46, 1.38s/it] 21%|██ | 12724/61904 [6:29:54<18:24:33, 1.35s/it] 21%|██ | 12725/61904 [6:29:55<18:13:39, 1.33s/it] 21%|██ | 12726/61904 [6:29:56<18:08:47, 1.33s/it] 21%|██ | 12727/61904 [6:29:58<18:05:33, 1.32s/it] 21%|██ | 12728/61904 [6:29:59<18:44:55, 1.37s/it] 21%|██ | 12729/61904 [6:30:01<18:27:06, 1.35s/it] 21%|██ | 12730/61904 [6:30:02<18:31:46, 1.36s/it] 21%|██ | 12731/61904 [6:30:03<18:53:57, 1.38s/it] 21%|██ | 12732/61904 [6:30:05<18:52:06, 1.38s/it] 21%|██ | 12733/61904 [6:30:06<18:35:41, 1.36s/it] 21%|██ | 12734/61904 [6:30:07<18:28:15, 1.35s/it] 21%|██ | 12735/61904 [6:30:09<18:47:15, 1.38s/it] 21%|██ | 12736/61904 [6:30:10<18:21:11, 1.34s/it] 21%|██ | 12737/61904 [6:30:11<18:16:03, 1.34s/it] 21%|██ | 12738/61904 [6:30:13<18:14:50, 1.34s/it] 21%|██ | 12739/61904 [6:30:14<18:10:40, 1.33s/it] 21%|██ | 12740/61904 [6:30:15<18:05:14, 1.32s/it] {'loss': 2.7244, 'learning_rate': 1.7967716841695835e-07, 'epoch': 3.29} 21%|██ | 12740/61904 [6:30:15<18:05:14, 1.32s/it] 21%|██ | 12741/61904 [6:30:17<17:47:44, 1.30s/it] 21%|██ | 12742/61904 [6:30:18<18:29:58, 1.35s/it] 21%|██ | 12743/61904 [6:30:20<18:47:13, 1.38s/it] 21%|██ | 12744/61904 [6:30:21<18:41:56, 1.37s/it] 21%|██ | 12745/61904 [6:30:22<18:48:40, 1.38s/it] 21%|██ | 12746/61904 [6:30:23<18:00:59, 1.32s/it] 21%|██ | 12747/61904 [6:30:25<18:33:58, 1.36s/it] 21%|██ | 12748/61904 [6:30:26<18:31:55, 1.36s/it] 21%|██ | 12749/61904 [6:30:28<18:03:26, 1.32s/it] 21%|██ | 12750/61904 [6:30:29<18:07:27, 1.33s/it] 21%|██ | 12751/61904 [6:30:30<18:15:27, 1.34s/it] 21%|██ | 12752/61904 [6:30:32<18:58:38, 1.39s/it] 21%|██ | 12753/61904 [6:30:33<18:31:03, 1.36s/it] 21%|██ | 12754/61904 [6:30:34<18:27:45, 1.35s/it] 21%|██ | 12755/61904 [6:30:36<18:48:13, 1.38s/it] 21%|██ | 12756/61904 [6:30:37<19:06:45, 1.40s/it] 21%|██ | 12757/61904 [6:30:39<18:46:07, 1.37s/it] 21%|██ | 12758/61904 [6:30:40<18:35:45, 1.36s/it] 21%|██ | 12759/61904 [6:30:41<18:44:19, 1.37s/it] 21%|██ | 12760/61904 [6:30:43<18:20:13, 1.34s/it] {'loss': 2.7736, 'learning_rate': 1.7964475560741604e-07, 'epoch': 3.3} 21%|██ | 12760/61904 [6:30:43<18:20:13, 1.34s/it] 21%|██ | 12761/61904 [6:30:44<18:17:27, 1.34s/it] 21%|██ | 12762/61904 [6:30:45<18:02:00, 1.32s/it] 21%|██ | 12763/61904 [6:30:46<17:59:22, 1.32s/it] 21%|██ | 12764/61904 [6:30:48<17:56:29, 1.31s/it] 21%|██ | 12765/61904 [6:30:49<17:56:50, 1.31s/it] 21%|██ | 12766/61904 [6:30:50<18:13:43, 1.34s/it] 21%|██ | 12767/61904 [6:30:52<18:25:31, 1.35s/it] 21%|██ | 12768/61904 [6:30:53<18:30:32, 1.36s/it] 21%|██ | 12769/61904 [6:30:55<19:02:08, 1.39s/it] 21%|██ | 12770/61904 [6:30:56<18:29:35, 1.35s/it] 21%|██ | 12771/61904 [6:30:57<18:51:24, 1.38s/it] 21%|██ | 12772/61904 [6:30:59<18:41:58, 1.37s/it] 21%|██ | 12773/61904 [6:31:00<18:41:45, 1.37s/it] 21%|██ | 12774/61904 [6:31:01<18:24:04, 1.35s/it] 21%|██ | 12775/61904 [6:31:03<18:08:38, 1.33s/it] 21%|██ | 12776/61904 [6:31:04<18:42:23, 1.37s/it] 21%|██ | 12777/61904 [6:31:06<18:53:21, 1.38s/it] 21%|██ | 12778/61904 [6:31:07<19:00:34, 1.39s/it] 21%|██ | 12779/61904 [6:31:08<18:38:51, 1.37s/it] 21%|██ | 12780/61904 [6:31:10<18:50:52, 1.38s/it] {'loss': 2.7509, 'learning_rate': 1.7961234279787373e-07, 'epoch': 3.3} 21%|██ | 12780/61904 [6:31:10<18:50:52, 1.38s/it] 21%|██ | 12781/61904 [6:31:11<18:54:33, 1.39s/it] 21%|██ | 12782/61904 [6:31:12<18:39:41, 1.37s/it] 21%|██ | 12783/61904 [6:31:14<18:10:52, 1.33s/it] 21%|██ | 12784/61904 [6:31:15<18:14:07, 1.34s/it] 21%|██ | 12785/61904 [6:31:17<18:53:04, 1.38s/it] 21%|██ | 12786/61904 [6:31:18<18:56:39, 1.39s/it] 21%|██ | 12787/61904 [6:31:19<18:47:39, 1.38s/it] 21%|██ | 12788/61904 [6:31:21<18:42:47, 1.37s/it] 21%|██ | 12789/61904 [6:31:22<18:09:59, 1.33s/it] 21%|██ | 12790/61904 [6:31:23<17:54:14, 1.31s/it] 21%|██ | 12791/61904 [6:31:25<18:35:13, 1.36s/it] 21%|██ | 12792/61904 [6:31:26<18:13:33, 1.34s/it] 21%|██ | 12793/61904 [6:31:27<18:01:18, 1.32s/it] 21%|██ | 12794/61904 [6:31:29<18:28:40, 1.35s/it] 21%|██ | 12795/61904 [6:31:30<18:33:02, 1.36s/it] 21%|██ | 12796/61904 [6:31:31<18:58:28, 1.39s/it] 21%|██ | 12797/61904 [6:31:33<18:42:35, 1.37s/it] 21%|██ | 12798/61904 [6:31:34<18:12:31, 1.33s/it] 21%|██ | 12799/61904 [6:31:35<18:07:07, 1.33s/it] 21%|██ | 12800/61904 [6:31:37<18:08:38, 1.33s/it] {'loss': 2.8058, 'learning_rate': 1.7957992998833136e-07, 'epoch': 3.31} 21%|██ | 12800/61904 [6:31:37<18:08:38, 1.33s/it] 21%|██ | 12801/61904 [6:31:38<18:33:30, 1.36s/it] 21%|██ | 12802/61904 [6:31:40<18:35:33, 1.36s/it] 21%|██ | 12803/61904 [6:31:41<18:12:05, 1.33s/it] 21%|██ | 12804/61904 [6:31:42<18:55:28, 1.39s/it] 21%|██ | 12805/61904 [6:31:44<19:20:21, 1.42s/it] 21%|██ | 12806/61904 [6:31:45<19:36:56, 1.44s/it] 21%|██ | 12807/61904 [6:31:47<19:43:55, 1.45s/it] 21%|██ | 12808/61904 [6:31:48<19:01:56, 1.40s/it] 21%|██ | 12809/61904 [6:31:49<18:23:29, 1.35s/it] 21%|██ | 12810/61904 [6:31:51<18:09:12, 1.33s/it] 21%|██ | 12811/61904 [6:31:52<17:56:42, 1.32s/it] 21%|██ | 12812/61904 [6:31:53<18:18:03, 1.34s/it] 21%|██ | 12813/61904 [6:31:55<18:57:43, 1.39s/it] 21%|██ | 12814/61904 [6:31:56<19:00:00, 1.39s/it] 21%|██ | 12815/61904 [6:31:58<19:07:14, 1.40s/it] 21%|██ | 12816/61904 [6:31:59<19:58:40, 1.47s/it] 21%|██ | 12817/61904 [6:32:00<19:01:38, 1.40s/it] 21%|██ | 12818/61904 [6:32:02<18:52:11, 1.38s/it] 21%|██ | 12819/61904 [6:32:03<18:51:19, 1.38s/it] 21%|██ | 12820/61904 [6:32:04<17:59:06, 1.32s/it] {'loss': 2.7647, 'learning_rate': 1.7954751717878905e-07, 'epoch': 3.31} 21%|██ | 12820/61904 [6:32:04<17:59:06, 1.32s/it] 21%|██ | 12821/61904 [6:32:06<18:31:30, 1.36s/it] 21%|██ | 12822/61904 [6:32:07<18:29:07, 1.36s/it] 21%|██ | 12823/61904 [6:32:08<18:22:42, 1.35s/it] 21%|██ | 12824/61904 [6:32:10<17:59:48, 1.32s/it] 21%|██ | 12825/61904 [6:32:11<18:28:20, 1.35s/it] 21%|██ | 12826/61904 [6:32:13<19:44:05, 1.45s/it] 21%|██ | 12827/61904 [6:32:14<19:10:48, 1.41s/it] 21%|██ | 12828/61904 [6:32:15<18:39:39, 1.37s/it] 21%|██ | 12829/61904 [6:32:17<18:35:51, 1.36s/it] 21%|██ | 12830/61904 [6:32:18<18:37:25, 1.37s/it] 21%|██ | 12831/61904 [6:32:19<18:25:00, 1.35s/it] 21%|██ | 12832/61904 [6:32:21<18:14:40, 1.34s/it] 21%|██ | 12833/61904 [6:32:22<18:45:47, 1.38s/it] 21%|██ | 12834/61904 [6:32:24<18:54:59, 1.39s/it] 21%|██ | 12835/61904 [6:32:25<18:51:20, 1.38s/it] 21%|██ | 12836/61904 [6:32:26<18:58:07, 1.39s/it] 21%|██ | 12837/61904 [6:32:28<18:43:30, 1.37s/it] 21%|██ | 12838/61904 [6:32:29<19:15:32, 1.41s/it] 21%|██ | 12839/61904 [6:32:31<19:38:38, 1.44s/it] 21%|██ | 12840/61904 [6:32:32<19:08:39, 1.40s/it] {'loss': 2.7883, 'learning_rate': 1.795151043692467e-07, 'epoch': 3.32} 21%|██ | 12840/61904 [6:32:32<19:08:39, 1.40s/it] 21%|██ | 12841/61904 [6:32:33<18:33:00, 1.36s/it] 21%|██ | 12842/61904 [6:32:35<18:12:05, 1.34s/it] 21%|██ | 12843/61904 [6:32:36<18:25:07, 1.35s/it] 21%|██ | 12844/61904 [6:32:37<17:50:40, 1.31s/it] 21%|██ | 12845/61904 [6:32:39<18:12:59, 1.34s/it] 21%|██ | 12846/61904 [6:32:40<18:27:40, 1.35s/it] 21%|██ | 12847/61904 [6:32:41<18:42:14, 1.37s/it] 21%|██ | 12848/61904 [6:32:43<18:32:51, 1.36s/it] 21%|██ | 12849/61904 [6:32:44<18:25:44, 1.35s/it] 21%|██ | 12850/61904 [6:32:45<17:56:56, 1.32s/it] 21%|██ | 12851/61904 [6:32:47<17:48:34, 1.31s/it] 21%|██ | 12852/61904 [6:32:48<17:42:00, 1.30s/it] 21%|██ | 12853/61904 [6:32:49<17:59:43, 1.32s/it] 21%|██ | 12854/61904 [6:32:51<18:17:24, 1.34s/it] 21%|██ | 12855/61904 [6:32:52<18:57:12, 1.39s/it] 21%|██ | 12856/61904 [6:32:54<18:58:00, 1.39s/it] 21%|██ | 12857/61904 [6:32:55<19:00:07, 1.39s/it] 21%|██ | 12858/61904 [6:32:56<18:49:45, 1.38s/it] 21%|██ | 12859/61904 [6:32:58<18:44:02, 1.38s/it] 21%|██ | 12860/61904 [6:32:59<18:19:10, 1.34s/it] {'loss': 2.8089, 'learning_rate': 1.7948269155970437e-07, 'epoch': 3.32} 21%|██ | 12860/61904 [6:32:59<18:19:10, 1.34s/it] 21%|██ | 12861/61904 [6:33:00<18:23:07, 1.35s/it] 21%|██ | 12862/61904 [6:33:02<18:54:18, 1.39s/it] 21%|██ | 12863/61904 [6:33:03<19:17:36, 1.42s/it] 21%|██ | 12864/61904 [6:33:04<18:33:41, 1.36s/it] 21%|██ | 12865/61904 [6:33:06<19:30:36, 1.43s/it] 21%|██ | 12866/61904 [6:33:07<18:55:54, 1.39s/it] 21%|██ | 12867/61904 [6:33:09<18:50:58, 1.38s/it] 21%|██ | 12868/61904 [6:33:10<19:34:43, 1.44s/it] 21%|██ | 12869/61904 [6:33:12<18:58:20, 1.39s/it] 21%|██ | 12870/61904 [6:33:13<18:39:37, 1.37s/it] 21%|██ | 12871/61904 [6:33:14<19:10:10, 1.41s/it] 21%|██ | 12872/61904 [6:33:16<20:01:18, 1.47s/it] 21%|██ | 12873/61904 [6:33:17<19:32:27, 1.43s/it] 21%|██ | 12874/61904 [6:33:19<19:02:07, 1.40s/it] 21%|██ | 12875/61904 [6:33:20<18:50:11, 1.38s/it] 21%|██ | 12876/61904 [6:33:22<19:22:10, 1.42s/it] 21%|██ | 12877/61904 [6:33:23<19:03:31, 1.40s/it] 21%|██ | 12878/61904 [6:33:24<18:45:16, 1.38s/it] 21%|██ | 12879/61904 [6:33:26<18:58:13, 1.39s/it] 21%|██ | 12880/61904 [6:33:27<19:01:29, 1.40s/it] {'loss': 2.7858, 'learning_rate': 1.7945027875016206e-07, 'epoch': 3.33} 21%|██ | 12880/61904 [6:33:27<19:01:29, 1.40s/it] 21%|██ | 12881/61904 [6:33:28<18:44:11, 1.38s/it] 21%|██ | 12882/61904 [6:33:30<19:12:09, 1.41s/it] 21%|██ | 12883/61904 [6:33:31<19:14:17, 1.41s/it] 21%|██ | 12884/61904 [6:33:33<18:48:39, 1.38s/it] 21%|██ | 12885/61904 [6:33:34<18:59:34, 1.39s/it] 21%|██ | 12886/61904 [6:33:35<19:07:44, 1.40s/it] 21%|██ | 12887/61904 [6:33:37<19:23:03, 1.42s/it] 21%|██ | 12888/61904 [6:33:38<18:45:40, 1.38s/it] 21%|██ | 12889/61904 [6:33:39<18:11:22, 1.34s/it] 21%|██ | 12890/61904 [6:33:41<18:19:46, 1.35s/it] 21%|██ | 12891/61904 [6:33:42<18:34:58, 1.36s/it] 21%|██ | 12892/61904 [6:33:44<18:47:14, 1.38s/it] 21%|██ | 12893/61904 [6:33:45<18:30:39, 1.36s/it] 21%|██ | 12894/61904 [6:33:46<18:23:04, 1.35s/it] 21%|██ | 12895/61904 [6:33:48<18:10:08, 1.33s/it] 21%|██ | 12896/61904 [6:33:49<18:02:00, 1.32s/it] 21%|██ | 12897/61904 [6:33:50<19:04:20, 1.40s/it] 21%|██ | 12898/61904 [6:33:52<19:03:49, 1.40s/it] 21%|██ | 12899/61904 [6:33:53<19:13:44, 1.41s/it] 21%|██ | 12900/61904 [6:33:55<19:31:31, 1.43s/it] {'loss': 2.7449, 'learning_rate': 1.7941786594061972e-07, 'epoch': 3.33} 21%|██ | 12900/61904 [6:33:55<19:31:31, 1.43s/it] 21%|██ | 12901/61904 [6:33:56<19:15:51, 1.42s/it] 21%|██ | 12902/61904 [6:33:57<18:42:44, 1.37s/it] 21%|██ | 12903/61904 [6:33:59<18:29:32, 1.36s/it] 21%|██ | 12904/61904 [6:34:00<18:52:50, 1.39s/it] 21%|██ | 12905/61904 [6:34:02<18:37:47, 1.37s/it] 21%|██ | 12906/61904 [6:34:03<18:06:24, 1.33s/it] 21%|██ | 12907/61904 [6:34:04<18:00:05, 1.32s/it] 21%|██ | 12908/61904 [6:34:05<17:58:41, 1.32s/it] 21%|██ | 12909/61904 [6:34:07<18:02:56, 1.33s/it] 21%|██ | 12910/61904 [6:34:08<18:13:09, 1.34s/it] 21%|██ | 12911/61904 [6:34:09<17:52:10, 1.31s/it] 21%|██ | 12912/61904 [6:34:11<17:44:31, 1.30s/it] 21%|██ | 12913/61904 [6:34:12<18:50:51, 1.38s/it] 21%|██ | 12914/61904 [6:34:14<19:26:12, 1.43s/it] 21%|██ | 12915/61904 [6:34:15<19:24:19, 1.43s/it] 21%|██ | 12916/61904 [6:34:17<19:21:47, 1.42s/it] 21%|██ | 12917/61904 [6:34:18<19:05:36, 1.40s/it] 21%|██ | 12918/61904 [6:34:19<18:59:39, 1.40s/it] 21%|██ | 12919/61904 [6:34:21<19:25:07, 1.43s/it] 21%|██ | 12920/61904 [6:34:22<19:05:43, 1.40s/it] {'loss': 2.7406, 'learning_rate': 1.7938545313107739e-07, 'epoch': 3.34} 21%|██ | 12920/61904 [6:34:22<19:05:43, 1.40s/it] 21%|██ | 12921/61904 [6:34:24<19:15:30, 1.42s/it] 21%|██ | 12922/61904 [6:34:25<19:01:47, 1.40s/it] 21%|██ | 12923/61904 [6:34:26<18:52:46, 1.39s/it] 21%|██ | 12924/61904 [6:34:28<18:46:35, 1.38s/it] 21%|██ | 12925/61904 [6:34:29<18:24:52, 1.35s/it] 21%|██ | 12926/61904 [6:34:30<18:07:06, 1.33s/it] 21%|██ | 12927/61904 [6:34:31<17:28:25, 1.28s/it] 21%|██ | 12928/61904 [6:34:33<17:29:42, 1.29s/it] 21%|██ | 12929/61904 [6:34:34<18:16:32, 1.34s/it] 21%|██ | 12930/61904 [6:34:36<18:42:29, 1.38s/it] 21%|██ | 12931/61904 [6:34:37<18:21:49, 1.35s/it] 21%|██ | 12932/61904 [6:34:38<18:18:27, 1.35s/it] 21%|██ | 12933/61904 [6:34:40<19:00:30, 1.40s/it] 21%|██ | 12934/61904 [6:34:41<18:30:04, 1.36s/it] 21%|██ | 12935/61904 [6:34:42<18:11:27, 1.34s/it] 21%|██ | 12936/61904 [6:34:44<18:23:17, 1.35s/it] 21%|██ | 12937/61904 [6:34:45<18:47:08, 1.38s/it] 21%|██ | 12938/61904 [6:34:47<18:42:42, 1.38s/it] 21%|██ | 12939/61904 [6:34:48<18:06:42, 1.33s/it] 21%|██ | 12940/61904 [6:34:49<18:23:46, 1.35s/it] {'loss': 2.8112, 'learning_rate': 1.7935304032153507e-07, 'epoch': 3.34} 21%|██ | 12940/61904 [6:34:49<18:23:46, 1.35s/it] 21%|██ | 12941/61904 [6:34:51<18:44:18, 1.38s/it] 21%|██ | 12942/61904 [6:34:52<18:17:20, 1.34s/it] 21%|██ | 12943/61904 [6:34:53<18:48:22, 1.38s/it] 21%|██ | 12944/61904 [6:34:55<18:21:43, 1.35s/it] 21%|██ | 12945/61904 [6:34:56<18:28:29, 1.36s/it] 21%|██ | 12946/61904 [6:34:57<18:40:30, 1.37s/it] 21%|██ | 12947/61904 [6:34:59<19:08:57, 1.41s/it] 21%|██ | 12948/61904 [6:35:00<18:43:24, 1.38s/it] 21%|██ | 12949/61904 [6:35:02<18:40:25, 1.37s/it] 21%|██ | 12950/61904 [6:35:03<18:38:22, 1.37s/it] 21%|██ | 12951/61904 [6:35:04<18:28:59, 1.36s/it] 21%|██ | 12952/61904 [6:35:06<18:08:17, 1.33s/it] 21%|██ | 12953/61904 [6:35:07<18:00:05, 1.32s/it] 21%|██ | 12954/61904 [6:35:08<17:55:02, 1.32s/it] 21%|██ | 12955/61904 [6:35:10<18:11:25, 1.34s/it] 21%|██ | 12956/61904 [6:35:11<18:43:25, 1.38s/it] 21%|██ | 12957/61904 [6:35:13<19:28:33, 1.43s/it] 21%|██ | 12958/61904 [6:35:14<19:41:31, 1.45s/it] 21%|██ | 12959/61904 [6:35:15<19:25:48, 1.43s/it] 21%|██ | 12960/61904 [6:35:17<19:37:05, 1.44s/it] {'loss': 2.7738, 'learning_rate': 1.793206275119927e-07, 'epoch': 3.35} 21%|██ | 12960/61904 [6:35:17<19:37:05, 1.44s/it] 21%|██ | 12961/61904 [6:35:18<18:56:48, 1.39s/it] 21%|██ | 12962/61904 [6:35:19<18:37:30, 1.37s/it] 21%|██ | 12963/61904 [6:35:21<18:28:18, 1.36s/it] 21%|██ | 12964/61904 [6:35:22<18:05:59, 1.33s/it] 21%|██ | 12965/61904 [6:35:23<17:41:55, 1.30s/it] 21%|██ | 12966/61904 [6:35:25<18:15:40, 1.34s/it] 21%|██ | 12967/61904 [6:35:26<18:16:00, 1.34s/it] 21%|██ | 12968/61904 [6:35:27<18:18:26, 1.35s/it] 21%|██ | 12969/61904 [6:35:29<18:06:44, 1.33s/it] 21%|██ | 12970/61904 [6:35:30<18:12:33, 1.34s/it] 21%|██ | 12971/61904 [6:35:31<18:19:00, 1.35s/it] 21%|██ | 12972/61904 [6:35:33<18:28:52, 1.36s/it] 21%|██ | 12973/61904 [6:35:34<17:55:27, 1.32s/it] 21%|██ | 12974/61904 [6:35:36<18:21:16, 1.35s/it] 21%|██ | 12975/61904 [6:35:37<18:00:50, 1.33s/it] 21%|██ | 12976/61904 [6:35:38<17:55:16, 1.32s/it] 21%|██ | 12977/61904 [6:35:39<18:02:07, 1.33s/it] 21%|██ | 12978/61904 [6:35:41<17:36:35, 1.30s/it] 21%|██ | 12979/61904 [6:35:42<17:48:27, 1.31s/it] 21%|██ | 12980/61904 [6:35:43<18:02:29, 1.33s/it] {'loss': 2.7914, 'learning_rate': 1.792882147024504e-07, 'epoch': 3.35} 21%|██ | 12980/61904 [6:35:43<18:02:29, 1.33s/it] 21%|██ | 12981/61904 [6:35:45<18:16:40, 1.34s/it] 21%|██ | 12982/61904 [6:35:46<18:37:12, 1.37s/it] 21%|██ | 12983/61904 [6:35:47<18:15:40, 1.34s/it] 21%|██ | 12984/61904 [6:35:49<18:34:45, 1.37s/it] 21%|██ | 12985/61904 [6:35:50<18:18:44, 1.35s/it] 21%|██ | 12986/61904 [6:35:51<17:54:23, 1.32s/it] 21%|██ | 12987/61904 [6:35:53<17:29:58, 1.29s/it] 21%|██ | 12988/61904 [6:35:54<18:52:54, 1.39s/it] 21%|██ | 12989/61904 [6:35:56<18:55:03, 1.39s/it] 21%|██ | 12990/61904 [6:35:57<19:00:38, 1.40s/it] 21%|██ | 12991/61904 [6:35:58<18:58:04, 1.40s/it] 21%|██ | 12992/61904 [6:36:00<18:43:56, 1.38s/it] 21%|██ | 12993/61904 [6:36:01<18:46:15, 1.38s/it] 21%|██ | 12994/61904 [6:36:03<18:32:52, 1.37s/it] 21%|██ | 12995/61904 [6:36:04<18:17:11, 1.35s/it] 21%|██ | 12996/61904 [6:36:05<18:15:29, 1.34s/it] 21%|██ | 12997/61904 [6:36:06<18:11:44, 1.34s/it] 21%|██ | 12998/61904 [6:36:08<18:05:15, 1.33s/it] 21%|██ | 12999/61904 [6:36:09<17:33:08, 1.29s/it] 21%|██ | 13000/61904 [6:36:10<17:36:11, 1.30s/it] {'loss': 2.8128, 'learning_rate': 1.7925580189290806e-07, 'epoch': 3.36} 21%|██ | 13000/61904 [6:36:10<17:36:11, 1.30s/it] 21%|██ | 13001/61904 [6:36:12<17:38:24, 1.30s/it] 21%|██ | 13002/61904 [6:36:13<17:43:05, 1.30s/it] 21%|██ | 13003/61904 [6:36:14<17:46:25, 1.31s/it] 21%|██ | 13004/61904 [6:36:15<17:30:26, 1.29s/it] 21%|██ | 13005/61904 [6:36:17<18:41:57, 1.38s/it] 21%|██ | 13006/61904 [6:36:19<19:00:11, 1.40s/it] 21%|██ | 13007/61904 [6:36:20<19:18:28, 1.42s/it] 21%|██ | 13008/61904 [6:36:21<19:21:20, 1.43s/it] 21%|██ | 13009/61904 [6:36:23<19:00:01, 1.40s/it] 21%|██ | 13010/61904 [6:36:24<19:20:33, 1.42s/it] 21%|██ | 13011/61904 [6:36:26<19:16:41, 1.42s/it] 21%|██ | 13012/61904 [6:36:27<19:17:22, 1.42s/it] 21%|██ | 13013/61904 [6:36:28<18:59:31, 1.40s/it] 21%|██ | 13014/61904 [6:36:30<18:37:08, 1.37s/it] 21%|██ | 13015/61904 [6:36:31<18:35:07, 1.37s/it] 21%|██ | 13016/61904 [6:36:32<18:26:48, 1.36s/it] 21%|██ | 13017/61904 [6:36:34<18:52:42, 1.39s/it] 21%|██ | 13018/61904 [6:36:35<18:55:19, 1.39s/it] 21%|██ | 13019/61904 [6:36:37<19:14:22, 1.42s/it] 21%|██ | 13020/61904 [6:36:38<18:36:59, 1.37s/it] {'loss': 2.8135, 'learning_rate': 1.7922338908336572e-07, 'epoch': 3.36} 21%|██ | 13020/61904 [6:36:38<18:36:59, 1.37s/it] 21%|██ | 13021/61904 [6:36:39<18:57:00, 1.40s/it] 21%|██ | 13022/61904 [6:36:41<18:29:04, 1.36s/it] 21%|██ | 13023/61904 [6:36:42<18:18:44, 1.35s/it] 21%|██ | 13024/61904 [6:36:43<18:26:07, 1.36s/it] 21%|██ | 13025/61904 [6:36:45<18:53:16, 1.39s/it] 21%|██ | 13026/61904 [6:36:46<19:10:11, 1.41s/it] 21%|██ | 13027/61904 [6:36:48<19:01:56, 1.40s/it] 21%|██ | 13028/61904 [6:36:49<18:50:53, 1.39s/it] 21%|██ | 13029/61904 [6:36:51<18:46:50, 1.38s/it] 21%|██ | 13030/61904 [6:36:52<18:57:57, 1.40s/it] 21%|██ | 13031/61904 [6:36:53<18:52:00, 1.39s/it] 21%|██ | 13032/61904 [6:36:55<19:05:54, 1.41s/it] 21%|██ | 13033/61904 [6:36:56<18:57:10, 1.40s/it] 21%|██ | 13034/61904 [6:36:57<18:43:58, 1.38s/it] 21%|██ | 13035/61904 [6:36:59<19:09:32, 1.41s/it] 21%|██ | 13036/61904 [6:37:00<19:21:11, 1.43s/it] 21%|██ | 13037/61904 [6:37:02<19:19:12, 1.42s/it] 21%|██ | 13038/61904 [6:37:03<19:51:55, 1.46s/it] 21%|██ | 13039/61904 [6:37:05<19:16:27, 1.42s/it] 21%|██ | 13040/61904 [6:37:06<19:23:16, 1.43s/it] {'loss': 2.7928, 'learning_rate': 1.791909762738234e-07, 'epoch': 3.37} 21%|██ | 13040/61904 [6:37:06<19:23:16, 1.43s/it] 21%|██ | 13041/61904 [6:37:08<20:12:58, 1.49s/it] 21%|██ | 13042/61904 [6:37:09<19:41:26, 1.45s/it] 21%|██ | 13043/61904 [6:37:10<18:57:08, 1.40s/it] 21%|██ | 13044/61904 [6:37:12<18:42:02, 1.38s/it] 21%|██ | 13045/61904 [6:37:13<18:34:16, 1.37s/it] 21%|██ | 13046/61904 [6:37:15<19:11:05, 1.41s/it] 21%|██ | 13047/61904 [6:37:16<18:45:33, 1.38s/it] 21%|██ | 13048/61904 [6:37:17<19:04:59, 1.41s/it] 21%|██ | 13049/61904 [6:37:19<18:30:52, 1.36s/it] 21%|██ | 13050/61904 [6:37:20<18:07:53, 1.34s/it] 21%|██ | 13051/61904 [6:37:21<18:30:53, 1.36s/it] 21%|██ | 13052/61904 [6:37:23<18:04:26, 1.33s/it] 21%|██ | 13053/61904 [6:37:24<18:27:47, 1.36s/it] 21%|██ | 13054/61904 [6:37:25<18:47:19, 1.38s/it] 21%|██ | 13055/61904 [6:37:27<18:44:15, 1.38s/it] 21%|██ | 13056/61904 [6:37:28<18:26:14, 1.36s/it] 21%|██ | 13057/61904 [6:37:29<18:17:00, 1.35s/it] 21%|██ | 13058/61904 [6:37:31<18:34:27, 1.37s/it] 21%|██ | 13059/61904 [6:37:32<18:40:13, 1.38s/it] 21%|██ | 13060/61904 [6:37:34<19:04:08, 1.41s/it] {'loss': 2.7973, 'learning_rate': 1.7915856346428107e-07, 'epoch': 3.38} 21%|██ | 13060/61904 [6:37:34<19:04:08, 1.41s/it] 21%|██ | 13061/61904 [6:37:35<18:47:26, 1.38s/it] 21%|██ | 13062/61904 [6:37:37<19:11:05, 1.41s/it] 21%|██ | 13063/61904 [6:37:38<18:57:09, 1.40s/it] 21%|██ | 13064/61904 [6:37:40<19:40:22, 1.45s/it] 21%|██ | 13065/61904 [6:37:41<19:12:53, 1.42s/it] 21%|██ | 13066/61904 [6:37:42<18:37:38, 1.37s/it] 21%|██ | 13067/61904 [6:37:44<18:41:02, 1.38s/it] 21%|██ | 13068/61904 [6:37:45<19:05:08, 1.41s/it] 21%|██ | 13069/61904 [6:37:46<19:25:52, 1.43s/it] 21%|██ | 13070/61904 [6:37:48<18:23:09, 1.36s/it] 21%|██ | 13071/61904 [6:37:49<18:44:14, 1.38s/it] 21%|██ | 13072/61904 [6:37:50<18:14:08, 1.34s/it] 21%|██ | 13073/61904 [6:37:52<18:34:24, 1.37s/it] 21%|██ | 13074/61904 [6:37:53<18:14:18, 1.34s/it] 21%|██ | 13075/61904 [6:37:54<17:53:01, 1.32s/it] 21%|██ | 13076/61904 [6:37:56<18:05:20, 1.33s/it] 21%|██ | 13077/61904 [6:37:57<18:21:08, 1.35s/it] 21%|██ | 13078/61904 [6:37:58<18:07:39, 1.34s/it] 21%|██ | 13079/61904 [6:38:00<18:32:41, 1.37s/it] 21%|██ | 13080/61904 [6:38:02<20:00:04, 1.47s/it] {'loss': 2.8088, 'learning_rate': 1.7912615065473873e-07, 'epoch': 3.38} 21%|██ | 13080/61904 [6:38:02<20:00:04, 1.47s/it] 21%|██ | 13081/61904 [6:38:03<20:01:35, 1.48s/it] 21%|██ | 13082/61904 [6:38:04<19:52:17, 1.47s/it] 21%|██ | 13083/61904 [6:38:06<19:29:59, 1.44s/it] 21%|██ | 13084/61904 [6:38:07<19:43:08, 1.45s/it] 21%|██ | 13085/61904 [6:38:09<19:29:57, 1.44s/it] 21%|██ | 13086/61904 [6:38:10<19:30:37, 1.44s/it] 21%|██ | 13087/61904 [6:38:11<18:52:27, 1.39s/it] 21%|██ | 13088/61904 [6:38:13<19:00:27, 1.40s/it] 21%|██ | 13089/61904 [6:38:14<18:07:23, 1.34s/it] 21%|██ | 13090/61904 [6:38:15<18:13:49, 1.34s/it] 21%|██ | 13091/61904 [6:38:17<18:13:03, 1.34s/it] 21%|██ | 13092/61904 [6:38:18<18:50:28, 1.39s/it] 21%|██ | 13093/61904 [6:38:20<18:51:12, 1.39s/it] 21%|██ | 13094/61904 [6:38:21<19:28:46, 1.44s/it] 21%|██ | 13095/61904 [6:38:23<19:04:58, 1.41s/it] 21%|██ | 13096/61904 [6:38:24<18:37:55, 1.37s/it] 21%|██ | 13097/61904 [6:38:25<19:07:57, 1.41s/it] 21%|██ | 13098/61904 [6:38:27<18:47:29, 1.39s/it] 21%|██ | 13099/61904 [6:38:28<19:27:16, 1.44s/it] 21%|██ | 13100/61904 [6:38:30<19:10:08, 1.41s/it] {'loss': 2.7603, 'learning_rate': 1.7909373784519642e-07, 'epoch': 3.39} 21%|██ | 13100/61904 [6:38:30<19:10:08, 1.41s/it] 21%|██ | 13101/61904 [6:38:31<19:17:46, 1.42s/it] 21%|██ | 13102/61904 [6:38:33<19:36:53, 1.45s/it] 21%|██ | 13103/61904 [6:38:34<19:12:11, 1.42s/it] 21%|██ | 13104/61904 [6:38:35<19:22:06, 1.43s/it] 21%|██ | 13105/61904 [6:38:37<19:32:12, 1.44s/it] 21%|██ | 13106/61904 [6:38:38<19:45:01, 1.46s/it] 21%|██ | 13107/61904 [6:38:40<19:24:08, 1.43s/it] 21%|██ | 13108/61904 [6:38:41<19:03:13, 1.41s/it] 21%|██ | 13109/61904 [6:38:42<18:17:51, 1.35s/it] 21%|██ | 13110/61904 [6:38:44<18:11:07, 1.34s/it] 21%|██ | 13111/61904 [6:38:45<19:10:51, 1.42s/it] 21%|██ | 13112/61904 [6:38:47<18:58:17, 1.40s/it] 21%|██ | 13113/61904 [6:38:48<18:21:17, 1.35s/it] 21%|██ | 13114/61904 [6:38:49<18:35:30, 1.37s/it] 21%|██ | 13115/61904 [6:38:51<19:36:24, 1.45s/it] 21%|██ | 13116/61904 [6:38:52<19:11:30, 1.42s/it] 21%|██ | 13117/61904 [6:38:53<18:44:58, 1.38s/it] 21%|██ | 13118/61904 [6:38:55<18:37:04, 1.37s/it] 21%|██ | 13119/61904 [6:38:56<19:20:45, 1.43s/it] 21%|██ | 13120/61904 [6:38:58<19:09:08, 1.41s/it] {'loss': 2.7204, 'learning_rate': 1.7906132503565406e-07, 'epoch': 3.39} 21%|██ | 13120/61904 [6:38:58<19:09:08, 1.41s/it] 21%|██ | 13121/61904 [6:38:59<18:33:59, 1.37s/it] 21%|██ | 13122/61904 [6:39:00<18:20:09, 1.35s/it] 21%|██ | 13123/61904 [6:39:02<19:45:04, 1.46s/it] 21%|██ | 13124/61904 [6:39:03<19:05:27, 1.41s/it] 21%|██ | 13125/61904 [6:39:05<18:56:33, 1.40s/it] 21%|██ | 13126/61904 [6:39:06<19:35:51, 1.45s/it] 21%|██ | 13127/61904 [6:39:08<19:18:00, 1.42s/it] 21%|██ | 13128/61904 [6:39:09<19:12:01, 1.42s/it] 21%|██ | 13129/61904 [6:39:10<19:13:18, 1.42s/it] 21%|██ | 13130/61904 [6:39:12<18:55:35, 1.40s/it] 21%|██ | 13131/61904 [6:39:13<18:39:23, 1.38s/it] 21%|██ | 13132/61904 [6:39:15<18:45:53, 1.39s/it] 21%|██ | 13133/61904 [6:39:16<18:41:50, 1.38s/it] 21%|██ | 13134/61904 [6:39:17<19:36:34, 1.45s/it] 21%|██ | 13135/61904 [6:39:19<19:08:31, 1.41s/it] 21%|██ | 13136/61904 [6:39:20<19:03:57, 1.41s/it] 21%|██ | 13137/61904 [6:39:22<18:44:09, 1.38s/it] 21%|██ | 13138/61904 [6:39:23<19:07:49, 1.41s/it] 21%|██ | 13139/61904 [6:39:24<19:12:23, 1.42s/it] 21%|██ | 13140/61904 [6:39:26<19:26:49, 1.44s/it] {'loss': 2.7494, 'learning_rate': 1.7902891222611175e-07, 'epoch': 3.4} 21%|██ | 13140/61904 [6:39:26<19:26:49, 1.44s/it] 21%|██ | 13141/61904 [6:39:27<18:44:25, 1.38s/it] 21%|██ | 13142/61904 [6:39:29<18:54:32, 1.40s/it] 21%|██ | 13143/61904 [6:39:30<18:48:05, 1.39s/it] 21%|██ | 13144/61904 [6:39:31<19:03:36, 1.41s/it] 21%|██ | 13145/61904 [6:39:33<19:19:09, 1.43s/it] 21%|██ | 13146/61904 [6:39:34<18:58:27, 1.40s/it] 21%|██ | 13147/61904 [6:39:36<18:37:48, 1.38s/it] 21%|██ | 13148/61904 [6:39:37<18:18:45, 1.35s/it] 21%|██ | 13149/61904 [6:39:38<18:41:57, 1.38s/it] 21%|██ | 13150/61904 [6:39:40<18:26:54, 1.36s/it] 21%|██ | 13151/61904 [6:39:41<19:34:52, 1.45s/it] 21%|██ | 13152/61904 [6:39:43<20:17:09, 1.50s/it] 21%|██ | 13153/61904 [6:39:44<20:04:28, 1.48s/it] 21%|██ | 13154/61904 [6:39:46<19:42:44, 1.46s/it] 21%|██▏ | 13155/61904 [6:39:47<19:35:08, 1.45s/it] 21%|██▏ | 13156/61904 [6:39:49<19:20:01, 1.43s/it] 21%|██▏ | 13157/61904 [6:39:50<18:55:03, 1.40s/it] 21%|██▏ | 13158/61904 [6:39:51<18:59:25, 1.40s/it] 21%|██▏ | 13159/61904 [6:39:53<18:38:25, 1.38s/it] 21%|██▏ | 13160/61904 [6:39:54<19:02:50, 1.41s/it] {'loss': 2.7304, 'learning_rate': 1.7899649941656943e-07, 'epoch': 3.4} 21%|██▏ | 13160/61904 [6:39:54<19:02:50, 1.41s/it] 21%|██▏ | 13161/61904 [6:39:55<18:41:17, 1.38s/it] 21%|██▏ | 13162/61904 [6:39:57<18:25:57, 1.36s/it] 21%|██▏ | 13163/61904 [6:39:58<18:54:05, 1.40s/it] 21%|██▏ | 13164/61904 [6:40:00<18:32:08, 1.37s/it] 21%|██▏ | 13165/61904 [6:40:01<18:15:46, 1.35s/it] 21%|██▏ | 13166/61904 [6:40:02<17:56:08, 1.32s/it] 21%|██▏ | 13167/61904 [6:40:04<18:40:19, 1.38s/it] 21%|██▏ | 13168/61904 [6:40:05<18:13:00, 1.35s/it] 21%|██▏ | 13169/61904 [6:40:06<18:33:16, 1.37s/it] 21%|██▏ | 13170/61904 [6:40:08<18:27:25, 1.36s/it] 21%|██▏ | 13171/61904 [6:40:09<18:10:53, 1.34s/it] 21%|██▏ | 13172/61904 [6:40:10<18:10:57, 1.34s/it] 21%|██▏ | 13173/61904 [6:40:12<17:57:48, 1.33s/it] 21%|██▏ | 13174/61904 [6:40:13<18:16:09, 1.35s/it] 21%|██▏ | 13175/61904 [6:40:14<17:37:12, 1.30s/it] 21%|██▏ | 13176/61904 [6:40:16<18:13:21, 1.35s/it] 21%|██▏ | 13177/61904 [6:40:17<18:03:17, 1.33s/it] 21%|██▏ | 13178/61904 [6:40:18<18:07:09, 1.34s/it] 21%|██▏ | 13179/61904 [6:40:20<17:54:49, 1.32s/it] 21%|██▏ | 13180/61904 [6:40:21<17:58:18, 1.33s/it] {'loss': 2.7921, 'learning_rate': 1.7896408660702707e-07, 'epoch': 3.41} 21%|██▏ | 13180/61904 [6:40:21<17:58:18, 1.33s/it] 21%|██▏ | 13181/61904 [6:40:22<17:54:58, 1.32s/it] 21%|██▏ | 13182/61904 [6:40:24<18:05:19, 1.34s/it] 21%|██▏ | 13183/61904 [6:40:25<18:13:44, 1.35s/it] 21%|██▏ | 13184/61904 [6:40:26<18:35:54, 1.37s/it] 21%|██▏ | 13185/61904 [6:40:28<19:01:54, 1.41s/it] 21%|██▏ | 13186/61904 [6:40:29<18:57:54, 1.40s/it] 21%|██▏ | 13187/61904 [6:40:31<18:36:07, 1.37s/it] 21%|██▏ | 13188/61904 [6:40:32<18:10:12, 1.34s/it] 21%|██▏ | 13189/61904 [6:40:33<18:01:38, 1.33s/it] 21%|██▏ | 13190/61904 [6:40:34<17:40:01, 1.31s/it] 21%|██▏ | 13191/61904 [6:40:36<18:15:47, 1.35s/it] 21%|██▏ | 13192/61904 [6:40:37<18:49:26, 1.39s/it] 21%|██▏ | 13193/61904 [6:40:39<18:30:25, 1.37s/it] 21%|██▏ | 13194/61904 [6:40:40<17:57:32, 1.33s/it] 21%|██▏ | 13195/61904 [6:40:41<18:33:10, 1.37s/it] 21%|██▏ | 13196/61904 [6:40:43<18:33:11, 1.37s/it] 21%|██▏ | 13197/61904 [6:40:44<18:30:57, 1.37s/it] 21%|██▏ | 13198/61904 [6:40:46<19:08:27, 1.41s/it] 21%|██▏ | 13199/61904 [6:40:47<19:02:52, 1.41s/it] 21%|██▏ | 13200/61904 [6:40:48<18:56:22, 1.40s/it] {'loss': 2.7574, 'learning_rate': 1.7893167379748476e-07, 'epoch': 3.41} 21%|██▏ | 13200/61904 [6:40:48<18:56:22, 1.40s/it] 21%|██▏ | 13201/61904 [6:40:50<18:33:17, 1.37s/it] 21%|██▏ | 13202/61904 [6:40:51<18:33:59, 1.37s/it] 21%|██▏ | 13203/61904 [6:40:52<18:52:15, 1.39s/it] 21%|██▏ | 13204/61904 [6:40:54<18:31:13, 1.37s/it] 21%|██▏ | 13205/61904 [6:40:55<18:12:09, 1.35s/it] 21%|██▏ | 13206/61904 [6:40:56<18:02:50, 1.33s/it] 21%|██▏ | 13207/61904 [6:40:58<19:09:03, 1.42s/it] 21%|██▏ | 13208/61904 [6:40:59<18:54:59, 1.40s/it] 21%|██▏ | 13209/61904 [6:41:01<18:29:54, 1.37s/it] 21%|██▏ | 13210/61904 [6:41:02<18:34:25, 1.37s/it] 21%|██▏ | 13211/61904 [6:41:03<18:04:29, 1.34s/it] 21%|██▏ | 13212/61904 [6:41:05<17:42:39, 1.31s/it] 21%|██▏ | 13213/61904 [6:41:06<17:35:58, 1.30s/it] 21%|██▏ | 13214/61904 [6:41:07<18:00:31, 1.33s/it] 21%|██▏ | 13215/61904 [6:41:09<18:14:40, 1.35s/it] 21%|██▏ | 13216/61904 [6:41:10<18:34:24, 1.37s/it] 21%|██▏ | 13217/61904 [6:41:11<18:51:05, 1.39s/it] 21%|██▏ | 13218/61904 [6:41:13<18:36:16, 1.38s/it] 21%|██▏ | 13219/61904 [6:41:14<18:38:45, 1.38s/it] 21%|██▏ | 13220/61904 [6:41:15<18:10:12, 1.34s/it] {'loss': 2.7649, 'learning_rate': 1.7889926098794242e-07, 'epoch': 3.42} 21%|██▏ | 13220/61904 [6:41:15<18:10:12, 1.34s/it] 21%|██▏ | 13221/61904 [6:41:17<18:21:24, 1.36s/it] 21%|██▏ | 13222/61904 [6:41:18<18:35:32, 1.37s/it] 21%|██▏ | 13223/61904 [6:41:20<18:26:04, 1.36s/it] 21%|██▏ | 13224/61904 [6:41:21<18:21:25, 1.36s/it] 21%|██▏ | 13225/61904 [6:41:23<20:13:12, 1.50s/it] 21%|██▏ | 13226/61904 [6:41:24<19:27:41, 1.44s/it] 21%|██▏ | 13227/61904 [6:41:25<19:02:10, 1.41s/it] 21%|██▏ | 13228/61904 [6:41:27<18:34:10, 1.37s/it] 21%|██▏ | 13229/61904 [6:41:28<18:24:44, 1.36s/it] 21%|██▏ | 13230/61904 [6:41:30<18:50:50, 1.39s/it] 21%|██▏ | 13231/61904 [6:41:31<19:13:34, 1.42s/it] 21%|██▏ | 13232/61904 [6:41:32<19:10:25, 1.42s/it] 21%|██▏ | 13233/61904 [6:41:34<19:26:24, 1.44s/it] 21%|██▏ | 13234/61904 [6:41:35<18:48:21, 1.39s/it] 21%|██▏ | 13235/61904 [6:41:37<18:40:15, 1.38s/it] 21%|██▏ | 13236/61904 [6:41:38<18:36:44, 1.38s/it] 21%|██▏ | 13237/61904 [6:41:39<18:52:27, 1.40s/it] 21%|██▏ | 13238/61904 [6:41:41<18:36:36, 1.38s/it] 21%|██▏ | 13239/61904 [6:41:42<17:54:49, 1.33s/it] 21%|██▏ | 13240/61904 [6:41:43<17:52:56, 1.32s/it] {'loss': 2.7618, 'learning_rate': 1.7886684817840008e-07, 'epoch': 3.42} 21%|██▏ | 13240/61904 [6:41:43<17:52:56, 1.32s/it] 21%|██▏ | 13241/61904 [6:41:45<18:29:40, 1.37s/it] 21%|██▏ | 13242/61904 [6:41:46<18:28:19, 1.37s/it] 21%|██▏ | 13243/61904 [6:41:47<18:39:34, 1.38s/it] 21%|██▏ | 13244/61904 [6:41:49<18:34:58, 1.37s/it] 21%|██▏ | 13245/61904 [6:41:50<19:12:33, 1.42s/it] 21%|██▏ | 13246/61904 [6:41:52<18:26:02, 1.36s/it] 21%|██▏ | 13247/61904 [6:41:53<18:39:25, 1.38s/it] 21%|██▏ | 13248/61904 [6:41:54<18:16:15, 1.35s/it] 21%|██▏ | 13249/61904 [6:41:56<19:06:28, 1.41s/it] 21%|██▏ | 13250/61904 [6:41:57<19:09:05, 1.42s/it] 21%|██▏ | 13251/61904 [6:41:59<19:22:22, 1.43s/it] 21%|██▏ | 13252/61904 [6:42:00<18:49:36, 1.39s/it] 21%|██▏ | 13253/61904 [6:42:02<19:17:02, 1.43s/it] 21%|██▏ | 13254/61904 [6:42:03<19:49:02, 1.47s/it] 21%|██▏ | 13255/61904 [6:42:04<19:16:49, 1.43s/it] 21%|██▏ | 13256/61904 [6:42:06<18:50:53, 1.39s/it] 21%|██▏ | 13257/61904 [6:42:07<18:25:27, 1.36s/it] 21%|██▏ | 13258/61904 [6:42:08<17:58:47, 1.33s/it] 21%|██▏ | 13259/61904 [6:42:10<17:54:02, 1.32s/it] 21%|██▏ | 13260/61904 [6:42:11<18:10:11, 1.34s/it] {'loss': 2.7628, 'learning_rate': 1.7883443536885777e-07, 'epoch': 3.43} 21%|██▏ | 13260/61904 [6:42:11<18:10:11, 1.34s/it] 21%|██▏ | 13261/61904 [6:42:13<19:05:15, 1.41s/it] 21%|██▏ | 13262/61904 [6:42:14<19:04:13, 1.41s/it] 21%|██▏ | 13263/61904 [6:42:16<19:49:56, 1.47s/it] 21%|██▏ | 13264/61904 [6:42:17<18:53:31, 1.40s/it] 21%|██▏ | 13265/61904 [6:42:18<18:24:39, 1.36s/it] 21%|██▏ | 13266/61904 [6:42:20<18:54:06, 1.40s/it] 21%|██▏ | 13267/61904 [6:42:21<18:35:35, 1.38s/it] 21%|██▏ | 13268/61904 [6:42:22<18:31:11, 1.37s/it] 21%|██▏ | 13269/61904 [6:42:24<18:25:36, 1.36s/it] 21%|██▏ | 13270/61904 [6:42:25<18:32:47, 1.37s/it] 21%|██▏ | 13271/61904 [6:42:26<18:59:43, 1.41s/it] 21%|██▏ | 13272/61904 [6:42:28<18:47:07, 1.39s/it] 21%|██▏ | 13273/61904 [6:42:29<19:01:56, 1.41s/it] 21%|██▏ | 13274/61904 [6:42:31<18:23:08, 1.36s/it] 21%|██▏ | 13275/61904 [6:42:32<17:56:20, 1.33s/it] 21%|██▏ | 13276/61904 [6:42:33<18:06:16, 1.34s/it] 21%|██▏ | 13277/61904 [6:42:34<17:52:50, 1.32s/it] 21%|██▏ | 13278/61904 [6:42:36<18:46:57, 1.39s/it] 21%|██▏ | 13279/61904 [6:42:37<18:39:05, 1.38s/it] 21%|██▏ | 13280/61904 [6:42:39<18:43:40, 1.39s/it] {'loss': 2.7988, 'learning_rate': 1.7880202255931543e-07, 'epoch': 3.43} 21%|██▏ | 13280/61904 [6:42:39<18:43:40, 1.39s/it] 21%|██▏ | 13281/61904 [6:42:40<19:07:39, 1.42s/it] 21%|██▏ | 13282/61904 [6:42:42<19:07:14, 1.42s/it] 21%|██▏ | 13283/61904 [6:42:43<19:13:10, 1.42s/it] 21%|██▏ | 13284/61904 [6:42:45<19:17:46, 1.43s/it] 21%|██▏ | 13285/61904 [6:42:46<19:00:10, 1.41s/it] 21%|██▏ | 13286/61904 [6:42:47<18:39:17, 1.38s/it] 21%|██▏ | 13287/61904 [6:42:49<18:23:46, 1.36s/it] 21%|██▏ | 13288/61904 [6:42:50<18:00:29, 1.33s/it] 21%|██▏ | 13289/61904 [6:42:51<17:12:48, 1.27s/it] 21%|██▏ | 13290/61904 [6:42:52<17:07:26, 1.27s/it] 21%|██▏ | 13291/61904 [6:42:53<17:08:27, 1.27s/it] 21%|██▏ | 13292/61904 [6:42:55<18:09:50, 1.35s/it] 21%|██▏ | 13293/61904 [6:42:56<18:06:29, 1.34s/it] 21%|██▏ | 13294/61904 [6:42:58<18:45:27, 1.39s/it] 21%|██▏ | 13295/61904 [6:42:59<18:17:26, 1.35s/it] 21%|██▏ | 13296/61904 [6:43:01<18:47:16, 1.39s/it] 21%|██▏ | 13297/61904 [6:43:02<19:23:35, 1.44s/it] 21%|██▏ | 13298/61904 [6:43:03<19:12:12, 1.42s/it] 21%|██▏ | 13299/61904 [6:43:05<18:50:35, 1.40s/it] 21%|██▏ | 13300/61904 [6:43:06<18:54:27, 1.40s/it] {'loss': 2.7147, 'learning_rate': 1.787696097497731e-07, 'epoch': 3.44} 21%|██▏ | 13300/61904 [6:43:06<18:54:27, 1.40s/it] 21%|██▏ | 13301/61904 [6:43:08<18:48:13, 1.39s/it] 21%|██▏ | 13302/61904 [6:43:09<18:30:26, 1.37s/it] 21%|██▏ | 13303/61904 [6:43:10<18:14:45, 1.35s/it] 21%|██▏ | 13304/61904 [6:43:11<17:53:00, 1.32s/it] 21%|██▏ | 13305/61904 [6:43:13<18:34:46, 1.38s/it] 21%|██▏ | 13306/61904 [6:43:14<18:34:48, 1.38s/it] 21%|██▏ | 13307/61904 [6:43:16<18:25:56, 1.37s/it] 21%|██▏ | 13308/61904 [6:43:17<18:50:01, 1.40s/it] 21%|██▏ | 13309/61904 [6:43:19<19:07:52, 1.42s/it] 22%|██▏ | 13310/61904 [6:43:20<19:04:34, 1.41s/it] 22%|██▏ | 13311/61904 [6:43:21<18:20:25, 1.36s/it] 22%|██▏ | 13312/61904 [6:43:22<17:49:11, 1.32s/it] 22%|██▏ | 13313/61904 [6:43:24<18:01:54, 1.34s/it] 22%|██▏ | 13314/61904 [6:43:25<18:10:35, 1.35s/it] 22%|██▏ | 13315/61904 [6:43:27<18:08:09, 1.34s/it] 22%|██▏ | 13316/61904 [6:43:28<18:22:43, 1.36s/it] 22%|██▏ | 13317/61904 [6:43:29<18:05:40, 1.34s/it] 22%|██▏ | 13318/61904 [6:43:31<18:17:08, 1.35s/it] 22%|██▏ | 13319/61904 [6:43:32<19:05:27, 1.41s/it] 22%|██▏ | 13320/61904 [6:43:34<18:38:54, 1.38s/it] {'loss': 2.8114, 'learning_rate': 1.7873719694023078e-07, 'epoch': 3.44} 22%|██▏ | 13320/61904 [6:43:34<18:38:54, 1.38s/it] 22%|██▏ | 13321/61904 [6:43:35<19:07:03, 1.42s/it] 22%|██▏ | 13322/61904 [6:43:36<18:33:52, 1.38s/it] 22%|██▏ | 13323/61904 [6:43:38<18:41:39, 1.39s/it] 22%|██▏ | 13324/61904 [6:43:39<18:14:30, 1.35s/it] 22%|██▏ | 13325/61904 [6:43:40<18:24:01, 1.36s/it] 22%|██▏ | 13326/61904 [6:43:42<18:00:40, 1.33s/it] 22%|██▏ | 13327/61904 [6:43:43<18:02:13, 1.34s/it] 22%|██▏ | 13328/61904 [6:43:44<17:45:26, 1.32s/it] 22%|██▏ | 13329/61904 [6:43:46<17:46:15, 1.32s/it] 22%|██▏ | 13330/61904 [6:43:47<17:44:24, 1.31s/it] 22%|██▏ | 13331/61904 [6:43:48<17:33:00, 1.30s/it] 22%|██▏ | 13332/61904 [6:43:49<17:33:11, 1.30s/it] 22%|██▏ | 13333/61904 [6:43:51<18:01:26, 1.34s/it] 22%|██▏ | 13334/61904 [6:43:52<17:49:15, 1.32s/it] 22%|██▏ | 13335/61904 [6:43:53<17:35:51, 1.30s/it] 22%|██▏ | 13336/61904 [6:43:55<17:14:14, 1.28s/it] 22%|██▏ | 13337/61904 [6:43:56<17:34:56, 1.30s/it] 22%|██▏ | 13338/61904 [6:43:57<17:53:49, 1.33s/it] 22%|██▏ | 13339/61904 [6:43:59<18:33:53, 1.38s/it] 22%|██▏ | 13340/61904 [6:44:00<18:08:56, 1.35s/it] {'loss': 2.7474, 'learning_rate': 1.7870478413068842e-07, 'epoch': 3.45} 22%|██▏ | 13340/61904 [6:44:00<18:08:56, 1.35s/it] 22%|██▏ | 13341/61904 [6:44:01<17:42:40, 1.31s/it] 22%|██▏ | 13342/61904 [6:44:03<18:01:48, 1.34s/it] 22%|██▏ | 13343/61904 [6:44:04<18:45:14, 1.39s/it] 22%|██▏ | 13344/61904 [6:44:06<18:58:58, 1.41s/it] 22%|██▏ | 13345/61904 [6:44:07<19:09:49, 1.42s/it] 22%|██▏ | 13346/61904 [6:44:09<18:55:57, 1.40s/it] 22%|██▏ | 13347/61904 [6:44:10<18:52:53, 1.40s/it] 22%|██▏ | 13348/61904 [6:44:11<19:00:23, 1.41s/it] 22%|██▏ | 13349/61904 [6:44:13<18:24:38, 1.37s/it] 22%|██▏ | 13350/61904 [6:44:14<18:05:54, 1.34s/it] 22%|██▏ | 13351/61904 [6:44:15<18:03:39, 1.34s/it] 22%|██▏ | 13352/61904 [6:44:17<18:34:08, 1.38s/it] 22%|██▏ | 13353/61904 [6:44:18<19:01:58, 1.41s/it] 22%|██▏ | 13354/61904 [6:44:20<18:42:17, 1.39s/it] 22%|██▏ | 13355/61904 [6:44:21<18:32:35, 1.38s/it] 22%|██▏ | 13356/61904 [6:44:22<18:19:48, 1.36s/it] 22%|██▏ | 13357/61904 [6:44:24<18:25:13, 1.37s/it] 22%|██▏ | 13358/61904 [6:44:25<18:47:38, 1.39s/it] 22%|██▏ | 13359/61904 [6:44:26<18:08:18, 1.35s/it] 22%|██▏ | 13360/61904 [6:44:28<18:34:03, 1.38s/it] {'loss': 2.7718, 'learning_rate': 1.786723713211461e-07, 'epoch': 3.45} 22%|██▏ | 13360/61904 [6:44:28<18:34:03, 1.38s/it] 22%|██▏ | 13361/61904 [6:44:29<18:46:11, 1.39s/it] 22%|██▏ | 13362/61904 [6:44:31<18:39:33, 1.38s/it] 22%|██▏ | 13363/61904 [6:44:32<18:05:04, 1.34s/it] 22%|██▏ | 13364/61904 [6:44:33<18:16:11, 1.36s/it] 22%|██▏ | 13365/61904 [6:44:35<18:19:15, 1.36s/it] 22%|██▏ | 13366/61904 [6:44:36<18:31:01, 1.37s/it] 22%|██▏ | 13367/61904 [6:44:37<18:54:12, 1.40s/it] 22%|██▏ | 13368/61904 [6:44:39<18:52:17, 1.40s/it] 22%|██▏ | 13369/61904 [6:44:40<18:23:05, 1.36s/it] 22%|██▏ | 13370/61904 [6:44:41<18:27:23, 1.37s/it] 22%|██▏ | 13371/61904 [6:44:43<18:11:14, 1.35s/it] 22%|██▏ | 13372/61904 [6:44:44<18:15:11, 1.35s/it] 22%|██▏ | 13373/61904 [6:44:45<18:15:26, 1.35s/it] 22%|██▏ | 13374/61904 [6:44:47<17:57:00, 1.33s/it] 22%|██▏ | 13375/61904 [6:44:48<18:02:53, 1.34s/it] 22%|██▏ | 13376/61904 [6:44:49<17:48:58, 1.32s/it] 22%|██▏ | 13377/61904 [6:44:51<17:39:51, 1.31s/it] 22%|██▏ | 13378/61904 [6:44:52<17:35:35, 1.31s/it] 22%|██▏ | 13379/61904 [6:44:53<17:48:15, 1.32s/it] 22%|██▏ | 13380/61904 [6:44:55<17:49:12, 1.32s/it] {'loss': 2.821, 'learning_rate': 1.786399585116038e-07, 'epoch': 3.46} 22%|██▏ | 13380/61904 [6:44:55<17:49:12, 1.32s/it] 22%|██▏ | 13381/61904 [6:44:56<18:02:37, 1.34s/it] 22%|██▏ | 13382/61904 [6:44:57<17:55:00, 1.33s/it] 22%|██▏ | 13383/61904 [6:44:59<17:38:16, 1.31s/it] 22%|██▏ | 13384/61904 [6:45:00<17:45:15, 1.32s/it] 22%|██▏ | 13385/61904 [6:45:01<17:39:48, 1.31s/it] 22%|██▏ | 13386/61904 [6:45:03<17:54:51, 1.33s/it] 22%|██▏ | 13387/61904 [6:45:04<17:23:28, 1.29s/it] 22%|██▏ | 13388/61904 [6:45:05<17:27:39, 1.30s/it] 22%|██▏ | 13389/61904 [6:45:06<17:26:36, 1.29s/it] 22%|██▏ | 13390/61904 [6:45:08<18:06:26, 1.34s/it] 22%|██▏ | 13391/61904 [6:45:09<18:17:26, 1.36s/it] 22%|██▏ | 13392/61904 [6:45:10<17:48:55, 1.32s/it] 22%|██▏ | 13393/61904 [6:45:12<17:26:51, 1.29s/it] 22%|██▏ | 13394/61904 [6:45:13<17:48:40, 1.32s/it] 22%|██▏ | 13395/61904 [6:45:14<17:43:01, 1.31s/it] 22%|██▏ | 13396/61904 [6:45:16<17:36:47, 1.31s/it] 22%|██▏ | 13397/61904 [6:45:17<17:45:56, 1.32s/it] 22%|██▏ | 13398/61904 [6:45:18<18:18:11, 1.36s/it] 22%|██▏ | 13399/61904 [6:45:20<19:07:21, 1.42s/it] 22%|██▏ | 13400/61904 [6:45:21<19:00:25, 1.41s/it] {'loss': 2.706, 'learning_rate': 1.7860754570206143e-07, 'epoch': 3.46} 22%|██▏ | 13400/61904 [6:45:21<19:00:25, 1.41s/it] 22%|██▏ | 13401/61904 [6:45:23<19:30:10, 1.45s/it] 22%|██▏ | 13402/61904 [6:45:24<19:28:36, 1.45s/it] 22%|██▏ | 13403/61904 [6:45:26<18:54:06, 1.40s/it] 22%|██▏ | 13404/61904 [6:45:27<18:58:25, 1.41s/it] 22%|██▏ | 13405/61904 [6:45:28<18:42:04, 1.39s/it] 22%|██▏ | 13406/61904 [6:45:30<19:11:48, 1.42s/it] 22%|██▏ | 13407/61904 [6:45:31<18:40:30, 1.39s/it] 22%|██▏ | 13408/61904 [6:45:32<17:59:21, 1.34s/it] 22%|██▏ | 13409/61904 [6:45:34<18:21:41, 1.36s/it] 22%|██▏ | 13410/61904 [6:45:35<18:15:59, 1.36s/it] 22%|██▏ | 13411/61904 [6:45:37<18:26:53, 1.37s/it] 22%|██▏ | 13412/61904 [6:45:38<17:57:19, 1.33s/it] 22%|██▏ | 13413/61904 [6:45:39<18:00:23, 1.34s/it] 22%|██▏ | 13414/61904 [6:45:41<18:01:50, 1.34s/it] 22%|██▏ | 13415/61904 [6:45:42<18:07:04, 1.35s/it] 22%|██▏ | 13416/61904 [6:45:43<18:16:44, 1.36s/it] 22%|██▏ | 13417/61904 [6:45:45<19:27:38, 1.44s/it] 22%|██▏ | 13418/61904 [6:45:46<19:14:53, 1.43s/it] 22%|██▏ | 13419/61904 [6:45:48<18:57:09, 1.41s/it] 22%|██▏ | 13420/61904 [6:45:49<18:16:20, 1.36s/it] {'loss': 2.7982, 'learning_rate': 1.7857513289251912e-07, 'epoch': 3.47} 22%|██▏ | 13420/61904 [6:45:49<18:16:20, 1.36s/it] 22%|██▏ | 13421/61904 [6:45:50<18:34:40, 1.38s/it] 22%|██▏ | 13422/61904 [6:45:52<18:44:06, 1.39s/it] 22%|██▏ | 13423/61904 [6:45:53<17:59:47, 1.34s/it] 22%|██▏ | 13424/61904 [6:45:54<17:35:51, 1.31s/it] 22%|██▏ | 13425/61904 [6:45:56<17:58:53, 1.34s/it] 22%|██▏ | 13426/61904 [6:45:57<18:38:30, 1.38s/it] 22%|██▏ | 13427/61904 [6:45:58<18:10:41, 1.35s/it] 22%|██▏ | 13428/61904 [6:46:00<18:45:17, 1.39s/it] 22%|██▏ | 13429/61904 [6:46:01<18:46:43, 1.39s/it] 22%|██▏ | 13430/61904 [6:46:03<18:35:43, 1.38s/it] 22%|██▏ | 13431/61904 [6:46:04<18:31:03, 1.38s/it] 22%|██▏ | 13432/61904 [6:46:05<18:17:14, 1.36s/it] 22%|██▏ | 13433/61904 [6:46:07<18:26:31, 1.37s/it] 22%|██▏ | 13434/61904 [6:46:08<18:52:08, 1.40s/it] 22%|██▏ | 13435/61904 [6:46:10<18:24:10, 1.37s/it] 22%|██▏ | 13436/61904 [6:46:11<18:17:27, 1.36s/it] 22%|██▏ | 13437/61904 [6:46:12<18:17:16, 1.36s/it] 22%|██▏ | 13438/61904 [6:46:14<18:19:45, 1.36s/it] 22%|██▏ | 13439/61904 [6:46:15<18:02:04, 1.34s/it] 22%|██▏ | 13440/61904 [6:46:16<18:09:38, 1.35s/it] {'loss': 2.7446, 'learning_rate': 1.7854272008297678e-07, 'epoch': 3.47} 22%|██▏ | 13440/61904 [6:46:16<18:09:38, 1.35s/it] 22%|██▏ | 13441/61904 [6:46:18<18:37:39, 1.38s/it] 22%|██▏ | 13442/61904 [6:46:19<19:08:38, 1.42s/it] 22%|██▏ | 13443/61904 [6:46:21<19:18:01, 1.43s/it] 22%|██▏ | 13444/61904 [6:46:22<19:14:15, 1.43s/it] 22%|██▏ | 13445/61904 [6:46:23<18:54:15, 1.40s/it] 22%|██▏ | 13446/61904 [6:46:25<18:48:52, 1.40s/it] 22%|██▏ | 13447/61904 [6:46:26<18:49:39, 1.40s/it] 22%|██▏ | 13448/61904 [6:46:28<18:50:02, 1.40s/it] 22%|██▏ | 13449/61904 [6:46:29<18:25:33, 1.37s/it] 22%|██▏ | 13450/61904 [6:46:30<18:16:25, 1.36s/it] 22%|██▏ | 13451/61904 [6:46:32<17:45:41, 1.32s/it] 22%|██▏ | 13452/61904 [6:46:33<18:36:20, 1.38s/it] 22%|██▏ | 13453/61904 [6:46:34<18:34:27, 1.38s/it] 22%|██▏ | 13454/61904 [6:46:36<18:08:29, 1.35s/it] 22%|██▏ | 13455/61904 [6:46:37<19:06:34, 1.42s/it] 22%|██▏ | 13456/61904 [6:46:39<19:06:58, 1.42s/it] 22%|██▏ | 13457/61904 [6:46:40<19:04:34, 1.42s/it] 22%|██▏ | 13458/61904 [6:46:42<19:04:31, 1.42s/it] 22%|██▏ | 13459/61904 [6:46:43<18:54:53, 1.41s/it] 22%|██▏ | 13460/61904 [6:46:44<18:19:20, 1.36s/it] {'loss': 2.749, 'learning_rate': 1.7851030727343444e-07, 'epoch': 3.48} 22%|██▏ | 13460/61904 [6:46:44<18:19:20, 1.36s/it] 22%|██▏ | 13461/61904 [6:46:46<18:33:52, 1.38s/it] 22%|██▏ | 13462/61904 [6:46:47<18:42:05, 1.39s/it] 22%|██▏ | 13463/61904 [6:46:48<19:04:53, 1.42s/it] 22%|██▏ | 13464/61904 [6:46:50<18:47:23, 1.40s/it] 22%|██▏ | 13465/61904 [6:46:51<18:34:49, 1.38s/it] 22%|██▏ | 13466/61904 [6:46:53<18:31:48, 1.38s/it] 22%|██▏ | 13467/61904 [6:46:54<18:39:40, 1.39s/it] 22%|██▏ | 13468/61904 [6:46:55<18:36:35, 1.38s/it] 22%|██▏ | 13469/61904 [6:46:57<18:25:55, 1.37s/it] 22%|██▏ | 13470/61904 [6:46:58<18:48:46, 1.40s/it] 22%|██▏ | 13471/61904 [6:46:59<18:30:07, 1.38s/it] 22%|██▏ | 13472/61904 [6:47:01<18:33:51, 1.38s/it] 22%|██▏ | 13473/61904 [6:47:02<18:18:42, 1.36s/it] 22%|██▏ | 13474/61904 [6:47:04<18:25:22, 1.37s/it] 22%|██▏ | 13475/61904 [6:47:05<18:36:54, 1.38s/it] 22%|██▏ | 13476/61904 [6:47:06<19:11:42, 1.43s/it] 22%|██▏ | 13477/61904 [6:47:08<18:46:24, 1.40s/it] 22%|██▏ | 13478/61904 [6:47:09<18:27:28, 1.37s/it] 22%|██▏ | 13479/61904 [6:47:10<18:23:07, 1.37s/it] 22%|██▏ | 13480/61904 [6:47:12<18:20:00, 1.36s/it] {'loss': 2.7231, 'learning_rate': 1.7847789446389213e-07, 'epoch': 3.48} 22%|██▏ | 13480/61904 [6:47:12<18:20:00, 1.36s/it] 22%|██▏ | 13481/61904 [6:47:13<18:58:53, 1.41s/it] 22%|██▏ | 13482/61904 [6:47:15<18:31:41, 1.38s/it] 22%|██▏ | 13483/61904 [6:47:16<19:11:34, 1.43s/it] 22%|██▏ | 13484/61904 [6:47:18<19:17:56, 1.43s/it] 22%|██▏ | 13485/61904 [6:47:19<20:24:00, 1.52s/it] 22%|██▏ | 13486/61904 [6:47:21<19:40:07, 1.46s/it] 22%|██▏ | 13487/61904 [6:47:22<19:17:17, 1.43s/it] 22%|██▏ | 13488/61904 [6:47:23<19:04:32, 1.42s/it] 22%|██▏ | 13489/61904 [6:47:25<19:01:27, 1.41s/it] 22%|██▏ | 13490/61904 [6:47:26<18:14:39, 1.36s/it] 22%|██▏ | 13491/61904 [6:47:27<17:54:34, 1.33s/it] 22%|██▏ | 13492/61904 [6:47:29<17:46:59, 1.32s/it] 22%|██▏ | 13493/61904 [6:47:30<18:21:59, 1.37s/it] 22%|██▏ | 13494/61904 [6:47:31<17:50:06, 1.33s/it] 22%|██▏ | 13495/61904 [6:47:33<17:37:30, 1.31s/it] 22%|██▏ | 13496/61904 [6:47:34<17:53:35, 1.33s/it] 22%|██▏ | 13497/61904 [6:47:35<18:01:00, 1.34s/it] 22%|██▏ | 13498/61904 [6:47:37<17:42:46, 1.32s/it] 22%|██▏ | 13499/61904 [6:47:38<18:01:46, 1.34s/it] 22%|██▏ | 13500/61904 [6:47:39<18:14:40, 1.36s/it] {'loss': 2.7026, 'learning_rate': 1.784454816543498e-07, 'epoch': 3.49} 22%|██▏ | 13500/61904 [6:47:39<18:14:40, 1.36s/it] 22%|██▏ | 13501/61904 [6:47:41<18:23:49, 1.37s/it] 22%|██▏ | 13502/61904 [6:47:42<18:01:30, 1.34s/it] 22%|██▏ | 13503/61904 [6:47:43<17:45:27, 1.32s/it] 22%|██▏ | 13504/61904 [6:47:45<18:23:35, 1.37s/it] 22%|██▏ | 13505/61904 [6:47:46<18:46:02, 1.40s/it] 22%|██▏ | 13506/61904 [6:47:48<18:34:02, 1.38s/it] 22%|██▏ | 13507/61904 [6:47:49<18:03:35, 1.34s/it] 22%|██▏ | 13508/61904 [6:47:50<18:12:19, 1.35s/it] 22%|██▏ | 13509/61904 [6:47:52<18:11:12, 1.35s/it] 22%|██▏ | 13510/61904 [6:47:53<18:06:12, 1.35s/it] 22%|██▏ | 13511/61904 [6:47:54<18:14:11, 1.36s/it] 22%|██▏ | 13512/61904 [6:47:56<18:36:08, 1.38s/it] 22%|██▏ | 13513/61904 [6:47:57<18:22:08, 1.37s/it] 22%|██▏ | 13514/61904 [6:47:58<18:24:31, 1.37s/it] 22%|██▏ | 13515/61904 [6:48:00<18:35:18, 1.38s/it] 22%|██▏ | 13516/61904 [6:48:02<19:40:36, 1.46s/it] 22%|██▏ | 13517/61904 [6:48:03<19:04:02, 1.42s/it] 22%|██▏ | 13518/61904 [6:48:04<19:48:05, 1.47s/it] 22%|██▏ | 13519/61904 [6:48:06<19:17:23, 1.44s/it] 22%|██▏ | 13520/61904 [6:48:07<19:00:57, 1.41s/it] {'loss': 2.7534, 'learning_rate': 1.7841306884480745e-07, 'epoch': 3.49} 22%|██▏ | 13520/61904 [6:48:07<19:00:57, 1.41s/it] 22%|██▏ | 13521/61904 [6:48:09<18:39:00, 1.39s/it] 22%|██▏ | 13522/61904 [6:48:10<18:20:25, 1.36s/it] 22%|██▏ | 13523/61904 [6:48:11<18:13:24, 1.36s/it] 22%|██▏ | 13524/61904 [6:48:12<17:51:56, 1.33s/it] 22%|██▏ | 13525/61904 [6:48:14<17:40:11, 1.31s/it] 22%|██▏ | 13526/61904 [6:48:15<17:46:02, 1.32s/it] 22%|██▏ | 13527/61904 [6:48:16<17:36:32, 1.31s/it] 22%|██▏ | 13528/61904 [6:48:18<18:01:11, 1.34s/it] 22%|██▏ | 13529/61904 [6:48:19<18:00:21, 1.34s/it] 22%|██▏ | 13530/61904 [6:48:20<17:43:30, 1.32s/it] 22%|██▏ | 13531/61904 [6:48:22<17:46:40, 1.32s/it] 22%|██▏ | 13532/61904 [6:48:23<17:43:14, 1.32s/it] 22%|██▏ | 13533/61904 [6:48:24<17:36:56, 1.31s/it] 22%|██▏ | 13534/61904 [6:48:26<17:26:28, 1.30s/it] 22%|██▏ | 13535/61904 [6:48:27<17:10:30, 1.28s/it] 22%|██▏ | 13536/61904 [6:48:28<17:37:39, 1.31s/it] 22%|██▏ | 13537/61904 [6:48:29<17:05:22, 1.27s/it] 22%|██▏ | 13538/61904 [6:48:31<16:59:27, 1.26s/it] 22%|██▏ | 13539/61904 [6:48:32<17:18:33, 1.29s/it] 22%|██▏ | 13540/61904 [6:48:33<17:29:19, 1.30s/it] {'loss': 2.8366, 'learning_rate': 1.7838065603526514e-07, 'epoch': 3.5} 22%|██▏ | 13540/61904 [6:48:33<17:29:19, 1.30s/it] 22%|██▏ | 13541/61904 [6:48:35<17:50:41, 1.33s/it] 22%|██▏ | 13542/61904 [6:48:36<18:02:43, 1.34s/it] 22%|██▏ | 13543/61904 [6:48:37<18:19:55, 1.36s/it] 22%|██▏ | 13544/61904 [6:48:39<18:09:25, 1.35s/it] 22%|██▏ | 13545/61904 [6:48:40<18:13:27, 1.36s/it] 22%|██▏ | 13546/61904 [6:48:42<18:43:08, 1.39s/it] 22%|██▏ | 13547/61904 [6:48:43<18:57:28, 1.41s/it] 22%|██▏ | 13548/61904 [6:48:44<18:49:05, 1.40s/it] 22%|██▏ | 13549/61904 [6:48:46<18:47:18, 1.40s/it] 22%|██▏ | 13550/61904 [6:48:47<18:22:34, 1.37s/it] 22%|██▏ | 13551/61904 [6:48:49<18:25:43, 1.37s/it] 22%|██▏ | 13552/61904 [6:48:50<18:08:15, 1.35s/it] 22%|██▏ | 13553/61904 [6:48:51<18:11:52, 1.35s/it] 22%|██▏ | 13554/61904 [6:48:53<18:35:00, 1.38s/it] 22%|██▏ | 13555/61904 [6:48:54<18:57:21, 1.41s/it] 22%|██▏ | 13556/61904 [6:48:55<18:14:39, 1.36s/it] 22%|██▏ | 13557/61904 [6:48:57<18:21:52, 1.37s/it] 22%|██▏ | 13558/61904 [6:48:58<18:00:23, 1.34s/it] 22%|██▏ | 13559/61904 [6:48:59<17:35:00, 1.31s/it] 22%|██▏ | 13560/61904 [6:49:01<18:02:36, 1.34s/it] {'loss': 2.7885, 'learning_rate': 1.7834824322572278e-07, 'epoch': 3.5} 22%|██▏ | 13560/61904 [6:49:01<18:02:36, 1.34s/it] 22%|██▏ | 13561/61904 [6:49:02<18:15:13, 1.36s/it] 22%|██▏ | 13562/61904 [6:49:03<18:23:38, 1.37s/it] 22%|██▏ | 13563/61904 [6:49:05<17:51:08, 1.33s/it] 22%|██▏ | 13564/61904 [6:49:06<17:43:51, 1.32s/it] 22%|██▏ | 13565/61904 [6:49:07<17:49:29, 1.33s/it] 22%|██▏ | 13566/61904 [6:49:09<18:13:13, 1.36s/it] 22%|██▏ | 13567/61904 [6:49:10<18:03:15, 1.34s/it] 22%|██▏ | 13568/61904 [6:49:11<18:11:14, 1.35s/it] 22%|██▏ | 13569/61904 [6:49:13<17:56:01, 1.34s/it] 22%|██▏ | 13570/61904 [6:49:14<18:08:52, 1.35s/it] 22%|██▏ | 13571/61904 [6:49:15<17:45:12, 1.32s/it] 22%|██▏ | 13572/61904 [6:49:17<18:40:38, 1.39s/it] 22%|██▏ | 13573/61904 [6:49:18<18:56:59, 1.41s/it] 22%|██▏ | 13574/61904 [6:49:20<19:40:03, 1.47s/it] 22%|██▏ | 13575/61904 [6:49:22<19:46:35, 1.47s/it] 22%|██▏ | 13576/61904 [6:49:23<18:40:53, 1.39s/it] 22%|██▏ | 13577/61904 [6:49:24<18:48:43, 1.40s/it] 22%|██▏ | 13578/61904 [6:49:26<19:01:42, 1.42s/it] 22%|██▏ | 13579/61904 [6:49:27<19:03:11, 1.42s/it] 22%|██▏ | 13580/61904 [6:49:28<19:20:53, 1.44s/it] {'loss': 2.7273, 'learning_rate': 1.7831583041618047e-07, 'epoch': 3.51} 22%|██▏ | 13580/61904 [6:49:28<19:20:53, 1.44s/it] 22%|██▏ | 13581/61904 [6:49:30<18:48:33, 1.40s/it] 22%|██▏ | 13582/61904 [6:49:31<18:16:31, 1.36s/it] 22%|██▏ | 13583/61904 [6:49:32<18:18:52, 1.36s/it] 22%|██▏ | 13584/61904 [6:49:34<18:41:01, 1.39s/it] 22%|██▏ | 13585/61904 [6:49:35<18:24:40, 1.37s/it] 22%|██▏ | 13586/61904 [6:49:37<18:31:13, 1.38s/it] 22%|██▏ | 13587/61904 [6:49:38<18:48:07, 1.40s/it] 22%|██▏ | 13588/61904 [6:49:39<18:28:12, 1.38s/it] 22%|██▏ | 13589/61904 [6:49:41<18:33:10, 1.38s/it] 22%|██▏ | 13590/61904 [6:49:42<18:31:33, 1.38s/it] 22%|██▏ | 13591/61904 [6:49:43<18:13:59, 1.36s/it] 22%|██▏ | 13592/61904 [6:49:45<17:59:14, 1.34s/it] 22%|██▏ | 13593/61904 [6:49:46<17:53:44, 1.33s/it] 22%|██▏ | 13594/61904 [6:49:47<17:38:22, 1.31s/it] 22%|██▏ | 13595/61904 [6:49:49<17:31:06, 1.31s/it] 22%|██▏ | 13596/61904 [6:49:50<17:46:46, 1.32s/it] 22%|██▏ | 13597/61904 [6:49:51<18:14:52, 1.36s/it] 22%|██▏ | 13598/61904 [6:49:53<18:14:44, 1.36s/it] 22%|██▏ | 13599/61904 [6:49:54<18:05:44, 1.35s/it] 22%|██▏ | 13600/61904 [6:49:56<19:00:12, 1.42s/it] {'loss': 2.7864, 'learning_rate': 1.7828341760663813e-07, 'epoch': 3.51} 22%|██▏ | 13600/61904 [6:49:56<19:00:12, 1.42s/it] 22%|██▏ | 13601/61904 [6:49:57<18:43:56, 1.40s/it] 22%|██▏ | 13602/61904 [6:49:58<18:42:15, 1.39s/it] 22%|██▏ | 13603/61904 [6:50:00<18:22:14, 1.37s/it] 22%|██▏ | 13604/61904 [6:50:01<17:51:51, 1.33s/it] 22%|██▏ | 13605/61904 [6:50:02<17:54:53, 1.34s/it] 22%|██▏ | 13606/61904 [6:50:04<18:53:09, 1.41s/it] 22%|██▏ | 13607/61904 [6:50:05<18:12:17, 1.36s/it] 22%|██▏ | 13608/61904 [6:50:07<18:29:11, 1.38s/it] 22%|██▏ | 13609/61904 [6:50:08<18:11:37, 1.36s/it] 22%|██▏ | 13610/61904 [6:50:09<18:17:53, 1.36s/it] 22%|██▏ | 13611/61904 [6:50:11<18:23:00, 1.37s/it] 22%|██▏ | 13612/61904 [6:50:12<17:59:20, 1.34s/it] 22%|██▏ | 13613/61904 [6:50:13<17:38:12, 1.31s/it] 22%|██▏ | 13614/61904 [6:50:15<17:37:14, 1.31s/it] 22%|██▏ | 13615/61904 [6:50:16<17:50:29, 1.33s/it] 22%|██▏ | 13616/61904 [6:50:17<17:53:18, 1.33s/it] 22%|██▏ | 13617/61904 [6:50:19<17:53:53, 1.33s/it] 22%|██▏ | 13618/61904 [6:50:20<18:00:51, 1.34s/it] 22%|██▏ | 13619/61904 [6:50:21<18:15:25, 1.36s/it] 22%|██▏ | 13620/61904 [6:50:23<18:44:09, 1.40s/it] {'loss': 2.7448, 'learning_rate': 1.782510047970958e-07, 'epoch': 3.52} 22%|██▏ | 13620/61904 [6:50:23<18:44:09, 1.40s/it] 22%|██▏ | 13621/61904 [6:50:24<18:43:02, 1.40s/it] 22%|██▏ | 13622/61904 [6:50:26<18:23:43, 1.37s/it] 22%|██▏ | 13623/61904 [6:50:27<18:14:41, 1.36s/it] 22%|██▏ | 13624/61904 [6:50:28<18:30:05, 1.38s/it] 22%|██▏ | 13625/61904 [6:50:30<18:17:15, 1.36s/it] 22%|██▏ | 13626/61904 [6:50:31<18:39:19, 1.39s/it] 22%|██▏ | 13627/61904 [6:50:32<18:19:42, 1.37s/it] 22%|██▏ | 13628/61904 [6:50:34<18:26:09, 1.37s/it] 22%|██▏ | 13629/61904 [6:50:35<17:58:04, 1.34s/it] 22%|██▏ | 13630/61904 [6:50:36<17:55:14, 1.34s/it] 22%|██▏ | 13631/61904 [6:50:38<17:32:25, 1.31s/it] 22%|██▏ | 13632/61904 [6:50:39<17:34:31, 1.31s/it] 22%|██▏ | 13633/61904 [6:50:40<17:54:40, 1.34s/it] 22%|██▏ | 13634/61904 [6:50:42<18:02:18, 1.35s/it] 22%|██▏ | 13635/61904 [6:50:43<17:59:12, 1.34s/it] 22%|██▏ | 13636/61904 [6:50:45<18:57:40, 1.41s/it] 22%|██▏ | 13637/61904 [6:50:46<18:45:44, 1.40s/it] 22%|██▏ | 13638/61904 [6:50:47<18:01:19, 1.34s/it] 22%|██▏ | 13639/61904 [6:50:48<17:51:47, 1.33s/it] 22%|██▏ | 13640/61904 [6:50:50<17:18:42, 1.29s/it] {'loss': 2.7384, 'learning_rate': 1.7821859198755348e-07, 'epoch': 3.53} 22%|██▏ | 13640/61904 [6:50:50<17:18:42, 1.29s/it] 22%|██▏ | 13641/61904 [6:50:51<17:31:50, 1.31s/it] 22%|██▏ | 13642/61904 [6:50:52<17:44:07, 1.32s/it] 22%|██▏ | 13643/61904 [6:50:54<18:33:20, 1.38s/it] 22%|██▏ | 13644/61904 [6:50:55<18:15:58, 1.36s/it] 22%|██▏ | 13645/61904 [6:50:57<18:28:31, 1.38s/it] 22%|██▏ | 13646/61904 [6:50:58<19:02:44, 1.42s/it] 22%|██▏ | 13647/61904 [6:50:59<18:17:32, 1.36s/it] 22%|██▏ | 13648/61904 [6:51:01<17:49:20, 1.33s/it] 22%|██▏ | 13649/61904 [6:51:02<17:59:42, 1.34s/it] 22%|██▏ | 13650/61904 [6:51:03<18:06:29, 1.35s/it] 22%|██▏ | 13651/61904 [6:51:05<17:51:19, 1.33s/it] 22%|██▏ | 13652/61904 [6:51:06<17:44:41, 1.32s/it] 22%|██▏ | 13653/61904 [6:51:07<18:05:30, 1.35s/it] 22%|██▏ | 13654/61904 [6:51:09<17:51:39, 1.33s/it] 22%|██▏ | 13655/61904 [6:51:10<18:36:25, 1.39s/it] 22%|██▏ | 13656/61904 [6:51:11<18:20:39, 1.37s/it] 22%|██▏ | 13657/61904 [6:51:13<18:45:34, 1.40s/it] 22%|██▏ | 13658/61904 [6:51:14<18:39:07, 1.39s/it] 22%|██▏ | 13659/61904 [6:51:16<18:38:23, 1.39s/it] 22%|██▏ | 13660/61904 [6:51:17<18:42:04, 1.40s/it] {'loss': 2.7694, 'learning_rate': 1.7818617917801114e-07, 'epoch': 3.53} 22%|██▏ | 13660/61904 [6:51:17<18:42:04, 1.40s/it] 22%|██▏ | 13661/61904 [6:51:18<18:29:26, 1.38s/it] 22%|██▏ | 13662/61904 [6:51:20<18:23:39, 1.37s/it] 22%|██▏ | 13663/61904 [6:51:21<18:24:40, 1.37s/it] 22%|██▏ | 13664/61904 [6:51:23<18:51:10, 1.41s/it] 22%|██▏ | 13665/61904 [6:51:24<19:01:35, 1.42s/it] 22%|██▏ | 13666/61904 [6:51:25<18:21:31, 1.37s/it] 22%|██▏ | 13667/61904 [6:51:27<18:40:23, 1.39s/it] 22%|██▏ | 13668/61904 [6:51:28<19:05:58, 1.43s/it] 22%|██▏ | 13669/61904 [6:51:30<19:05:52, 1.43s/it] 22%|██▏ | 13670/61904 [6:51:31<18:21:23, 1.37s/it] 22%|██▏ | 13671/61904 [6:51:32<18:23:56, 1.37s/it] 22%|██▏ | 13672/61904 [6:51:34<18:01:14, 1.35s/it] 22%|██▏ | 13673/61904 [6:51:35<18:06:08, 1.35s/it] 22%|██▏ | 13674/61904 [6:51:36<17:35:52, 1.31s/it] 22%|██▏ | 13675/61904 [6:51:38<17:29:02, 1.31s/it] 22%|██▏ | 13676/61904 [6:51:39<17:49:15, 1.33s/it] 22%|██▏ | 13677/61904 [6:51:40<18:09:34, 1.36s/it] 22%|██▏ | 13678/61904 [6:51:42<18:11:20, 1.36s/it] 22%|██▏ | 13679/61904 [6:51:43<18:00:19, 1.34s/it] 22%|██▏ | 13680/61904 [6:51:44<18:09:40, 1.36s/it] {'loss': 2.7501, 'learning_rate': 1.781537663684688e-07, 'epoch': 3.54} 22%|██▏ | 13680/61904 [6:51:44<18:09:40, 1.36s/it] 22%|██▏ | 13681/61904 [6:51:46<18:00:53, 1.34s/it] 22%|██▏ | 13682/61904 [6:51:47<17:32:09, 1.31s/it] 22%|██▏ | 13683/61904 [6:51:48<17:53:24, 1.34s/it] 22%|██▏ | 13684/61904 [6:51:50<17:55:28, 1.34s/it] 22%|██▏ | 13685/61904 [6:51:51<17:51:12, 1.33s/it] 22%|██▏ | 13686/61904 [6:51:52<17:41:11, 1.32s/it] 22%|██▏ | 13687/61904 [6:51:54<17:43:16, 1.32s/it] 22%|██▏ | 13688/61904 [6:51:55<17:23:43, 1.30s/it] 22%|██▏ | 13689/61904 [6:51:56<17:28:08, 1.30s/it] 22%|██▏ | 13690/61904 [6:51:58<17:43:52, 1.32s/it] 22%|██▏ | 13691/61904 [6:51:59<17:51:25, 1.33s/it] 22%|██▏ | 13692/61904 [6:52:00<17:47:01, 1.33s/it] 22%|██▏ | 13693/61904 [6:52:02<17:51:22, 1.33s/it] 22%|██▏ | 13694/61904 [6:52:03<17:49:09, 1.33s/it] 22%|██▏ | 13695/61904 [6:52:04<17:27:07, 1.30s/it] 22%|██▏ | 13696/61904 [6:52:06<17:55:13, 1.34s/it] 22%|██▏ | 13697/61904 [6:52:07<17:51:20, 1.33s/it] 22%|██▏ | 13698/61904 [6:52:08<18:20:38, 1.37s/it] 22%|██▏ | 13699/61904 [6:52:10<17:59:20, 1.34s/it] 22%|██▏ | 13700/61904 [6:52:11<18:01:31, 1.35s/it] {'loss': 2.8046, 'learning_rate': 1.781213535589265e-07, 'epoch': 3.54} 22%|██▏ | 13700/61904 [6:52:11<18:01:31, 1.35s/it] 22%|██▏ | 13701/61904 [6:52:12<18:21:32, 1.37s/it] 22%|██▏ | 13702/61904 [6:52:14<18:22:36, 1.37s/it] 22%|██▏ | 13703/61904 [6:52:15<18:48:35, 1.40s/it] 22%|██▏ | 13704/61904 [6:52:17<19:12:47, 1.44s/it] 22%|██▏ | 13705/61904 [6:52:18<19:14:53, 1.44s/it] 22%|██▏ | 13706/61904 [6:52:20<19:40:55, 1.47s/it] 22%|██▏ | 13707/61904 [6:52:21<19:38:08, 1.47s/it] 22%|██▏ | 13708/61904 [6:52:23<19:06:48, 1.43s/it] 22%|██▏ | 13709/61904 [6:52:24<19:13:47, 1.44s/it] 22%|██▏ | 13710/61904 [6:52:25<18:22:01, 1.37s/it] 22%|██▏ | 13711/61904 [6:52:27<18:06:55, 1.35s/it] 22%|██▏ | 13712/61904 [6:52:28<18:07:53, 1.35s/it] 22%|██▏ | 13713/61904 [6:52:29<17:49:23, 1.33s/it] 22%|██▏ | 13714/61904 [6:52:31<18:43:04, 1.40s/it] 22%|██▏ | 13715/61904 [6:52:32<18:55:16, 1.41s/it] 22%|██▏ | 13716/61904 [6:52:33<18:26:43, 1.38s/it] 22%|██▏ | 13717/61904 [6:52:35<18:22:20, 1.37s/it] 22%|██▏ | 13718/61904 [6:52:36<18:27:18, 1.38s/it] 22%|██▏ | 13719/61904 [6:52:38<18:15:02, 1.36s/it] 22%|██▏ | 13720/61904 [6:52:39<17:51:26, 1.33s/it] {'loss': 2.752, 'learning_rate': 1.7808894074938413e-07, 'epoch': 3.55} 22%|██▏ | 13720/61904 [6:52:39<17:51:26, 1.33s/it] 22%|██▏ | 13721/61904 [6:52:40<18:20:31, 1.37s/it] 22%|██▏ | 13722/61904 [6:52:42<20:02:05, 1.50s/it] 22%|██▏ | 13723/61904 [6:52:43<19:37:59, 1.47s/it] 22%|██▏ | 13724/61904 [6:52:45<18:52:10, 1.41s/it] 22%|██▏ | 13725/61904 [6:52:46<18:11:00, 1.36s/it] 22%|██▏ | 13726/61904 [6:52:47<18:33:18, 1.39s/it] 22%|██▏ | 13727/61904 [6:52:49<18:29:26, 1.38s/it] 22%|██▏ | 13728/61904 [6:52:50<18:26:59, 1.38s/it] 22%|██▏ | 13729/61904 [6:52:52<18:37:10, 1.39s/it] 22%|██▏ | 13730/61904 [6:52:53<18:08:01, 1.36s/it] 22%|██▏ | 13731/61904 [6:52:54<18:24:09, 1.38s/it] 22%|██▏ | 13732/61904 [6:52:56<18:08:43, 1.36s/it] 22%|██▏ | 13733/61904 [6:52:57<17:40:54, 1.32s/it] 22%|██▏ | 13734/61904 [6:52:59<19:05:32, 1.43s/it] 22%|██▏ | 13735/61904 [6:53:00<18:23:52, 1.38s/it] 22%|██▏ | 13736/61904 [6:53:01<18:03:44, 1.35s/it] 22%|██▏ | 13737/61904 [6:53:02<18:13:26, 1.36s/it] 22%|██▏ | 13738/61904 [6:53:04<17:56:21, 1.34s/it] 22%|██▏ | 13739/61904 [6:53:05<17:42:26, 1.32s/it] 22%|██▏ | 13740/61904 [6:53:06<17:46:52, 1.33s/it] {'loss': 2.771, 'learning_rate': 1.7805652793984181e-07, 'epoch': 3.55} 22%|██▏ | 13740/61904 [6:53:06<17:46:52, 1.33s/it] 22%|██▏ | 13741/61904 [6:53:08<17:25:36, 1.30s/it] 22%|██▏ | 13742/61904 [6:53:09<17:17:42, 1.29s/it] 22%|██▏ | 13743/61904 [6:53:10<17:07:08, 1.28s/it] 22%|██▏ | 13744/61904 [6:53:11<17:11:49, 1.29s/it] 22%|██▏ | 13745/61904 [6:53:13<17:27:47, 1.31s/it] 22%|██▏ | 13746/61904 [6:53:14<17:41:42, 1.32s/it] 22%|██▏ | 13747/61904 [6:53:16<18:26:34, 1.38s/it] 22%|██▏ | 13748/61904 [6:53:17<18:32:35, 1.39s/it] 22%|██▏ | 13749/61904 [6:53:18<18:07:04, 1.35s/it] 22%|██▏ | 13750/61904 [6:53:20<18:13:29, 1.36s/it] 22%|██▏ | 13751/61904 [6:53:21<18:15:00, 1.36s/it] 22%|██▏ | 13752/61904 [6:53:23<18:38:51, 1.39s/it] 22%|██▏ | 13753/61904 [6:53:24<18:16:13, 1.37s/it] 22%|██▏ | 13754/61904 [6:53:25<18:13:11, 1.36s/it] 22%|██▏ | 13755/61904 [6:53:26<17:54:15, 1.34s/it] 22%|██▏ | 13756/61904 [6:53:28<18:15:42, 1.37s/it] 22%|██▏ | 13757/61904 [6:53:29<18:01:25, 1.35s/it] 22%|██▏ | 13758/61904 [6:53:31<18:11:27, 1.36s/it] 22%|██▏ | 13759/61904 [6:53:32<18:11:45, 1.36s/it] 22%|██▏ | 13760/61904 [6:53:33<18:05:28, 1.35s/it] {'loss': 2.7431, 'learning_rate': 1.780241151302995e-07, 'epoch': 3.56} 22%|██▏ | 13760/61904 [6:53:33<18:05:28, 1.35s/it] 22%|██▏ | 13761/61904 [6:53:35<18:12:27, 1.36s/it] 22%|██▏ | 13762/61904 [6:53:36<17:49:18, 1.33s/it] 22%|██▏ | 13763/61904 [6:53:37<17:54:11, 1.34s/it] 22%|██▏ | 13764/61904 [6:53:39<17:55:28, 1.34s/it] 22%|██▏ | 13765/61904 [6:53:40<17:46:42, 1.33s/it] 22%|██▏ | 13766/61904 [6:53:41<18:10:32, 1.36s/it] 22%|██▏ | 13767/61904 [6:53:43<17:57:39, 1.34s/it] 22%|██▏ | 13768/61904 [6:53:44<18:31:47, 1.39s/it] 22%|██▏ | 13769/61904 [6:53:46<20:17:43, 1.52s/it] 22%|██▏ | 13770/61904 [6:53:47<19:03:47, 1.43s/it] 22%|██▏ | 13771/61904 [6:53:49<18:33:30, 1.39s/it] 22%|██▏ | 13772/61904 [6:53:50<18:47:21, 1.41s/it] 22%|██▏ | 13773/61904 [6:53:51<18:49:40, 1.41s/it] 22%|██▏ | 13774/61904 [6:53:53<19:01:08, 1.42s/it] 22%|██▏ | 13775/61904 [6:53:54<18:25:19, 1.38s/it] 22%|██▏ | 13776/61904 [6:53:55<18:10:38, 1.36s/it] 22%|██▏ | 13777/61904 [6:53:57<18:16:29, 1.37s/it] 22%|██▏ | 13778/61904 [6:53:58<18:21:31, 1.37s/it] 22%|██▏ | 13779/61904 [6:54:00<18:11:20, 1.36s/it] 22%|██▏ | 13780/61904 [6:54:01<18:20:26, 1.37s/it] {'loss': 2.7601, 'learning_rate': 1.7799170232075714e-07, 'epoch': 3.56} 22%|██▏ | 13780/61904 [6:54:01<18:20:26, 1.37s/it] 22%|██▏ | 13781/61904 [6:54:02<17:58:00, 1.34s/it] 22%|██▏ | 13782/61904 [6:54:03<17:39:10, 1.32s/it] 22%|██▏ | 13783/61904 [6:54:05<17:46:03, 1.33s/it] 22%|██▏ | 13784/61904 [6:54:06<17:56:25, 1.34s/it] 22%|██▏ | 13785/61904 [6:54:08<18:07:09, 1.36s/it] 22%|██▏ | 13786/61904 [6:54:09<17:48:17, 1.33s/it] 22%|██▏ | 13787/61904 [6:54:10<18:16:37, 1.37s/it] 22%|██▏ | 13788/61904 [6:54:12<18:13:05, 1.36s/it] 22%|██▏ | 13789/61904 [6:54:13<17:56:32, 1.34s/it] 22%|██▏ | 13790/61904 [6:54:14<18:17:06, 1.37s/it] 22%|██▏ | 13791/61904 [6:54:16<18:15:37, 1.37s/it] 22%|██▏ | 13792/61904 [6:54:17<18:05:15, 1.35s/it] 22%|██▏ | 13793/61904 [6:54:18<18:04:45, 1.35s/it] 22%|██▏ | 13794/61904 [6:54:20<18:01:42, 1.35s/it] 22%|██▏ | 13795/61904 [6:54:21<18:25:28, 1.38s/it] 22%|██▏ | 13796/61904 [6:54:22<18:04:57, 1.35s/it] 22%|██▏ | 13797/61904 [6:54:24<18:22:42, 1.38s/it] 22%|██▏ | 13798/61904 [6:54:25<18:29:51, 1.38s/it] 22%|██▏ | 13799/61904 [6:54:27<18:28:15, 1.38s/it] 22%|██▏ | 13800/61904 [6:54:28<18:07:20, 1.36s/it] {'loss': 2.8301, 'learning_rate': 1.7795928951121483e-07, 'epoch': 3.57} 22%|██▏ | 13800/61904 [6:54:28<18:07:20, 1.36s/it] 22%|██▏ | 13801/61904 [6:54:29<17:41:20, 1.32s/it] 22%|██▏ | 13802/61904 [6:54:30<17:16:25, 1.29s/it] 22%|██▏ | 13803/61904 [6:54:32<17:32:02, 1.31s/it] 22%|██▏ | 13804/61904 [6:54:33<17:10:43, 1.29s/it] 22%|██▏ | 13805/61904 [6:54:34<17:33:08, 1.31s/it] 22%|██▏ | 13806/61904 [6:54:36<17:59:25, 1.35s/it] 22%|██▏ | 13807/61904 [6:54:37<19:01:43, 1.42s/it] 22%|██▏ | 13808/61904 [6:54:39<18:51:40, 1.41s/it] 22%|██▏ | 13809/61904 [6:54:40<18:26:59, 1.38s/it] 22%|██▏ | 13810/61904 [6:54:42<18:40:30, 1.40s/it] 22%|██▏ | 13811/61904 [6:54:43<18:51:02, 1.41s/it] 22%|██▏ | 13812/61904 [6:54:45<19:25:39, 1.45s/it] 22%|██▏ | 13813/61904 [6:54:46<18:51:17, 1.41s/it] 22%|██▏ | 13814/61904 [6:54:47<18:22:20, 1.38s/it] 22%|██▏ | 13815/61904 [6:54:48<18:07:34, 1.36s/it] 22%|██▏ | 13816/61904 [6:54:50<18:18:31, 1.37s/it] 22%|██▏ | 13817/61904 [6:54:51<18:34:13, 1.39s/it] 22%|██▏ | 13818/61904 [6:54:53<18:07:30, 1.36s/it] 22%|██▏ | 13819/61904 [6:54:54<17:40:35, 1.32s/it] 22%|██▏ | 13820/61904 [6:54:55<17:50:34, 1.34s/it] {'loss': 2.7157, 'learning_rate': 1.779268767016725e-07, 'epoch': 3.57} 22%|██▏ | 13820/61904 [6:54:55<17:50:34, 1.34s/it] 22%|██▏ | 13821/61904 [6:54:57<17:52:36, 1.34s/it] 22%|██▏ | 13822/61904 [6:54:58<17:59:27, 1.35s/it] 22%|██▏ | 13823/61904 [6:54:59<17:23:20, 1.30s/it] 22%|██▏ | 13824/61904 [6:55:01<17:58:29, 1.35s/it] 22%|██▏ | 13825/61904 [6:55:02<18:11:51, 1.36s/it] 22%|██▏ | 13826/61904 [6:55:03<18:39:50, 1.40s/it] 22%|██▏ | 13827/61904 [6:55:05<18:31:02, 1.39s/it] 22%|██▏ | 13828/61904 [6:55:06<18:13:31, 1.36s/it] 22%|██▏ | 13829/61904 [6:55:07<18:11:36, 1.36s/it] 22%|██▏ | 13830/61904 [6:55:09<18:33:29, 1.39s/it] 22%|██▏ | 13831/61904 [6:55:10<18:20:03, 1.37s/it] 22%|██▏ | 13832/61904 [6:55:12<17:54:36, 1.34s/it] 22%|██▏ | 13833/61904 [6:55:13<17:47:48, 1.33s/it] 22%|██▏ | 13834/61904 [6:55:14<18:11:30, 1.36s/it] 22%|██▏ | 13835/61904 [6:55:16<18:06:58, 1.36s/it] 22%|██▏ | 13836/61904 [6:55:17<18:19:04, 1.37s/it] 22%|██▏ | 13837/61904 [6:55:19<18:42:44, 1.40s/it] 22%|██▏ | 13838/61904 [6:55:20<18:33:48, 1.39s/it] 22%|██▏ | 13839/61904 [6:55:21<19:12:27, 1.44s/it] 22%|██▏ | 13840/61904 [6:55:23<19:36:37, 1.47s/it] {'loss': 2.7602, 'learning_rate': 1.7789446389213015e-07, 'epoch': 3.58} 22%|██▏ | 13840/61904 [6:55:23<19:36:37, 1.47s/it] 22%|██▏ | 13841/61904 [6:55:24<19:30:48, 1.46s/it] 22%|██▏ | 13842/61904 [6:55:26<19:59:30, 1.50s/it] 22%|██▏ | 13843/61904 [6:55:27<19:22:03, 1.45s/it] 22%|██▏ | 13844/61904 [6:55:29<19:24:53, 1.45s/it] 22%|██▏ | 13845/61904 [6:55:30<19:10:56, 1.44s/it] 22%|██▏ | 13846/61904 [6:55:32<19:19:53, 1.45s/it] 22%|██▏ | 13847/61904 [6:55:33<18:38:21, 1.40s/it] 22%|██▏ | 13848/61904 [6:55:34<18:46:58, 1.41s/it] 22%|██▏ | 13849/61904 [6:55:36<18:01:14, 1.35s/it] 22%|██▏ | 13850/61904 [6:55:37<18:12:51, 1.36s/it] 22%|██▏ | 13851/61904 [6:55:39<19:11:51, 1.44s/it] 22%|██▏ | 13852/61904 [6:55:40<19:21:24, 1.45s/it] 22%|██▏ | 13853/61904 [6:55:42<19:21:19, 1.45s/it] 22%|██▏ | 13854/61904 [6:55:43<19:15:43, 1.44s/it] 22%|██▏ | 13855/61904 [6:55:44<19:12:34, 1.44s/it] 22%|██▏ | 13856/61904 [6:55:46<20:33:43, 1.54s/it] 22%|██▏ | 13857/61904 [6:55:48<20:09:49, 1.51s/it] 22%|██▏ | 13858/61904 [6:55:49<19:54:18, 1.49s/it] 22%|██▏ | 13859/61904 [6:55:50<19:37:48, 1.47s/it] 22%|██▏ | 13860/61904 [6:55:52<19:45:05, 1.48s/it] {'loss': 2.7926, 'learning_rate': 1.7786205108258784e-07, 'epoch': 3.58} 22%|██▏ | 13860/61904 [6:55:52<19:45:05, 1.48s/it] 22%|██▏ | 13861/61904 [6:55:53<19:22:27, 1.45s/it] 22%|██▏ | 13862/61904 [6:55:55<18:40:01, 1.40s/it] 22%|██▏ | 13863/61904 [6:55:56<19:10:26, 1.44s/it] 22%|██▏ | 13864/61904 [6:55:57<18:28:13, 1.38s/it] 22%|██▏ | 13865/61904 [6:55:59<17:47:56, 1.33s/it] 22%|██▏ | 13866/61904 [6:56:00<17:26:26, 1.31s/it] 22%|██▏ | 13867/61904 [6:56:01<18:15:42, 1.37s/it] 22%|██▏ | 13868/61904 [6:56:03<18:21:10, 1.38s/it] 22%|██▏ | 13869/61904 [6:56:04<18:18:47, 1.37s/it] 22%|██▏ | 13870/61904 [6:56:06<18:19:14, 1.37s/it] 22%|██▏ | 13871/61904 [6:56:07<18:12:16, 1.36s/it] 22%|██▏ | 13872/61904 [6:56:08<17:43:14, 1.33s/it] 22%|██▏ | 13873/61904 [6:56:09<17:51:09, 1.34s/it] 22%|██▏ | 13874/61904 [6:56:11<18:03:50, 1.35s/it] 22%|██▏ | 13875/61904 [6:56:12<17:56:40, 1.35s/it] 22%|██▏ | 13876/61904 [6:56:14<18:09:01, 1.36s/it] 22%|██▏ | 13877/61904 [6:56:15<17:57:10, 1.35s/it] 22%|██▏ | 13878/61904 [6:56:16<17:54:17, 1.34s/it] 22%|██▏ | 13879/61904 [6:56:18<18:26:09, 1.38s/it] 22%|██▏ | 13880/61904 [6:56:19<18:24:56, 1.38s/it] {'loss': 2.76, 'learning_rate': 1.778296382730455e-07, 'epoch': 3.59} 22%|██▏ | 13880/61904 [6:56:19<18:24:56, 1.38s/it] 22%|██▏ | 13881/61904 [6:56:20<18:13:38, 1.37s/it] 22%|██▏ | 13882/61904 [6:56:22<18:30:35, 1.39s/it] 22%|██▏ | 13883/61904 [6:56:23<18:49:19, 1.41s/it] 22%|██▏ | 13884/61904 [6:56:25<19:23:26, 1.45s/it] 22%|██▏ | 13885/61904 [6:56:26<18:56:37, 1.42s/it] 22%|██▏ | 13886/61904 [6:56:28<19:02:29, 1.43s/it] 22%|██▏ | 13887/61904 [6:56:29<18:45:16, 1.41s/it] 22%|██▏ | 13888/61904 [6:56:30<18:33:43, 1.39s/it] 22%|██▏ | 13889/61904 [6:56:32<19:01:18, 1.43s/it] 22%|██▏ | 13890/61904 [6:56:33<19:25:03, 1.46s/it] 22%|██▏ | 13891/61904 [6:56:35<19:09:04, 1.44s/it] 22%|██▏ | 13892/61904 [6:56:36<18:43:21, 1.40s/it] 22%|██▏ | 13893/61904 [6:56:38<18:52:32, 1.42s/it] 22%|██▏ | 13894/61904 [6:56:39<18:15:18, 1.37s/it] 22%|██▏ | 13895/61904 [6:56:40<18:13:39, 1.37s/it] 22%|██▏ | 13896/61904 [6:56:42<17:58:57, 1.35s/it] 22%|██▏ | 13897/61904 [6:56:43<17:59:07, 1.35s/it] 22%|██▏ | 13898/61904 [6:56:44<18:48:11, 1.41s/it] 22%|██▏ | 13899/61904 [6:56:46<19:04:38, 1.43s/it] 22%|██▏ | 13900/61904 [6:56:47<18:58:31, 1.42s/it] {'loss': 2.735, 'learning_rate': 1.7779722546350316e-07, 'epoch': 3.59} 22%|██▏ | 13900/61904 [6:56:47<18:58:31, 1.42s/it] 22%|██▏ | 13901/61904 [6:56:49<18:35:28, 1.39s/it] 22%|██▏ | 13902/61904 [6:56:50<18:24:12, 1.38s/it] 22%|██▏ | 13903/61904 [6:56:51<18:07:40, 1.36s/it] 22%|██▏ | 13904/61904 [6:56:53<18:15:55, 1.37s/it] 22%|██▏ | 13905/61904 [6:56:54<18:22:35, 1.38s/it] 22%|██▏ | 13906/61904 [6:56:55<18:23:40, 1.38s/it] 22%|██▏ | 13907/61904 [6:56:57<19:12:48, 1.44s/it] 22%|██▏ | 13908/61904 [6:56:58<19:00:00, 1.43s/it] 22%|██▏ | 13909/61904 [6:57:00<18:46:34, 1.41s/it] 22%|██▏ | 13910/61904 [6:57:01<18:40:22, 1.40s/it] 22%|██▏ | 13911/61904 [6:57:02<18:21:05, 1.38s/it] 22%|██▏ | 13912/61904 [6:57:04<18:19:26, 1.37s/it] 22%|██▏ | 13913/61904 [6:57:05<18:40:52, 1.40s/it] 22%|██▏ | 13914/61904 [6:57:07<18:26:20, 1.38s/it] 22%|██▏ | 13915/61904 [6:57:08<18:46:03, 1.41s/it] 22%|██▏ | 13916/61904 [6:57:09<18:16:24, 1.37s/it] 22%|██▏ | 13917/61904 [6:57:11<18:55:45, 1.42s/it] 22%|██▏ | 13918/61904 [6:57:12<18:14:22, 1.37s/it] 22%|██▏ | 13919/61904 [6:57:13<17:45:40, 1.33s/it] 22%|██▏ | 13920/61904 [6:57:15<17:20:58, 1.30s/it] {'loss': 2.7026, 'learning_rate': 1.7776481265396085e-07, 'epoch': 3.6} 22%|██▏ | 13920/61904 [6:57:15<17:20:58, 1.30s/it] 22%|██▏ | 13921/61904 [6:57:16<17:15:03, 1.29s/it] 22%|██▏ | 13922/61904 [6:57:17<18:12:07, 1.37s/it] 22%|██▏ | 13923/61904 [6:57:19<18:36:05, 1.40s/it] 22%|██▏ | 13924/61904 [6:57:20<18:56:01, 1.42s/it] 22%|██▏ | 13925/61904 [6:57:22<18:25:14, 1.38s/it] 22%|██▏ | 13926/61904 [6:57:23<18:16:08, 1.37s/it] 22%|██▏ | 13927/61904 [6:57:25<18:31:06, 1.39s/it] 22%|██▏ | 13928/61904 [6:57:26<18:31:39, 1.39s/it] 23%|██▎ | 13929/61904 [6:57:27<18:24:41, 1.38s/it] 23%|██▎ | 13930/61904 [6:57:29<18:16:52, 1.37s/it] 23%|██▎ | 13931/61904 [6:57:30<17:55:30, 1.35s/it] 23%|██▎ | 13932/61904 [6:57:31<18:00:52, 1.35s/it] 23%|██▎ | 13933/61904 [6:57:33<18:02:13, 1.35s/it] 23%|██▎ | 13934/61904 [6:57:34<18:00:04, 1.35s/it] 23%|██▎ | 13935/61904 [6:57:35<18:04:36, 1.36s/it] 23%|██▎ | 13936/61904 [6:57:37<17:46:06, 1.33s/it] 23%|██▎ | 13937/61904 [6:57:38<18:21:58, 1.38s/it] 23%|██▎ | 13938/61904 [6:57:39<18:04:45, 1.36s/it] 23%|██▎ | 13939/61904 [6:57:41<18:00:00, 1.35s/it] 23%|██▎ | 13940/61904 [6:57:42<18:27:51, 1.39s/it] {'loss': 2.7219, 'learning_rate': 1.7773239984441849e-07, 'epoch': 3.6} 23%|██▎ | 13940/61904 [6:57:42<18:27:51, 1.39s/it] 23%|██▎ | 13941/61904 [6:57:44<18:35:38, 1.40s/it] 23%|██▎ | 13942/61904 [6:57:45<18:29:25, 1.39s/it] 23%|██▎ | 13943/61904 [6:57:46<18:16:56, 1.37s/it] 23%|██▎ | 13944/61904 [6:57:48<18:07:53, 1.36s/it] 23%|██▎ | 13945/61904 [6:57:49<17:52:24, 1.34s/it] 23%|██▎ | 13946/61904 [6:57:50<17:47:34, 1.34s/it] 23%|██▎ | 13947/61904 [6:57:52<18:08:27, 1.36s/it] 23%|██▎ | 13948/61904 [6:57:53<17:43:37, 1.33s/it] 23%|██▎ | 13949/61904 [6:57:54<17:48:44, 1.34s/it] 23%|██▎ | 13950/61904 [6:57:56<17:36:10, 1.32s/it] 23%|██▎ | 13951/61904 [6:57:57<17:30:37, 1.31s/it] 23%|██▎ | 13952/61904 [6:57:58<17:43:03, 1.33s/it] 23%|██▎ | 13953/61904 [6:58:00<18:29:04, 1.39s/it] 23%|██▎ | 13954/61904 [6:58:01<18:38:59, 1.40s/it] 23%|██▎ | 13955/61904 [6:58:03<18:41:55, 1.40s/it] 23%|██▎ | 13956/61904 [6:58:04<18:39:51, 1.40s/it] 23%|██▎ | 13957/61904 [6:58:05<18:46:52, 1.41s/it] 23%|██▎ | 13958/61904 [6:58:07<18:12:37, 1.37s/it] 23%|██▎ | 13959/61904 [6:58:08<18:09:26, 1.36s/it] 23%|██▎ | 13960/61904 [6:58:09<17:56:06, 1.35s/it] {'loss': 2.821, 'learning_rate': 1.7769998703487617e-07, 'epoch': 3.61} 23%|██▎ | 13960/61904 [6:58:09<17:56:06, 1.35s/it] 23%|██▎ | 13961/61904 [6:58:11<18:08:42, 1.36s/it] 23%|██▎ | 13962/61904 [6:58:12<17:49:03, 1.34s/it] 23%|██▎ | 13963/61904 [6:58:13<17:35:58, 1.32s/it] 23%|██▎ | 13964/61904 [6:58:15<17:13:45, 1.29s/it] 23%|██▎ | 13965/61904 [6:58:16<17:43:04, 1.33s/it] 23%|██▎ | 13966/61904 [6:58:17<17:50:12, 1.34s/it] 23%|██▎ | 13967/61904 [6:58:19<18:16:53, 1.37s/it] 23%|██▎ | 13968/61904 [6:58:20<18:48:05, 1.41s/it] 23%|██▎ | 13969/61904 [6:58:22<18:05:32, 1.36s/it] 23%|██▎ | 13970/61904 [6:58:23<18:47:23, 1.41s/it] 23%|██▎ | 13971/61904 [6:58:24<18:44:10, 1.41s/it] 23%|██▎ | 13972/61904 [6:58:26<18:17:29, 1.37s/it] 23%|██▎ | 13973/61904 [6:58:27<18:41:14, 1.40s/it] 23%|██▎ | 13974/61904 [6:58:29<18:32:49, 1.39s/it] 23%|██▎ | 13975/61904 [6:58:30<18:41:56, 1.40s/it] 23%|██▎ | 13976/61904 [6:58:31<18:42:32, 1.41s/it] 23%|██▎ | 13977/61904 [6:58:33<18:39:15, 1.40s/it] 23%|██▎ | 13978/61904 [6:58:34<18:54:58, 1.42s/it] 23%|██▎ | 13979/61904 [6:58:36<19:16:35, 1.45s/it] 23%|██▎ | 13980/61904 [6:58:37<19:09:23, 1.44s/it] {'loss': 2.6779, 'learning_rate': 1.7766757422533386e-07, 'epoch': 3.61} 23%|██▎ | 13980/61904 [6:58:37<19:09:23, 1.44s/it] 23%|██▎ | 13981/61904 [6:58:39<19:57:42, 1.50s/it] 23%|██▎ | 13982/61904 [6:58:40<19:04:57, 1.43s/it] 23%|██▎ | 13983/61904 [6:58:42<18:46:23, 1.41s/it] 23%|██▎ | 13984/61904 [6:58:43<18:11:25, 1.37s/it] 23%|██▎ | 13985/61904 [6:58:44<17:36:21, 1.32s/it] 23%|██▎ | 13986/61904 [6:58:45<17:30:20, 1.32s/it] 23%|██▎ | 13987/61904 [6:58:47<17:30:55, 1.32s/it] 23%|██▎ | 13988/61904 [6:58:48<17:47:12, 1.34s/it] 23%|██▎ | 13989/61904 [6:58:49<17:39:42, 1.33s/it] 23%|██▎ | 13990/61904 [6:58:51<17:42:14, 1.33s/it] 23%|██▎ | 13991/61904 [6:58:52<18:01:00, 1.35s/it] 23%|██▎ | 13992/61904 [6:58:53<18:20:48, 1.38s/it] 23%|██▎ | 13993/61904 [6:58:55<17:47:35, 1.34s/it] 23%|██▎ | 13994/61904 [6:58:56<18:12:48, 1.37s/it] 23%|██▎ | 13995/61904 [6:58:58<18:14:41, 1.37s/it] 23%|██▎ | 13996/61904 [6:58:59<18:12:58, 1.37s/it] 23%|██▎ | 13997/61904 [6:59:00<18:01:15, 1.35s/it] 23%|██▎ | 13998/61904 [6:59:02<18:41:05, 1.40s/it] 23%|██▎ | 13999/61904 [6:59:03<18:29:24, 1.39s/it] 23%|██▎ | 14000/61904 [6:59:04<18:24:47, 1.38s/it] {'loss': 2.7756, 'learning_rate': 1.776351614157915e-07, 'epoch': 3.62} 23%|██▎ | 14000/61904 [6:59:04<18:24:47, 1.38s/it] 23%|██▎ | 14001/61904 [6:59:06<18:12:56, 1.37s/it] 23%|██▎ | 14002/61904 [6:59:07<17:42:12, 1.33s/it] 23%|██▎ | 14003/61904 [6:59:08<17:56:23, 1.35s/it] 23%|██▎ | 14004/61904 [6:59:10<18:13:51, 1.37s/it] 23%|██▎ | 14005/61904 [6:59:11<18:06:12, 1.36s/it] 23%|██▎ | 14006/61904 [6:59:13<18:15:11, 1.37s/it] 23%|██▎ | 14007/61904 [6:59:14<18:29:06, 1.39s/it] 23%|██▎ | 14008/61904 [6:59:15<18:07:46, 1.36s/it] 23%|██▎ | 14009/61904 [6:59:17<18:01:46, 1.36s/it] 23%|██▎ | 14010/61904 [6:59:18<18:41:06, 1.40s/it] 23%|██▎ | 14011/61904 [6:59:20<18:27:31, 1.39s/it] 23%|██▎ | 14012/61904 [6:59:21<17:54:51, 1.35s/it] 23%|██▎ | 14013/61904 [6:59:22<17:47:35, 1.34s/it] 23%|██▎ | 14014/61904 [6:59:23<17:43:28, 1.33s/it] 23%|██▎ | 14015/61904 [6:59:25<18:00:30, 1.35s/it] 23%|██▎ | 14016/61904 [6:59:26<18:09:08, 1.36s/it] 23%|██▎ | 14017/61904 [6:59:28<18:25:03, 1.38s/it] 23%|██▎ | 14018/61904 [6:59:29<17:59:53, 1.35s/it] 23%|██▎ | 14019/61904 [6:59:30<18:44:57, 1.41s/it] 23%|██▎ | 14020/61904 [6:59:32<18:50:35, 1.42s/it] {'loss': 2.7093, 'learning_rate': 1.7760274860624919e-07, 'epoch': 3.62} 23%|██▎ | 14020/61904 [6:59:32<18:50:35, 1.42s/it] 23%|██▎ | 14021/61904 [6:59:33<18:37:14, 1.40s/it] 23%|██▎ | 14022/61904 [6:59:35<18:08:52, 1.36s/it] 23%|██▎ | 14023/61904 [6:59:36<18:03:40, 1.36s/it] 23%|██▎ | 14024/61904 [6:59:37<18:11:17, 1.37s/it] 23%|██▎ | 14025/61904 [6:59:39<18:13:32, 1.37s/it] 23%|██▎ | 14026/61904 [6:59:40<18:06:10, 1.36s/it] 23%|██▎ | 14027/61904 [6:59:42<18:42:24, 1.41s/it] 23%|██▎ | 14028/61904 [6:59:43<19:21:25, 1.46s/it] 23%|██▎ | 14029/61904 [6:59:44<18:53:54, 1.42s/it] 23%|██▎ | 14030/61904 [6:59:46<18:54:15, 1.42s/it] 23%|██▎ | 14031/61904 [6:59:47<18:28:53, 1.39s/it] 23%|██▎ | 14032/61904 [6:59:48<18:03:24, 1.36s/it] 23%|██▎ | 14033/61904 [6:59:50<18:06:52, 1.36s/it] 23%|██▎ | 14034/61904 [6:59:51<17:52:19, 1.34s/it] 23%|██▎ | 14035/61904 [6:59:52<17:54:28, 1.35s/it] 23%|██▎ | 14036/61904 [6:59:54<17:39:54, 1.33s/it] 23%|██▎ | 14037/61904 [6:59:55<18:19:26, 1.38s/it] 23%|██▎ | 14038/61904 [6:59:57<18:39:52, 1.40s/it] 23%|██▎ | 14039/61904 [6:59:58<18:08:18, 1.36s/it] 23%|██▎ | 14040/61904 [6:59:59<17:34:56, 1.32s/it] {'loss': 2.8079, 'learning_rate': 1.7757033579670685e-07, 'epoch': 3.63} 23%|██▎ | 14040/61904 [6:59:59<17:34:56, 1.32s/it] 23%|██▎ | 14041/61904 [7:00:01<17:32:25, 1.32s/it] 23%|██▎ | 14042/61904 [7:00:02<18:01:26, 1.36s/it] 23%|██▎ | 14043/61904 [7:00:03<18:17:38, 1.38s/it] 23%|██▎ | 14044/61904 [7:00:05<18:14:50, 1.37s/it] 23%|██▎ | 14045/61904 [7:00:06<18:18:20, 1.38s/it] 23%|██▎ | 14046/61904 [7:00:08<19:05:20, 1.44s/it] 23%|██▎ | 14047/61904 [7:00:09<18:47:25, 1.41s/it] 23%|██▎ | 14048/61904 [7:00:11<18:54:13, 1.42s/it] 23%|██▎ | 14049/61904 [7:00:12<17:59:40, 1.35s/it] 23%|██▎ | 14050/61904 [7:00:13<18:24:07, 1.38s/it] 23%|██▎ | 14051/61904 [7:00:15<18:50:22, 1.42s/it] 23%|██▎ | 14052/61904 [7:00:16<19:06:37, 1.44s/it] 23%|██▎ | 14053/61904 [7:00:18<19:00:39, 1.43s/it] 23%|██▎ | 14054/61904 [7:00:19<19:03:26, 1.43s/it] 23%|██▎ | 14055/61904 [7:00:20<18:15:16, 1.37s/it] 23%|██▎ | 14056/61904 [7:00:22<18:19:46, 1.38s/it] 23%|██▎ | 14057/61904 [7:00:23<18:43:15, 1.41s/it] 23%|██▎ | 14058/61904 [7:00:24<18:13:37, 1.37s/it] 23%|██▎ | 14059/61904 [7:00:26<18:07:42, 1.36s/it] 23%|██▎ | 14060/61904 [7:00:27<19:28:17, 1.47s/it] {'loss': 2.7358, 'learning_rate': 1.775379229871645e-07, 'epoch': 3.63} 23%|██▎ | 14060/61904 [7:00:27<19:28:17, 1.47s/it] 23%|██▎ | 14061/61904 [7:00:29<18:49:51, 1.42s/it] 23%|██▎ | 14062/61904 [7:00:30<18:38:56, 1.40s/it] 23%|██▎ | 14063/61904 [7:00:32<19:05:12, 1.44s/it] 23%|██▎ | 14064/61904 [7:00:33<19:01:31, 1.43s/it] 23%|██▎ | 14065/61904 [7:00:35<19:24:20, 1.46s/it] 23%|██▎ | 14066/61904 [7:00:36<18:58:12, 1.43s/it] 23%|██▎ | 14067/61904 [7:00:37<18:46:15, 1.41s/it] 23%|██▎ | 14068/61904 [7:00:39<19:08:33, 1.44s/it] 23%|██▎ | 14069/61904 [7:00:40<18:49:42, 1.42s/it] 23%|██▎ | 14070/61904 [7:00:42<19:12:23, 1.45s/it] 23%|██▎ | 14071/61904 [7:00:43<19:13:50, 1.45s/it] 23%|██▎ | 14072/61904 [7:00:45<19:10:13, 1.44s/it] 23%|██▎ | 14073/61904 [7:00:46<18:48:09, 1.42s/it] 23%|██▎ | 14074/61904 [7:00:47<18:50:49, 1.42s/it] 23%|██▎ | 14075/61904 [7:00:49<18:34:53, 1.40s/it] 23%|██▎ | 14076/61904 [7:00:50<17:54:21, 1.35s/it] 23%|██▎ | 14077/61904 [7:00:51<17:40:33, 1.33s/it] 23%|██▎ | 14078/61904 [7:00:53<17:51:44, 1.34s/it] 23%|██▎ | 14079/61904 [7:00:54<18:17:47, 1.38s/it] 23%|██▎ | 14080/61904 [7:00:55<18:23:28, 1.38s/it] {'loss': 2.7401, 'learning_rate': 1.775055101776222e-07, 'epoch': 3.64} 23%|██▎ | 14080/61904 [7:00:55<18:23:28, 1.38s/it] 23%|██▎ | 14081/61904 [7:00:57<18:32:34, 1.40s/it] 23%|██▎ | 14082/61904 [7:00:58<18:06:18, 1.36s/it] 23%|██▎ | 14083/61904 [7:01:00<18:22:57, 1.38s/it] 23%|██▎ | 14084/61904 [7:01:01<17:52:15, 1.35s/it] 23%|██▎ | 14085/61904 [7:01:02<18:12:56, 1.37s/it] 23%|██▎ | 14086/61904 [7:01:04<17:59:54, 1.36s/it] 23%|██▎ | 14087/61904 [7:01:05<17:31:26, 1.32s/it] 23%|██▎ | 14088/61904 [7:01:06<17:40:28, 1.33s/it] 23%|██▎ | 14089/61904 [7:01:08<17:50:21, 1.34s/it] 23%|██▎ | 14090/61904 [7:01:09<17:47:25, 1.34s/it] 23%|██▎ | 14091/61904 [7:01:10<17:34:34, 1.32s/it] 23%|██▎ | 14092/61904 [7:01:12<18:20:27, 1.38s/it] 23%|██▎ | 14093/61904 [7:01:13<18:16:46, 1.38s/it] 23%|██▎ | 14094/61904 [7:01:14<18:30:11, 1.39s/it] 23%|██▎ | 14095/61904 [7:01:16<17:57:16, 1.35s/it] 23%|██▎ | 14096/61904 [7:01:17<18:36:01, 1.40s/it] 23%|██▎ | 14097/61904 [7:01:19<18:30:16, 1.39s/it] 23%|██▎ | 14098/61904 [7:01:20<18:07:12, 1.36s/it] 23%|██▎ | 14099/61904 [7:01:21<17:49:44, 1.34s/it] 23%|██▎ | 14100/61904 [7:01:23<18:32:45, 1.40s/it] {'loss': 2.7088, 'learning_rate': 1.7747309736807986e-07, 'epoch': 3.64} 23%|██▎ | 14100/61904 [7:01:23<18:32:45, 1.40s/it] 23%|██▎ | 14101/61904 [7:01:24<18:16:34, 1.38s/it] 23%|██▎ | 14102/61904 [7:01:26<18:48:08, 1.42s/it] 23%|██▎ | 14103/61904 [7:01:27<18:28:37, 1.39s/it] 23%|██▎ | 14104/61904 [7:01:28<17:57:12, 1.35s/it] 23%|██▎ | 14105/61904 [7:01:29<17:36:32, 1.33s/it] 23%|██▎ | 14106/61904 [7:01:31<18:27:06, 1.39s/it] 23%|██▎ | 14107/61904 [7:01:32<18:17:20, 1.38s/it] 23%|██▎ | 14108/61904 [7:01:34<18:24:23, 1.39s/it] 23%|██▎ | 14109/61904 [7:01:35<18:15:36, 1.38s/it] 23%|██▎ | 14110/61904 [7:01:36<18:17:18, 1.38s/it] 23%|██▎ | 14111/61904 [7:01:38<18:31:27, 1.40s/it] 23%|██▎ | 14112/61904 [7:01:39<18:28:50, 1.39s/it] 23%|██▎ | 14113/61904 [7:01:41<17:59:24, 1.36s/it] 23%|██▎ | 14114/61904 [7:01:42<17:59:32, 1.36s/it] 23%|██▎ | 14115/61904 [7:01:43<17:52:56, 1.35s/it] 23%|██▎ | 14116/61904 [7:01:45<18:40:26, 1.41s/it] 23%|██▎ | 14117/61904 [7:01:46<19:16:17, 1.45s/it] 23%|██▎ | 14118/61904 [7:01:48<19:14:23, 1.45s/it] 23%|██▎ | 14119/61904 [7:01:49<18:09:47, 1.37s/it] 23%|██▎ | 14120/61904 [7:01:50<18:01:44, 1.36s/it] {'loss': 2.6936, 'learning_rate': 1.7744068455853752e-07, 'epoch': 3.65} 23%|██▎ | 14120/61904 [7:01:50<18:01:44, 1.36s/it] 23%|██▎ | 14121/61904 [7:01:52<18:25:55, 1.39s/it] 23%|██▎ | 14122/61904 [7:01:53<18:15:07, 1.38s/it] 23%|██▎ | 14123/61904 [7:01:54<18:17:45, 1.38s/it] 23%|██▎ | 14124/61904 [7:01:56<17:46:00, 1.34s/it] 23%|██▎ | 14125/61904 [7:01:57<17:50:48, 1.34s/it] 23%|██▎ | 14126/61904 [7:01:58<17:32:37, 1.32s/it] 23%|██▎ | 14127/61904 [7:02:00<18:08:36, 1.37s/it] 23%|██▎ | 14128/61904 [7:02:01<17:46:32, 1.34s/it] 23%|██▎ | 14129/61904 [7:02:03<18:11:24, 1.37s/it] 23%|██▎ | 14130/61904 [7:02:04<18:06:43, 1.36s/it] 23%|██▎ | 14131/61904 [7:02:05<17:57:50, 1.35s/it] 23%|██▎ | 14132/61904 [7:02:07<18:02:21, 1.36s/it] 23%|██▎ | 14133/61904 [7:02:08<18:03:44, 1.36s/it] 23%|██▎ | 14134/61904 [7:02:09<18:24:48, 1.39s/it] 23%|██▎ | 14135/61904 [7:02:11<19:00:54, 1.43s/it] 23%|██▎ | 14136/61904 [7:02:12<18:45:23, 1.41s/it] 23%|██▎ | 14137/61904 [7:02:14<18:09:29, 1.37s/it] 23%|██▎ | 14138/61904 [7:02:15<18:25:43, 1.39s/it] 23%|██▎ | 14139/61904 [7:02:17<19:04:52, 1.44s/it] 23%|██▎ | 14140/61904 [7:02:18<18:59:58, 1.43s/it] {'loss': 2.7538, 'learning_rate': 1.774082717489952e-07, 'epoch': 3.65} 23%|██▎ | 14140/61904 [7:02:18<18:59:58, 1.43s/it] 23%|██▎ | 14141/61904 [7:02:19<19:07:08, 1.44s/it] 23%|██▎ | 14142/61904 [7:02:21<19:04:16, 1.44s/it] 23%|██▎ | 14143/61904 [7:02:22<18:32:46, 1.40s/it] 23%|██▎ | 14144/61904 [7:02:24<18:16:18, 1.38s/it] 23%|██▎ | 14145/61904 [7:02:25<18:15:08, 1.38s/it] 23%|██▎ | 14146/61904 [7:02:26<18:01:50, 1.36s/it] 23%|██▎ | 14147/61904 [7:02:27<17:41:50, 1.33s/it] 23%|██▎ | 14148/61904 [7:02:29<17:12:34, 1.30s/it] 23%|██▎ | 14149/61904 [7:02:30<17:17:35, 1.30s/it] 23%|██▎ | 14150/61904 [7:02:31<17:12:33, 1.30s/it] 23%|██▎ | 14151/61904 [7:02:33<17:37:21, 1.33s/it] 23%|██▎ | 14152/61904 [7:02:34<17:26:17, 1.31s/it] 23%|██▎ | 14153/61904 [7:02:35<17:16:12, 1.30s/it] 23%|██▎ | 14154/61904 [7:02:37<17:17:44, 1.30s/it] 23%|██▎ | 14155/61904 [7:02:38<17:15:04, 1.30s/it] 23%|██▎ | 14156/61904 [7:02:39<17:35:49, 1.33s/it] 23%|██▎ | 14157/61904 [7:02:41<17:52:52, 1.35s/it] 23%|██▎ | 14158/61904 [7:02:42<17:52:05, 1.35s/it] 23%|██▎ | 14159/61904 [7:02:43<18:13:56, 1.37s/it] 23%|██▎ | 14160/61904 [7:02:45<18:41:23, 1.41s/it] {'loss': 2.7446, 'learning_rate': 1.7737585893945285e-07, 'epoch': 3.66} 23%|██▎ | 14160/61904 [7:02:45<18:41:23, 1.41s/it] 23%|██▎ | 14161/61904 [7:02:46<18:18:39, 1.38s/it] 23%|██▎ | 14162/61904 [7:02:48<18:12:22, 1.37s/it] 23%|██▎ | 14163/61904 [7:02:49<18:32:03, 1.40s/it] 23%|██▎ | 14164/61904 [7:02:50<18:13:18, 1.37s/it] 23%|██▎ | 14165/61904 [7:02:52<18:14:45, 1.38s/it] 23%|██▎ | 14166/61904 [7:02:53<18:32:56, 1.40s/it] 23%|██▎ | 14167/61904 [7:02:55<18:36:45, 1.40s/it] 23%|██▎ | 14168/61904 [7:02:56<18:19:42, 1.38s/it] 23%|██▎ | 14169/61904 [7:02:57<18:16:56, 1.38s/it] 23%|██▎ | 14170/61904 [7:02:59<17:55:09, 1.35s/it] 23%|██▎ | 14171/61904 [7:03:00<18:03:03, 1.36s/it] 23%|██▎ | 14172/61904 [7:03:01<18:00:06, 1.36s/it] 23%|██▎ | 14173/61904 [7:03:03<17:52:54, 1.35s/it] 23%|██▎ | 14174/61904 [7:03:04<17:50:39, 1.35s/it] 23%|██▎ | 14175/61904 [7:03:05<17:41:51, 1.33s/it] 23%|██▎ | 14176/61904 [7:03:07<17:49:09, 1.34s/it] 23%|██▎ | 14177/61904 [7:03:08<17:49:21, 1.34s/it] 23%|██▎ | 14178/61904 [7:03:10<18:36:48, 1.40s/it] 23%|██▎ | 14179/61904 [7:03:11<18:03:12, 1.36s/it] 23%|██▎ | 14180/61904 [7:03:12<17:42:38, 1.34s/it] {'loss': 2.7566, 'learning_rate': 1.7734344612991053e-07, 'epoch': 3.66} 23%|██▎ | 14180/61904 [7:03:12<17:42:38, 1.34s/it] 23%|██▎ | 14181/61904 [7:03:14<18:06:34, 1.37s/it] 23%|██▎ | 14182/61904 [7:03:15<18:22:10, 1.39s/it] 23%|██▎ | 14183/61904 [7:03:16<18:20:49, 1.38s/it] 23%|██▎ | 14184/61904 [7:03:18<18:05:13, 1.36s/it] 23%|██▎ | 14185/61904 [7:03:19<18:02:09, 1.36s/it] 23%|██▎ | 14186/61904 [7:03:20<17:46:43, 1.34s/it] 23%|██▎ | 14187/61904 [7:03:22<17:30:59, 1.32s/it] 23%|██▎ | 14188/61904 [7:03:23<17:54:03, 1.35s/it] 23%|██▎ | 14189/61904 [7:03:24<18:14:06, 1.38s/it] 23%|██▎ | 14190/61904 [7:03:26<18:17:14, 1.38s/it] 23%|██▎ | 14191/61904 [7:03:27<18:05:06, 1.36s/it] 23%|██▎ | 14192/61904 [7:03:28<17:43:53, 1.34s/it] 23%|██▎ | 14193/61904 [7:03:30<17:35:55, 1.33s/it] 23%|██▎ | 14194/61904 [7:03:31<17:20:13, 1.31s/it] 23%|██▎ | 14195/61904 [7:03:32<17:46:17, 1.34s/it] 23%|██▎ | 14196/61904 [7:03:34<17:59:15, 1.36s/it] 23%|██▎ | 14197/61904 [7:03:35<18:16:15, 1.38s/it] 23%|██▎ | 14198/61904 [7:03:37<18:54:49, 1.43s/it] 23%|██▎ | 14199/61904 [7:03:38<18:34:07, 1.40s/it] 23%|██▎ | 14200/61904 [7:03:40<18:47:42, 1.42s/it] {'loss': 2.7146, 'learning_rate': 1.773110333203682e-07, 'epoch': 3.67} 23%|██▎ | 14200/61904 [7:03:40<18:47:42, 1.42s/it] 23%|██▎ | 14201/61904 [7:03:41<18:22:36, 1.39s/it] 23%|██▎ | 14202/61904 [7:03:42<17:56:57, 1.35s/it] 23%|██▎ | 14203/61904 [7:03:44<19:10:25, 1.45s/it] 23%|██▎ | 14204/61904 [7:03:45<18:53:38, 1.43s/it] 23%|██▎ | 14205/61904 [7:03:47<18:48:39, 1.42s/it] 23%|██▎ | 14206/61904 [7:03:48<18:39:10, 1.41s/it] 23%|██▎ | 14207/61904 [7:03:49<18:17:47, 1.38s/it] 23%|██▎ | 14208/61904 [7:03:51<18:31:47, 1.40s/it] 23%|██▎ | 14209/61904 [7:03:52<18:43:29, 1.41s/it] 23%|██▎ | 14210/61904 [7:03:54<18:45:05, 1.42s/it] 23%|██▎ | 14211/61904 [7:03:55<18:17:16, 1.38s/it] 23%|██▎ | 14212/61904 [7:03:56<18:22:13, 1.39s/it] 23%|██▎ | 14213/61904 [7:03:58<18:20:50, 1.38s/it] 23%|██▎ | 14214/61904 [7:03:59<18:22:28, 1.39s/it] 23%|██▎ | 14215/61904 [7:04:00<17:55:55, 1.35s/it] 23%|██▎ | 14216/61904 [7:04:02<18:30:26, 1.40s/it] 23%|██▎ | 14217/61904 [7:04:03<18:47:05, 1.42s/it] 23%|██▎ | 14218/61904 [7:04:05<18:03:43, 1.36s/it] 23%|██▎ | 14219/61904 [7:04:06<17:57:12, 1.36s/it] 23%|██▎ | 14220/61904 [7:04:07<17:16:07, 1.30s/it] {'loss': 2.7953, 'learning_rate': 1.7727862051082586e-07, 'epoch': 3.67} 23%|██▎ | 14220/61904 [7:04:07<17:16:07, 1.30s/it] 23%|██▎ | 14221/61904 [7:04:08<16:55:40, 1.28s/it] 23%|██▎ | 14222/61904 [7:04:10<17:48:19, 1.34s/it] 23%|██▎ | 14223/61904 [7:04:11<18:05:13, 1.37s/it] 23%|██▎ | 14224/61904 [7:04:13<17:46:37, 1.34s/it] 23%|██▎ | 14225/61904 [7:04:14<18:09:36, 1.37s/it] 23%|██▎ | 14226/61904 [7:04:15<17:52:46, 1.35s/it] 23%|██▎ | 14227/61904 [7:04:17<18:00:32, 1.36s/it] 23%|██▎ | 14228/61904 [7:04:18<17:51:26, 1.35s/it] 23%|██▎ | 14229/61904 [7:04:19<17:13:29, 1.30s/it] 23%|██▎ | 14230/61904 [7:04:20<17:14:35, 1.30s/it] 23%|██▎ | 14231/61904 [7:04:22<17:14:16, 1.30s/it] 23%|██▎ | 14232/61904 [7:04:23<17:26:29, 1.32s/it] 23%|██▎ | 14233/61904 [7:04:25<17:58:19, 1.36s/it] 23%|██▎ | 14234/61904 [7:04:26<17:56:28, 1.35s/it] 23%|██▎ | 14235/61904 [7:04:27<18:16:09, 1.38s/it] 23%|██▎ | 14236/61904 [7:04:29<18:12:25, 1.38s/it] 23%|██▎ | 14237/61904 [7:04:30<18:46:00, 1.42s/it] 23%|██▎ | 14238/61904 [7:04:32<18:59:12, 1.43s/it] 23%|██▎ | 14239/61904 [7:04:33<18:42:22, 1.41s/it] 23%|██▎ | 14240/61904 [7:04:34<18:19:04, 1.38s/it] {'loss': 2.7185, 'learning_rate': 1.7724620770128355e-07, 'epoch': 3.68} 23%|██▎ | 14240/61904 [7:04:34<18:19:04, 1.38s/it] 23%|██▎ | 14241/61904 [7:04:36<18:05:32, 1.37s/it] 23%|██▎ | 14242/61904 [7:04:37<17:49:45, 1.35s/it] 23%|██▎ | 14243/61904 [7:04:38<17:41:49, 1.34s/it] 23%|██▎ | 14244/61904 [7:04:40<17:26:04, 1.32s/it] 23%|██▎ | 14245/61904 [7:04:41<17:40:29, 1.34s/it] 23%|██▎ | 14246/61904 [7:04:42<18:03:21, 1.36s/it] 23%|██▎ | 14247/61904 [7:04:44<17:48:37, 1.35s/it] 23%|██▎ | 14248/61904 [7:04:45<17:46:40, 1.34s/it] 23%|██▎ | 14249/61904 [7:04:47<18:18:48, 1.38s/it] 23%|██▎ | 14250/61904 [7:04:48<18:08:08, 1.37s/it] 23%|██▎ | 14251/61904 [7:04:49<17:46:01, 1.34s/it] 23%|██▎ | 14252/61904 [7:04:50<17:47:20, 1.34s/it] 23%|██▎ | 14253/61904 [7:04:52<18:14:42, 1.38s/it] 23%|██▎ | 14254/61904 [7:04:53<18:42:20, 1.41s/it] 23%|██▎ | 14255/61904 [7:04:55<18:24:20, 1.39s/it] 23%|██▎ | 14256/61904 [7:04:56<18:15:07, 1.38s/it] 23%|██▎ | 14257/61904 [7:04:57<18:00:26, 1.36s/it] 23%|██▎ | 14258/61904 [7:04:59<18:26:49, 1.39s/it] 23%|██▎ | 14259/61904 [7:05:00<18:16:19, 1.38s/it] 23%|██▎ | 14260/61904 [7:05:02<18:29:27, 1.40s/it] {'loss': 2.7957, 'learning_rate': 1.772137948917412e-07, 'epoch': 3.69} 23%|██▎ | 14260/61904 [7:05:02<18:29:27, 1.40s/it] 23%|██▎ | 14261/61904 [7:05:03<19:08:37, 1.45s/it] 23%|██▎ | 14262/61904 [7:05:05<19:24:01, 1.47s/it] 23%|██▎ | 14263/61904 [7:05:06<18:54:09, 1.43s/it] 23%|██▎ | 14264/61904 [7:05:08<18:58:45, 1.43s/it] 23%|██▎ | 14265/61904 [7:05:09<18:37:29, 1.41s/it] 23%|██▎ | 14266/61904 [7:05:10<18:14:28, 1.38s/it] 23%|██▎ | 14267/61904 [7:05:12<18:27:13, 1.39s/it] 23%|██▎ | 14268/61904 [7:05:13<18:25:33, 1.39s/it] 23%|██▎ | 14269/61904 [7:05:14<18:03:13, 1.36s/it] 23%|██▎ | 14270/61904 [7:05:16<17:35:49, 1.33s/it] 23%|██▎ | 14271/61904 [7:05:17<18:08:43, 1.37s/it] 23%|██▎ | 14272/61904 [7:05:18<18:09:45, 1.37s/it] 23%|██▎ | 14273/61904 [7:05:20<17:35:22, 1.33s/it] 23%|██▎ | 14274/61904 [7:05:21<18:17:25, 1.38s/it] 23%|██▎ | 14275/61904 [7:05:22<17:48:40, 1.35s/it] 23%|██▎ | 14276/61904 [7:05:24<17:24:05, 1.32s/it] 23%|██▎ | 14277/61904 [7:05:25<17:32:39, 1.33s/it] 23%|██▎ | 14278/61904 [7:05:26<17:51:50, 1.35s/it] 23%|██▎ | 14279/61904 [7:05:28<18:02:44, 1.36s/it] 23%|██▎ | 14280/61904 [7:05:29<17:54:55, 1.35s/it] {'loss': 2.7017, 'learning_rate': 1.7718138208219887e-07, 'epoch': 3.69} 23%|██▎ | 14280/61904 [7:05:29<17:54:55, 1.35s/it] 23%|██▎ | 14281/61904 [7:05:30<17:41:29, 1.34s/it] 23%|██▎ | 14282/61904 [7:05:32<18:02:21, 1.36s/it] 23%|██▎ | 14283/61904 [7:05:33<17:55:37, 1.36s/it] 23%|██▎ | 14284/61904 [7:05:35<18:06:21, 1.37s/it] 23%|██▎ | 14285/61904 [7:05:36<18:21:16, 1.39s/it] 23%|██▎ | 14286/61904 [7:05:37<17:50:18, 1.35s/it] 23%|██▎ | 14287/61904 [7:05:39<17:41:43, 1.34s/it] 23%|██▎ | 14288/61904 [7:05:40<17:31:22, 1.32s/it] 23%|██▎ | 14289/61904 [7:05:41<17:36:33, 1.33s/it] 23%|██▎ | 14290/61904 [7:05:43<17:59:48, 1.36s/it] 23%|██▎ | 14291/61904 [7:05:44<18:26:41, 1.39s/it] 23%|██▎ | 14292/61904 [7:05:46<18:42:44, 1.41s/it] 23%|██▎ | 14293/61904 [7:05:47<18:31:45, 1.40s/it] 23%|██▎ | 14294/61904 [7:05:48<18:14:45, 1.38s/it] 23%|██▎ | 14295/61904 [7:05:50<18:25:37, 1.39s/it] 23%|██▎ | 14296/61904 [7:05:51<17:51:42, 1.35s/it] 23%|██▎ | 14297/61904 [7:05:52<17:36:29, 1.33s/it] 23%|██▎ | 14298/61904 [7:05:54<17:38:00, 1.33s/it] 23%|██▎ | 14299/61904 [7:05:55<19:18:58, 1.46s/it] 23%|██▎ | 14300/61904 [7:05:57<19:01:11, 1.44s/it] {'loss': 2.7374, 'learning_rate': 1.7714896927265656e-07, 'epoch': 3.7} 23%|██▎ | 14300/61904 [7:05:57<19:01:11, 1.44s/it] 23%|██▎ | 14301/61904 [7:05:58<18:12:22, 1.38s/it] 23%|██▎ | 14302/61904 [7:05:59<17:57:16, 1.36s/it] 23%|██▎ | 14303/61904 [7:06:01<18:02:44, 1.36s/it] 23%|██▎ | 14304/61904 [7:06:02<17:37:59, 1.33s/it] 23%|██▎ | 14305/61904 [7:06:03<17:24:51, 1.32s/it] 23%|██▎ | 14306/61904 [7:06:05<18:09:43, 1.37s/it] 23%|██▎ | 14307/61904 [7:06:06<18:29:11, 1.40s/it] 23%|██▎ | 14308/61904 [7:06:08<18:43:07, 1.42s/it] 23%|██▎ | 14309/61904 [7:06:09<18:44:31, 1.42s/it] 23%|██▎ | 14310/61904 [7:06:10<18:27:44, 1.40s/it] 23%|██▎ | 14311/61904 [7:06:12<18:20:30, 1.39s/it] 23%|██▎ | 14312/61904 [7:06:13<18:24:26, 1.39s/it] 23%|██▎ | 14313/61904 [7:06:15<18:34:58, 1.41s/it] 23%|██▎ | 14314/61904 [7:06:16<18:47:37, 1.42s/it] 23%|██▎ | 14315/61904 [7:06:17<18:29:32, 1.40s/it] 23%|██▎ | 14316/61904 [7:06:19<17:57:53, 1.36s/it] 23%|██▎ | 14317/61904 [7:06:20<17:29:50, 1.32s/it] 23%|██▎ | 14318/61904 [7:06:21<17:09:06, 1.30s/it] 23%|██▎ | 14319/61904 [7:06:22<16:53:04, 1.28s/it] 23%|██▎ | 14320/61904 [7:06:24<17:01:51, 1.29s/it] {'loss': 2.7442, 'learning_rate': 1.771165564631142e-07, 'epoch': 3.7} 23%|██▎ | 14320/61904 [7:06:24<17:01:51, 1.29s/it] 23%|██▎ | 14321/61904 [7:06:25<17:29:59, 1.32s/it] 23%|██▎ | 14322/61904 [7:06:26<17:28:21, 1.32s/it] 23%|██▎ | 14323/61904 [7:06:28<17:13:22, 1.30s/it] 23%|██▎ | 14324/61904 [7:06:29<17:35:34, 1.33s/it] 23%|██▎ | 14325/61904 [7:06:31<17:56:31, 1.36s/it] 23%|██▎ | 14326/61904 [7:06:32<17:53:53, 1.35s/it] 23%|██▎ | 14327/61904 [7:06:33<17:41:45, 1.34s/it] 23%|██▎ | 14328/61904 [7:06:35<17:51:48, 1.35s/it] 23%|██▎ | 14329/61904 [7:06:36<17:26:55, 1.32s/it] 23%|██▎ | 14330/61904 [7:06:37<17:41:53, 1.34s/it] 23%|██▎ | 14331/61904 [7:06:38<17:37:05, 1.33s/it] 23%|██▎ | 14332/61904 [7:06:40<17:26:21, 1.32s/it] 23%|██▎ | 14333/61904 [7:06:41<17:16:05, 1.31s/it] 23%|██▎ | 14334/61904 [7:06:42<16:54:45, 1.28s/it] 23%|██▎ | 14335/61904 [7:06:44<17:44:47, 1.34s/it] 23%|██▎ | 14336/61904 [7:06:45<17:58:17, 1.36s/it] 23%|██▎ | 14337/61904 [7:06:47<17:57:40, 1.36s/it] 23%|██▎ | 14338/61904 [7:06:48<17:28:12, 1.32s/it] 23%|██▎ | 14339/61904 [7:06:49<18:02:59, 1.37s/it] 23%|██▎ | 14340/61904 [7:06:51<18:27:52, 1.40s/it] {'loss': 2.7792, 'learning_rate': 1.7708414365357188e-07, 'epoch': 3.71} 23%|██▎ | 14340/61904 [7:06:51<18:27:52, 1.40s/it] 23%|██▎ | 14341/61904 [7:06:52<18:30:32, 1.40s/it] 23%|██▎ | 14342/61904 [7:06:53<17:55:40, 1.36s/it] 23%|██▎ | 14343/61904 [7:06:55<18:21:54, 1.39s/it] 23%|██▎ | 14344/61904 [7:06:56<18:37:56, 1.41s/it] 23%|██▎ | 14345/61904 [7:06:58<19:02:41, 1.44s/it] 23%|██▎ | 14346/61904 [7:06:59<18:28:14, 1.40s/it] 23%|██▎ | 14347/61904 [7:07:00<18:12:31, 1.38s/it] 23%|██▎ | 14348/61904 [7:07:02<18:03:21, 1.37s/it] 23%|██▎ | 14349/61904 [7:07:03<17:49:05, 1.35s/it] 23%|██▎ | 14350/61904 [7:07:04<17:34:56, 1.33s/it] 23%|██▎ | 14351/61904 [7:07:06<17:21:09, 1.31s/it] 23%|██▎ | 14352/61904 [7:07:07<18:23:45, 1.39s/it] 23%|██▎ | 14353/61904 [7:07:09<18:27:21, 1.40s/it] 23%|██▎ | 14354/61904 [7:07:10<17:55:13, 1.36s/it] 23%|██▎ | 14355/61904 [7:07:11<17:52:59, 1.35s/it] 23%|██▎ | 14356/61904 [7:07:13<18:08:14, 1.37s/it] 23%|██▎ | 14357/61904 [7:07:14<18:18:31, 1.39s/it] 23%|██▎ | 14358/61904 [7:07:15<18:20:39, 1.39s/it] 23%|██▎ | 14359/61904 [7:07:17<17:59:18, 1.36s/it] 23%|██▎ | 14360/61904 [7:07:18<18:02:47, 1.37s/it] {'loss': 2.7355, 'learning_rate': 1.7705173084402957e-07, 'epoch': 3.71} 23%|██▎ | 14360/61904 [7:07:18<18:02:47, 1.37s/it] 23%|██▎ | 14361/61904 [7:07:19<17:54:40, 1.36s/it] 23%|██▎ | 14362/61904 [7:07:21<17:14:27, 1.31s/it] 23%|██▎ | 14363/61904 [7:07:22<17:14:41, 1.31s/it] 23%|██▎ | 14364/61904 [7:07:23<18:00:11, 1.36s/it] 23%|██▎ | 14365/61904 [7:07:25<17:55:27, 1.36s/it] 23%|██▎ | 14366/61904 [7:07:26<18:00:15, 1.36s/it] 23%|██▎ | 14367/61904 [7:07:28<18:04:57, 1.37s/it] 23%|██▎ | 14368/61904 [7:07:29<18:29:51, 1.40s/it] 23%|██▎ | 14369/61904 [7:07:30<18:38:07, 1.41s/it] 23%|██▎ | 14370/61904 [7:07:32<18:32:41, 1.40s/it] 23%|██▎ | 14371/61904 [7:07:33<18:40:11, 1.41s/it] 23%|██▎ | 14372/61904 [7:07:35<18:56:27, 1.43s/it] 23%|██▎ | 14373/61904 [7:07:36<18:47:27, 1.42s/it] 23%|██▎ | 14374/61904 [7:07:38<18:21:53, 1.39s/it] 23%|██▎ | 14375/61904 [7:07:39<17:53:16, 1.35s/it] 23%|██▎ | 14376/61904 [7:07:40<17:45:30, 1.35s/it] 23%|██▎ | 14377/61904 [7:07:42<18:04:54, 1.37s/it] 23%|██▎ | 14378/61904 [7:07:43<18:02:48, 1.37s/it] 23%|██▎ | 14379/61904 [7:07:44<18:26:31, 1.40s/it] 23%|██▎ | 14380/61904 [7:07:46<18:02:29, 1.37s/it] {'loss': 2.7571, 'learning_rate': 1.770193180344872e-07, 'epoch': 3.72} 23%|██▎ | 14380/61904 [7:07:46<18:02:29, 1.37s/it] 23%|██▎ | 14381/61904 [7:07:47<18:49:28, 1.43s/it] 23%|██▎ | 14382/61904 [7:07:49<18:40:32, 1.41s/it] 23%|██▎ | 14383/61904 [7:07:50<18:42:56, 1.42s/it] 23%|██▎ | 14384/61904 [7:07:51<18:19:36, 1.39s/it] 23%|██▎ | 14385/61904 [7:07:53<18:57:42, 1.44s/it] 23%|██▎ | 14386/61904 [7:07:54<18:52:28, 1.43s/it] 23%|██▎ | 14387/61904 [7:07:56<19:05:28, 1.45s/it] 23%|██▎ | 14388/61904 [7:07:57<18:24:47, 1.40s/it] 23%|██▎ | 14389/61904 [7:07:58<17:48:39, 1.35s/it] 23%|██▎ | 14390/61904 [7:08:00<18:33:25, 1.41s/it] 23%|██▎ | 14391/61904 [7:08:02<19:32:19, 1.48s/it] 23%|██▎ | 14392/61904 [7:08:03<19:16:16, 1.46s/it] 23%|██▎ | 14393/61904 [7:08:04<18:31:15, 1.40s/it] 23%|██▎ | 14394/61904 [7:08:06<19:47:44, 1.50s/it] 23%|██▎ | 14395/61904 [7:08:07<19:28:18, 1.48s/it] 23%|██▎ | 14396/61904 [7:08:09<18:44:49, 1.42s/it] 23%|██▎ | 14397/61904 [7:08:10<18:09:01, 1.38s/it] 23%|██▎ | 14398/61904 [7:08:11<18:17:52, 1.39s/it] 23%|██▎ | 14399/61904 [7:08:13<17:56:14, 1.36s/it] 23%|██▎ | 14400/61904 [7:08:14<18:04:42, 1.37s/it] {'loss': 2.6742, 'learning_rate': 1.769869052249449e-07, 'epoch': 3.72} 23%|██▎ | 14400/61904 [7:08:14<18:04:42, 1.37s/it] 23%|██▎ | 14401/61904 [7:08:15<17:57:29, 1.36s/it] 23%|██▎ | 14402/61904 [7:08:17<18:10:51, 1.38s/it] 23%|██▎ | 14403/61904 [7:08:18<18:10:48, 1.38s/it] 23%|██▎ | 14404/61904 [7:08:19<18:01:24, 1.37s/it] 23%|██▎ | 14405/61904 [7:08:21<17:17:29, 1.31s/it] 23%|██▎ | 14406/61904 [7:08:22<17:15:59, 1.31s/it] 23%|██▎ | 14407/61904 [7:08:23<17:59:53, 1.36s/it] 23%|██▎ | 14408/61904 [7:08:25<17:48:02, 1.35s/it] 23%|██▎ | 14409/61904 [7:08:26<18:31:41, 1.40s/it] 23%|██▎ | 14410/61904 [7:08:28<18:30:05, 1.40s/it] 23%|██▎ | 14411/61904 [7:08:29<18:35:03, 1.41s/it] 23%|██▎ | 14412/61904 [7:08:30<18:02:52, 1.37s/it] 23%|██▎ | 14413/61904 [7:08:32<17:46:49, 1.35s/it] 23%|██▎ | 14414/61904 [7:08:33<17:36:39, 1.34s/it] 23%|██▎ | 14415/61904 [7:08:34<17:04:27, 1.29s/it] 23%|██▎ | 14416/61904 [7:08:36<17:41:19, 1.34s/it] 23%|██▎ | 14417/61904 [7:08:37<17:54:19, 1.36s/it] 23%|██▎ | 14418/61904 [7:08:39<18:24:14, 1.40s/it] 23%|██▎ | 14419/61904 [7:08:40<18:25:31, 1.40s/it] 23%|██▎ | 14420/61904 [7:08:41<18:16:24, 1.39s/it] {'loss': 2.7332, 'learning_rate': 1.7695449241540256e-07, 'epoch': 3.73} 23%|██▎ | 14420/61904 [7:08:41<18:16:24, 1.39s/it] 23%|██▎ | 14421/61904 [7:08:43<18:01:22, 1.37s/it] 23%|██▎ | 14422/61904 [7:08:44<17:59:10, 1.36s/it] 23%|██▎ | 14423/61904 [7:08:45<18:05:50, 1.37s/it] 23%|██▎ | 14424/61904 [7:08:47<18:19:56, 1.39s/it] 23%|██▎ | 14425/61904 [7:08:48<18:50:48, 1.43s/it] 23%|██▎ | 14426/61904 [7:08:50<18:31:38, 1.40s/it] 23%|██▎ | 14427/61904 [7:08:51<18:24:49, 1.40s/it] 23%|██▎ | 14428/61904 [7:08:52<18:03:24, 1.37s/it] 23%|██▎ | 14429/61904 [7:08:54<17:48:53, 1.35s/it] 23%|██▎ | 14430/61904 [7:08:55<18:10:37, 1.38s/it] 23%|██▎ | 14431/61904 [7:08:56<17:50:46, 1.35s/it] 23%|██▎ | 14432/61904 [7:08:58<17:16:21, 1.31s/it] 23%|██▎ | 14433/61904 [7:08:59<17:19:10, 1.31s/it] 23%|██▎ | 14434/61904 [7:09:00<17:10:42, 1.30s/it] 23%|██▎ | 14435/61904 [7:09:01<16:50:06, 1.28s/it] 23%|██▎ | 14436/61904 [7:09:03<17:00:39, 1.29s/it] 23%|██▎ | 14437/61904 [7:09:04<17:23:12, 1.32s/it] 23%|██▎ | 14438/61904 [7:09:05<17:28:56, 1.33s/it] 23%|██▎ | 14439/61904 [7:09:07<17:57:47, 1.36s/it] 23%|██▎ | 14440/61904 [7:09:08<17:50:17, 1.35s/it] {'loss': 2.7142, 'learning_rate': 1.7692207960586022e-07, 'epoch': 3.73} 23%|██▎ | 14440/61904 [7:09:08<17:50:17, 1.35s/it] 23%|██▎ | 14441/61904 [7:09:10<18:14:34, 1.38s/it] 23%|██▎ | 14442/61904 [7:09:11<18:26:37, 1.40s/it] 23%|██▎ | 14443/61904 [7:09:12<18:18:19, 1.39s/it] 23%|██▎ | 14444/61904 [7:09:14<17:44:48, 1.35s/it] 23%|██▎ | 14445/61904 [7:09:15<17:24:48, 1.32s/it] 23%|██▎ | 14446/61904 [7:09:16<17:57:55, 1.36s/it] 23%|██▎ | 14447/61904 [7:09:18<18:00:59, 1.37s/it] 23%|██▎ | 14448/61904 [7:09:19<17:39:16, 1.34s/it] 23%|██▎ | 14449/61904 [7:09:21<18:19:24, 1.39s/it] 23%|██▎ | 14450/61904 [7:09:22<18:21:54, 1.39s/it] 23%|██▎ | 14451/61904 [7:09:24<19:03:33, 1.45s/it] 23%|██▎ | 14452/61904 [7:09:25<18:33:32, 1.41s/it] 23%|██▎ | 14453/61904 [7:09:26<18:06:40, 1.37s/it] 23%|██▎ | 14454/61904 [7:09:27<17:41:59, 1.34s/it] 23%|██▎ | 14455/61904 [7:09:29<17:41:40, 1.34s/it] 23%|██▎ | 14456/61904 [7:09:30<18:08:16, 1.38s/it] 23%|██▎ | 14457/61904 [7:09:32<18:02:26, 1.37s/it] 23%|██▎ | 14458/61904 [7:09:33<18:04:54, 1.37s/it] 23%|██▎ | 14459/61904 [7:09:34<17:58:33, 1.36s/it] 23%|██▎ | 14460/61904 [7:09:36<17:33:36, 1.33s/it] {'loss': 2.7682, 'learning_rate': 1.768896667963179e-07, 'epoch': 3.74} 23%|██▎ | 14460/61904 [7:09:36<17:33:36, 1.33s/it] 23%|██▎ | 14461/61904 [7:09:37<17:20:59, 1.32s/it] 23%|██▎ | 14462/61904 [7:09:38<17:21:54, 1.32s/it] 23%|██▎ | 14463/61904 [7:09:40<18:21:50, 1.39s/it] 23%|██▎ | 14464/61904 [7:09:41<18:10:45, 1.38s/it] 23%|██▎ | 14465/61904 [7:09:43<18:16:40, 1.39s/it] 23%|██▎ | 14466/61904 [7:09:44<18:00:00, 1.37s/it] 23%|██▎ | 14467/61904 [7:09:45<18:10:07, 1.38s/it] 23%|██▎ | 14468/61904 [7:09:46<17:37:09, 1.34s/it] 23%|██▎ | 14469/61904 [7:09:48<17:56:11, 1.36s/it] 23%|██▎ | 14470/61904 [7:09:49<18:26:45, 1.40s/it] 23%|██▎ | 14471/61904 [7:09:51<18:10:25, 1.38s/it] 23%|██▎ | 14472/61904 [7:09:52<18:09:38, 1.38s/it] 23%|██▎ | 14473/61904 [7:09:54<18:28:53, 1.40s/it] 23%|██▎ | 14474/61904 [7:09:55<18:22:48, 1.40s/it] 23%|██▎ | 14475/61904 [7:09:56<17:59:29, 1.37s/it] 23%|██▎ | 14476/61904 [7:09:58<18:18:48, 1.39s/it] 23%|██▎ | 14477/61904 [7:09:59<18:28:18, 1.40s/it] 23%|██▎ | 14478/61904 [7:10:01<19:57:22, 1.51s/it] 23%|██▎ | 14479/61904 [7:10:02<18:59:52, 1.44s/it] 23%|██▎ | 14480/61904 [7:10:03<18:19:07, 1.39s/it] {'loss': 2.8006, 'learning_rate': 1.7685725398677557e-07, 'epoch': 3.74} 23%|██▎ | 14480/61904 [7:10:03<18:19:07, 1.39s/it] 23%|██▎ | 14481/61904 [7:10:05<18:22:24, 1.39s/it] 23%|██▎ | 14482/61904 [7:10:06<18:51:14, 1.43s/it] 23%|██▎ | 14483/61904 [7:10:08<19:05:20, 1.45s/it] 23%|██▎ | 14484/61904 [7:10:09<18:42:43, 1.42s/it] 23%|██▎ | 14485/61904 [7:10:11<18:29:54, 1.40s/it] 23%|██▎ | 14486/61904 [7:10:12<18:12:26, 1.38s/it] 23%|██▎ | 14487/61904 [7:10:13<18:07:37, 1.38s/it] 23%|██▎ | 14488/61904 [7:10:15<18:12:53, 1.38s/it] 23%|██▎ | 14489/61904 [7:10:16<17:54:31, 1.36s/it] 23%|██▎ | 14490/61904 [7:10:17<17:47:51, 1.35s/it] 23%|██▎ | 14491/61904 [7:10:19<18:18:27, 1.39s/it] 23%|██▎ | 14492/61904 [7:10:20<17:55:13, 1.36s/it] 23%|██▎ | 14493/61904 [7:10:21<17:53:58, 1.36s/it] 23%|██▎ | 14494/61904 [7:10:23<18:06:14, 1.37s/it] 23%|██▎ | 14495/61904 [7:10:24<18:18:08, 1.39s/it] 23%|██▎ | 14496/61904 [7:10:26<19:01:21, 1.44s/it] 23%|██▎ | 14497/61904 [7:10:27<18:28:27, 1.40s/it] 23%|██▎ | 14498/61904 [7:10:29<18:18:59, 1.39s/it] 23%|██▎ | 14499/61904 [7:10:30<18:01:36, 1.37s/it] 23%|██▎ | 14500/61904 [7:10:31<18:37:02, 1.41s/it] {'loss': 2.7649, 'learning_rate': 1.7682484117723323e-07, 'epoch': 3.75} 23%|██▎ | 14500/61904 [7:10:31<18:37:02, 1.41s/it] 23%|██▎ | 14501/61904 [7:10:33<18:16:38, 1.39s/it] 23%|██▎ | 14502/61904 [7:10:34<18:17:10, 1.39s/it] 23%|██▎ | 14503/61904 [7:10:36<18:31:30, 1.41s/it] 23%|██▎ | 14504/61904 [7:10:37<18:42:53, 1.42s/it] 23%|██▎ | 14505/61904 [7:10:38<18:10:36, 1.38s/it] 23%|██▎ | 14506/61904 [7:10:40<18:10:41, 1.38s/it] 23%|██▎ | 14507/61904 [7:10:41<19:02:12, 1.45s/it] 23%|██▎ | 14508/61904 [7:10:43<19:31:49, 1.48s/it] 23%|██▎ | 14509/61904 [7:10:44<18:44:08, 1.42s/it] 23%|██▎ | 14510/61904 [7:10:46<18:57:26, 1.44s/it] 23%|██▎ | 14511/61904 [7:10:47<18:25:53, 1.40s/it] 23%|██▎ | 14512/61904 [7:10:48<18:24:31, 1.40s/it] 23%|██▎ | 14513/61904 [7:10:50<18:37:45, 1.42s/it] 23%|██▎ | 14514/61904 [7:10:51<18:34:53, 1.41s/it] 23%|██▎ | 14515/61904 [7:10:52<18:13:24, 1.38s/it] 23%|██▎ | 14516/61904 [7:10:54<17:56:30, 1.36s/it] 23%|██▎ | 14517/61904 [7:10:55<18:09:06, 1.38s/it] 23%|██▎ | 14518/61904 [7:10:57<17:59:57, 1.37s/it] 23%|██▎ | 14519/61904 [7:10:58<18:31:10, 1.41s/it] 23%|██▎ | 14520/61904 [7:10:59<18:11:48, 1.38s/it] {'loss': 2.7049, 'learning_rate': 1.7679242836769092e-07, 'epoch': 3.75} 23%|██▎ | 14520/61904 [7:10:59<18:11:48, 1.38s/it] 23%|██▎ | 14521/61904 [7:11:01<17:40:21, 1.34s/it] 23%|██▎ | 14522/61904 [7:11:02<17:52:51, 1.36s/it] 23%|██▎ | 14523/61904 [7:11:03<17:29:40, 1.33s/it] 23%|██▎ | 14524/61904 [7:11:05<18:35:10, 1.41s/it] 23%|██▎ | 14525/61904 [7:11:06<18:38:56, 1.42s/it] 23%|██▎ | 14526/61904 [7:11:08<18:35:12, 1.41s/it] 23%|██▎ | 14527/61904 [7:11:09<18:00:59, 1.37s/it] 23%|██▎ | 14528/61904 [7:11:10<17:52:05, 1.36s/it] 23%|██▎ | 14529/61904 [7:11:12<18:01:55, 1.37s/it] 23%|██▎ | 14530/61904 [7:11:13<17:33:39, 1.33s/it] 23%|██▎ | 14531/61904 [7:11:14<17:52:35, 1.36s/it] 23%|██▎ | 14532/61904 [7:11:16<18:35:50, 1.41s/it] 23%|██▎ | 14533/61904 [7:11:17<18:33:51, 1.41s/it] 23%|██▎ | 14534/61904 [7:11:19<17:54:55, 1.36s/it] 23%|██▎ | 14535/61904 [7:11:20<17:52:41, 1.36s/it] 23%|██▎ | 14536/61904 [7:11:21<18:42:56, 1.42s/it] 23%|██▎ | 14537/61904 [7:11:23<18:20:17, 1.39s/it] 23%|██▎ | 14538/61904 [7:11:24<17:43:58, 1.35s/it] 23%|██▎ | 14539/61904 [7:11:25<18:05:26, 1.37s/it] 23%|██▎ | 14540/61904 [7:11:27<17:50:32, 1.36s/it] {'loss': 2.7923, 'learning_rate': 1.7676001555814855e-07, 'epoch': 3.76} 23%|██▎ | 14540/61904 [7:11:27<17:50:32, 1.36s/it] 23%|██▎ | 14541/61904 [7:11:28<18:03:19, 1.37s/it] 23%|██▎ | 14542/61904 [7:11:30<18:24:10, 1.40s/it] 23%|██▎ | 14543/61904 [7:11:31<17:44:01, 1.35s/it] 23%|██▎ | 14544/61904 [7:11:32<18:09:59, 1.38s/it] 23%|██▎ | 14545/61904 [7:11:34<18:13:51, 1.39s/it] 23%|██▎ | 14546/61904 [7:11:35<19:22:02, 1.47s/it] 23%|██▎ | 14547/61904 [7:11:37<19:31:36, 1.48s/it] 24%|██▎ | 14548/61904 [7:11:38<19:05:34, 1.45s/it] 24%|██▎ | 14549/61904 [7:11:40<18:52:27, 1.43s/it] 24%|██▎ | 14550/61904 [7:11:41<18:25:44, 1.40s/it] 24%|██▎ | 14551/61904 [7:11:42<18:22:37, 1.40s/it] 24%|██▎ | 14552/61904 [7:11:44<18:26:28, 1.40s/it] 24%|██▎ | 14553/61904 [7:11:45<18:10:46, 1.38s/it] 24%|██▎ | 14554/61904 [7:11:47<18:32:50, 1.41s/it] 24%|██▎ | 14555/61904 [7:11:48<17:44:21, 1.35s/it] 24%|██▎ | 14556/61904 [7:11:49<17:44:32, 1.35s/it] 24%|██▎ | 14557/61904 [7:11:51<17:52:43, 1.36s/it] 24%|██▎ | 14558/61904 [7:11:52<17:17:33, 1.31s/it] 24%|██▎ | 14559/61904 [7:11:53<16:59:28, 1.29s/it] 24%|██▎ | 14560/61904 [7:11:54<17:05:28, 1.30s/it] {'loss': 2.6909, 'learning_rate': 1.7672760274860624e-07, 'epoch': 3.76} 24%|██▎ | 14560/61904 [7:11:54<17:05:28, 1.30s/it] 24%|██▎ | 14561/61904 [7:11:56<17:46:16, 1.35s/it] 24%|██▎ | 14562/61904 [7:11:57<17:26:46, 1.33s/it] 24%|██▎ | 14563/61904 [7:11:58<17:45:45, 1.35s/it] 24%|██▎ | 14564/61904 [7:12:00<17:38:11, 1.34s/it] 24%|██▎ | 14565/61904 [7:12:01<17:43:13, 1.35s/it] 24%|██▎ | 14566/61904 [7:12:03<17:47:07, 1.35s/it] 24%|██▎ | 14567/61904 [7:12:04<17:48:33, 1.35s/it] 24%|██▎ | 14568/61904 [7:12:05<18:18:47, 1.39s/it] 24%|██▎ | 14569/61904 [7:12:07<19:06:54, 1.45s/it] 24%|██▎ | 14570/61904 [7:12:08<19:00:04, 1.45s/it] 24%|██▎ | 14571/61904 [7:12:10<18:12:24, 1.38s/it] 24%|██▎ | 14572/61904 [7:12:11<18:20:01, 1.39s/it] 24%|██▎ | 14573/61904 [7:12:12<17:59:30, 1.37s/it] 24%|██▎ | 14574/61904 [7:12:14<17:45:04, 1.35s/it] 24%|██▎ | 14575/61904 [7:12:15<18:12:28, 1.38s/it] 24%|██▎ | 14576/61904 [7:12:17<18:27:30, 1.40s/it] 24%|██▎ | 14577/61904 [7:12:18<18:07:59, 1.38s/it] 24%|██▎ | 14578/61904 [7:12:19<17:28:40, 1.33s/it] 24%|██▎ | 14579/61904 [7:12:20<17:23:34, 1.32s/it] 24%|██▎ | 14580/61904 [7:12:22<18:27:08, 1.40s/it] {'loss': 2.7651, 'learning_rate': 1.7669518993906393e-07, 'epoch': 3.77} 24%|██▎ | 14580/61904 [7:12:22<18:27:08, 1.40s/it] 24%|██▎ | 14581/61904 [7:12:23<17:59:28, 1.37s/it] 24%|██▎ | 14582/61904 [7:12:25<17:44:41, 1.35s/it] 24%|██▎ | 14583/61904 [7:12:26<17:57:29, 1.37s/it] 24%|██▎ | 14584/61904 [7:12:27<18:14:12, 1.39s/it] 24%|██▎ | 14585/61904 [7:12:29<18:23:06, 1.40s/it] 24%|██▎ | 14586/61904 [7:12:30<18:17:05, 1.39s/it] 24%|██▎ | 14587/61904 [7:12:32<18:02:17, 1.37s/it] 24%|██▎ | 14588/61904 [7:12:33<17:51:05, 1.36s/it] 24%|██▎ | 14589/61904 [7:12:34<17:03:44, 1.30s/it] 24%|██▎ | 14590/61904 [7:12:36<17:44:09, 1.35s/it] 24%|██▎ | 14591/61904 [7:12:37<17:31:41, 1.33s/it] 24%|██▎ | 14592/61904 [7:12:38<17:33:51, 1.34s/it] 24%|██▎ | 14593/61904 [7:12:39<17:30:58, 1.33s/it] 24%|██▎ | 14594/61904 [7:12:41<17:40:33, 1.35s/it] 24%|██▎ | 14595/61904 [7:12:42<17:54:19, 1.36s/it] 24%|██▎ | 14596/61904 [7:12:44<18:18:32, 1.39s/it] 24%|██▎ | 14597/61904 [7:12:45<17:59:40, 1.37s/it] 24%|██▎ | 14598/61904 [7:12:46<18:02:02, 1.37s/it] 24%|██▎ | 14599/61904 [7:12:48<17:42:21, 1.35s/it] 24%|██▎ | 14600/61904 [7:12:49<18:05:34, 1.38s/it] {'loss': 2.6994, 'learning_rate': 1.7666277712952157e-07, 'epoch': 3.77} 24%|██▎ | 14600/61904 [7:12:49<18:05:34, 1.38s/it] 24%|██▎ | 14601/61904 [7:12:51<18:15:47, 1.39s/it] 24%|██▎ | 14602/61904 [7:12:52<18:02:07, 1.37s/it] 24%|██▎ | 14603/61904 [7:12:53<17:59:02, 1.37s/it] 24%|██▎ | 14604/61904 [7:12:55<17:42:10, 1.35s/it] 24%|██▎ | 14605/61904 [7:12:56<18:00:24, 1.37s/it] 24%|██▎ | 14606/61904 [7:12:57<18:17:00, 1.39s/it] 24%|██▎ | 14607/61904 [7:12:59<18:25:30, 1.40s/it] 24%|██▎ | 14608/61904 [7:13:00<18:44:59, 1.43s/it] 24%|██▎ | 14609/61904 [7:13:02<18:58:01, 1.44s/it] 24%|██▎ | 14610/61904 [7:13:03<18:34:00, 1.41s/it] 24%|██▎ | 14611/61904 [7:13:04<18:04:28, 1.38s/it] 24%|██▎ | 14612/61904 [7:13:06<18:19:48, 1.40s/it] 24%|██▎ | 14613/61904 [7:13:07<17:48:09, 1.36s/it] 24%|██▎ | 14614/61904 [7:13:09<17:55:19, 1.36s/it] 24%|██▎ | 14615/61904 [7:13:10<17:45:57, 1.35s/it] 24%|██▎ | 14616/61904 [7:13:11<17:47:20, 1.35s/it] 24%|██▎ | 14617/61904 [7:13:13<18:03:40, 1.38s/it] 24%|██▎ | 14618/61904 [7:13:14<18:03:36, 1.37s/it] 24%|██▎ | 14619/61904 [7:13:15<18:09:31, 1.38s/it] 24%|██▎ | 14620/61904 [7:13:17<17:39:24, 1.34s/it] {'loss': 2.7794, 'learning_rate': 1.7663036431997925e-07, 'epoch': 3.78} 24%|██▎ | 14620/61904 [7:13:17<17:39:24, 1.34s/it] 24%|██▎ | 14621/61904 [7:13:18<17:59:44, 1.37s/it] 24%|██▎ | 14622/61904 [7:13:20<18:47:48, 1.43s/it] 24%|██▎ | 14623/61904 [7:13:21<18:49:50, 1.43s/it] 24%|██▎ | 14624/61904 [7:13:22<18:23:35, 1.40s/it] 24%|██▎ | 14625/61904 [7:13:24<18:08:19, 1.38s/it] 24%|██▎ | 14626/61904 [7:13:25<18:01:29, 1.37s/it] 24%|██▎ | 14627/61904 [7:13:26<17:42:48, 1.35s/it] 24%|██▎ | 14628/61904 [7:13:28<17:13:55, 1.31s/it] 24%|██▎ | 14629/61904 [7:13:29<17:18:17, 1.32s/it] 24%|██▎ | 14630/61904 [7:13:30<17:34:12, 1.34s/it] 24%|██▎ | 14631/61904 [7:13:32<17:17:56, 1.32s/it] 24%|██▎ | 14632/61904 [7:13:33<17:10:24, 1.31s/it] 24%|██▎ | 14633/61904 [7:13:34<18:08:43, 1.38s/it] 24%|██▎ | 14634/61904 [7:13:36<18:20:35, 1.40s/it] 24%|██▎ | 14635/61904 [7:13:37<18:42:29, 1.42s/it] 24%|██▎ | 14636/61904 [7:13:39<18:25:51, 1.40s/it] 24%|██▎ | 14637/61904 [7:13:40<18:29:17, 1.41s/it] 24%|██▎ | 14638/61904 [7:13:42<18:28:29, 1.41s/it] 24%|██▎ | 14639/61904 [7:13:43<18:46:02, 1.43s/it] 24%|██▎ | 14640/61904 [7:13:44<18:26:22, 1.40s/it] {'loss': 2.712, 'learning_rate': 1.7659795151043692e-07, 'epoch': 3.78} 24%|██▎ | 14640/61904 [7:13:44<18:26:22, 1.40s/it] 24%|██▎ | 14641/61904 [7:13:46<18:37:50, 1.42s/it] 24%|██▎ | 14642/61904 [7:13:47<18:32:50, 1.41s/it] 24%|██▎ | 14643/61904 [7:13:49<18:20:30, 1.40s/it] 24%|██▎ | 14644/61904 [7:13:50<18:07:42, 1.38s/it] 24%|██▎ | 14645/61904 [7:13:51<18:23:21, 1.40s/it] 24%|██▎ | 14646/61904 [7:13:53<18:31:54, 1.41s/it] 24%|██▎ | 14647/61904 [7:13:54<18:25:43, 1.40s/it] 24%|██▎ | 14648/61904 [7:13:56<18:36:10, 1.42s/it] 24%|██▎ | 14649/61904 [7:13:57<18:13:02, 1.39s/it] 24%|██▎ | 14650/61904 [7:13:58<18:16:32, 1.39s/it] 24%|██▎ | 14651/61904 [7:14:00<18:25:40, 1.40s/it] 24%|██▎ | 14652/61904 [7:14:01<19:13:18, 1.46s/it] 24%|██▎ | 14653/61904 [7:14:03<19:00:30, 1.45s/it] 24%|██▎ | 14654/61904 [7:14:04<18:23:32, 1.40s/it] 24%|██▎ | 14655/61904 [7:14:06<19:00:16, 1.45s/it] 24%|██▎ | 14656/61904 [7:14:07<18:08:07, 1.38s/it] 24%|██▎ | 14657/61904 [7:14:08<18:19:49, 1.40s/it] 24%|██▎ | 14658/61904 [7:14:10<18:26:17, 1.40s/it] 24%|██▎ | 14659/61904 [7:14:11<18:21:07, 1.40s/it] 24%|██▎ | 14660/61904 [7:14:13<18:18:30, 1.40s/it] {'loss': 2.698, 'learning_rate': 1.7656553870089458e-07, 'epoch': 3.79} 24%|██▎ | 14660/61904 [7:14:13<18:18:30, 1.40s/it] 24%|██▎ | 14661/61904 [7:14:14<18:01:34, 1.37s/it] 24%|██▎ | 14662/61904 [7:14:15<17:51:34, 1.36s/it] 24%|██▎ | 14663/61904 [7:14:17<18:35:51, 1.42s/it] 24%|██▎ | 14664/61904 [7:14:18<18:34:04, 1.42s/it] 24%|██▎ | 14665/61904 [7:14:19<18:02:52, 1.38s/it] 24%|██▎ | 14666/61904 [7:14:21<17:42:54, 1.35s/it] 24%|██▎ | 14667/61904 [7:14:22<18:20:02, 1.40s/it] 24%|██▎ | 14668/61904 [7:14:24<18:32:15, 1.41s/it] 24%|██▎ | 14669/61904 [7:14:25<18:34:48, 1.42s/it] 24%|██▎ | 14670/61904 [7:14:27<18:37:43, 1.42s/it] 24%|██▎ | 14671/61904 [7:14:28<18:47:12, 1.43s/it] 24%|██▎ | 14672/61904 [7:14:29<18:27:20, 1.41s/it] 24%|██▎ | 14673/61904 [7:14:31<18:19:37, 1.40s/it] 24%|██▎ | 14674/61904 [7:14:32<18:31:28, 1.41s/it] 24%|██▎ | 14675/61904 [7:14:34<18:40:26, 1.42s/it] 24%|██▎ | 14676/61904 [7:14:35<18:09:29, 1.38s/it] 24%|██▎ | 14677/61904 [7:14:36<17:44:54, 1.35s/it] 24%|██▎ | 14678/61904 [7:14:38<18:11:57, 1.39s/it] 24%|██▎ | 14679/61904 [7:14:39<17:57:57, 1.37s/it] 24%|██▎ | 14680/61904 [7:14:41<18:29:06, 1.41s/it] {'loss': 2.6473, 'learning_rate': 1.7653312589135227e-07, 'epoch': 3.79} 24%|██▎ | 14680/61904 [7:14:41<18:29:06, 1.41s/it] 24%|██▎ | 14681/61904 [7:14:42<17:58:33, 1.37s/it] 24%|██▎ | 14682/61904 [7:14:43<18:14:58, 1.39s/it] 24%|██▎ | 14683/61904 [7:14:45<18:13:35, 1.39s/it] 24%|██▎ | 14684/61904 [7:14:46<18:15:46, 1.39s/it] 24%|██▎ | 14685/61904 [7:14:47<17:59:33, 1.37s/it] 24%|██▎ | 14686/61904 [7:14:49<18:22:12, 1.40s/it] 24%|██▎ | 14687/61904 [7:14:50<18:11:22, 1.39s/it] 24%|██▎ | 14688/61904 [7:14:52<18:27:53, 1.41s/it] 24%|██▎ | 14689/61904 [7:14:53<18:14:04, 1.39s/it] 24%|██▎ | 14690/61904 [7:14:54<18:27:33, 1.41s/it] 24%|██▎ | 14691/61904 [7:14:56<18:31:53, 1.41s/it] 24%|██▎ | 14692/61904 [7:14:57<18:48:49, 1.43s/it] 24%|██▎ | 14693/61904 [7:14:59<19:04:39, 1.45s/it] 24%|██▎ | 14694/61904 [7:15:00<18:22:14, 1.40s/it] 24%|██▎ | 14695/61904 [7:15:01<18:09:54, 1.39s/it] 24%|██▎ | 14696/61904 [7:15:03<17:58:48, 1.37s/it] 24%|██▎ | 14697/61904 [7:15:04<17:48:33, 1.36s/it] 24%|██▎ | 14698/61904 [7:15:05<17:55:36, 1.37s/it] 24%|██▎ | 14699/61904 [7:15:07<17:43:20, 1.35s/it] 24%|██▎ | 14700/61904 [7:15:08<17:32:25, 1.34s/it] {'loss': 2.7231, 'learning_rate': 1.7650071308180993e-07, 'epoch': 3.8} 24%|██▎ | 14700/61904 [7:15:08<17:32:25, 1.34s/it] 24%|██▎ | 14701/61904 [7:15:10<17:43:18, 1.35s/it] 24%|██▎ | 14702/61904 [7:15:11<18:02:10, 1.38s/it] 24%|██▍ | 14703/61904 [7:15:12<18:46:11, 1.43s/it] 24%|██▍ | 14704/61904 [7:15:14<18:44:36, 1.43s/it] 24%|██▍ | 14705/61904 [7:15:16<19:24:19, 1.48s/it] 24%|██▍ | 14706/61904 [7:15:17<20:27:40, 1.56s/it] 24%|██▍ | 14707/61904 [7:15:19<19:50:57, 1.51s/it] 24%|██▍ | 14708/61904 [7:15:20<18:58:45, 1.45s/it] 24%|██▍ | 14709/61904 [7:15:21<18:27:43, 1.41s/it] 24%|██▍ | 14710/61904 [7:15:23<18:22:19, 1.40s/it] 24%|██▍ | 14711/61904 [7:15:24<18:25:44, 1.41s/it] 24%|██▍ | 14712/61904 [7:15:25<18:01:18, 1.37s/it] 24%|██▍ | 14713/61904 [7:15:27<18:13:25, 1.39s/it] 24%|██▍ | 14714/61904 [7:15:28<18:05:08, 1.38s/it] 24%|██▍ | 14715/61904 [7:15:30<17:54:53, 1.37s/it] 24%|██▍ | 14716/61904 [7:15:31<17:43:07, 1.35s/it] 24%|██▍ | 14717/61904 [7:15:32<18:06:17, 1.38s/it] 24%|██▍ | 14718/61904 [7:15:34<18:05:40, 1.38s/it] 24%|██▍ | 14719/61904 [7:15:35<18:20:59, 1.40s/it] 24%|██▍ | 14720/61904 [7:15:36<17:57:47, 1.37s/it] {'loss': 2.7367, 'learning_rate': 1.764683002722676e-07, 'epoch': 3.8} 24%|██▍ | 14720/61904 [7:15:36<17:57:47, 1.37s/it] 24%|██▍ | 14721/61904 [7:15:38<17:57:08, 1.37s/it] 24%|██▍ | 14722/61904 [7:15:39<17:59:46, 1.37s/it] 24%|██▍ | 14723/61904 [7:15:40<17:40:28, 1.35s/it] 24%|██▍ | 14724/61904 [7:15:42<17:48:04, 1.36s/it] 24%|██▍ | 14725/61904 [7:15:43<18:00:33, 1.37s/it] 24%|██▍ | 14726/61904 [7:15:45<18:13:13, 1.39s/it] 24%|██▍ | 14727/61904 [7:15:46<18:11:47, 1.39s/it] 24%|██▍ | 14728/61904 [7:15:48<19:06:20, 1.46s/it] 24%|██▍ | 14729/61904 [7:15:49<18:36:10, 1.42s/it] 24%|██▍ | 14730/61904 [7:15:50<18:31:25, 1.41s/it] 24%|██▍ | 14731/61904 [7:15:52<18:24:28, 1.40s/it] 24%|██▍ | 14732/61904 [7:15:53<18:12:07, 1.39s/it] 24%|██▍ | 14733/61904 [7:15:55<19:01:39, 1.45s/it] 24%|██▍ | 14734/61904 [7:15:56<19:05:54, 1.46s/it] 24%|██▍ | 14735/61904 [7:15:58<19:28:20, 1.49s/it] 24%|██▍ | 14736/61904 [7:15:59<18:55:18, 1.44s/it] 24%|██▍ | 14737/61904 [7:16:00<18:33:03, 1.42s/it] 24%|██▍ | 14738/61904 [7:16:02<18:37:42, 1.42s/it] 24%|██▍ | 14739/61904 [7:16:03<18:19:36, 1.40s/it] 24%|██▍ | 14740/61904 [7:16:05<17:50:13, 1.36s/it] {'loss': 2.7298, 'learning_rate': 1.7643588746272528e-07, 'epoch': 3.81} 24%|██▍ | 14740/61904 [7:16:05<17:50:13, 1.36s/it] 24%|██▍ | 14741/61904 [7:16:06<18:23:36, 1.40s/it] 24%|██▍ | 14742/61904 [7:16:07<18:36:18, 1.42s/it] 24%|██▍ | 14743/61904 [7:16:09<18:46:32, 1.43s/it] 24%|██▍ | 14744/61904 [7:16:10<18:05:57, 1.38s/it] 24%|██▍ | 14745/61904 [7:16:12<18:16:38, 1.40s/it] 24%|██▍ | 14746/61904 [7:16:13<18:09:27, 1.39s/it] 24%|██▍ | 14747/61904 [7:16:14<18:13:08, 1.39s/it] 24%|██▍ | 14748/61904 [7:16:16<18:05:42, 1.38s/it] 24%|██▍ | 14749/61904 [7:16:17<18:11:24, 1.39s/it] 24%|██▍ | 14750/61904 [7:16:19<18:38:37, 1.42s/it] 24%|██▍ | 14751/61904 [7:16:20<19:09:44, 1.46s/it] 24%|██▍ | 14752/61904 [7:16:22<19:12:11, 1.47s/it] 24%|██▍ | 14753/61904 [7:16:23<19:17:01, 1.47s/it] 24%|██▍ | 14754/61904 [7:16:25<19:00:56, 1.45s/it] 24%|██▍ | 14755/61904 [7:16:26<18:47:32, 1.43s/it] 24%|██▍ | 14756/61904 [7:16:27<18:48:27, 1.44s/it] 24%|██▍ | 14757/61904 [7:16:29<18:47:23, 1.43s/it] 24%|██▍ | 14758/61904 [7:16:30<19:13:31, 1.47s/it] 24%|██▍ | 14759/61904 [7:16:32<18:29:04, 1.41s/it] 24%|██▍ | 14760/61904 [7:16:33<18:25:37, 1.41s/it] {'loss': 2.7584, 'learning_rate': 1.764034746531829e-07, 'epoch': 3.81} 24%|██▍ | 14760/61904 [7:16:33<18:25:37, 1.41s/it] 24%|██▍ | 14761/61904 [7:16:34<18:20:19, 1.40s/it] 24%|██▍ | 14762/61904 [7:16:36<18:21:39, 1.40s/it] 24%|██▍ | 14763/61904 [7:16:37<17:45:51, 1.36s/it] 24%|██▍ | 14764/61904 [7:16:39<18:04:05, 1.38s/it] 24%|██▍ | 14765/61904 [7:16:40<18:26:07, 1.41s/it] 24%|██▍ | 14766/61904 [7:16:41<18:34:29, 1.42s/it] 24%|██▍ | 14767/61904 [7:16:43<18:16:28, 1.40s/it] 24%|██▍ | 14768/61904 [7:16:44<17:49:49, 1.36s/it] 24%|██▍ | 14769/61904 [7:16:45<17:56:40, 1.37s/it] 24%|██▍ | 14770/61904 [7:16:47<17:47:01, 1.36s/it] 24%|██▍ | 14771/61904 [7:16:48<17:58:23, 1.37s/it] 24%|██▍ | 14772/61904 [7:16:50<18:24:27, 1.41s/it] 24%|██▍ | 14773/61904 [7:16:51<19:12:21, 1.47s/it] 24%|██▍ | 14774/61904 [7:16:53<18:41:31, 1.43s/it] 24%|██▍ | 14775/61904 [7:16:54<18:34:54, 1.42s/it] 24%|██▍ | 14776/61904 [7:16:55<18:00:24, 1.38s/it] 24%|██▍ | 14777/61904 [7:16:57<17:39:36, 1.35s/it] 24%|██▍ | 14778/61904 [7:16:58<17:31:39, 1.34s/it] 24%|██▍ | 14779/61904 [7:16:59<17:20:23, 1.32s/it] 24%|██▍ | 14780/61904 [7:17:01<17:46:23, 1.36s/it] {'loss': 2.7566, 'learning_rate': 1.763710618436406e-07, 'epoch': 3.82} 24%|██▍ | 14780/61904 [7:17:01<17:46:23, 1.36s/it] 24%|██▍ | 14781/61904 [7:17:02<18:26:00, 1.41s/it] 24%|██▍ | 14782/61904 [7:17:03<17:51:34, 1.36s/it] 24%|██▍ | 14783/61904 [7:17:05<18:24:58, 1.41s/it] 24%|██▍ | 14784/61904 [7:17:06<18:00:15, 1.38s/it] 24%|██▍ | 14785/61904 [7:17:08<17:43:49, 1.35s/it] 24%|██▍ | 14786/61904 [7:17:09<18:28:24, 1.41s/it] 24%|██▍ | 14787/61904 [7:17:10<17:44:00, 1.35s/it] 24%|██▍ | 14788/61904 [7:17:12<17:47:58, 1.36s/it] 24%|██▍ | 14789/61904 [7:17:13<18:06:45, 1.38s/it] 24%|██▍ | 14790/61904 [7:17:14<17:49:13, 1.36s/it] 24%|██▍ | 14791/61904 [7:17:16<17:26:10, 1.33s/it] 24%|██▍ | 14792/61904 [7:17:17<18:06:57, 1.38s/it] 24%|██▍ | 14793/61904 [7:17:18<17:41:08, 1.35s/it] 24%|██▍ | 14794/61904 [7:17:20<17:30:43, 1.34s/it] 24%|██▍ | 14795/61904 [7:17:21<17:36:01, 1.35s/it] 24%|██▍ | 14796/61904 [7:17:22<17:11:44, 1.31s/it] 24%|██▍ | 14797/61904 [7:17:24<17:27:34, 1.33s/it] 24%|██▍ | 14798/61904 [7:17:25<17:54:33, 1.37s/it] 24%|██▍ | 14799/61904 [7:17:27<18:18:09, 1.40s/it] 24%|██▍ | 14800/61904 [7:17:28<19:11:39, 1.47s/it] {'loss': 2.753, 'learning_rate': 1.7633864903409826e-07, 'epoch': 3.82} 24%|██▍ | 14800/61904 [7:17:28<19:11:39, 1.47s/it] 24%|██▍ | 14801/61904 [7:17:30<18:41:48, 1.43s/it] 24%|██▍ | 14802/61904 [7:17:31<18:41:33, 1.43s/it] 24%|██▍ | 14803/61904 [7:17:33<18:43:19, 1.43s/it] 24%|██▍ | 14804/61904 [7:17:34<18:01:53, 1.38s/it] 24%|██▍ | 14805/61904 [7:17:35<17:55:54, 1.37s/it] 24%|██▍ | 14806/61904 [7:17:36<17:46:57, 1.36s/it] 24%|██▍ | 14807/61904 [7:17:38<17:54:09, 1.37s/it] 24%|██▍ | 14808/61904 [7:17:39<18:12:28, 1.39s/it] 24%|██▍ | 14809/61904 [7:17:41<17:53:28, 1.37s/it] 24%|██▍ | 14810/61904 [7:17:42<17:41:56, 1.35s/it] 24%|██▍ | 14811/61904 [7:17:43<17:10:36, 1.31s/it] 24%|██▍ | 14812/61904 [7:17:44<16:53:36, 1.29s/it] 24%|██▍ | 14813/61904 [7:17:46<17:29:06, 1.34s/it] 24%|██▍ | 14814/61904 [7:17:47<17:30:42, 1.34s/it] 24%|██▍ | 14815/61904 [7:17:49<18:02:44, 1.38s/it] 24%|██▍ | 14816/61904 [7:17:50<17:26:08, 1.33s/it] 24%|██▍ | 14817/61904 [7:17:51<17:50:52, 1.36s/it] 24%|██▍ | 14818/61904 [7:17:53<17:57:08, 1.37s/it] 24%|██▍ | 14819/61904 [7:17:54<17:43:22, 1.36s/it] 24%|██▍ | 14820/61904 [7:17:55<17:37:03, 1.35s/it] {'loss': 2.7702, 'learning_rate': 1.7630623622455592e-07, 'epoch': 3.83} 24%|██▍ | 14820/61904 [7:17:55<17:37:03, 1.35s/it] 24%|██▍ | 14821/61904 [7:17:57<17:54:55, 1.37s/it] 24%|██▍ | 14822/61904 [7:17:58<17:47:25, 1.36s/it] 24%|██▍ | 14823/61904 [7:17:59<17:57:12, 1.37s/it] 24%|██▍ | 14824/61904 [7:18:01<17:48:19, 1.36s/it] 24%|██▍ | 14825/61904 [7:18:02<18:04:42, 1.38s/it] 24%|██▍ | 14826/61904 [7:18:04<17:56:38, 1.37s/it] 24%|██▍ | 14827/61904 [7:18:05<17:46:27, 1.36s/it] 24%|██▍ | 14828/61904 [7:18:06<18:19:05, 1.40s/it] 24%|██▍ | 14829/61904 [7:18:08<18:40:45, 1.43s/it] 24%|██▍ | 14830/61904 [7:18:09<18:56:54, 1.45s/it] 24%|██▍ | 14831/61904 [7:18:11<19:15:57, 1.47s/it] 24%|██▍ | 14832/61904 [7:18:12<19:16:45, 1.47s/it] 24%|██▍ | 14833/61904 [7:18:14<19:39:42, 1.50s/it] 24%|██▍ | 14834/61904 [7:18:15<19:16:10, 1.47s/it] 24%|██▍ | 14835/61904 [7:18:17<19:05:46, 1.46s/it] 24%|██▍ | 14836/61904 [7:18:18<18:45:32, 1.43s/it] 24%|██▍ | 14837/61904 [7:18:20<18:31:14, 1.42s/it] 24%|██▍ | 14838/61904 [7:18:21<18:32:02, 1.42s/it] 24%|██▍ | 14839/61904 [7:18:22<18:30:25, 1.42s/it] 24%|██▍ | 14840/61904 [7:18:24<18:27:18, 1.41s/it] {'loss': 2.7236, 'learning_rate': 1.7627382341501361e-07, 'epoch': 3.84} 24%|██▍ | 14840/61904 [7:18:24<18:27:18, 1.41s/it] 24%|██▍ | 14841/61904 [7:18:25<18:24:02, 1.41s/it] 24%|██▍ | 14842/61904 [7:18:27<18:36:06, 1.42s/it] 24%|██▍ | 14843/61904 [7:18:28<18:01:38, 1.38s/it] 24%|██▍ | 14844/61904 [7:18:30<18:45:48, 1.44s/it] 24%|██▍ | 14845/61904 [7:18:31<19:01:21, 1.46s/it] 24%|██▍ | 14846/61904 [7:18:33<19:12:15, 1.47s/it] 24%|██▍ | 14847/61904 [7:18:34<18:47:12, 1.44s/it] 24%|██▍ | 14848/61904 [7:18:35<18:31:40, 1.42s/it] 24%|██▍ | 14849/61904 [7:18:37<18:51:53, 1.44s/it] 24%|██▍ | 14850/61904 [7:18:38<18:12:02, 1.39s/it] 24%|██▍ | 14851/61904 [7:18:39<18:13:33, 1.39s/it] 24%|██▍ | 14852/61904 [7:18:41<17:59:27, 1.38s/it] 24%|██▍ | 14853/61904 [7:18:42<17:58:05, 1.37s/it] 24%|██▍ | 14854/61904 [7:18:44<18:00:52, 1.38s/it] 24%|██▍ | 14855/61904 [7:18:45<17:51:16, 1.37s/it] 24%|██▍ | 14856/61904 [7:18:46<18:18:23, 1.40s/it] 24%|██▍ | 14857/61904 [7:18:48<18:29:23, 1.41s/it] 24%|██▍ | 14858/61904 [7:18:49<18:14:39, 1.40s/it] 24%|██▍ | 14859/61904 [7:18:51<18:41:14, 1.43s/it] 24%|██▍ | 14860/61904 [7:18:52<18:57:32, 1.45s/it] {'loss': 2.8067, 'learning_rate': 1.7624141060547128e-07, 'epoch': 3.84} 24%|██▍ | 14860/61904 [7:18:52<18:57:32, 1.45s/it] 24%|██▍ | 14861/61904 [7:18:53<18:23:35, 1.41s/it] 24%|██▍ | 14862/61904 [7:18:55<18:21:09, 1.40s/it] 24%|██▍ | 14863/61904 [7:18:56<18:11:48, 1.39s/it] 24%|██▍ | 14864/61904 [7:18:58<18:16:33, 1.40s/it] 24%|██▍ | 14865/61904 [7:18:59<18:00:15, 1.38s/it] 24%|██▍ | 14866/61904 [7:19:00<17:56:46, 1.37s/it] 24%|██▍ | 14867/61904 [7:19:02<17:59:16, 1.38s/it] 24%|██▍ | 14868/61904 [7:19:03<18:00:20, 1.38s/it] 24%|██▍ | 14869/61904 [7:19:05<18:19:14, 1.40s/it] 24%|██▍ | 14870/61904 [7:19:06<18:08:39, 1.39s/it] 24%|██▍ | 14871/61904 [7:19:07<18:17:32, 1.40s/it] 24%|██▍ | 14872/61904 [7:19:09<18:01:19, 1.38s/it] 24%|██▍ | 14873/61904 [7:19:10<18:05:44, 1.39s/it] 24%|██▍ | 14874/61904 [7:19:11<17:45:56, 1.36s/it] 24%|██▍ | 14875/61904 [7:19:13<17:58:32, 1.38s/it] 24%|██▍ | 14876/61904 [7:19:14<18:20:45, 1.40s/it] 24%|██▍ | 14877/61904 [7:19:16<17:53:09, 1.37s/it] 24%|██▍ | 14878/61904 [7:19:17<17:42:43, 1.36s/it] 24%|██▍ | 14879/61904 [7:19:18<18:03:01, 1.38s/it] 24%|██▍ | 14880/61904 [7:19:20<17:40:48, 1.35s/it] {'loss': 2.723, 'learning_rate': 1.7620899779592894e-07, 'epoch': 3.85} 24%|██▍ | 14880/61904 [7:19:20<17:40:48, 1.35s/it] 24%|██▍ | 14881/61904 [7:19:21<17:31:19, 1.34s/it] 24%|██▍ | 14882/61904 [7:19:22<17:33:25, 1.34s/it] 24%|██▍ | 14883/61904 [7:19:24<17:20:55, 1.33s/it] 24%|██▍ | 14884/61904 [7:19:25<17:51:13, 1.37s/it] 24%|██▍ | 14885/61904 [7:19:26<17:55:52, 1.37s/it] 24%|██▍ | 14886/61904 [7:19:28<17:43:14, 1.36s/it] 24%|██▍ | 14887/61904 [7:19:29<17:30:18, 1.34s/it] 24%|██▍ | 14888/61904 [7:19:30<17:47:24, 1.36s/it] 24%|██▍ | 14889/61904 [7:19:32<17:46:59, 1.36s/it] 24%|██▍ | 14890/61904 [7:19:33<18:06:10, 1.39s/it] 24%|██▍ | 14891/61904 [7:19:35<17:52:17, 1.37s/it] 24%|██▍ | 14892/61904 [7:19:36<18:29:51, 1.42s/it] 24%|██▍ | 14893/61904 [7:19:37<18:25:54, 1.41s/it] 24%|██▍ | 14894/61904 [7:19:39<18:09:57, 1.39s/it] 24%|██▍ | 14895/61904 [7:19:40<18:46:12, 1.44s/it] 24%|██▍ | 14896/61904 [7:19:42<18:27:18, 1.41s/it] 24%|██▍ | 14897/61904 [7:19:43<18:36:14, 1.42s/it] 24%|██▍ | 14898/61904 [7:19:45<18:19:36, 1.40s/it] 24%|██▍ | 14899/61904 [7:19:46<17:54:03, 1.37s/it] 24%|██▍ | 14900/61904 [7:19:48<19:52:38, 1.52s/it] {'loss': 2.7566, 'learning_rate': 1.7617658498638663e-07, 'epoch': 3.85} 24%|██▍ | 14900/61904 [7:19:48<19:52:38, 1.52s/it] 24%|██▍ | 14901/61904 [7:19:49<19:11:59, 1.47s/it] 24%|██▍ | 14902/61904 [7:19:50<18:52:44, 1.45s/it] 24%|██▍ | 14903/61904 [7:19:52<18:46:20, 1.44s/it] 24%|██▍ | 14904/61904 [7:19:53<18:37:16, 1.43s/it] 24%|██▍ | 14905/61904 [7:19:55<18:02:14, 1.38s/it] 24%|██▍ | 14906/61904 [7:19:56<18:12:02, 1.39s/it] 24%|██▍ | 14907/61904 [7:19:57<17:48:33, 1.36s/it] 24%|██▍ | 14908/61904 [7:19:59<18:04:04, 1.38s/it] 24%|██▍ | 14909/61904 [7:20:00<17:58:26, 1.38s/it] 24%|██▍ | 14910/61904 [7:20:01<17:45:34, 1.36s/it] 24%|██▍ | 14911/61904 [7:20:03<17:54:22, 1.37s/it] 24%|██▍ | 14912/61904 [7:20:04<17:35:11, 1.35s/it] 24%|██▍ | 14913/61904 [7:20:05<17:37:53, 1.35s/it] 24%|██▍ | 14914/61904 [7:20:07<17:32:12, 1.34s/it] 24%|██▍ | 14915/61904 [7:20:08<17:45:42, 1.36s/it] 24%|██▍ | 14916/61904 [7:20:09<17:33:42, 1.35s/it] 24%|██▍ | 14917/61904 [7:20:11<17:56:43, 1.37s/it] 24%|██▍ | 14918/61904 [7:20:12<17:43:28, 1.36s/it] 24%|██▍ | 14919/61904 [7:20:14<17:40:45, 1.35s/it] 24%|██▍ | 14920/61904 [7:20:15<18:01:39, 1.38s/it] {'loss': 2.6923, 'learning_rate': 1.7614417217684426e-07, 'epoch': 3.86} 24%|██▍ | 14920/61904 [7:20:15<18:01:39, 1.38s/it] 24%|██▍ | 14921/61904 [7:20:16<17:58:15, 1.38s/it] 24%|██▍ | 14922/61904 [7:20:18<17:59:29, 1.38s/it] 24%|██▍ | 14923/61904 [7:20:19<17:58:53, 1.38s/it] 24%|██▍ | 14924/61904 [7:20:20<17:53:12, 1.37s/it] 24%|██▍ | 14925/61904 [7:20:22<17:35:04, 1.35s/it] 24%|██▍ | 14926/61904 [7:20:23<17:41:54, 1.36s/it] 24%|██▍ | 14927/61904 [7:20:25<18:03:12, 1.38s/it] 24%|██▍ | 14928/61904 [7:20:26<17:52:03, 1.37s/it] 24%|██▍ | 14929/61904 [7:20:27<17:28:57, 1.34s/it] 24%|██▍ | 14930/61904 [7:20:29<17:50:16, 1.37s/it] 24%|██▍ | 14931/61904 [7:20:30<18:04:51, 1.39s/it] 24%|██▍ | 14932/61904 [7:20:32<18:19:22, 1.40s/it] 24%|██▍ | 14933/61904 [7:20:33<18:17:28, 1.40s/it] 24%|██▍ | 14934/61904 [7:20:34<17:35:36, 1.35s/it] 24%|██▍ | 14935/61904 [7:20:35<17:16:45, 1.32s/it] 24%|██▍ | 14936/61904 [7:20:37<17:55:01, 1.37s/it] 24%|██▍ | 14937/61904 [7:20:38<18:30:23, 1.42s/it] 24%|██▍ | 14938/61904 [7:20:40<18:42:33, 1.43s/it] 24%|██▍ | 14939/61904 [7:20:41<18:14:45, 1.40s/it] 24%|██▍ | 14940/61904 [7:20:43<17:59:05, 1.38s/it] {'loss': 2.7525, 'learning_rate': 1.7611175936730195e-07, 'epoch': 3.86} 24%|██▍ | 14940/61904 [7:20:43<17:59:05, 1.38s/it] 24%|██▍ | 14941/61904 [7:20:44<17:42:41, 1.36s/it] 24%|██▍ | 14942/61904 [7:20:45<17:50:02, 1.37s/it] 24%|██▍ | 14943/61904 [7:20:47<18:06:56, 1.39s/it] 24%|██▍ | 14944/61904 [7:20:48<18:00:53, 1.38s/it] 24%|██▍ | 14945/61904 [7:20:50<18:34:44, 1.42s/it] 24%|██▍ | 14946/61904 [7:20:51<18:07:06, 1.39s/it] 24%|██▍ | 14947/61904 [7:20:52<18:04:20, 1.39s/it] 24%|██▍ | 14948/61904 [7:20:54<17:46:07, 1.36s/it] 24%|██▍ | 14949/61904 [7:20:55<19:05:45, 1.46s/it] 24%|██▍ | 14950/61904 [7:20:57<18:44:16, 1.44s/it] 24%|██▍ | 14951/61904 [7:20:58<17:58:12, 1.38s/it] 24%|██▍ | 14952/61904 [7:20:59<17:35:47, 1.35s/it] 24%|██▍ | 14953/61904 [7:21:01<17:39:09, 1.35s/it] 24%|██▍ | 14954/61904 [7:21:02<18:25:18, 1.41s/it] 24%|██▍ | 14955/61904 [7:21:03<18:01:45, 1.38s/it] 24%|██▍ | 14956/61904 [7:21:05<17:41:18, 1.36s/it] 24%|██▍ | 14957/61904 [7:21:06<17:55:59, 1.38s/it] 24%|██▍ | 14958/61904 [7:21:07<17:31:39, 1.34s/it] 24%|██▍ | 14959/61904 [7:21:09<17:32:06, 1.34s/it] 24%|██▍ | 14960/61904 [7:21:10<17:03:40, 1.31s/it] {'loss': 2.7261, 'learning_rate': 1.7607934655775964e-07, 'epoch': 3.87} 24%|██▍ | 14960/61904 [7:21:10<17:03:40, 1.31s/it] 24%|██▍ | 14961/61904 [7:21:11<17:51:03, 1.37s/it] 24%|██▍ | 14962/61904 [7:21:13<17:41:29, 1.36s/it] 24%|██▍ | 14963/61904 [7:21:14<17:50:26, 1.37s/it] 24%|██▍ | 14964/61904 [7:21:15<17:17:09, 1.33s/it] 24%|██▍ | 14965/61904 [7:21:17<17:20:11, 1.33s/it] 24%|██▍ | 14966/61904 [7:21:18<17:21:49, 1.33s/it] 24%|██▍ | 14967/61904 [7:21:19<17:13:21, 1.32s/it] 24%|██▍ | 14968/61904 [7:21:21<17:14:42, 1.32s/it] 24%|██▍ | 14969/61904 [7:21:22<17:58:16, 1.38s/it] 24%|██▍ | 14970/61904 [7:21:24<18:15:22, 1.40s/it] 24%|██▍ | 14971/61904 [7:21:25<17:57:50, 1.38s/it] 24%|██▍ | 14972/61904 [7:21:26<17:39:22, 1.35s/it] 24%|██▍ | 14973/61904 [7:21:28<17:56:19, 1.38s/it] 24%|██▍ | 14974/61904 [7:21:29<18:45:27, 1.44s/it] 24%|██▍ | 14975/61904 [7:21:31<18:03:58, 1.39s/it] 24%|██▍ | 14976/61904 [7:21:32<18:38:42, 1.43s/it] 24%|██▍ | 14977/61904 [7:21:34<18:48:51, 1.44s/it] 24%|██▍ | 14978/61904 [7:21:35<18:12:13, 1.40s/it] 24%|██▍ | 14979/61904 [7:21:36<18:17:03, 1.40s/it] 24%|██▍ | 14980/61904 [7:21:38<18:07:57, 1.39s/it] {'loss': 2.7879, 'learning_rate': 1.7604693374821727e-07, 'epoch': 3.87} 24%|██▍ | 14980/61904 [7:21:38<18:07:57, 1.39s/it] 24%|██▍ | 14981/61904 [7:21:39<18:55:07, 1.45s/it] 24%|██▍ | 14982/61904 [7:21:41<19:06:55, 1.47s/it] 24%|██▍ | 14983/61904 [7:21:42<18:19:06, 1.41s/it] 24%|██▍ | 14984/61904 [7:21:43<18:34:23, 1.43s/it] 24%|██▍ | 14985/61904 [7:21:45<18:34:26, 1.43s/it] 24%|██▍ | 14986/61904 [7:21:46<18:43:30, 1.44s/it] 24%|██▍ | 14987/61904 [7:21:48<18:37:42, 1.43s/it] 24%|██▍ | 14988/61904 [7:21:49<18:05:06, 1.39s/it] 24%|██▍ | 14989/61904 [7:21:50<17:59:51, 1.38s/it] 24%|██▍ | 14990/61904 [7:21:52<17:48:48, 1.37s/it] 24%|██▍ | 14991/61904 [7:21:53<17:48:44, 1.37s/it] 24%|██▍ | 14992/61904 [7:21:54<17:49:40, 1.37s/it] 24%|██▍ | 14993/61904 [7:21:56<17:40:56, 1.36s/it] 24%|██▍ | 14994/61904 [7:21:57<17:38:06, 1.35s/it] 24%|██▍ | 14995/61904 [7:21:59<18:07:14, 1.39s/it] 24%|██▍ | 14996/61904 [7:22:00<17:48:25, 1.37s/it] 24%|██▍ | 14997/61904 [7:22:01<18:23:01, 1.41s/it] 24%|██▍ | 14998/61904 [7:22:03<17:48:18, 1.37s/it] 24%|██▍ | 14999/61904 [7:22:04<17:50:04, 1.37s/it] 24%|██▍ | 15000/61904 [7:22:06<18:03:07, 1.39s/it] {'loss': 2.7061, 'learning_rate': 1.7601452093867496e-07, 'epoch': 3.88} 24%|██▍ | 15000/61904 [7:22:06<18:03:07, 1.39s/it] 24%|██▍ | 15001/61904 [7:22:07<17:46:38, 1.36s/it] 24%|██▍ | 15002/61904 [7:22:08<18:20:59, 1.41s/it] 24%|██▍ | 15003/61904 [7:22:10<17:49:20, 1.37s/it] 24%|██▍ | 15004/61904 [7:22:11<18:45:13, 1.44s/it] 24%|██▍ | 15005/61904 [7:22:13<19:25:12, 1.49s/it] 24%|██▍ | 15006/61904 [7:22:14<18:49:35, 1.45s/it] 24%|██▍ | 15007/61904 [7:22:16<19:05:23, 1.47s/it] 24%|██▍ | 15008/61904 [7:22:17<18:56:35, 1.45s/it] 24%|██▍ | 15009/61904 [7:22:18<18:06:47, 1.39s/it] 24%|██▍ | 15010/61904 [7:22:20<18:28:33, 1.42s/it] 24%|██▍ | 15011/61904 [7:22:21<18:18:25, 1.41s/it] 24%|██▍ | 15012/61904 [7:22:23<17:52:12, 1.37s/it] 24%|██▍ | 15013/61904 [7:22:24<17:58:01, 1.38s/it] 24%|██▍ | 15014/61904 [7:22:25<17:56:15, 1.38s/it] 24%|██▍ | 15015/61904 [7:22:27<17:42:32, 1.36s/it] 24%|██▍ | 15016/61904 [7:22:28<17:57:00, 1.38s/it] 24%|██▍ | 15017/61904 [7:22:29<18:10:04, 1.39s/it] 24%|██▍ | 15018/61904 [7:22:31<18:53:12, 1.45s/it] 24%|██▍ | 15019/61904 [7:22:32<18:38:13, 1.43s/it] 24%|██▍ | 15020/61904 [7:22:34<18:22:47, 1.41s/it] {'loss': 2.6878, 'learning_rate': 1.7598210812913262e-07, 'epoch': 3.88} 24%|██▍ | 15020/61904 [7:22:34<18:22:47, 1.41s/it] 24%|██▍ | 15021/61904 [7:22:35<18:22:36, 1.41s/it] 24%|██▍ | 15022/61904 [7:22:36<17:55:16, 1.38s/it] 24%|██▍ | 15023/61904 [7:22:38<18:01:00, 1.38s/it] 24%|██▍ | 15024/61904 [7:22:39<17:50:43, 1.37s/it] 24%|██▍ | 15025/61904 [7:22:41<17:51:11, 1.37s/it] 24%|██▍ | 15026/61904 [7:22:42<17:28:22, 1.34s/it] 24%|██▍ | 15027/61904 [7:22:43<17:52:27, 1.37s/it] 24%|██▍ | 15028/61904 [7:22:45<17:59:16, 1.38s/it] 24%|██▍ | 15029/61904 [7:22:46<17:22:39, 1.33s/it] 24%|██▍ | 15030/61904 [7:22:47<17:10:54, 1.32s/it] 24%|██▍ | 15031/61904 [7:22:49<17:06:25, 1.31s/it] 24%|██▍ | 15032/61904 [7:22:50<17:51:19, 1.37s/it] 24%|██▍ | 15033/61904 [7:22:51<17:32:55, 1.35s/it] 24%|██▍ | 15034/61904 [7:22:53<17:40:47, 1.36s/it] 24%|██▍ | 15035/61904 [7:22:54<18:03:14, 1.39s/it] 24%|██▍ | 15036/61904 [7:22:55<17:35:07, 1.35s/it] 24%|██▍ | 15037/61904 [7:22:57<17:27:46, 1.34s/it] 24%|██▍ | 15038/61904 [7:22:58<17:36:37, 1.35s/it] 24%|██▍ | 15039/61904 [7:22:59<17:34:33, 1.35s/it] 24%|██▍ | 15040/61904 [7:23:01<17:22:53, 1.34s/it] {'loss': 2.7831, 'learning_rate': 1.7594969531959028e-07, 'epoch': 3.89} 24%|██▍ | 15040/61904 [7:23:01<17:22:53, 1.34s/it] 24%|██▍ | 15041/61904 [7:23:02<17:22:43, 1.34s/it] 24%|██▍ | 15042/61904 [7:23:04<17:40:58, 1.36s/it] 24%|██▍ | 15043/61904 [7:23:05<18:25:00, 1.41s/it] 24%|██▍ | 15044/61904 [7:23:06<18:14:17, 1.40s/it] 24%|██▍ | 15045/61904 [7:23:08<18:04:54, 1.39s/it] 24%|██▍ | 15046/61904 [7:23:09<17:54:38, 1.38s/it] 24%|██▍ | 15047/61904 [7:23:11<17:53:15, 1.37s/it] 24%|██▍ | 15048/61904 [7:23:12<17:46:09, 1.37s/it] 24%|██▍ | 15049/61904 [7:23:13<17:40:49, 1.36s/it] 24%|██▍ | 15050/61904 [7:23:15<18:04:57, 1.39s/it] 24%|██▍ | 15051/61904 [7:23:16<17:51:25, 1.37s/it] 24%|██▍ | 15052/61904 [7:23:17<17:53:40, 1.37s/it] 24%|██▍ | 15053/61904 [7:23:19<17:34:36, 1.35s/it] 24%|██▍ | 15054/61904 [7:23:20<18:26:27, 1.42s/it] 24%|██▍ | 15055/61904 [7:23:22<18:36:37, 1.43s/it] 24%|██▍ | 15056/61904 [7:23:23<17:53:36, 1.38s/it] 24%|██▍ | 15057/61904 [7:23:24<17:54:18, 1.38s/it] 24%|██▍ | 15058/61904 [7:23:26<17:21:33, 1.33s/it] 24%|██▍ | 15059/61904 [7:23:27<17:27:25, 1.34s/it] 24%|██▍ | 15060/61904 [7:23:28<17:31:17, 1.35s/it] {'loss': 2.751, 'learning_rate': 1.7591728251004797e-07, 'epoch': 3.89} 24%|██▍ | 15060/61904 [7:23:28<17:31:17, 1.35s/it] 24%|██▍ | 15061/61904 [7:23:30<17:21:13, 1.33s/it] 24%|██▍ | 15062/61904 [7:23:31<18:15:33, 1.40s/it] 24%|██▍ | 15063/61904 [7:23:33<18:38:17, 1.43s/it] 24%|██▍ | 15064/61904 [7:23:34<18:31:37, 1.42s/it] 24%|██▍ | 15065/61904 [7:23:35<18:15:28, 1.40s/it] 24%|██▍ | 15066/61904 [7:23:37<18:34:56, 1.43s/it] 24%|██▍ | 15067/61904 [7:23:38<19:04:35, 1.47s/it] 24%|██▍ | 15068/61904 [7:23:40<18:11:31, 1.40s/it] 24%|██▍ | 15069/61904 [7:23:41<18:29:00, 1.42s/it] 24%|██▍ | 15070/61904 [7:23:43<18:19:47, 1.41s/it] 24%|██▍ | 15071/61904 [7:23:44<18:08:02, 1.39s/it] 24%|██▍ | 15072/61904 [7:23:45<18:05:02, 1.39s/it] 24%|██▍ | 15073/61904 [7:23:47<18:39:08, 1.43s/it] 24%|██▍ | 15074/61904 [7:23:48<18:59:18, 1.46s/it] 24%|██▍ | 15075/61904 [7:23:50<19:02:10, 1.46s/it] 24%|██▍ | 15076/61904 [7:23:51<18:25:47, 1.42s/it] 24%|██▍ | 15077/61904 [7:23:53<18:26:37, 1.42s/it] 24%|██▍ | 15078/61904 [7:23:54<18:10:53, 1.40s/it] 24%|██▍ | 15079/61904 [7:23:55<18:22:02, 1.41s/it] 24%|██▍ | 15080/61904 [7:23:57<18:36:28, 1.43s/it] {'loss': 2.7191, 'learning_rate': 1.7588486970050564e-07, 'epoch': 3.9} 24%|██▍ | 15080/61904 [7:23:57<18:36:28, 1.43s/it] 24%|██▍ | 15081/61904 [7:23:58<18:18:52, 1.41s/it] 24%|██▍ | 15082/61904 [7:24:00<18:41:45, 1.44s/it] 24%|██▍ | 15083/61904 [7:24:01<18:43:52, 1.44s/it] 24%|██▍ | 15084/61904 [7:24:03<18:54:38, 1.45s/it] 24%|██▍ | 15085/61904 [7:24:04<18:49:42, 1.45s/it] 24%|██▍ | 15086/61904 [7:24:06<18:59:04, 1.46s/it] 24%|██▍ | 15087/61904 [7:24:07<19:08:19, 1.47s/it] 24%|██▍ | 15088/61904 [7:24:08<18:30:16, 1.42s/it] 24%|██▍ | 15089/61904 [7:24:10<18:08:27, 1.40s/it] 24%|██▍ | 15090/61904 [7:24:11<18:26:18, 1.42s/it] 24%|██▍ | 15091/61904 [7:24:13<18:55:15, 1.46s/it] 24%|██▍ | 15092/61904 [7:24:14<19:01:49, 1.46s/it] 24%|██▍ | 15093/61904 [7:24:16<18:55:15, 1.46s/it] 24%|██▍ | 15094/61904 [7:24:17<18:14:04, 1.40s/it] 24%|██▍ | 15095/61904 [7:24:18<17:59:57, 1.38s/it] 24%|██▍ | 15096/61904 [7:24:20<18:07:27, 1.39s/it] 24%|██▍ | 15097/61904 [7:24:21<18:02:35, 1.39s/it] 24%|██▍ | 15098/61904 [7:24:23<18:44:06, 1.44s/it] 24%|██▍ | 15099/61904 [7:24:24<18:31:40, 1.43s/it] 24%|██▍ | 15100/61904 [7:24:25<18:17:17, 1.41s/it] {'loss': 2.7581, 'learning_rate': 1.758524568909633e-07, 'epoch': 3.9} 24%|██▍ | 15100/61904 [7:24:25<18:17:17, 1.41s/it] 24%|██▍ | 15101/61904 [7:24:27<18:09:42, 1.40s/it] 24%|██▍ | 15102/61904 [7:24:28<18:15:44, 1.40s/it] 24%|██▍ | 15103/61904 [7:24:30<18:28:14, 1.42s/it] 24%|██▍ | 15104/61904 [7:24:31<19:00:42, 1.46s/it] 24%|██▍ | 15105/61904 [7:24:33<18:36:50, 1.43s/it] 24%|██▍ | 15106/61904 [7:24:34<18:55:35, 1.46s/it] 24%|██▍ | 15107/61904 [7:24:35<18:49:33, 1.45s/it] 24%|██▍ | 15108/61904 [7:24:37<18:28:47, 1.42s/it] 24%|██▍ | 15109/61904 [7:24:38<18:16:54, 1.41s/it] 24%|██▍ | 15110/61904 [7:24:40<17:55:54, 1.38s/it] 24%|██▍ | 15111/61904 [7:24:41<17:32:54, 1.35s/it] 24%|██▍ | 15112/61904 [7:24:42<17:54:01, 1.38s/it] 24%|██▍ | 15113/61904 [7:24:44<18:38:22, 1.43s/it] 24%|██▍ | 15114/61904 [7:24:45<18:30:12, 1.42s/it] 24%|██▍ | 15115/61904 [7:24:47<18:18:29, 1.41s/it] 24%|██▍ | 15116/61904 [7:24:48<18:05:41, 1.39s/it] 24%|██▍ | 15117/61904 [7:24:49<17:39:18, 1.36s/it] 24%|██▍ | 15118/61904 [7:24:50<17:16:19, 1.33s/it] 24%|██▍ | 15119/61904 [7:24:52<17:05:28, 1.32s/it] 24%|██▍ | 15120/61904 [7:24:53<17:10:59, 1.32s/it] {'loss': 2.7016, 'learning_rate': 1.7582004408142099e-07, 'epoch': 3.91} 24%|██▍ | 15120/61904 [7:24:53<17:10:59, 1.32s/it] 24%|██▍ | 15121/61904 [7:24:54<17:21:38, 1.34s/it] 24%|██▍ | 15122/61904 [7:24:56<17:35:53, 1.35s/it] 24%|██▍ | 15123/61904 [7:24:57<17:36:21, 1.35s/it] 24%|██▍ | 15124/61904 [7:24:59<17:42:53, 1.36s/it] 24%|██▍ | 15125/61904 [7:25:00<17:28:23, 1.34s/it] 24%|██▍ | 15126/61904 [7:25:01<17:48:34, 1.37s/it] 24%|██▍ | 15127/61904 [7:25:03<17:53:55, 1.38s/it] 24%|██▍ | 15128/61904 [7:25:04<18:21:48, 1.41s/it] 24%|██▍ | 15129/61904 [7:25:06<18:32:35, 1.43s/it] 24%|██▍ | 15130/61904 [7:25:07<18:41:07, 1.44s/it] 24%|██▍ | 15131/61904 [7:25:09<18:38:12, 1.43s/it] 24%|██▍ | 15132/61904 [7:25:10<18:54:24, 1.46s/it] 24%|██▍ | 15133/61904 [7:25:11<18:42:13, 1.44s/it] 24%|██▍ | 15134/61904 [7:25:13<18:10:18, 1.40s/it] 24%|██▍ | 15135/61904 [7:25:14<18:44:50, 1.44s/it] 24%|██▍ | 15136/61904 [7:25:16<18:14:28, 1.40s/it] 24%|██▍ | 15137/61904 [7:25:17<18:05:55, 1.39s/it] 24%|██▍ | 15138/61904 [7:25:18<18:11:07, 1.40s/it] 24%|██▍ | 15139/61904 [7:25:20<17:40:25, 1.36s/it] 24%|██▍ | 15140/61904 [7:25:21<17:51:53, 1.38s/it] {'loss': 2.7306, 'learning_rate': 1.7578763127187862e-07, 'epoch': 3.91} 24%|██▍ | 15140/61904 [7:25:21<17:51:53, 1.38s/it] 24%|██▍ | 15141/61904 [7:25:22<17:30:37, 1.35s/it] 24%|██▍ | 15142/61904 [7:25:24<17:40:33, 1.36s/it] 24%|██▍ | 15143/61904 [7:25:25<18:09:24, 1.40s/it] 24%|██▍ | 15144/61904 [7:25:27<17:48:03, 1.37s/it] 24%|██▍ | 15145/61904 [7:25:28<17:51:43, 1.38s/it] 24%|██▍ | 15146/61904 [7:25:29<17:29:16, 1.35s/it] 24%|██▍ | 15147/61904 [7:25:31<17:56:07, 1.38s/it] 24%|██▍ | 15148/61904 [7:25:32<18:08:59, 1.40s/it] 24%|██▍ | 15149/61904 [7:25:33<17:51:20, 1.37s/it] 24%|██▍ | 15150/61904 [7:25:35<17:20:44, 1.34s/it] 24%|██▍ | 15151/61904 [7:25:36<17:25:44, 1.34s/it] 24%|██▍ | 15152/61904 [7:25:37<17:20:03, 1.33s/it] 24%|██▍ | 15153/61904 [7:25:39<17:32:46, 1.35s/it] 24%|██▍ | 15154/61904 [7:25:40<17:40:23, 1.36s/it] 24%|██▍ | 15155/61904 [7:25:41<17:26:44, 1.34s/it] 24%|██▍ | 15156/61904 [7:25:43<17:13:13, 1.33s/it] 24%|██▍ | 15157/61904 [7:25:44<18:07:27, 1.40s/it] 24%|██▍ | 15158/61904 [7:25:46<18:08:51, 1.40s/it] 24%|██▍ | 15159/61904 [7:25:47<18:08:16, 1.40s/it] 24%|██▍ | 15160/61904 [7:25:48<17:53:59, 1.38s/it] {'loss': 2.7835, 'learning_rate': 1.757552184623363e-07, 'epoch': 3.92} 24%|██▍ | 15160/61904 [7:25:48<17:53:59, 1.38s/it] 24%|██▍ | 15161/61904 [7:25:50<18:00:28, 1.39s/it] 24%|██▍ | 15162/61904 [7:25:51<17:44:26, 1.37s/it] 24%|██▍ | 15163/61904 [7:25:53<17:54:29, 1.38s/it] 24%|██▍ | 15164/61904 [7:25:54<17:59:55, 1.39s/it] 24%|██▍ | 15165/61904 [7:25:55<17:22:26, 1.34s/it] 24%|██▍ | 15166/61904 [7:25:56<17:12:47, 1.33s/it] 25%|██▍ | 15167/61904 [7:25:58<17:37:59, 1.36s/it] 25%|██▍ | 15168/61904 [7:25:59<17:27:55, 1.35s/it] 25%|██▍ | 15169/61904 [7:26:01<17:35:35, 1.36s/it] 25%|██▍ | 15170/61904 [7:26:02<18:17:07, 1.41s/it] 25%|██▍ | 15171/61904 [7:26:03<17:55:06, 1.38s/it] 25%|██▍ | 15172/61904 [7:26:05<18:03:29, 1.39s/it] 25%|██▍ | 15173/61904 [7:26:06<18:27:55, 1.42s/it] 25%|██▍ | 15174/61904 [7:26:08<18:00:23, 1.39s/it] 25%|██▍ | 15175/61904 [7:26:09<17:47:41, 1.37s/it] 25%|██▍ | 15176/61904 [7:26:10<17:50:33, 1.37s/it] 25%|██▍ | 15177/61904 [7:26:12<18:06:38, 1.40s/it] 25%|██▍ | 15178/61904 [7:26:13<17:58:31, 1.38s/it] 25%|██▍ | 15179/61904 [7:26:15<17:44:56, 1.37s/it] 25%|██▍ | 15180/61904 [7:26:16<17:46:04, 1.37s/it] {'loss': 2.7854, 'learning_rate': 1.75722805652794e-07, 'epoch': 3.92} 25%|██▍ | 15180/61904 [7:26:16<17:46:04, 1.37s/it] 25%|██▍ | 15181/61904 [7:26:17<18:04:47, 1.39s/it] 25%|██▍ | 15182/61904 [7:26:19<18:19:55, 1.41s/it] 25%|██▍ | 15183/61904 [7:26:20<18:23:43, 1.42s/it] 25%|██▍ | 15184/61904 [7:26:22<18:30:04, 1.43s/it] 25%|██▍ | 15185/61904 [7:26:23<18:15:34, 1.41s/it] 25%|██▍ | 15186/61904 [7:26:24<18:22:41, 1.42s/it] 25%|██▍ | 15187/61904 [7:26:26<18:07:12, 1.40s/it] 25%|██▍ | 15188/61904 [7:26:27<17:46:23, 1.37s/it] 25%|██▍ | 15189/61904 [7:26:28<17:34:29, 1.35s/it] 25%|██▍ | 15190/61904 [7:26:30<18:04:31, 1.39s/it] 25%|██▍ | 15191/61904 [7:26:31<17:55:28, 1.38s/it] 25%|██▍ | 15192/61904 [7:26:33<17:27:19, 1.35s/it] 25%|██▍ | 15193/61904 [7:26:34<17:32:21, 1.35s/it] 25%|██▍ | 15194/61904 [7:26:35<17:50:42, 1.38s/it] 25%|██▍ | 15195/61904 [7:26:37<17:51:34, 1.38s/it] 25%|██▍ | 15196/61904 [7:26:38<17:18:42, 1.33s/it] 25%|██▍ | 15197/61904 [7:26:39<17:24:32, 1.34s/it] 25%|██▍ | 15198/61904 [7:26:41<17:33:31, 1.35s/it] 25%|██▍ | 15199/61904 [7:26:42<17:45:19, 1.37s/it] 25%|██▍ | 15200/61904 [7:26:43<17:46:11, 1.37s/it] {'loss': 2.7345, 'learning_rate': 1.7569039284325163e-07, 'epoch': 3.93} 25%|██▍ | 15200/61904 [7:26:43<17:46:11, 1.37s/it] 25%|██▍ | 15201/61904 [7:26:45<17:42:52, 1.37s/it] 25%|██▍ | 15202/61904 [7:26:46<17:55:33, 1.38s/it] 25%|██▍ | 15203/61904 [7:26:48<17:29:28, 1.35s/it] 25%|██▍ | 15204/61904 [7:26:49<17:15:35, 1.33s/it] 25%|██▍ | 15205/61904 [7:26:50<17:30:28, 1.35s/it] 25%|██▍ | 15206/61904 [7:26:51<16:59:36, 1.31s/it] 25%|██▍ | 15207/61904 [7:26:53<16:50:42, 1.30s/it] 25%|██▍ | 15208/61904 [7:26:54<17:20:30, 1.34s/it] 25%|██▍ | 15209/61904 [7:26:55<17:19:58, 1.34s/it] 25%|██▍ | 15210/61904 [7:26:57<17:25:11, 1.34s/it] 25%|██▍ | 15211/61904 [7:26:58<17:47:41, 1.37s/it] 25%|██▍ | 15212/61904 [7:27:00<17:33:46, 1.35s/it] 25%|██▍ | 15213/61904 [7:27:01<17:32:45, 1.35s/it] 25%|██▍ | 15214/61904 [7:27:02<17:19:20, 1.34s/it] 25%|██▍ | 15215/61904 [7:27:04<17:15:06, 1.33s/it] 25%|██▍ | 15216/61904 [7:27:05<17:27:45, 1.35s/it] 25%|██▍ | 15217/61904 [7:27:06<17:40:53, 1.36s/it] 25%|██▍ | 15218/61904 [7:27:08<17:21:55, 1.34s/it] 25%|██▍ | 15219/61904 [7:27:09<17:38:45, 1.36s/it] 25%|██▍ | 15220/61904 [7:27:11<18:19:58, 1.41s/it] {'loss': 2.758, 'learning_rate': 1.7565798003370932e-07, 'epoch': 3.93} 25%|██▍ | 15220/61904 [7:27:11<18:19:58, 1.41s/it] 25%|██▍ | 15221/61904 [7:27:12<18:15:33, 1.41s/it] 25%|██▍ | 15222/61904 [7:27:13<17:44:41, 1.37s/it] 25%|██▍ | 15223/61904 [7:27:14<17:21:57, 1.34s/it] 25%|██▍ | 15224/61904 [7:27:16<17:23:39, 1.34s/it] 25%|██▍ | 15225/61904 [7:27:17<17:24:11, 1.34s/it] 25%|██▍ | 15226/61904 [7:27:19<17:41:26, 1.36s/it] 25%|██▍ | 15227/61904 [7:27:20<17:48:15, 1.37s/it] 25%|██▍ | 15228/61904 [7:27:21<17:26:57, 1.35s/it] 25%|██▍ | 15229/61904 [7:27:23<17:55:55, 1.38s/it] 25%|██▍ | 15230/61904 [7:27:24<18:18:21, 1.41s/it] 25%|██▍ | 15231/61904 [7:27:26<18:59:50, 1.47s/it] 25%|██▍ | 15232/61904 [7:27:27<18:24:19, 1.42s/it] 25%|██▍ | 15233/61904 [7:27:28<18:12:14, 1.40s/it] 25%|██▍ | 15234/61904 [7:27:30<18:00:17, 1.39s/it] 25%|██▍ | 15235/61904 [7:27:31<17:57:56, 1.39s/it] 25%|██▍ | 15236/61904 [7:27:33<17:39:53, 1.36s/it] 25%|██▍ | 15237/61904 [7:27:34<17:43:45, 1.37s/it] 25%|██▍ | 15238/61904 [7:27:35<18:07:20, 1.40s/it] 25%|██▍ | 15239/61904 [7:27:37<17:51:31, 1.38s/it] 25%|██▍ | 15240/61904 [7:27:38<18:03:12, 1.39s/it] {'loss': 2.7777, 'learning_rate': 1.7562556722416698e-07, 'epoch': 3.94} 25%|██▍ | 15240/61904 [7:27:38<18:03:12, 1.39s/it] 25%|██▍ | 15241/61904 [7:27:39<17:55:55, 1.38s/it] 25%|██▍ | 15242/61904 [7:27:41<18:07:19, 1.40s/it] 25%|██▍ | 15243/61904 [7:27:42<17:54:29, 1.38s/it] 25%|██▍ | 15244/61904 [7:27:44<18:39:35, 1.44s/it] 25%|██▍ | 15245/61904 [7:27:45<18:34:56, 1.43s/it] 25%|██▍ | 15246/61904 [7:27:47<18:25:55, 1.42s/it] 25%|██▍ | 15247/61904 [7:27:48<18:36:21, 1.44s/it] 25%|██▍ | 15248/61904 [7:27:49<18:15:29, 1.41s/it] 25%|██▍ | 15249/61904 [7:27:51<18:55:48, 1.46s/it] 25%|██▍ | 15250/61904 [7:27:52<17:51:19, 1.38s/it] 25%|██▍ | 15251/61904 [7:27:54<18:22:47, 1.42s/it] 25%|██▍ | 15252/61904 [7:27:55<18:03:55, 1.39s/it] 25%|██▍ | 15253/61904 [7:27:56<17:51:26, 1.38s/it] 25%|██▍ | 15254/61904 [7:27:58<18:04:04, 1.39s/it] 25%|██▍ | 15255/61904 [7:27:59<17:44:03, 1.37s/it] 25%|██▍ | 15256/61904 [7:28:01<18:07:16, 1.40s/it] 25%|██▍ | 15257/61904 [7:28:02<18:22:25, 1.42s/it] 25%|██▍ | 15258/61904 [7:28:04<19:01:40, 1.47s/it] 25%|██▍ | 15259/61904 [7:28:05<18:31:29, 1.43s/it] 25%|██▍ | 15260/61904 [7:28:06<18:09:28, 1.40s/it] {'loss': 2.8035, 'learning_rate': 1.7559315441462464e-07, 'epoch': 3.94} 25%|██▍ | 15260/61904 [7:28:06<18:09:28, 1.40s/it] 25%|██▍ | 15261/61904 [7:28:08<17:32:57, 1.35s/it] 25%|██▍ | 15262/61904 [7:28:09<17:18:53, 1.34s/it] 25%|██▍ | 15263/61904 [7:28:10<16:59:19, 1.31s/it] 25%|██▍ | 15264/61904 [7:28:11<16:41:55, 1.29s/it] 25%|██▍ | 15265/61904 [7:28:13<17:04:06, 1.32s/it] 25%|██▍ | 15266/61904 [7:28:14<17:02:34, 1.32s/it] 25%|██▍ | 15267/61904 [7:28:16<17:27:23, 1.35s/it] 25%|██▍ | 15268/61904 [7:28:17<17:29:02, 1.35s/it] 25%|██▍ | 15269/61904 [7:28:18<17:03:17, 1.32s/it] 25%|██▍ | 15270/61904 [7:28:20<17:46:52, 1.37s/it] 25%|██▍ | 15271/61904 [7:28:21<17:25:00, 1.34s/it] 25%|██▍ | 15272/61904 [7:28:22<17:17:49, 1.34s/it] 25%|██▍ | 15273/61904 [7:28:24<17:40:05, 1.36s/it] 25%|██▍ | 15274/61904 [7:28:25<17:46:52, 1.37s/it] 25%|██▍ | 15275/61904 [7:28:26<17:31:14, 1.35s/it] 25%|██▍ | 15276/61904 [7:28:28<17:35:24, 1.36s/it] 25%|██▍ | 15277/61904 [7:28:29<17:15:22, 1.33s/it] 25%|██▍ | 15278/61904 [7:28:30<16:56:19, 1.31s/it] 25%|██▍ | 15279/61904 [7:28:32<17:21:33, 1.34s/it] 25%|██▍ | 15280/61904 [7:28:33<17:15:39, 1.33s/it] {'loss': 2.7233, 'learning_rate': 1.7556074160508233e-07, 'epoch': 3.95} 25%|██▍ | 15280/61904 [7:28:33<17:15:39, 1.33s/it] 25%|██▍ | 15281/61904 [7:28:34<17:27:01, 1.35s/it] 25%|██▍ | 15282/61904 [7:28:36<18:01:55, 1.39s/it] 25%|██▍ | 15283/61904 [7:28:37<17:29:44, 1.35s/it] 25%|██▍ | 15284/61904 [7:28:38<17:27:07, 1.35s/it] 25%|██▍ | 15285/61904 [7:28:40<17:44:49, 1.37s/it] 25%|██▍ | 15286/61904 [7:28:41<17:51:51, 1.38s/it] 25%|██▍ | 15287/61904 [7:28:43<18:46:50, 1.45s/it] 25%|██▍ | 15288/61904 [7:28:44<18:17:33, 1.41s/it] 25%|██▍ | 15289/61904 [7:28:46<18:13:44, 1.41s/it] 25%|██▍ | 15290/61904 [7:28:47<17:43:41, 1.37s/it] 25%|██▍ | 15291/61904 [7:28:48<17:58:04, 1.39s/it] 25%|██▍ | 15292/61904 [7:28:50<18:24:39, 1.42s/it] 25%|██▍ | 15293/61904 [7:28:51<18:19:46, 1.42s/it] 25%|██▍ | 15294/61904 [7:28:53<18:22:27, 1.42s/it] 25%|██▍ | 15295/61904 [7:28:54<18:05:27, 1.40s/it] 25%|██▍ | 15296/61904 [7:28:56<18:54:09, 1.46s/it] 25%|██▍ | 15297/61904 [7:28:57<18:34:47, 1.44s/it] 25%|██▍ | 15298/61904 [7:28:58<18:39:07, 1.44s/it] 25%|██▍ | 15299/61904 [7:29:00<18:04:56, 1.40s/it] 25%|██▍ | 15300/61904 [7:29:01<18:27:16, 1.43s/it] {'loss': 2.7346, 'learning_rate': 1.7552832879554e-07, 'epoch': 3.95} 25%|██▍ | 15300/61904 [7:29:01<18:27:16, 1.43s/it] 25%|██▍ | 15301/61904 [7:29:03<18:12:27, 1.41s/it] 25%|██▍ | 15302/61904 [7:29:04<18:05:54, 1.40s/it] 25%|██▍ | 15303/61904 [7:29:05<17:40:46, 1.37s/it] 25%|██▍ | 15304/61904 [7:29:07<18:44:10, 1.45s/it] 25%|██▍ | 15305/61904 [7:29:08<18:12:45, 1.41s/it] 25%|██▍ | 15306/61904 [7:29:10<18:09:18, 1.40s/it] 25%|██▍ | 15307/61904 [7:29:11<18:07:08, 1.40s/it] 25%|██▍ | 15308/61904 [7:29:12<18:30:27, 1.43s/it] 25%|██▍ | 15309/61904 [7:29:14<17:49:42, 1.38s/it] 25%|██▍ | 15310/61904 [7:29:15<17:37:56, 1.36s/it] 25%|██▍ | 15311/61904 [7:29:16<17:46:04, 1.37s/it] 25%|██▍ | 15312/61904 [7:29:18<18:18:18, 1.41s/it] 25%|██▍ | 15313/61904 [7:29:19<17:55:33, 1.39s/it] 25%|██▍ | 15314/61904 [7:29:21<17:47:48, 1.38s/it] 25%|██▍ | 15315/61904 [7:29:22<18:03:34, 1.40s/it] 25%|██▍ | 15316/61904 [7:29:23<18:11:54, 1.41s/it] 25%|██▍ | 15317/61904 [7:29:25<17:43:45, 1.37s/it] 25%|██▍ | 15318/61904 [7:29:26<17:31:08, 1.35s/it] 25%|██▍ | 15319/61904 [7:29:27<17:03:05, 1.32s/it] 25%|██▍ | 15320/61904 [7:29:29<17:18:36, 1.34s/it] {'loss': 2.7189, 'learning_rate': 1.7549591598599766e-07, 'epoch': 3.96} 25%|██▍ | 15320/61904 [7:29:29<17:18:36, 1.34s/it] 25%|██▍ | 15321/61904 [7:29:30<16:57:45, 1.31s/it] 25%|██▍ | 15322/61904 [7:29:32<18:14:20, 1.41s/it] 25%|██▍ | 15323/61904 [7:29:33<18:30:59, 1.43s/it] 25%|██▍ | 15324/61904 [7:29:34<18:16:00, 1.41s/it] 25%|██▍ | 15325/61904 [7:29:36<18:22:01, 1.42s/it] 25%|██▍ | 15326/61904 [7:29:37<17:58:43, 1.39s/it] 25%|██▍ | 15327/61904 [7:29:39<17:50:32, 1.38s/it] 25%|██▍ | 15328/61904 [7:29:40<18:05:25, 1.40s/it] 25%|██▍ | 15329/61904 [7:29:41<17:59:48, 1.39s/it] 25%|██▍ | 15330/61904 [7:29:43<17:25:47, 1.35s/it] 25%|██▍ | 15331/61904 [7:29:44<17:32:28, 1.36s/it] 25%|██▍ | 15332/61904 [7:29:45<17:29:03, 1.35s/it] 25%|██▍ | 15333/61904 [7:29:47<17:39:23, 1.36s/it] 25%|██▍ | 15334/61904 [7:29:48<18:00:23, 1.39s/it] 25%|██▍ | 15335/61904 [7:29:49<17:26:09, 1.35s/it] 25%|██▍ | 15336/61904 [7:29:51<17:33:13, 1.36s/it] 25%|██▍ | 15337/61904 [7:29:52<17:30:18, 1.35s/it] 25%|██▍ | 15338/61904 [7:29:54<18:03:49, 1.40s/it] 25%|██▍ | 15339/61904 [7:29:55<17:33:15, 1.36s/it] 25%|██▍ | 15340/61904 [7:29:56<17:27:02, 1.35s/it] {'loss': 2.69, 'learning_rate': 1.7546350317645535e-07, 'epoch': 3.96} 25%|██▍ | 15340/61904 [7:29:56<17:27:02, 1.35s/it] 25%|██▍ | 15341/61904 [7:29:58<17:40:39, 1.37s/it] 25%|██▍ | 15342/61904 [7:29:59<17:53:21, 1.38s/it] 25%|██▍ | 15343/61904 [7:30:00<17:55:25, 1.39s/it] 25%|██▍ | 15344/61904 [7:30:02<17:24:52, 1.35s/it] 25%|██▍ | 15345/61904 [7:30:03<17:42:00, 1.37s/it] 25%|██▍ | 15346/61904 [7:30:04<17:35:13, 1.36s/it] 25%|██▍ | 15347/61904 [7:30:06<17:04:39, 1.32s/it] 25%|██▍ | 15348/61904 [7:30:07<17:06:52, 1.32s/it] 25%|██▍ | 15349/61904 [7:30:09<17:48:54, 1.38s/it] 25%|██▍ | 15350/61904 [7:30:10<17:26:23, 1.35s/it] 25%|██▍ | 15351/61904 [7:30:11<17:18:04, 1.34s/it] 25%|██▍ | 15352/61904 [7:30:12<17:03:28, 1.32s/it] 25%|██▍ | 15353/61904 [7:30:14<17:05:03, 1.32s/it] 25%|██▍ | 15354/61904 [7:30:15<17:29:56, 1.35s/it] 25%|██▍ | 15355/61904 [7:30:17<18:04:10, 1.40s/it] 25%|██▍ | 15356/61904 [7:30:18<18:10:00, 1.41s/it] 25%|██▍ | 15357/61904 [7:30:19<18:05:47, 1.40s/it] 25%|██▍ | 15358/61904 [7:30:21<18:20:41, 1.42s/it] 25%|██▍ | 15359/61904 [7:30:22<18:04:45, 1.40s/it] 25%|██▍ | 15360/61904 [7:30:24<17:44:00, 1.37s/it] {'loss': 2.7707, 'learning_rate': 1.7543109036691298e-07, 'epoch': 3.97} 25%|██▍ | 15360/61904 [7:30:24<17:44:00, 1.37s/it] 25%|██▍ | 15361/61904 [7:30:25<17:27:39, 1.35s/it] 25%|██▍ | 15362/61904 [7:30:26<17:27:55, 1.35s/it] 25%|██▍ | 15363/61904 [7:30:28<17:41:28, 1.37s/it] 25%|██▍ | 15364/61904 [7:30:29<17:19:58, 1.34s/it] 25%|██▍ | 15365/61904 [7:30:30<17:33:32, 1.36s/it] 25%|██▍ | 15366/61904 [7:30:32<17:16:45, 1.34s/it] 25%|██▍ | 15367/61904 [7:30:33<18:20:12, 1.42s/it] 25%|██▍ | 15368/61904 [7:30:35<18:38:10, 1.44s/it] 25%|██▍ | 15369/61904 [7:30:36<18:17:29, 1.42s/it] 25%|██▍ | 15370/61904 [7:30:38<18:54:56, 1.46s/it] 25%|██▍ | 15371/61904 [7:30:39<18:34:27, 1.44s/it] 25%|██▍ | 15372/61904 [7:30:40<18:11:55, 1.41s/it] 25%|██▍ | 15373/61904 [7:30:42<18:17:40, 1.42s/it] 25%|██▍ | 15374/61904 [7:30:43<18:29:40, 1.43s/it] 25%|██▍ | 15375/61904 [7:30:45<19:31:12, 1.51s/it] 25%|██▍ | 15376/61904 [7:30:46<19:04:31, 1.48s/it] 25%|██▍ | 15377/61904 [7:30:48<19:27:06, 1.51s/it] 25%|██▍ | 15378/61904 [7:30:49<19:29:42, 1.51s/it] 25%|██▍ | 15379/61904 [7:30:51<19:17:55, 1.49s/it] 25%|██▍ | 15380/61904 [7:30:52<18:39:12, 1.44s/it] {'loss': 2.7585, 'learning_rate': 1.7539867755737067e-07, 'epoch': 3.97} 25%|██▍ | 15380/61904 [7:30:52<18:39:12, 1.44s/it] 25%|██▍ | 15381/61904 [7:30:54<18:06:57, 1.40s/it] 25%|██▍ | 15382/61904 [7:30:55<17:54:59, 1.39s/it] 25%|██▍ | 15383/61904 [7:30:56<18:21:28, 1.42s/it] 25%|██▍ | 15384/61904 [7:30:58<18:54:51, 1.46s/it] 25%|██▍ | 15385/61904 [7:30:59<18:03:08, 1.40s/it] 25%|██▍ | 15386/61904 [7:31:01<18:03:55, 1.40s/it] 25%|██▍ | 15387/61904 [7:31:02<17:57:06, 1.39s/it] 25%|██▍ | 15388/61904 [7:31:03<18:02:21, 1.40s/it] 25%|██▍ | 15389/61904 [7:31:05<18:00:43, 1.39s/it] 25%|██▍ | 15390/61904 [7:31:06<17:54:09, 1.39s/it] 25%|██▍ | 15391/61904 [7:31:07<17:28:36, 1.35s/it] 25%|██▍ | 15392/61904 [7:31:09<17:25:58, 1.35s/it] 25%|██▍ | 15393/61904 [7:31:10<18:03:13, 1.40s/it] 25%|██▍ | 15394/61904 [7:31:12<18:09:44, 1.41s/it] 25%|██▍ | 15395/61904 [7:31:13<17:55:51, 1.39s/it] 25%|██▍ | 15396/61904 [7:31:14<17:45:09, 1.37s/it] 25%|██▍ | 15397/61904 [7:31:16<18:24:52, 1.43s/it] 25%|██▍ | 15398/61904 [7:31:17<17:48:55, 1.38s/it] 25%|██▍ | 15399/61904 [7:31:19<17:51:38, 1.38s/it] 25%|██▍ | 15400/61904 [7:31:20<18:00:06, 1.39s/it] {'loss': 2.6792, 'learning_rate': 1.7536626474782833e-07, 'epoch': 3.98} 25%|██▍ | 15400/61904 [7:31:20<18:00:06, 1.39s/it] 25%|██▍ | 15401/61904 [7:31:21<17:44:10, 1.37s/it] 25%|██▍ | 15402/61904 [7:31:23<17:27:40, 1.35s/it] 25%|██▍ | 15403/61904 [7:31:24<17:39:57, 1.37s/it] 25%|██▍ | 15404/61904 [7:31:25<17:46:36, 1.38s/it] 25%|██▍ | 15405/61904 [7:31:27<17:51:51, 1.38s/it] 25%|██▍ | 15406/61904 [7:31:28<17:33:20, 1.36s/it] 25%|██▍ | 15407/61904 [7:31:30<19:18:23, 1.49s/it] 25%|██▍ | 15408/61904 [7:31:31<18:58:58, 1.47s/it] 25%|██▍ | 15409/61904 [7:31:33<18:36:00, 1.44s/it] 25%|██▍ | 15410/61904 [7:31:34<18:13:56, 1.41s/it] 25%|██▍ | 15411/61904 [7:31:36<18:33:56, 1.44s/it] 25%|██▍ | 15412/61904 [7:31:37<18:38:00, 1.44s/it] 25%|██▍ | 15413/61904 [7:31:38<18:08:27, 1.40s/it] 25%|██▍ | 15414/61904 [7:31:40<18:55:47, 1.47s/it] 25%|██▍ | 15415/61904 [7:31:41<18:25:13, 1.43s/it] 25%|██▍ | 15416/61904 [7:31:43<18:12:32, 1.41s/it] 25%|██▍ | 15417/61904 [7:31:44<18:40:03, 1.45s/it] 25%|██▍ | 15418/61904 [7:31:46<18:08:45, 1.41s/it] 25%|██▍ | 15419/61904 [7:31:47<18:17:18, 1.42s/it] 25%|██▍ | 15420/61904 [7:31:48<18:41:03, 1.45s/it] {'loss': 2.7379, 'learning_rate': 1.75333851938286e-07, 'epoch': 3.99} 25%|██▍ | 15420/61904 [7:31:48<18:41:03, 1.45s/it] 25%|██▍ | 15421/61904 [7:31:50<18:24:53, 1.43s/it] 25%|██▍ | 15422/61904 [7:31:51<19:05:39, 1.48s/it] 25%|██▍ | 15423/61904 [7:31:53<18:38:03, 1.44s/it] 25%|██▍ | 15424/61904 [7:31:54<18:18:21, 1.42s/it] 25%|██▍ | 15425/61904 [7:31:56<18:22:24, 1.42s/it] 25%|██▍ | 15426/61904 [7:31:57<18:20:07, 1.42s/it] 25%|██▍ | 15427/61904 [7:31:58<18:07:39, 1.40s/it] 25%|██▍ | 15428/61904 [7:32:00<17:38:51, 1.37s/it] 25%|██▍ | 15429/61904 [7:32:01<17:39:03, 1.37s/it] 25%|██▍ | 15430/61904 [7:32:03<18:03:52, 1.40s/it] 25%|██▍ | 15431/61904 [7:32:04<17:32:49, 1.36s/it] 25%|██▍ | 15432/61904 [7:32:05<17:51:26, 1.38s/it] 25%|██▍ | 15433/61904 [7:32:07<18:33:02, 1.44s/it] 25%|██▍ | 15434/61904 [7:32:08<18:01:43, 1.40s/it] 25%|██▍ | 15435/61904 [7:32:09<18:04:02, 1.40s/it] 25%|██▍ | 15436/61904 [7:32:11<17:48:45, 1.38s/it] 25%|██▍ | 15437/61904 [7:32:12<17:50:32, 1.38s/it] 25%|██▍ | 15438/61904 [7:32:14<17:51:40, 1.38s/it] 25%|██▍ | 15439/61904 [7:32:15<18:02:30, 1.40s/it] 25%|██▍ | 15440/61904 [7:32:16<18:01:17, 1.40s/it] {'loss': 2.6914, 'learning_rate': 1.7530143912874368e-07, 'epoch': 3.99} 25%|██▍ | 15440/61904 [7:32:16<18:01:17, 1.40s/it] 25%|██▍ | 15441/61904 [7:32:18<18:08:52, 1.41s/it] 25%|██▍ | 15442/61904 [7:32:19<17:52:39, 1.39s/it] 25%|██▍ | 15443/61904 [7:32:21<17:43:54, 1.37s/it] 25%|██▍ | 15444/61904 [7:32:22<17:28:12, 1.35s/it] 25%|██▍ | 15445/61904 [7:32:23<17:39:00, 1.37s/it] 25%|██▍ | 15446/61904 [7:32:25<17:34:44, 1.36s/it] 25%|██▍ | 15447/61904 [7:32:26<17:25:56, 1.35s/it] 25%|██▍ | 15448/61904 [7:32:27<17:59:32, 1.39s/it] 25%|██▍ | 15449/61904 [7:32:29<18:04:44, 1.40s/it] 25%|██▍ | 15450/61904 [7:32:30<18:18:09, 1.42s/it] 25%|██▍ | 15451/61904 [7:32:32<18:15:53, 1.42s/it] 25%|██▍ | 15452/61904 [7:32:33<18:22:46, 1.42s/it] 25%|██▍ | 15453/61904 [7:32:35<19:05:34, 1.48s/it] 25%|██▍ | 15454/61904 [7:32:36<19:18:47, 1.50s/it] 25%|██▍ | 15455/61904 [7:32:38<18:45:20, 1.45s/it] 25%|██▍ | 15456/61904 [7:32:39<18:24:17, 1.43s/it] 25%|██▍ | 15457/61904 [7:32:40<18:23:03, 1.42s/it] 25%|██▍ | 15458/61904 [7:32:42<17:54:31, 1.39s/it] 25%|██▍ | 15459/61904 [7:32:43<18:09:57, 1.41s/it] 25%|██▍ | 15460/61904 [7:32:44<17:44:39, 1.38s/it] {'loss': 2.7623, 'learning_rate': 1.7526902631920134e-07, 'epoch': 4.0} 25%|██▍ | 15460/61904 [7:32:44<17:44:39, 1.38s/it] 25%|██▍ | 15461/61904 [7:32:46<17:53:52, 1.39s/it] 25%|██▍ | 15462/61904 [7:32:47<18:32:21, 1.44s/it] 25%|██▍ | 15463/61904 [7:32:49<18:10:34, 1.41s/it] 25%|██▍ | 15464/61904 [7:32:50<18:32:15, 1.44s/it] 25%|██▍ | 15465/61904 [7:32:52<18:07:28, 1.41s/it] 25%|██▍ | 15466/61904 [7:32:53<17:53:13, 1.39s/it] 25%|██▍ | 15467/61904 [7:32:54<17:32:48, 1.36s/it] 25%|██▍ | 15468/61904 [7:32:56<17:28:33, 1.35s/it] 25%|██▍ | 15469/61904 [7:32:57<17:28:03, 1.35s/it] 25%|██▍ | 15470/61904 [7:32:58<17:14:51, 1.34s/it] 25%|██▍ | 15471/61904 [7:33:00<17:41:36, 1.37s/it] 25%|██▍ | 15472/61904 [7:33:01<17:17:38, 1.34s/it] 25%|██▍ | 15473/61904 [7:33:03<18:03:29, 1.40s/it] 25%|██▍ | 15474/61904 [7:33:04<17:49:41, 1.38s/it] 25%|██▍ | 15475/61904 [7:33:05<18:01:39, 1.40s/it] 25%|██▌ | 15476/61904 [7:33:07<18:22:36, 1.42s/it] 25%|██▌ | 15477/61904 [7:33:08<17:59:16, 1.39s/it] 25%|██▌ | 15478/61904 [7:33:09<17:51:46, 1.39s/it]Generation Kwargs: {'max_length': 384, 'max_gen_length': 380, 'num_beams': 5} 0%| | 0/861 [00:00> Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model) instead. This warning will be raised to an exception in v4.41. Non-default generation parameters: {'max_length': 200, 'early_stopping': True, 'num_beams': 5, 'forced_eos_token_id': 2} /opt/conda/lib/python3.10/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock. self.pid = os.fork() 25%|██▌ | 15479/61904 [8:04:32<7294:49:21, 565.67s/it] 25%|██▌ | 15480/61904 [8:04:33<5112:10:07, 396.43s/it] {'loss': 2.7449, 'learning_rate': 1.75236613509659e-07, 'epoch': 4.0} 25%|██▌ | 15480/61904 [8:04:33<5112:10:07, 396.43s/it] 25%|██▌ | 15481/61904 [8:04:35<3583:41:38, 277.91s/it] 25%|██▌ | 15482/61904 [8:04:36<2513:58:58, 194.96s/it] 25%|██▌ | 15483/61904 [8:04:38<1766:08:24, 136.97s/it] 25%|██▌ | 15484/61904 [8:04:39<1242:39:25, 96.37s/it] 25%|██▌ | 15485/61904 [8:04:41<874:35:36, 67.83s/it] 25%|██▌ | 15486/61904 [8:04:42<617:24:03, 47.88s/it] 25%|██▌ | 15487/61904 [8:04:43<437:38:39, 33.94s/it] 25%|██▌ | 15488/61904 [8:04:45<311:45:10, 24.18s/it] 25%|██▌ | 15489/61904 [8:04:46<223:45:15, 17.35s/it] 25%|██▌ | 15490/61904 [8:04:48<161:55:21, 12.56s/it] 25%|██▌ | 15491/61904 [8:04:49<118:26:20, 9.19s/it] 25%|██▌ | 15492/61904 [8:04:50<88:31:35, 6.87s/it] 25%|██▌ | 15493/61904 [8:04:52<67:05:47, 5.20s/it] 25%|██▌ | 15494/61904 [8:04:53<52:12:21, 4.05s/it] 25%|██▌ | 15495/61904 [8:04:55<42:32:50, 3.30s/it] 25%|██▌ | 15496/61904 [8:04:56<35:14:31, 2.73s/it] 25%|██▌ | 15497/61904 [8:04:58<30:37:37, 2.38s/it] 25%|██▌ | 15498/61904 [8:04:59<26:36:55, 2.06s/it] 25%|██▌ | 15499/61904 [8:05:00<23:28:07, 1.82s/it] 25%|██▌ | 15500/61904 [8:05:02<22:19:47, 1.73s/it] {'loss': 2.7855, 'learning_rate': 1.752042007001167e-07, 'epoch': 4.01} 25%|██▌ | 15500/61904 [8:05:02<22:19:47, 1.73s/it] 25%|██▌ | 15501/61904 [8:05:03<21:21:07, 1.66s/it] 25%|██▌ | 15502/61904 [8:05:04<20:07:57, 1.56s/it] 25%|██▌ | 15503/61904 [8:05:06<19:35:31, 1.52s/it] 25%|██▌ | 15504/61904 [8:05:07<19:10:55, 1.49s/it] 25%|██▌ | 15505/61904 [8:05:09<19:07:14, 1.48s/it] 25%|██▌ | 15506/61904 [8:05:10<18:49:05, 1.46s/it] 25%|██▌ | 15507/61904 [8:05:12<18:18:22, 1.42s/it] 25%|██▌ | 15508/61904 [8:05:13<17:48:31, 1.38s/it] 25%|██▌ | 15509/61904 [8:05:14<18:07:42, 1.41s/it] 25%|██▌ | 15510/61904 [8:05:16<17:52:06, 1.39s/it] 25%|██▌ | 15511/61904 [8:05:17<18:04:05, 1.40s/it] 25%|██▌ | 15512/61904 [8:05:18<17:44:27, 1.38s/it] 25%|██▌ | 15513/61904 [8:05:20<17:35:33, 1.37s/it] 25%|██▌ | 15514/61904 [8:05:21<17:53:45, 1.39s/it] 25%|██▌ | 15515/61904 [8:05:22<17:16:34, 1.34s/it] 25%|██▌ | 15516/61904 [8:05:24<17:45:27, 1.38s/it] 25%|██▌ | 15517/61904 [8:05:25<17:25:21, 1.35s/it]