[2023-02-22 16:46:18,059][15372] Saving configuration to /content/train_dir/default_experiment/config.json... [2023-02-22 16:46:18,063][15372] Rollout worker 0 uses device cpu [2023-02-22 16:46:18,067][15372] Rollout worker 1 uses device cpu [2023-02-22 16:46:18,069][15372] Rollout worker 2 uses device cpu [2023-02-22 16:46:18,071][15372] Rollout worker 3 uses device cpu [2023-02-22 16:46:18,072][15372] Rollout worker 4 uses device cpu [2023-02-22 16:46:18,073][15372] Rollout worker 5 uses device cpu [2023-02-22 16:46:18,074][15372] Rollout worker 6 uses device cpu [2023-02-22 16:46:18,075][15372] Rollout worker 7 uses device cpu [2023-02-22 16:46:18,077][15372] Rollout worker 8 uses device cpu [2023-02-22 16:46:18,078][15372] Rollout worker 9 uses device cpu [2023-02-22 16:46:18,079][15372] Rollout worker 10 uses device cpu [2023-02-22 16:46:18,080][15372] Rollout worker 11 uses device cpu [2023-02-22 16:46:18,081][15372] Rollout worker 12 uses device cpu [2023-02-22 16:46:18,084][15372] Rollout worker 13 uses device cpu [2023-02-22 16:46:18,085][15372] Rollout worker 14 uses device cpu [2023-02-22 16:46:18,086][15372] Rollout worker 15 uses device cpu [2023-02-22 16:46:18,631][15372] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 16:46:18,633][15372] InferenceWorker_p0-w0: min num requests: 5 [2023-02-22 16:46:18,690][15372] Starting all processes... [2023-02-22 16:46:18,691][15372] Starting process learner_proc0 [2023-02-22 16:46:18,744][15372] Starting all processes... [2023-02-22 16:46:18,755][15372] Starting process inference_proc0-0 [2023-02-22 16:46:18,756][15372] Starting process rollout_proc0 [2023-02-22 16:46:18,758][15372] Starting process rollout_proc1 [2023-02-22 16:46:18,758][15372] Starting process rollout_proc2 [2023-02-22 16:46:18,758][15372] Starting process rollout_proc3 [2023-02-22 16:46:18,758][15372] Starting process rollout_proc4 [2023-02-22 16:46:18,758][15372] Starting process rollout_proc5 [2023-02-22 16:46:18,758][15372] Starting process rollout_proc6 [2023-02-22 16:46:18,758][15372] Starting process rollout_proc7 [2023-02-22 16:46:18,758][15372] Starting process rollout_proc8 [2023-02-22 16:46:18,758][15372] Starting process rollout_proc9 [2023-02-22 16:46:18,758][15372] Starting process rollout_proc10 [2023-02-22 16:46:18,758][15372] Starting process rollout_proc11 [2023-02-22 16:46:18,776][15372] Starting process rollout_proc12 [2023-02-22 16:46:18,785][15372] Starting process rollout_proc13 [2023-02-22 16:46:18,785][15372] Starting process rollout_proc14 [2023-02-22 16:46:19,208][15372] Starting process rollout_proc15 [2023-02-22 16:46:37,976][17928] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 16:46:37,984][17928] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-02-22 16:46:38,711][17954] Worker 4 uses CPU cores [0] [2023-02-22 16:46:39,021][17950] Worker 0 uses CPU cores [0] [2023-02-22 16:46:39,027][17949] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 16:46:39,032][17949] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-02-22 16:46:39,053][17958] Worker 8 uses CPU cores [0] [2023-02-22 16:46:39,054][17951] Worker 2 uses CPU cores [0] [2023-02-22 16:46:39,096][15372] Heartbeat connected on RolloutWorker_w4 [2023-02-22 16:46:39,181][15372] Heartbeat connected on RolloutWorker_w2 [2023-02-22 16:46:39,187][15372] Heartbeat connected on RolloutWorker_w8 [2023-02-22 16:46:39,188][15372] Heartbeat connected on RolloutWorker_w0 [2023-02-22 16:46:39,246][17952] Worker 3 uses CPU cores [1] [2023-02-22 16:46:39,257][17953] Worker 1 uses CPU cores [1] [2023-02-22 16:46:39,320][17961] Worker 11 uses CPU cores [1] [2023-02-22 16:46:39,350][17955] Worker 5 uses CPU cores [1] [2023-02-22 16:46:39,371][17959] Worker 9 uses CPU cores [1] [2023-02-22 16:46:39,422][17967] Worker 13 uses CPU cores [1] [2023-02-22 16:46:39,459][17968] Worker 14 uses CPU cores [0] [2023-02-22 16:46:39,461][15372] Heartbeat connected on RolloutWorker_w1 [2023-02-22 16:46:39,478][17962] Worker 12 uses CPU cores [0] [2023-02-22 16:46:39,479][17970] Worker 15 uses CPU cores [1] [2023-02-22 16:46:39,481][17957] Worker 7 uses CPU cores [1] [2023-02-22 16:46:39,488][15372] Heartbeat connected on RolloutWorker_w3 [2023-02-22 16:46:39,494][15372] Heartbeat connected on RolloutWorker_w11 [2023-02-22 16:46:39,536][15372] Heartbeat connected on RolloutWorker_w5 [2023-02-22 16:46:39,555][15372] Heartbeat connected on RolloutWorker_w9 [2023-02-22 16:46:39,569][17956] Worker 6 uses CPU cores [0] [2023-02-22 16:46:39,570][15372] Heartbeat connected on RolloutWorker_w13 [2023-02-22 16:46:39,578][17960] Worker 10 uses CPU cores [0] [2023-02-22 16:46:39,580][15372] Heartbeat connected on RolloutWorker_w12 [2023-02-22 16:46:39,586][15372] Heartbeat connected on RolloutWorker_w14 [2023-02-22 16:46:39,592][15372] Heartbeat connected on RolloutWorker_w6 [2023-02-22 16:46:39,601][15372] Heartbeat connected on RolloutWorker_w15 [2023-02-22 16:46:39,605][15372] Heartbeat connected on RolloutWorker_w10 [2023-02-22 16:46:39,744][15372] Heartbeat connected on RolloutWorker_w7 [2023-02-22 16:46:40,078][17949] Num visible devices: 1 [2023-02-22 16:46:40,081][15372] Heartbeat connected on InferenceWorker_p0-w0 [2023-02-22 16:46:40,087][17928] Num visible devices: 1 [2023-02-22 16:46:40,094][17928] Starting seed is not provided [2023-02-22 16:46:40,095][17928] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 16:46:40,096][17928] Initializing actor-critic model on device cuda:0 [2023-02-22 16:46:40,097][17928] RunningMeanStd input shape: (3, 72, 128) [2023-02-22 16:46:40,098][15372] Heartbeat connected on Batcher_0 [2023-02-22 16:46:40,104][17928] RunningMeanStd input shape: (1,) [2023-02-22 16:46:40,124][17928] ConvEncoder: input_channels=3 [2023-02-22 16:46:40,465][17928] Conv encoder output size: 512 [2023-02-22 16:46:40,465][17928] Policy head output size: 512 [2023-02-22 16:46:40,528][17928] Created Actor Critic model with architecture: [2023-02-22 16:46:40,528][17928] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) [2023-02-22 16:46:47,576][17928] Using optimizer [2023-02-22 16:46:47,578][17928] No checkpoints found [2023-02-22 16:46:47,578][17928] Did not load from checkpoint, starting from scratch! [2023-02-22 16:46:47,579][17928] Initialized policy 0 weights for model version 0 [2023-02-22 16:46:47,582][17928] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 16:46:47,590][17928] LearnerWorker_p0 finished initialization! [2023-02-22 16:46:47,591][15372] Heartbeat connected on LearnerWorker_p0 [2023-02-22 16:46:47,839][17949] RunningMeanStd input shape: (3, 72, 128) [2023-02-22 16:46:47,840][17949] RunningMeanStd input shape: (1,) [2023-02-22 16:46:47,852][17949] ConvEncoder: input_channels=3 [2023-02-22 16:46:47,952][17949] Conv encoder output size: 512 [2023-02-22 16:46:47,952][17949] Policy head output size: 512 [2023-02-22 16:46:49,555][15372] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-22 16:46:50,266][15372] Inference worker 0-0 is ready! [2023-02-22 16:46:50,268][15372] All inference workers are ready! Signal rollout workers to start! [2023-02-22 16:46:50,518][17957] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 16:46:50,512][17953] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 16:46:50,529][17952] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 16:46:50,561][17961] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 16:46:50,577][17967] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 16:46:50,593][17970] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 16:46:50,606][17955] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 16:46:50,605][17950] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 16:46:50,615][17959] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 16:46:50,607][17960] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 16:46:50,601][17954] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 16:46:50,615][17968] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 16:46:50,624][17962] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 16:46:50,629][17958] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 16:46:50,636][17956] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 16:46:50,633][17951] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 16:46:51,443][17956] VizDoom game.init() threw an exception ViZDoomUnexpectedExitException('Controlled ViZDoom instance exited unexpectedly.'). Terminate process... [2023-02-22 16:46:51,442][17951] VizDoom game.init() threw an exception ViZDoomUnexpectedExitException('Controlled ViZDoom instance exited unexpectedly.'). Terminate process... [2023-02-22 16:46:51,449][17951] EvtLoop [rollout_proc2_evt_loop, process=rollout_proc2] unhandled exception in slot='init' connected to emitter=Emitter(object_id='Sampler', signal_name='_inference_workers_initialized'), args=() Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 228, in _game_init self.game.init() vizdoom.vizdoom.ViZDoomUnexpectedExitException: Controlled ViZDoom instance exited unexpectedly. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 150, in init env_runner.init(self.timing) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 418, in init self._reset() File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 430, in _reset observations, info = e.reset(seed=seed) # new way of doing seeding since Gym 0.26.0 File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 125, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 110, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 30, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 379, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 84, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 51, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 323, in reset self._ensure_initialized() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 274, in _ensure_initialized self.initialize() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 269, in initialize self._game_init() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 244, in _game_init raise EnvCriticalError() sample_factory.envs.env_utils.EnvCriticalError [2023-02-22 16:46:51,457][17951] Unhandled exception in evt loop rollout_proc2_evt_loop [2023-02-22 16:46:51,447][17956] EvtLoop [rollout_proc6_evt_loop, process=rollout_proc6] unhandled exception in slot='init' connected to emitter=Emitter(object_id='Sampler', signal_name='_inference_workers_initialized'), args=() Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 228, in _game_init self.game.init() vizdoom.vizdoom.ViZDoomUnexpectedExitException: Controlled ViZDoom instance exited unexpectedly. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 150, in init env_runner.init(self.timing) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 418, in init self._reset() File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 430, in _reset observations, info = e.reset(seed=seed) # new way of doing seeding since Gym 0.26.0 File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 125, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 110, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 30, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 379, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 84, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 51, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 323, in reset self._ensure_initialized() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 274, in _ensure_initialized self.initialize() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 269, in initialize self._game_init() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 244, in _game_init raise EnvCriticalError() sample_factory.envs.env_utils.EnvCriticalError [2023-02-22 16:46:51,462][17956] Unhandled exception in evt loop rollout_proc6_evt_loop [2023-02-22 16:46:53,083][17950] Decorrelating experience for 0 frames... [2023-02-22 16:46:53,325][17960] Decorrelating experience for 0 frames... [2023-02-22 16:46:53,513][17957] Decorrelating experience for 0 frames... [2023-02-22 16:46:53,526][17967] Decorrelating experience for 0 frames... [2023-02-22 16:46:53,529][17961] Decorrelating experience for 0 frames... [2023-02-22 16:46:53,503][17952] Decorrelating experience for 0 frames... [2023-02-22 16:46:53,630][17955] Decorrelating experience for 0 frames... [2023-02-22 16:46:54,558][15372] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-22 16:46:54,834][17954] Decorrelating experience for 0 frames... [2023-02-22 16:46:54,849][17950] Decorrelating experience for 32 frames... [2023-02-22 16:46:54,953][17962] Decorrelating experience for 0 frames... [2023-02-22 16:46:55,105][17958] Decorrelating experience for 0 frames... [2023-02-22 16:46:55,917][17952] Decorrelating experience for 32 frames... [2023-02-22 16:46:55,919][17967] Decorrelating experience for 32 frames... [2023-02-22 16:46:55,921][17961] Decorrelating experience for 32 frames... [2023-02-22 16:46:56,025][17957] Decorrelating experience for 32 frames... [2023-02-22 16:46:56,909][17962] Decorrelating experience for 32 frames... [2023-02-22 16:46:56,915][17968] Decorrelating experience for 0 frames... [2023-02-22 16:46:56,960][17955] Decorrelating experience for 32 frames... [2023-02-22 16:46:57,266][17950] Decorrelating experience for 64 frames... [2023-02-22 16:46:57,902][17961] Decorrelating experience for 64 frames... [2023-02-22 16:46:57,906][17967] Decorrelating experience for 64 frames... [2023-02-22 16:46:57,939][17960] Decorrelating experience for 32 frames... [2023-02-22 16:46:58,735][17968] Decorrelating experience for 32 frames... [2023-02-22 16:46:58,769][17958] Decorrelating experience for 32 frames... [2023-02-22 16:46:58,846][17962] Decorrelating experience for 64 frames... [2023-02-22 16:46:59,254][17955] Decorrelating experience for 64 frames... [2023-02-22 16:46:59,555][15372] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-22 16:47:00,096][17960] Decorrelating experience for 64 frames... [2023-02-22 16:47:00,191][17970] Decorrelating experience for 0 frames... [2023-02-22 16:47:00,235][17953] Decorrelating experience for 0 frames... [2023-02-22 16:47:00,320][17967] Decorrelating experience for 96 frames... [2023-02-22 16:47:00,323][17961] Decorrelating experience for 96 frames... [2023-02-22 16:47:00,846][17952] Decorrelating experience for 64 frames... [2023-02-22 16:47:00,892][17968] Decorrelating experience for 64 frames... [2023-02-22 16:47:01,572][17953] Decorrelating experience for 32 frames... [2023-02-22 16:47:01,647][17954] Decorrelating experience for 32 frames... [2023-02-22 16:47:01,745][17958] Decorrelating experience for 64 frames... [2023-02-22 16:47:01,750][17962] Decorrelating experience for 96 frames... [2023-02-22 16:47:02,382][17950] Decorrelating experience for 96 frames... [2023-02-22 16:47:02,382][17961] Decorrelating experience for 128 frames... [2023-02-22 16:47:02,413][17957] Decorrelating experience for 64 frames... [2023-02-22 16:47:03,078][17954] Decorrelating experience for 64 frames... [2023-02-22 16:47:03,226][17958] Decorrelating experience for 96 frames... [2023-02-22 16:47:03,611][17968] Decorrelating experience for 96 frames... [2023-02-22 16:47:03,838][17952] Decorrelating experience for 96 frames... [2023-02-22 16:47:03,900][17953] Decorrelating experience for 64 frames... [2023-02-22 16:47:03,972][17959] Decorrelating experience for 0 frames... [2023-02-22 16:47:04,117][17967] Decorrelating experience for 128 frames... [2023-02-22 16:47:04,127][17955] Decorrelating experience for 96 frames... [2023-02-22 16:47:04,174][17954] Decorrelating experience for 96 frames... [2023-02-22 16:47:04,555][15372] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-22 16:47:04,973][17968] Decorrelating experience for 128 frames... [2023-02-22 16:47:05,093][17961] Decorrelating experience for 160 frames... [2023-02-22 16:47:05,156][17950] Decorrelating experience for 128 frames... [2023-02-22 16:47:05,517][17958] Decorrelating experience for 128 frames... [2023-02-22 16:47:05,863][17962] Decorrelating experience for 128 frames... [2023-02-22 16:47:06,110][17959] Decorrelating experience for 32 frames... [2023-02-22 16:47:06,245][17970] Decorrelating experience for 32 frames... [2023-02-22 16:47:06,248][17957] Decorrelating experience for 96 frames... [2023-02-22 16:47:06,302][17958] Decorrelating experience for 160 frames... [2023-02-22 16:47:07,051][17962] Decorrelating experience for 160 frames... [2023-02-22 16:47:07,083][17967] Decorrelating experience for 160 frames... [2023-02-22 16:47:07,273][17955] Decorrelating experience for 128 frames... [2023-02-22 16:47:07,290][17958] Decorrelating experience for 192 frames... [2023-02-22 16:47:08,012][17968] Decorrelating experience for 160 frames... [2023-02-22 16:47:08,188][17950] Decorrelating experience for 160 frames... [2023-02-22 16:47:08,379][17959] Decorrelating experience for 64 frames... [2023-02-22 16:47:08,435][17952] Decorrelating experience for 128 frames... [2023-02-22 16:47:08,440][17953] Decorrelating experience for 96 frames... [2023-02-22 16:47:09,121][17970] Decorrelating experience for 64 frames... [2023-02-22 16:47:09,407][17968] Decorrelating experience for 192 frames... [2023-02-22 16:47:09,520][17960] Decorrelating experience for 96 frames... [2023-02-22 16:47:09,555][15372] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-22 16:47:09,582][17955] Decorrelating experience for 160 frames... [2023-02-22 16:47:09,590][17954] Decorrelating experience for 128 frames... [2023-02-22 16:47:10,043][17967] Decorrelating experience for 192 frames... [2023-02-22 16:47:10,498][17962] Decorrelating experience for 192 frames... [2023-02-22 16:47:11,226][17953] Decorrelating experience for 128 frames... [2023-02-22 16:47:11,286][17968] Decorrelating experience for 224 frames... [2023-02-22 16:47:11,462][17957] Decorrelating experience for 128 frames... [2023-02-22 16:47:11,766][17970] Decorrelating experience for 96 frames... [2023-02-22 16:47:12,156][17959] Decorrelating experience for 96 frames... [2023-02-22 16:47:12,901][17958] Decorrelating experience for 224 frames... [2023-02-22 16:47:13,181][17962] Decorrelating experience for 224 frames... [2023-02-22 16:47:13,313][17955] Decorrelating experience for 192 frames... [2023-02-22 16:47:13,441][17950] Decorrelating experience for 192 frames... [2023-02-22 16:47:14,102][17960] Decorrelating experience for 128 frames... [2023-02-22 16:47:14,529][17960] VizDoom game.init() threw an exception SignalException('Signal SIGINT received. ViZDoom instance has been closed.'). Terminate process... [2023-02-22 16:47:14,530][17960] EvtLoop [rollout_proc10_evt_loop, process=rollout_proc10] unhandled exception in slot='init' connected to emitter=Emitter(object_id='Sampler', signal_name='_inference_workers_initialized'), args=() Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 228, in _game_init self.game.init() vizdoom.vizdoom.SignalException: Signal SIGINT received. ViZDoom instance has been closed. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 150, in init env_runner.init(self.timing) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 418, in init self._reset() File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 430, in _reset observations, info = e.reset(seed=seed) # new way of doing seeding since Gym 0.26.0 File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 125, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 110, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 30, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 379, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 84, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 51, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 323, in reset self._ensure_initialized() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 274, in _ensure_initialized self.initialize() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 269, in initialize self._game_init() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 244, in _game_init raise EnvCriticalError() sample_factory.envs.env_utils.EnvCriticalError [2023-02-22 16:47:14,542][17960] Unhandled exception in evt loop rollout_proc10_evt_loop [2023-02-22 16:47:14,490][15372] Keyboard interrupt detected in the event loop EvtLoop [Runner_EvtLoop, process=main process 15372], exiting... [2023-02-22 16:47:14,553][17928] Stopping Batcher_0... [2023-02-22 16:47:14,554][17928] Loop batcher_evt_loop terminating... [2023-02-22 16:47:14,555][17928] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000000_0.pth... [2023-02-22 16:47:14,552][15372] Runner profile tree view: main_loop: 55.8620 [2023-02-22 16:47:14,565][15372] Collected {0: 0}, FPS: 0.0 [2023-02-22 16:47:14,589][17928] Stopping LearnerWorker_p0... [2023-02-22 16:47:14,590][17928] Loop learner_proc0_evt_loop terminating... [2023-02-22 16:47:14,793][17959] VizDoom game.init() threw an exception SignalException('Signal SIGINT received. ViZDoom instance has been closed.'). Terminate process... [2023-02-22 16:47:14,853][17967] VizDoom game.init() threw an exception SignalException('Signal SIGINT received. ViZDoom instance has been closed.'). Terminate process... [2023-02-22 16:47:14,883][17961] VizDoom game.init() threw an exception SignalException('Signal SIGINT received. ViZDoom instance has been closed.'). Terminate process... [2023-02-22 16:47:14,957][17953] VizDoom game.init() threw an exception SignalException('Signal SIGINT received. ViZDoom instance has been closed.'). Terminate process... [2023-02-22 16:47:14,794][17959] EvtLoop [rollout_proc9_evt_loop, process=rollout_proc9] unhandled exception in slot='init' connected to emitter=Emitter(object_id='Sampler', signal_name='_inference_workers_initialized'), args=() Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 228, in _game_init self.game.init() vizdoom.vizdoom.SignalException: Signal SIGINT received. ViZDoom instance has been closed. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 150, in init env_runner.init(self.timing) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 418, in init self._reset() File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 430, in _reset observations, info = e.reset(seed=seed) # new way of doing seeding since Gym 0.26.0 File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 125, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 110, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 30, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 379, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 84, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 51, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 323, in reset self._ensure_initialized() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 274, in _ensure_initialized self.initialize() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 269, in initialize self._game_init() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 244, in _game_init raise EnvCriticalError() sample_factory.envs.env_utils.EnvCriticalError [2023-02-22 16:47:14,978][17959] Unhandled exception in evt loop rollout_proc9_evt_loop [2023-02-22 16:47:14,862][17967] EvtLoop [rollout_proc13_evt_loop, process=rollout_proc13] unhandled exception in slot='init' connected to emitter=Emitter(object_id='Sampler', signal_name='_inference_workers_initialized'), args=() Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 228, in _game_init self.game.init() vizdoom.vizdoom.SignalException: Signal SIGINT received. ViZDoom instance has been closed. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 150, in init env_runner.init(self.timing) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 418, in init self._reset() File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 430, in _reset observations, info = e.reset(seed=seed) # new way of doing seeding since Gym 0.26.0 File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 125, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 110, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 30, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 379, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 84, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 51, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 323, in reset self._ensure_initialized() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 274, in _ensure_initialized self.initialize() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 269, in initialize self._game_init() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 244, in _game_init raise EnvCriticalError() sample_factory.envs.env_utils.EnvCriticalError [2023-02-22 16:47:14,982][17967] Unhandled exception in evt loop rollout_proc13_evt_loop [2023-02-22 16:47:14,914][17955] EvtLoop [rollout_proc5_evt_loop, process=rollout_proc5] unhandled exception in slot='init' connected to emitter=Emitter(object_id='Sampler', signal_name='_inference_workers_initialized'), args=() Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 150, in init env_runner.init(self.timing) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 418, in init self._reset() File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 439, in _reset observations, rew, terminated, truncated, info = e.step(actions) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step return self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 129, in step obs, rew, terminated, truncated, info = self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 115, in step obs, rew, terminated, truncated, info = self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 33, in step observation, reward, terminated, truncated, info = self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 384, in step observation, reward, terminated, truncated, info = self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 88, in step obs, reward, terminated, truncated, info = self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step return self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 54, in step obs, reward, terminated, truncated, info = self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 452, in step reward = self.game.make_action(actions_flattened, self.skip_frames) vizdoom.vizdoom.SignalException: Signal SIGINT received. ViZDoom instance has been closed. [2023-02-22 16:47:15,011][17955] Unhandled exception Signal SIGINT received. ViZDoom instance has been closed. in evt loop rollout_proc5_evt_loop [2023-02-22 16:47:15,064][17958] Stopping RolloutWorker_w8... [2023-02-22 16:47:15,065][17958] Loop rollout_proc8_evt_loop terminating... [2023-02-22 16:47:15,081][17968] Stopping RolloutWorker_w14... [2023-02-22 16:47:15,081][17968] Loop rollout_proc14_evt_loop terminating... [2023-02-22 16:47:14,920][17961] EvtLoop [rollout_proc11_evt_loop, process=rollout_proc11] unhandled exception in slot='init' connected to emitter=Emitter(object_id='Sampler', signal_name='_inference_workers_initialized'), args=() Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 228, in _game_init self.game.init() vizdoom.vizdoom.SignalException: Signal SIGINT received. ViZDoom instance has been closed. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 150, in init env_runner.init(self.timing) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 418, in init self._reset() File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 430, in _reset observations, info = e.reset(seed=seed) # new way of doing seeding since Gym 0.26.0 File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 125, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 110, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 30, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 379, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 84, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 51, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 323, in reset self._ensure_initialized() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 274, in _ensure_initialized self.initialize() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 269, in initialize self._game_init() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 244, in _game_init raise EnvCriticalError() sample_factory.envs.env_utils.EnvCriticalError [2023-02-22 16:47:15,094][17962] Stopping RolloutWorker_w12... [2023-02-22 16:47:15,108][17962] Loop rollout_proc12_evt_loop terminating... [2023-02-22 16:47:15,094][17961] Unhandled exception in evt loop rollout_proc11_evt_loop [2023-02-22 16:47:14,961][17953] EvtLoop [rollout_proc1_evt_loop, process=rollout_proc1] unhandled exception in slot='init' connected to emitter=Emitter(object_id='Sampler', signal_name='_inference_workers_initialized'), args=() Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 228, in _game_init self.game.init() vizdoom.vizdoom.SignalException: Signal SIGINT received. ViZDoom instance has been closed. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 150, in init env_runner.init(self.timing) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 418, in init self._reset() File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 430, in _reset observations, info = e.reset(seed=seed) # new way of doing seeding since Gym 0.26.0 File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 125, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 110, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 30, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 379, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 84, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 51, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 323, in reset self._ensure_initialized() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 274, in _ensure_initialized self.initialize() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 269, in initialize self._game_init() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 244, in _game_init raise EnvCriticalError() sample_factory.envs.env_utils.EnvCriticalError [2023-02-22 16:47:15,178][17953] Unhandled exception in evt loop rollout_proc1_evt_loop [2023-02-22 16:47:17,862][17957] Decorrelating experience for 160 frames... [2023-02-22 16:47:17,865][17970] Decorrelating experience for 128 frames... [2023-02-22 16:47:19,322][17952] Decorrelating experience for 160 frames... [2023-02-22 16:47:19,532][17970] Decorrelating experience for 160 frames... [2023-02-22 16:47:19,602][17957] Decorrelating experience for 192 frames... [2023-02-22 16:47:20,489][17949] Weights refcount: 2 0 [2023-02-22 16:47:20,519][17970] Decorrelating experience for 192 frames... [2023-02-22 16:47:20,594][17957] Decorrelating experience for 224 frames... [2023-02-22 16:47:20,695][17949] Stopping InferenceWorker_p0-w0... [2023-02-22 16:47:20,698][17949] Loop inference_proc0-0_evt_loop terminating... [2023-02-22 16:47:21,002][17954] Decorrelating experience for 160 frames... [2023-02-22 16:47:21,120][17957] Stopping RolloutWorker_w7... [2023-02-22 16:47:21,121][17957] Loop rollout_proc7_evt_loop terminating... [2023-02-22 16:47:21,853][17950] Decorrelating experience for 224 frames... [2023-02-22 16:47:22,272][17950] Stopping RolloutWorker_w0... [2023-02-22 16:47:22,274][17950] Loop rollout_proc0_evt_loop terminating... [2023-02-22 16:47:22,297][17970] Decorrelating experience for 224 frames... [2023-02-22 16:47:22,563][17954] Decorrelating experience for 192 frames... [2023-02-22 16:47:22,611][17970] Stopping RolloutWorker_w15... [2023-02-22 16:47:22,613][17970] Loop rollout_proc15_evt_loop terminating... [2023-02-22 16:47:22,930][17952] Decorrelating experience for 192 frames... [2023-02-22 16:47:23,218][17954] Decorrelating experience for 224 frames... [2023-02-22 16:47:23,398][17954] Stopping RolloutWorker_w4... [2023-02-22 16:47:23,399][17954] Loop rollout_proc4_evt_loop terminating... [2023-02-22 16:47:23,593][17952] Decorrelating experience for 224 frames... [2023-02-22 16:47:23,813][17952] Stopping RolloutWorker_w3... [2023-02-22 16:47:23,815][17952] Loop rollout_proc3_evt_loop terminating... [2023-02-22 17:04:32,652][15372] Environment doom_basic already registered, overwriting... [2023-02-22 17:04:32,655][15372] Environment doom_two_colors_easy already registered, overwriting... [2023-02-22 17:04:32,656][15372] Environment doom_two_colors_hard already registered, overwriting... [2023-02-22 17:04:32,657][15372] Environment doom_dm already registered, overwriting... [2023-02-22 17:04:32,659][15372] Environment doom_dwango5 already registered, overwriting... [2023-02-22 17:04:32,662][15372] Environment doom_my_way_home_flat_actions already registered, overwriting... [2023-02-22 17:04:32,663][15372] Environment doom_defend_the_center_flat_actions already registered, overwriting... [2023-02-22 17:04:32,664][15372] Environment doom_my_way_home already registered, overwriting... [2023-02-22 17:04:32,666][15372] Environment doom_deadly_corridor already registered, overwriting... [2023-02-22 17:04:32,667][15372] Environment doom_defend_the_center already registered, overwriting... [2023-02-22 17:04:32,668][15372] Environment doom_defend_the_line already registered, overwriting... [2023-02-22 17:04:32,670][15372] Environment doom_health_gathering already registered, overwriting... [2023-02-22 17:04:32,671][15372] Environment doom_health_gathering_supreme already registered, overwriting... [2023-02-22 17:04:32,672][15372] Environment doom_battle already registered, overwriting... [2023-02-22 17:04:32,673][15372] Environment doom_battle2 already registered, overwriting... [2023-02-22 17:04:32,675][15372] Environment doom_duel_bots already registered, overwriting... [2023-02-22 17:04:32,676][15372] Environment doom_deathmatch_bots already registered, overwriting... [2023-02-22 17:04:32,677][15372] Environment doom_duel already registered, overwriting... [2023-02-22 17:04:32,679][15372] Environment doom_deathmatch_full already registered, overwriting... [2023-02-22 17:04:32,680][15372] Environment doom_benchmark already registered, overwriting... [2023-02-22 17:04:32,685][15372] register_encoder_factory: [2023-02-22 17:05:41,593][15372] Environment doom_basic already registered, overwriting... [2023-02-22 17:05:41,596][15372] Environment doom_two_colors_easy already registered, overwriting... [2023-02-22 17:05:41,601][15372] Environment doom_two_colors_hard already registered, overwriting... [2023-02-22 17:05:41,604][15372] Environment doom_dm already registered, overwriting... [2023-02-22 17:05:41,605][15372] Environment doom_dwango5 already registered, overwriting... [2023-02-22 17:05:41,606][15372] Environment doom_my_way_home_flat_actions already registered, overwriting... [2023-02-22 17:05:41,609][15372] Environment doom_defend_the_center_flat_actions already registered, overwriting... [2023-02-22 17:05:41,611][15372] Environment doom_my_way_home already registered, overwriting... [2023-02-22 17:05:41,615][15372] Environment doom_deadly_corridor already registered, overwriting... [2023-02-22 17:05:41,616][15372] Environment doom_defend_the_center already registered, overwriting... [2023-02-22 17:05:41,617][15372] Environment doom_defend_the_line already registered, overwriting... [2023-02-22 17:05:41,618][15372] Environment doom_health_gathering already registered, overwriting... [2023-02-22 17:05:41,619][15372] Environment doom_health_gathering_supreme already registered, overwriting... [2023-02-22 17:05:41,623][15372] Environment doom_battle already registered, overwriting... [2023-02-22 17:05:41,624][15372] Environment doom_battle2 already registered, overwriting... [2023-02-22 17:05:41,626][15372] Environment doom_duel_bots already registered, overwriting... [2023-02-22 17:05:41,628][15372] Environment doom_deathmatch_bots already registered, overwriting... [2023-02-22 17:05:41,630][15372] Environment doom_duel already registered, overwriting... [2023-02-22 17:05:41,631][15372] Environment doom_deathmatch_full already registered, overwriting... [2023-02-22 17:05:41,634][15372] Environment doom_benchmark already registered, overwriting... [2023-02-22 17:05:41,635][15372] register_encoder_factory: [2023-02-22 17:05:41,666][15372] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-22 17:05:41,667][15372] Overriding arg 'num_workers' with value 8 passed from command line [2023-02-22 17:05:41,669][15372] Overriding arg 'num_envs_per_worker' with value 4 passed from command line [2023-02-22 17:05:41,670][15372] Overriding arg 'num_epochs' with value 2 passed from command line [2023-02-22 17:05:41,671][15372] Overriding arg 'learning_rate' with value 0.0003 passed from command line [2023-02-22 17:05:41,677][15372] Experiment dir /content/train_dir/default_experiment already exists! [2023-02-22 17:05:41,680][15372] Resuming existing experiment from /content/train_dir/default_experiment... [2023-02-22 17:05:41,681][15372] Weights and Biases integration disabled [2023-02-22 17:05:41,684][15372] Environment var CUDA_VISIBLE_DEVICES is 0 [2023-02-22 17:05:43,760][15372] Starting experiment with the following configuration: help=False algo=APPO env=doom_health_gathering_supreme experiment=default_experiment train_dir=/content/train_dir restart_behavior=resume device=gpu seed=None num_policies=1 async_rl=True serial_mode=False batched_sampling=False num_batches_to_accumulate=2 worker_num_splits=2 policy_workers_per_policy=1 max_policy_lag=1000 num_workers=8 num_envs_per_worker=4 batch_size=1024 num_batches_per_epoch=1 num_epochs=2 rollout=32 recurrence=32 shuffle_minibatches=False gamma=0.99 reward_scale=1.0 reward_clip=1000.0 value_bootstrap=False normalize_returns=True exploration_loss_coeff=0.001 value_loss_coeff=0.5 kl_loss_coeff=0.0 exploration_loss=symmetric_kl gae_lambda=0.95 ppo_clip_ratio=0.1 ppo_clip_value=0.2 with_vtrace=False vtrace_rho=1.0 vtrace_c=1.0 optimizer=adam adam_eps=1e-06 adam_beta1=0.9 adam_beta2=0.999 max_grad_norm=4.0 learning_rate=0.0003 lr_schedule=constant lr_schedule_kl_threshold=0.008 lr_adaptive_min=1e-06 lr_adaptive_max=0.01 obs_subtract_mean=0.0 obs_scale=255.0 normalize_input=True normalize_input_keys=None decorrelate_experience_max_seconds=0 decorrelate_envs_on_one_worker=True actor_worker_gpus=[] set_workers_cpu_affinity=True force_envs_single_thread=False default_niceness=0 log_to_file=True experiment_summaries_interval=10 flush_summaries_interval=30 stats_avg=100 summaries_use_frameskip=True heartbeat_interval=20 heartbeat_reporting_interval=600 train_for_env_steps=12000000 train_for_seconds=10000000000 save_every_sec=120 keep_checkpoints=2 load_checkpoint_kind=latest save_milestones_sec=-1 save_best_every_sec=5 save_best_metric=reward save_best_after=100000 benchmark=False encoder_mlp_layers=[512, 512] encoder_conv_architecture=convnet_simple encoder_conv_mlp_layers=[512] use_rnn=True rnn_size=512 rnn_type=gru rnn_num_layers=1 decoder_mlp_layers=[] nonlinearity=elu policy_initialization=orthogonal policy_init_gain=1.0 actor_critic_share_weights=True adaptive_stddev=True continuous_tanh_scale=0.0 initial_stddev=1.0 use_env_info_cache=False env_gpu_actions=False env_gpu_observations=True env_frameskip=4 env_framestack=1 pixel_format=CHW use_record_episode_statistics=False with_wandb=False wandb_user=None wandb_project=sample_factory wandb_group=None wandb_job_type=SF wandb_tags=[] with_pbt=False pbt_mix_policies_in_one_env=True pbt_period_env_steps=5000000 pbt_start_mutation=20000000 pbt_replace_fraction=0.3 pbt_mutation_rate=0.15 pbt_replace_reward_gap=0.1 pbt_replace_reward_gap_absolute=1e-06 pbt_optimize_gamma=False pbt_target_objective=true_objective pbt_perturb_min=1.1 pbt_perturb_max=1.5 num_agents=-1 num_humans=0 num_bots=-1 start_bot_difficulty=None timelimit=None res_w=128 res_h=72 wide_aspect_ratio=False eval_env_frameskip=1 fps=35 command_line=--env=doom_health_gathering_supreme --num_workers=16 --num_envs_per_worker=8 --train_for_env_steps=12000000 cli_args={'env': 'doom_health_gathering_supreme', 'num_workers': 16, 'num_envs_per_worker': 8, 'train_for_env_steps': 12000000} git_hash=unknown git_repo_name=not a git repository [2023-02-22 17:05:43,762][15372] Saving configuration to /content/train_dir/default_experiment/config.json... [2023-02-22 17:05:43,769][15372] Rollout worker 0 uses device cpu [2023-02-22 17:05:43,770][15372] Rollout worker 1 uses device cpu [2023-02-22 17:05:43,772][15372] Rollout worker 2 uses device cpu [2023-02-22 17:05:43,775][15372] Rollout worker 3 uses device cpu [2023-02-22 17:05:43,778][15372] Rollout worker 4 uses device cpu [2023-02-22 17:05:43,780][15372] Rollout worker 5 uses device cpu [2023-02-22 17:05:43,782][15372] Rollout worker 6 uses device cpu [2023-02-22 17:05:43,783][15372] Rollout worker 7 uses device cpu [2023-02-22 17:05:43,928][15372] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 17:05:43,930][15372] InferenceWorker_p0-w0: min num requests: 2 [2023-02-22 17:05:43,967][15372] Starting all processes... [2023-02-22 17:05:43,972][15372] Starting process learner_proc0 [2023-02-22 17:05:44,029][15372] Starting all processes... [2023-02-22 17:05:44,038][15372] Starting process inference_proc0-0 [2023-02-22 17:05:44,039][15372] Starting process rollout_proc0 [2023-02-22 17:05:44,039][15372] Starting process rollout_proc1 [2023-02-22 17:05:44,039][15372] Starting process rollout_proc2 [2023-02-22 17:05:44,039][15372] Starting process rollout_proc3 [2023-02-22 17:05:44,039][15372] Starting process rollout_proc4 [2023-02-22 17:05:44,039][15372] Starting process rollout_proc5 [2023-02-22 17:05:44,039][15372] Starting process rollout_proc6 [2023-02-22 17:05:44,039][15372] Starting process rollout_proc7 [2023-02-22 17:05:53,839][33579] Worker 0 uses CPU cores [0] [2023-02-22 17:05:53,876][33564] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 17:05:53,880][33564] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-02-22 17:05:54,085][33585] Worker 6 uses CPU cores [0] [2023-02-22 17:05:54,139][33582] Worker 1 uses CPU cores [1] [2023-02-22 17:05:54,173][33584] Worker 4 uses CPU cores [0] [2023-02-22 17:05:54,376][33578] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 17:05:54,376][33578] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-02-22 17:05:54,521][33580] Worker 2 uses CPU cores [0] [2023-02-22 17:05:54,543][33581] Worker 3 uses CPU cores [1] [2023-02-22 17:05:54,590][33586] Worker 7 uses CPU cores [1] [2023-02-22 17:05:54,601][33583] Worker 5 uses CPU cores [1] [2023-02-22 17:05:54,940][33578] Num visible devices: 1 [2023-02-22 17:05:54,942][33564] Num visible devices: 1 [2023-02-22 17:05:54,958][33564] Starting seed is not provided [2023-02-22 17:05:54,959][33564] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 17:05:54,960][33564] Initializing actor-critic model on device cuda:0 [2023-02-22 17:05:54,960][33564] RunningMeanStd input shape: (3, 72, 128) [2023-02-22 17:05:54,962][33564] RunningMeanStd input shape: (1,) [2023-02-22 17:05:54,981][33564] ConvEncoder: input_channels=3 [2023-02-22 17:05:55,168][33564] Conv encoder output size: 512 [2023-02-22 17:05:55,169][33564] Policy head output size: 512 [2023-02-22 17:05:55,193][33564] Created Actor Critic model with architecture: [2023-02-22 17:05:55,194][33564] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) [2023-02-22 17:05:58,001][33564] Using optimizer [2023-02-22 17:05:58,003][33564] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000000_0.pth... [2023-02-22 17:05:58,015][33564] Loading model from checkpoint [2023-02-22 17:05:58,017][33564] Loaded experiment state at self.train_step=0, self.env_steps=0 [2023-02-22 17:05:58,017][33564] Initialized policy 0 weights for model version 0 [2023-02-22 17:05:58,022][33564] LearnerWorker_p0 finished initialization! [2023-02-22 17:05:58,024][33564] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 17:05:58,224][33578] RunningMeanStd input shape: (3, 72, 128) [2023-02-22 17:05:58,225][33578] RunningMeanStd input shape: (1,) [2023-02-22 17:05:58,237][33578] ConvEncoder: input_channels=3 [2023-02-22 17:05:58,331][33578] Conv encoder output size: 512 [2023-02-22 17:05:58,331][33578] Policy head output size: 512 [2023-02-22 17:06:00,445][15372] Inference worker 0-0 is ready! [2023-02-22 17:06:00,447][15372] All inference workers are ready! Signal rollout workers to start! [2023-02-22 17:06:00,545][33579] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 17:06:00,547][33584] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 17:06:00,551][33585] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 17:06:00,555][33586] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 17:06:00,548][33580] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 17:06:00,557][33582] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 17:06:00,556][33581] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 17:06:00,563][33583] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 17:06:01,671][33584] Decorrelating experience for 0 frames... [2023-02-22 17:06:01,676][33580] Decorrelating experience for 0 frames... [2023-02-22 17:06:01,670][33583] Decorrelating experience for 0 frames... [2023-02-22 17:06:01,674][33585] Decorrelating experience for 0 frames... [2023-02-22 17:06:01,673][33586] Decorrelating experience for 0 frames... [2023-02-22 17:06:01,676][33582] Decorrelating experience for 0 frames... [2023-02-22 17:06:01,684][15372] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-22 17:06:02,726][33583] Decorrelating experience for 32 frames... [2023-02-22 17:06:02,735][33581] Decorrelating experience for 0 frames... [2023-02-22 17:06:02,739][33586] Decorrelating experience for 32 frames... [2023-02-22 17:06:03,023][33584] Decorrelating experience for 32 frames... [2023-02-22 17:06:03,031][33580] Decorrelating experience for 32 frames... [2023-02-22 17:06:03,031][33585] Decorrelating experience for 32 frames... [2023-02-22 17:06:03,119][33579] Decorrelating experience for 0 frames... [2023-02-22 17:06:03,920][15372] Heartbeat connected on Batcher_0 [2023-02-22 17:06:03,927][15372] Heartbeat connected on LearnerWorker_p0 [2023-02-22 17:06:03,963][15372] Heartbeat connected on InferenceWorker_p0-w0 [2023-02-22 17:06:04,066][33582] Decorrelating experience for 32 frames... [2023-02-22 17:06:04,075][33581] Decorrelating experience for 32 frames... [2023-02-22 17:06:04,273][33584] Decorrelating experience for 64 frames... [2023-02-22 17:06:04,275][33580] Decorrelating experience for 64 frames... [2023-02-22 17:06:04,277][33585] Decorrelating experience for 64 frames... [2023-02-22 17:06:04,332][33586] Decorrelating experience for 64 frames... [2023-02-22 17:06:04,466][33583] Decorrelating experience for 64 frames... [2023-02-22 17:06:05,067][33579] Decorrelating experience for 32 frames... [2023-02-22 17:06:05,196][33584] Decorrelating experience for 96 frames... [2023-02-22 17:06:05,307][15372] Heartbeat connected on RolloutWorker_w4 [2023-02-22 17:06:05,390][33582] Decorrelating experience for 64 frames... [2023-02-22 17:06:05,525][33586] Decorrelating experience for 96 frames... [2023-02-22 17:06:05,681][33583] Decorrelating experience for 96 frames... [2023-02-22 17:06:05,756][15372] Heartbeat connected on RolloutWorker_w7 [2023-02-22 17:06:05,894][15372] Heartbeat connected on RolloutWorker_w5 [2023-02-22 17:06:06,086][33580] Decorrelating experience for 96 frames... [2023-02-22 17:06:06,239][15372] Heartbeat connected on RolloutWorker_w2 [2023-02-22 17:06:06,309][33581] Decorrelating experience for 64 frames... [2023-02-22 17:06:06,684][15372] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-22 17:06:07,095][33585] Decorrelating experience for 96 frames... [2023-02-22 17:06:07,590][15372] Heartbeat connected on RolloutWorker_w6 [2023-02-22 17:06:07,743][33581] Decorrelating experience for 96 frames... [2023-02-22 17:06:08,123][15372] Heartbeat connected on RolloutWorker_w3 [2023-02-22 17:06:10,693][33582] Decorrelating experience for 96 frames... [2023-02-22 17:06:11,444][15372] Heartbeat connected on RolloutWorker_w1 [2023-02-22 17:06:11,652][33564] Signal inference workers to stop experience collection... [2023-02-22 17:06:11,684][15372] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 153.4. Samples: 1534. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-22 17:06:11,690][15372] Avg episode reward: [(0, '2.642')] [2023-02-22 17:06:11,712][33578] InferenceWorker_p0-w0: stopping experience collection [2023-02-22 17:06:12,084][33579] Decorrelating experience for 64 frames... [2023-02-22 17:06:12,660][33579] Decorrelating experience for 96 frames... [2023-02-22 17:06:12,740][15372] Heartbeat connected on RolloutWorker_w0 [2023-02-22 17:06:13,989][33564] Signal inference workers to resume experience collection... [2023-02-22 17:06:13,992][33578] InferenceWorker_p0-w0: resuming experience collection [2023-02-22 17:06:16,684][15372] Fps is (10 sec: 1228.8, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 12288. Throughput: 0: 192.7. Samples: 2890. Policy #0 lag: (min: 3.0, avg: 3.5, max: 5.0) [2023-02-22 17:06:16,693][15372] Avg episode reward: [(0, '2.914')] [2023-02-22 17:06:18,692][33578] Updated weights for policy 0, policy_version 11 (0.0543) [2023-02-22 17:06:21,684][15372] Fps is (10 sec: 3686.4, 60 sec: 1843.2, 300 sec: 1843.2). Total num frames: 36864. Throughput: 0: 489.4. Samples: 9788. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 17:06:21,691][15372] Avg episode reward: [(0, '3.806')] [2023-02-22 17:06:22,970][33578] Updated weights for policy 0, policy_version 21 (0.0013) [2023-02-22 17:06:26,684][15372] Fps is (10 sec: 4505.7, 60 sec: 2293.8, 300 sec: 2293.8). Total num frames: 57344. Throughput: 0: 539.7. Samples: 13492. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:06:26,687][15372] Avg episode reward: [(0, '4.473')] [2023-02-22 17:06:28,784][33578] Updated weights for policy 0, policy_version 31 (0.0016) [2023-02-22 17:06:31,691][15372] Fps is (10 sec: 3684.0, 60 sec: 2457.1, 300 sec: 2457.1). Total num frames: 73728. Throughput: 0: 611.5. Samples: 18348. Policy #0 lag: (min: 1.0, avg: 2.4, max: 5.0) [2023-02-22 17:06:31,693][15372] Avg episode reward: [(0, '4.456')] [2023-02-22 17:06:35,170][33578] Updated weights for policy 0, policy_version 41 (0.0034) [2023-02-22 17:06:36,684][15372] Fps is (10 sec: 3276.8, 60 sec: 2574.6, 300 sec: 2574.6). Total num frames: 90112. Throughput: 0: 676.4. Samples: 23674. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:06:36,691][15372] Avg episode reward: [(0, '4.332')] [2023-02-22 17:06:39,215][33578] Updated weights for policy 0, policy_version 51 (0.0017) [2023-02-22 17:06:41,684][15372] Fps is (10 sec: 4098.7, 60 sec: 2867.2, 300 sec: 2867.2). Total num frames: 114688. Throughput: 0: 684.9. Samples: 27398. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:06:41,690][15372] Avg episode reward: [(0, '4.407')] [2023-02-22 17:06:41,699][33564] Saving new best policy, reward=4.407! [2023-02-22 17:06:43,963][33578] Updated weights for policy 0, policy_version 61 (0.0011) [2023-02-22 17:06:46,684][15372] Fps is (10 sec: 4505.5, 60 sec: 3003.7, 300 sec: 3003.7). Total num frames: 135168. Throughput: 0: 749.2. Samples: 33716. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 17:06:46,689][15372] Avg episode reward: [(0, '4.645')] [2023-02-22 17:06:46,696][33564] Saving new best policy, reward=4.645! [2023-02-22 17:06:50,405][33578] Updated weights for policy 0, policy_version 71 (0.0022) [2023-02-22 17:06:51,684][15372] Fps is (10 sec: 3276.8, 60 sec: 2949.1, 300 sec: 2949.1). Total num frames: 147456. Throughput: 0: 843.3. Samples: 37948. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) [2023-02-22 17:06:51,694][15372] Avg episode reward: [(0, '4.604')] [2023-02-22 17:06:56,544][33578] Updated weights for policy 0, policy_version 81 (0.0025) [2023-02-22 17:06:56,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3053.4, 300 sec: 3053.4). Total num frames: 167936. Throughput: 0: 851.2. Samples: 39840. Policy #0 lag: (min: 1.0, avg: 1.8, max: 4.0) [2023-02-22 17:06:56,690][15372] Avg episode reward: [(0, '4.557')] [2023-02-22 17:07:00,799][33578] Updated weights for policy 0, policy_version 91 (0.0018) [2023-02-22 17:07:01,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3208.5, 300 sec: 3208.5). Total num frames: 192512. Throughput: 0: 983.0. Samples: 47124. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:07:01,687][15372] Avg episode reward: [(0, '4.409')] [2023-02-22 17:07:05,510][33578] Updated weights for policy 0, policy_version 101 (0.0016) [2023-02-22 17:07:06,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3213.8). Total num frames: 208896. Throughput: 0: 974.3. Samples: 53630. Policy #0 lag: (min: 1.0, avg: 2.5, max: 5.0) [2023-02-22 17:07:06,690][15372] Avg episode reward: [(0, '4.537')] [2023-02-22 17:07:11,690][15372] Fps is (10 sec: 3275.0, 60 sec: 3754.3, 300 sec: 3218.0). Total num frames: 225280. Throughput: 0: 941.5. Samples: 55866. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:07:11,704][15372] Avg episode reward: [(0, '4.498')] [2023-02-22 17:07:12,076][33578] Updated weights for policy 0, policy_version 111 (0.0021) [2023-02-22 17:07:16,684][15372] Fps is (10 sec: 3686.3, 60 sec: 3891.2, 300 sec: 3276.8). Total num frames: 245760. Throughput: 0: 950.9. Samples: 61134. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:07:16,689][15372] Avg episode reward: [(0, '4.551')] [2023-02-22 17:07:16,945][33578] Updated weights for policy 0, policy_version 121 (0.0021) [2023-02-22 17:07:21,178][33578] Updated weights for policy 0, policy_version 131 (0.0013) [2023-02-22 17:07:21,684][15372] Fps is (10 sec: 4508.0, 60 sec: 3891.2, 300 sec: 3379.2). Total num frames: 270336. Throughput: 0: 995.9. Samples: 68490. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:07:21,692][15372] Avg episode reward: [(0, '5.048')] [2023-02-22 17:07:21,700][33564] Saving new best policy, reward=5.048! [2023-02-22 17:07:26,610][33578] Updated weights for policy 0, policy_version 141 (0.0023) [2023-02-22 17:07:26,689][15372] Fps is (10 sec: 4094.3, 60 sec: 3822.6, 300 sec: 3373.0). Total num frames: 286720. Throughput: 0: 988.5. Samples: 71884. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:07:26,691][15372] Avg episode reward: [(0, '5.004')] [2023-02-22 17:07:31,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3823.4, 300 sec: 3367.8). Total num frames: 303104. Throughput: 0: 945.9. Samples: 76280. Policy #0 lag: (min: 1.0, avg: 2.2, max: 3.0) [2023-02-22 17:07:31,695][15372] Avg episode reward: [(0, '5.091')] [2023-02-22 17:07:31,711][33564] Saving new best policy, reward=5.091! [2023-02-22 17:07:33,331][33578] Updated weights for policy 0, policy_version 151 (0.0019) [2023-02-22 17:07:36,684][15372] Fps is (10 sec: 3688.1, 60 sec: 3891.2, 300 sec: 3406.1). Total num frames: 323584. Throughput: 0: 984.9. Samples: 82268. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-02-22 17:07:36,691][15372] Avg episode reward: [(0, '5.520')] [2023-02-22 17:07:36,784][33564] Saving new best policy, reward=5.520! [2023-02-22 17:07:37,655][33578] Updated weights for policy 0, policy_version 161 (0.0016) [2023-02-22 17:07:41,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3481.6). Total num frames: 348160. Throughput: 0: 1023.7. Samples: 85908. Policy #0 lag: (min: 1.0, avg: 2.4, max: 5.0) [2023-02-22 17:07:41,688][15372] Avg episode reward: [(0, '5.807')] [2023-02-22 17:07:41,701][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000170_348160.pth... [2023-02-22 17:07:41,866][33564] Saving new best policy, reward=5.807! [2023-02-22 17:07:42,189][33578] Updated weights for policy 0, policy_version 171 (0.0011) [2023-02-22 17:07:46,690][15372] Fps is (10 sec: 4503.1, 60 sec: 3890.9, 300 sec: 3510.7). Total num frames: 368640. Throughput: 0: 1000.5. Samples: 92150. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) [2023-02-22 17:07:46,692][15372] Avg episode reward: [(0, '5.897')] [2023-02-22 17:07:46,696][33564] Saving new best policy, reward=5.897! [2023-02-22 17:07:47,668][33578] Updated weights for policy 0, policy_version 181 (0.0012) [2023-02-22 17:07:51,686][15372] Fps is (10 sec: 3276.2, 60 sec: 3891.1, 300 sec: 3462.9). Total num frames: 380928. Throughput: 0: 958.1. Samples: 96746. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:07:51,691][15372] Avg episode reward: [(0, '5.891')] [2023-02-22 17:07:53,900][33578] Updated weights for policy 0, policy_version 191 (0.0026) [2023-02-22 17:07:56,684][15372] Fps is (10 sec: 3688.4, 60 sec: 3959.5, 300 sec: 3526.1). Total num frames: 405504. Throughput: 0: 973.6. Samples: 99672. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:07:56,687][15372] Avg episode reward: [(0, '6.260')] [2023-02-22 17:07:56,691][33564] Saving new best policy, reward=6.260! [2023-02-22 17:07:58,068][33578] Updated weights for policy 0, policy_version 201 (0.0017) [2023-02-22 17:08:01,684][15372] Fps is (10 sec: 4916.2, 60 sec: 3959.5, 300 sec: 3584.0). Total num frames: 430080. Throughput: 0: 1019.7. Samples: 107022. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-02-22 17:08:01,691][15372] Avg episode reward: [(0, '6.733')] [2023-02-22 17:08:01,702][33564] Saving new best policy, reward=6.733! [2023-02-22 17:08:02,071][33578] Updated weights for policy 0, policy_version 211 (0.0031) [2023-02-22 17:08:06,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3571.7). Total num frames: 446464. Throughput: 0: 986.2. Samples: 112868. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:08:06,687][15372] Avg episode reward: [(0, '7.263')] [2023-02-22 17:08:06,689][33564] Saving new best policy, reward=7.263! [2023-02-22 17:08:08,524][33578] Updated weights for policy 0, policy_version 222 (0.0033) [2023-02-22 17:08:11,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3959.8, 300 sec: 3560.4). Total num frames: 462848. Throughput: 0: 960.8. Samples: 115116. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:08:11,691][15372] Avg episode reward: [(0, '7.364')] [2023-02-22 17:08:11,702][33564] Saving new best policy, reward=7.364! [2023-02-22 17:08:14,261][33578] Updated weights for policy 0, policy_version 232 (0.0024) [2023-02-22 17:08:16,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3580.2). Total num frames: 483328. Throughput: 0: 993.6. Samples: 120992. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:08:16,689][15372] Avg episode reward: [(0, '7.116')] [2023-02-22 17:08:18,282][33578] Updated weights for policy 0, policy_version 242 (0.0011) [2023-02-22 17:08:21,684][15372] Fps is (10 sec: 4915.2, 60 sec: 4027.7, 300 sec: 3657.1). Total num frames: 512000. Throughput: 0: 1024.7. Samples: 128380. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:08:21,687][15372] Avg episode reward: [(0, '7.606')] [2023-02-22 17:08:21,696][33564] Saving new best policy, reward=7.606! [2023-02-22 17:08:22,469][33578] Updated weights for policy 0, policy_version 252 (0.0023) [2023-02-22 17:08:26,685][15372] Fps is (10 sec: 4095.8, 60 sec: 3959.7, 300 sec: 3615.8). Total num frames: 524288. Throughput: 0: 1005.0. Samples: 131134. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:08:26,690][15372] Avg episode reward: [(0, '7.962')] [2023-02-22 17:08:26,809][33564] Saving new best policy, reward=7.962! [2023-02-22 17:08:29,548][33578] Updated weights for policy 0, policy_version 262 (0.0025) [2023-02-22 17:08:31,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3959.5, 300 sec: 3604.5). Total num frames: 540672. Throughput: 0: 964.9. Samples: 135566. Policy #0 lag: (min: 0.0, avg: 1.3, max: 4.0) [2023-02-22 17:08:31,687][15372] Avg episode reward: [(0, '8.406')] [2023-02-22 17:08:31,695][33564] Saving new best policy, reward=8.406! [2023-02-22 17:08:34,564][33578] Updated weights for policy 0, policy_version 272 (0.0026) [2023-02-22 17:08:36,684][15372] Fps is (10 sec: 4096.2, 60 sec: 4027.7, 300 sec: 3646.8). Total num frames: 565248. Throughput: 0: 1006.2. Samples: 142024. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:08:36,688][15372] Avg episode reward: [(0, '9.005')] [2023-02-22 17:08:36,691][33564] Saving new best policy, reward=9.005! [2023-02-22 17:08:38,907][33578] Updated weights for policy 0, policy_version 282 (0.0019) [2023-02-22 17:08:41,684][15372] Fps is (10 sec: 4915.2, 60 sec: 4027.7, 300 sec: 3686.4). Total num frames: 589824. Throughput: 0: 1022.9. Samples: 145702. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:08:41,687][15372] Avg episode reward: [(0, '9.033')] [2023-02-22 17:08:41,695][33564] Saving new best policy, reward=9.033! [2023-02-22 17:08:43,960][33578] Updated weights for policy 0, policy_version 292 (0.0011) [2023-02-22 17:08:46,687][15372] Fps is (10 sec: 3685.6, 60 sec: 3891.4, 300 sec: 3649.1). Total num frames: 602112. Throughput: 0: 985.3. Samples: 151362. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:08:46,694][15372] Avg episode reward: [(0, '9.024')] [2023-02-22 17:08:50,663][33578] Updated weights for policy 0, policy_version 302 (0.0025) [2023-02-22 17:08:51,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3959.6, 300 sec: 3638.2). Total num frames: 618496. Throughput: 0: 958.6. Samples: 156004. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:08:51,691][15372] Avg episode reward: [(0, '10.025')] [2023-02-22 17:08:51,813][33564] Saving new best policy, reward=10.025! [2023-02-22 17:08:55,084][33578] Updated weights for policy 0, policy_version 312 (0.0013) [2023-02-22 17:08:56,684][15372] Fps is (10 sec: 4096.9, 60 sec: 3959.5, 300 sec: 3674.7). Total num frames: 643072. Throughput: 0: 986.5. Samples: 159508. Policy #0 lag: (min: 0.0, avg: 1.3, max: 4.0) [2023-02-22 17:08:56,687][15372] Avg episode reward: [(0, '10.348')] [2023-02-22 17:08:56,752][33564] Saving new best policy, reward=10.348! [2023-02-22 17:08:59,443][33578] Updated weights for policy 0, policy_version 322 (0.0022) [2023-02-22 17:09:01,684][15372] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3709.2). Total num frames: 667648. Throughput: 0: 1017.8. Samples: 166794. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2023-02-22 17:09:01,688][15372] Avg episode reward: [(0, '11.479')] [2023-02-22 17:09:01,703][33564] Saving new best policy, reward=11.479! [2023-02-22 17:09:04,858][33578] Updated weights for policy 0, policy_version 332 (0.0022) [2023-02-22 17:09:06,690][15372] Fps is (10 sec: 4093.8, 60 sec: 3959.1, 300 sec: 3697.4). Total num frames: 684032. Throughput: 0: 965.8. Samples: 171846. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:09:06,692][15372] Avg episode reward: [(0, '11.202')] [2023-02-22 17:09:11,132][33578] Updated weights for policy 0, policy_version 342 (0.0025) [2023-02-22 17:09:11,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3686.4). Total num frames: 700416. Throughput: 0: 955.6. Samples: 174136. Policy #0 lag: (min: 0.0, avg: 0.6, max: 4.0) [2023-02-22 17:09:11,687][15372] Avg episode reward: [(0, '11.061')] [2023-02-22 17:09:15,601][33578] Updated weights for policy 0, policy_version 352 (0.0016) [2023-02-22 17:09:16,684][15372] Fps is (10 sec: 4098.2, 60 sec: 4027.7, 300 sec: 3717.9). Total num frames: 724992. Throughput: 0: 1001.5. Samples: 180634. Policy #0 lag: (min: 1.0, avg: 1.7, max: 5.0) [2023-02-22 17:09:16,692][15372] Avg episode reward: [(0, '10.790')] [2023-02-22 17:09:19,837][33578] Updated weights for policy 0, policy_version 362 (0.0015) [2023-02-22 17:09:21,684][15372] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3747.8). Total num frames: 749568. Throughput: 0: 1019.1. Samples: 187882. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:09:21,691][15372] Avg episode reward: [(0, '10.790')] [2023-02-22 17:09:25,519][33578] Updated weights for policy 0, policy_version 372 (0.0012) [2023-02-22 17:09:26,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3716.4). Total num frames: 761856. Throughput: 0: 986.4. Samples: 190088. Policy #0 lag: (min: 0.0, avg: 0.8, max: 4.0) [2023-02-22 17:09:26,687][15372] Avg episode reward: [(0, '11.258')] [2023-02-22 17:09:31,675][33578] Updated weights for policy 0, policy_version 382 (0.0030) [2023-02-22 17:09:31,684][15372] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 3725.4). Total num frames: 782336. Throughput: 0: 965.1. Samples: 194788. Policy #0 lag: (min: 1.0, avg: 1.7, max: 5.0) [2023-02-22 17:09:31,686][15372] Avg episode reward: [(0, '11.153')] [2023-02-22 17:09:35,890][33578] Updated weights for policy 0, policy_version 392 (0.0014) [2023-02-22 17:09:36,684][15372] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3753.1). Total num frames: 806912. Throughput: 0: 1021.8. Samples: 201984. Policy #0 lag: (min: 0.0, avg: 1.3, max: 4.0) [2023-02-22 17:09:36,686][15372] Avg episode reward: [(0, '13.570')] [2023-02-22 17:09:36,695][33564] Saving new best policy, reward=13.570! [2023-02-22 17:09:40,091][33578] Updated weights for policy 0, policy_version 402 (0.0011) [2023-02-22 17:09:41,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3760.9). Total num frames: 827392. Throughput: 0: 1024.0. Samples: 205590. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:09:41,691][15372] Avg episode reward: [(0, '14.267')] [2023-02-22 17:09:41,712][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000404_827392.pth... [2023-02-22 17:09:41,905][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000000_0.pth [2023-02-22 17:09:41,923][33564] Saving new best policy, reward=14.267! [2023-02-22 17:09:46,571][33578] Updated weights for policy 0, policy_version 412 (0.0014) [2023-02-22 17:09:46,685][15372] Fps is (10 sec: 3686.1, 60 sec: 4027.8, 300 sec: 3750.1). Total num frames: 843776. Throughput: 0: 973.0. Samples: 210578. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:09:46,687][15372] Avg episode reward: [(0, '14.177')] [2023-02-22 17:09:51,684][15372] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 3739.8). Total num frames: 860160. Throughput: 0: 974.3. Samples: 215684. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:09:51,690][15372] Avg episode reward: [(0, '13.873')] [2023-02-22 17:09:52,322][33578] Updated weights for policy 0, policy_version 422 (0.0011) [2023-02-22 17:09:56,388][33578] Updated weights for policy 0, policy_version 432 (0.0011) [2023-02-22 17:09:56,684][15372] Fps is (10 sec: 4096.2, 60 sec: 4027.7, 300 sec: 3764.8). Total num frames: 884736. Throughput: 0: 1005.7. Samples: 219394. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:09:56,692][15372] Avg episode reward: [(0, '12.604')] [2023-02-22 17:10:00,528][33578] Updated weights for policy 0, policy_version 442 (0.0013) [2023-02-22 17:10:01,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3771.7). Total num frames: 905216. Throughput: 0: 1025.5. Samples: 226782. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:10:01,691][15372] Avg episode reward: [(0, '13.080')] [2023-02-22 17:10:06,688][15372] Fps is (10 sec: 3684.9, 60 sec: 3959.5, 300 sec: 3761.6). Total num frames: 921600. Throughput: 0: 966.4. Samples: 231374. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:10:06,695][15372] Avg episode reward: [(0, '13.104')] [2023-02-22 17:10:07,323][33578] Updated weights for policy 0, policy_version 452 (0.0011) [2023-02-22 17:10:11,684][15372] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3768.3). Total num frames: 942080. Throughput: 0: 969.2. Samples: 233700. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:10:11,695][15372] Avg episode reward: [(0, '14.213')] [2023-02-22 17:10:12,465][33578] Updated weights for policy 0, policy_version 462 (0.0018) [2023-02-22 17:10:16,616][33578] Updated weights for policy 0, policy_version 472 (0.0014) [2023-02-22 17:10:16,684][15372] Fps is (10 sec: 4507.5, 60 sec: 4027.7, 300 sec: 3790.8). Total num frames: 966656. Throughput: 0: 1025.9. Samples: 240952. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:10:16,691][15372] Avg episode reward: [(0, '15.823')] [2023-02-22 17:10:16,694][33564] Saving new best policy, reward=15.823! [2023-02-22 17:10:21,246][33578] Updated weights for policy 0, policy_version 482 (0.0013) [2023-02-22 17:10:21,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3796.7). Total num frames: 987136. Throughput: 0: 1013.9. Samples: 247610. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:10:21,687][15372] Avg episode reward: [(0, '16.600')] [2023-02-22 17:10:21,697][33564] Saving new best policy, reward=16.600! [2023-02-22 17:10:26,688][15372] Fps is (10 sec: 3684.9, 60 sec: 4027.5, 300 sec: 3786.8). Total num frames: 1003520. Throughput: 0: 984.2. Samples: 249882. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-02-22 17:10:26,691][15372] Avg episode reward: [(0, '16.526')] [2023-02-22 17:10:27,995][33578] Updated weights for policy 0, policy_version 492 (0.0023) [2023-02-22 17:10:31,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3777.4). Total num frames: 1019904. Throughput: 0: 982.4. Samples: 254786. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:10:31,690][15372] Avg episode reward: [(0, '15.454')] [2023-02-22 17:10:32,690][33578] Updated weights for policy 0, policy_version 502 (0.0020) [2023-02-22 17:10:36,684][15372] Fps is (10 sec: 4097.7, 60 sec: 3959.5, 300 sec: 3798.1). Total num frames: 1044480. Throughput: 0: 1033.8. Samples: 262204. Policy #0 lag: (min: 0.0, avg: 0.9, max: 4.0) [2023-02-22 17:10:36,689][15372] Avg episode reward: [(0, '15.464')] [2023-02-22 17:10:36,861][33578] Updated weights for policy 0, policy_version 512 (0.0020) [2023-02-22 17:10:41,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3803.4). Total num frames: 1064960. Throughput: 0: 1033.6. Samples: 265906. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:10:41,687][15372] Avg episode reward: [(0, '16.200')] [2023-02-22 17:10:42,045][33578] Updated weights for policy 0, policy_version 522 (0.0012) [2023-02-22 17:10:46,690][15372] Fps is (10 sec: 3684.4, 60 sec: 3959.1, 300 sec: 3794.1). Total num frames: 1081344. Throughput: 0: 975.0. Samples: 270662. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:10:46,692][15372] Avg episode reward: [(0, '16.389')] [2023-02-22 17:10:48,387][33578] Updated weights for policy 0, policy_version 532 (0.0028) [2023-02-22 17:10:51,684][15372] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3799.4). Total num frames: 1101824. Throughput: 0: 996.6. Samples: 276216. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:10:51,690][15372] Avg episode reward: [(0, '16.637')] [2023-02-22 17:10:51,699][33564] Saving new best policy, reward=16.637! [2023-02-22 17:10:52,929][33578] Updated weights for policy 0, policy_version 542 (0.0018) [2023-02-22 17:10:56,685][15372] Fps is (10 sec: 4507.9, 60 sec: 4027.7, 300 sec: 3818.3). Total num frames: 1126400. Throughput: 0: 1024.3. Samples: 279794. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:10:56,690][15372] Avg episode reward: [(0, '16.887')] [2023-02-22 17:10:56,695][33564] Saving new best policy, reward=16.887! [2023-02-22 17:10:57,315][33578] Updated weights for policy 0, policy_version 552 (0.0020) [2023-02-22 17:11:01,685][15372] Fps is (10 sec: 4505.5, 60 sec: 4027.7, 300 sec: 3887.7). Total num frames: 1146880. Throughput: 0: 1016.5. Samples: 286694. Policy #0 lag: (min: 1.0, avg: 2.4, max: 5.0) [2023-02-22 17:11:01,692][15372] Avg episode reward: [(0, '16.588')] [2023-02-22 17:11:02,675][33578] Updated weights for policy 0, policy_version 562 (0.0011) [2023-02-22 17:11:06,684][15372] Fps is (10 sec: 3686.5, 60 sec: 4028.0, 300 sec: 3943.3). Total num frames: 1163264. Throughput: 0: 972.4. Samples: 291370. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:11:06,695][15372] Avg episode reward: [(0, '15.936')] [2023-02-22 17:11:09,150][33578] Updated weights for policy 0, policy_version 572 (0.0022) [2023-02-22 17:11:11,684][15372] Fps is (10 sec: 3686.5, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 1183744. Throughput: 0: 979.2. Samples: 293940. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:11:11,692][15372] Avg episode reward: [(0, '16.113')] [2023-02-22 17:11:13,237][33578] Updated weights for policy 0, policy_version 582 (0.0013) [2023-02-22 17:11:16,684][15372] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 1208320. Throughput: 0: 1035.4. Samples: 301378. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:11:16,692][15372] Avg episode reward: [(0, '17.125')] [2023-02-22 17:11:16,696][33564] Saving new best policy, reward=17.125! [2023-02-22 17:11:17,518][33578] Updated weights for policy 0, policy_version 592 (0.0033) [2023-02-22 17:11:21,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 1224704. Throughput: 0: 1005.3. Samples: 307442. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:11:21,687][15372] Avg episode reward: [(0, '16.858')] [2023-02-22 17:11:23,408][33578] Updated weights for policy 0, policy_version 602 (0.0013) [2023-02-22 17:11:26,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3959.7, 300 sec: 3957.2). Total num frames: 1241088. Throughput: 0: 975.4. Samples: 309798. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:11:26,694][15372] Avg episode reward: [(0, '16.757')] [2023-02-22 17:11:29,341][33578] Updated weights for policy 0, policy_version 612 (0.0014) [2023-02-22 17:11:31,684][15372] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 1261568. Throughput: 0: 993.2. Samples: 315352. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:11:31,691][15372] Avg episode reward: [(0, '18.026')] [2023-02-22 17:11:31,704][33564] Saving new best policy, reward=18.026! [2023-02-22 17:11:33,523][33578] Updated weights for policy 0, policy_version 622 (0.0011) [2023-02-22 17:11:36,684][15372] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 1286144. Throughput: 0: 1034.0. Samples: 322746. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:11:36,686][15372] Avg episode reward: [(0, '18.198')] [2023-02-22 17:11:36,691][33564] Saving new best policy, reward=18.198! [2023-02-22 17:11:37,622][33578] Updated weights for policy 0, policy_version 632 (0.0016) [2023-02-22 17:11:41,690][15372] Fps is (10 sec: 4503.2, 60 sec: 4027.4, 300 sec: 3971.0). Total num frames: 1306624. Throughput: 0: 1024.8. Samples: 325914. Policy #0 lag: (min: 0.0, avg: 0.9, max: 4.0) [2023-02-22 17:11:41,698][15372] Avg episode reward: [(0, '18.897')] [2023-02-22 17:11:41,713][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000638_1306624.pth... [2023-02-22 17:11:41,846][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000170_348160.pth [2023-02-22 17:11:41,865][33564] Saving new best policy, reward=18.897! [2023-02-22 17:11:43,966][33578] Updated weights for policy 0, policy_version 642 (0.0024) [2023-02-22 17:11:46,685][15372] Fps is (10 sec: 3276.6, 60 sec: 3959.8, 300 sec: 3971.0). Total num frames: 1318912. Throughput: 0: 970.3. Samples: 330360. Policy #0 lag: (min: 0.0, avg: 0.9, max: 4.0) [2023-02-22 17:11:46,689][15372] Avg episode reward: [(0, '18.406')] [2023-02-22 17:11:49,752][33578] Updated weights for policy 0, policy_version 652 (0.0017) [2023-02-22 17:11:51,684][15372] Fps is (10 sec: 3688.3, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 1343488. Throughput: 0: 1003.6. Samples: 336534. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:11:51,687][15372] Avg episode reward: [(0, '18.609')] [2023-02-22 17:11:53,939][33578] Updated weights for policy 0, policy_version 662 (0.0013) [2023-02-22 17:11:56,684][15372] Fps is (10 sec: 4505.9, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 1363968. Throughput: 0: 1024.3. Samples: 340032. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:11:56,690][15372] Avg episode reward: [(0, '17.829')] [2023-02-22 17:12:00,779][33578] Updated weights for policy 0, policy_version 672 (0.0057) [2023-02-22 17:12:01,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3957.2). Total num frames: 1376256. Throughput: 0: 956.0. Samples: 344396. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:12:01,687][15372] Avg episode reward: [(0, '18.775')] [2023-02-22 17:12:06,686][15372] Fps is (10 sec: 2457.2, 60 sec: 3754.6, 300 sec: 3943.3). Total num frames: 1388544. Throughput: 0: 905.1. Samples: 348172. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:12:06,688][15372] Avg episode reward: [(0, '18.780')] [2023-02-22 17:12:08,407][33578] Updated weights for policy 0, policy_version 682 (0.0030) [2023-02-22 17:12:11,685][15372] Fps is (10 sec: 3276.5, 60 sec: 3754.6, 300 sec: 3943.3). Total num frames: 1409024. Throughput: 0: 906.2. Samples: 350580. Policy #0 lag: (min: 1.0, avg: 2.1, max: 3.0) [2023-02-22 17:12:11,688][15372] Avg episode reward: [(0, '18.758')] [2023-02-22 17:12:13,302][33578] Updated weights for policy 0, policy_version 692 (0.0016) [2023-02-22 17:12:16,684][15372] Fps is (10 sec: 4506.2, 60 sec: 3754.7, 300 sec: 3943.3). Total num frames: 1433600. Throughput: 0: 940.9. Samples: 357694. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:12:16,692][15372] Avg episode reward: [(0, '19.010')] [2023-02-22 17:12:16,695][33564] Saving new best policy, reward=19.010! [2023-02-22 17:12:17,434][33578] Updated weights for policy 0, policy_version 702 (0.0013) [2023-02-22 17:12:21,684][15372] Fps is (10 sec: 4506.1, 60 sec: 3822.9, 300 sec: 3957.2). Total num frames: 1454080. Throughput: 0: 927.4. Samples: 364480. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:12:21,689][15372] Avg episode reward: [(0, '18.890')] [2023-02-22 17:12:22,289][33578] Updated weights for policy 0, policy_version 712 (0.0019) [2023-02-22 17:12:26,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3957.2). Total num frames: 1470464. Throughput: 0: 908.1. Samples: 366774. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) [2023-02-22 17:12:26,689][15372] Avg episode reward: [(0, '18.978')] [2023-02-22 17:12:29,061][33578] Updated weights for policy 0, policy_version 722 (0.0012) [2023-02-22 17:12:31,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3943.3). Total num frames: 1486848. Throughput: 0: 919.7. Samples: 371748. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:12:31,691][15372] Avg episode reward: [(0, '18.963')] [2023-02-22 17:12:33,464][33578] Updated weights for policy 0, policy_version 732 (0.0011) [2023-02-22 17:12:36,686][15372] Fps is (10 sec: 4095.4, 60 sec: 3754.6, 300 sec: 3943.2). Total num frames: 1511424. Throughput: 0: 947.8. Samples: 379186. Policy #0 lag: (min: 1.0, avg: 2.2, max: 4.0) [2023-02-22 17:12:36,688][15372] Avg episode reward: [(0, '18.823')] [2023-02-22 17:12:37,664][33578] Updated weights for policy 0, policy_version 742 (0.0016) [2023-02-22 17:12:41,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3755.0, 300 sec: 3943.3). Total num frames: 1531904. Throughput: 0: 950.4. Samples: 382800. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:12:41,692][15372] Avg episode reward: [(0, '18.223')] [2023-02-22 17:12:43,226][33578] Updated weights for policy 0, policy_version 752 (0.0026) [2023-02-22 17:12:46,687][15372] Fps is (10 sec: 3686.0, 60 sec: 3822.8, 300 sec: 3957.1). Total num frames: 1548288. Throughput: 0: 957.5. Samples: 387484. Policy #0 lag: (min: 0.0, avg: 1.4, max: 4.0) [2023-02-22 17:12:46,689][15372] Avg episode reward: [(0, '18.361')] [2023-02-22 17:12:49,691][33578] Updated weights for policy 0, policy_version 762 (0.0018) [2023-02-22 17:12:51,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3943.3). Total num frames: 1568768. Throughput: 0: 998.0. Samples: 393082. Policy #0 lag: (min: 1.0, avg: 2.4, max: 5.0) [2023-02-22 17:12:51,692][15372] Avg episode reward: [(0, '17.312')] [2023-02-22 17:12:53,818][33578] Updated weights for policy 0, policy_version 772 (0.0015) [2023-02-22 17:12:56,684][15372] Fps is (10 sec: 4506.8, 60 sec: 3822.9, 300 sec: 3943.3). Total num frames: 1593344. Throughput: 0: 1026.6. Samples: 396774. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:12:56,689][15372] Avg episode reward: [(0, '19.149')] [2023-02-22 17:12:56,691][33564] Saving new best policy, reward=19.149! [2023-02-22 17:12:57,922][33578] Updated weights for policy 0, policy_version 782 (0.0012) [2023-02-22 17:13:01,687][15372] Fps is (10 sec: 4504.2, 60 sec: 3959.3, 300 sec: 3957.1). Total num frames: 1613824. Throughput: 0: 1018.0. Samples: 403508. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:13:01,690][15372] Avg episode reward: [(0, '20.361')] [2023-02-22 17:13:01,701][33564] Saving new best policy, reward=20.361! [2023-02-22 17:13:03,934][33578] Updated weights for policy 0, policy_version 792 (0.0019) [2023-02-22 17:13:06,688][15372] Fps is (10 sec: 3685.0, 60 sec: 4027.6, 300 sec: 3957.1). Total num frames: 1630208. Throughput: 0: 968.5. Samples: 408068. Policy #0 lag: (min: 0.0, avg: 1.3, max: 4.0) [2023-02-22 17:13:06,699][15372] Avg episode reward: [(0, '21.555')] [2023-02-22 17:13:06,706][33564] Saving new best policy, reward=21.555! [2023-02-22 17:13:10,067][33578] Updated weights for policy 0, policy_version 802 (0.0016) [2023-02-22 17:13:11,684][15372] Fps is (10 sec: 3277.8, 60 sec: 3959.5, 300 sec: 3943.3). Total num frames: 1646592. Throughput: 0: 971.0. Samples: 410468. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:13:11,687][15372] Avg episode reward: [(0, '21.031')] [2023-02-22 17:13:14,380][33578] Updated weights for policy 0, policy_version 812 (0.0011) [2023-02-22 17:13:16,684][15372] Fps is (10 sec: 4097.5, 60 sec: 3959.5, 300 sec: 3929.4). Total num frames: 1671168. Throughput: 0: 1023.1. Samples: 417788. Policy #0 lag: (min: 1.0, avg: 2.3, max: 4.0) [2023-02-22 17:13:16,686][15372] Avg episode reward: [(0, '19.696')] [2023-02-22 17:13:18,556][33578] Updated weights for policy 0, policy_version 822 (0.0011) [2023-02-22 17:13:21,685][15372] Fps is (10 sec: 4505.5, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 1691648. Throughput: 0: 997.5. Samples: 424074. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:13:21,687][15372] Avg episode reward: [(0, '19.688')] [2023-02-22 17:13:24,510][33578] Updated weights for policy 0, policy_version 832 (0.0011) [2023-02-22 17:13:26,684][15372] Fps is (10 sec: 3686.3, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 1708032. Throughput: 0: 968.5. Samples: 426384. Policy #0 lag: (min: 1.0, avg: 1.8, max: 5.0) [2023-02-22 17:13:26,693][15372] Avg episode reward: [(0, '19.578')] [2023-02-22 17:13:30,336][33578] Updated weights for policy 0, policy_version 842 (0.0030) [2023-02-22 17:13:31,684][15372] Fps is (10 sec: 3686.5, 60 sec: 4027.7, 300 sec: 3943.3). Total num frames: 1728512. Throughput: 0: 985.3. Samples: 431822. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:13:31,690][15372] Avg episode reward: [(0, '20.592')] [2023-02-22 17:13:34,480][33578] Updated weights for policy 0, policy_version 852 (0.0012) [2023-02-22 17:13:36,684][15372] Fps is (10 sec: 4505.7, 60 sec: 4027.8, 300 sec: 3943.3). Total num frames: 1753088. Throughput: 0: 1026.4. Samples: 439272. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:13:36,687][15372] Avg episode reward: [(0, '22.971')] [2023-02-22 17:13:36,695][33564] Saving new best policy, reward=22.971! [2023-02-22 17:13:38,679][33578] Updated weights for policy 0, policy_version 862 (0.0016) [2023-02-22 17:13:41,686][15372] Fps is (10 sec: 4504.9, 60 sec: 4027.6, 300 sec: 3971.0). Total num frames: 1773568. Throughput: 0: 1016.5. Samples: 442516. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-22 17:13:41,692][15372] Avg episode reward: [(0, '23.735')] [2023-02-22 17:13:41,704][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000866_1773568.pth... [2023-02-22 17:13:41,836][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000404_827392.pth [2023-02-22 17:13:41,849][33564] Saving new best policy, reward=23.735! [2023-02-22 17:13:45,372][33578] Updated weights for policy 0, policy_version 872 (0.0013) [2023-02-22 17:13:46,689][15372] Fps is (10 sec: 3275.3, 60 sec: 3959.3, 300 sec: 3957.1). Total num frames: 1785856. Throughput: 0: 967.2. Samples: 447034. Policy #0 lag: (min: 0.0, avg: 1.1, max: 2.0) [2023-02-22 17:13:46,691][15372] Avg episode reward: [(0, '22.771')] [2023-02-22 17:13:50,601][33578] Updated weights for policy 0, policy_version 882 (0.0027) [2023-02-22 17:13:51,684][15372] Fps is (10 sec: 3686.9, 60 sec: 4027.7, 300 sec: 3957.2). Total num frames: 1810432. Throughput: 0: 1003.5. Samples: 453220. Policy #0 lag: (min: 0.0, avg: 0.9, max: 4.0) [2023-02-22 17:13:51,687][15372] Avg episode reward: [(0, '21.068')] [2023-02-22 17:13:54,742][33578] Updated weights for policy 0, policy_version 892 (0.0024) [2023-02-22 17:13:56,684][15372] Fps is (10 sec: 4917.5, 60 sec: 4027.7, 300 sec: 3957.2). Total num frames: 1835008. Throughput: 0: 1033.8. Samples: 456990. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:13:56,692][15372] Avg episode reward: [(0, '18.500')] [2023-02-22 17:13:59,297][33578] Updated weights for policy 0, policy_version 902 (0.0024) [2023-02-22 17:14:01,692][15372] Fps is (10 sec: 4092.9, 60 sec: 3959.2, 300 sec: 3957.1). Total num frames: 1851392. Throughput: 0: 1012.8. Samples: 463372. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:14:01,698][15372] Avg episode reward: [(0, '18.703')] [2023-02-22 17:14:05,936][33578] Updated weights for policy 0, policy_version 912 (0.0032) [2023-02-22 17:14:06,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3959.7, 300 sec: 3957.2). Total num frames: 1867776. Throughput: 0: 975.2. Samples: 467960. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:14:06,694][15372] Avg episode reward: [(0, '18.816')] [2023-02-22 17:14:10,824][33578] Updated weights for policy 0, policy_version 922 (0.0031) [2023-02-22 17:14:11,684][15372] Fps is (10 sec: 4099.1, 60 sec: 4096.0, 300 sec: 3957.2). Total num frames: 1892352. Throughput: 0: 994.0. Samples: 471112. Policy #0 lag: (min: 1.0, avg: 2.2, max: 3.0) [2023-02-22 17:14:11,693][15372] Avg episode reward: [(0, '21.283')] [2023-02-22 17:14:15,047][33578] Updated weights for policy 0, policy_version 932 (0.0016) [2023-02-22 17:14:16,684][15372] Fps is (10 sec: 4915.3, 60 sec: 4096.0, 300 sec: 3957.2). Total num frames: 1916928. Throughput: 0: 1034.9. Samples: 478392. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:14:16,686][15372] Avg episode reward: [(0, '21.564')] [2023-02-22 17:14:19,788][33578] Updated weights for policy 0, policy_version 942 (0.0018) [2023-02-22 17:14:21,684][15372] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 1933312. Throughput: 0: 994.0. Samples: 484002. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:14:21,687][15372] Avg episode reward: [(0, '22.339')] [2023-02-22 17:14:26,685][15372] Fps is (10 sec: 2866.9, 60 sec: 3959.4, 300 sec: 3943.3). Total num frames: 1945600. Throughput: 0: 970.7. Samples: 486198. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-22 17:14:26,690][15372] Avg episode reward: [(0, '22.581')] [2023-02-22 17:14:26,830][33578] Updated weights for policy 0, policy_version 952 (0.0022) [2023-02-22 17:14:31,355][33578] Updated weights for policy 0, policy_version 962 (0.0011) [2023-02-22 17:14:31,684][15372] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3943.3). Total num frames: 1970176. Throughput: 0: 1000.7. Samples: 492062. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:14:31,688][15372] Avg episode reward: [(0, '22.724')] [2023-02-22 17:14:35,542][33578] Updated weights for policy 0, policy_version 972 (0.0019) [2023-02-22 17:14:36,684][15372] Fps is (10 sec: 4915.7, 60 sec: 4027.7, 300 sec: 3957.2). Total num frames: 1994752. Throughput: 0: 1029.1. Samples: 499530. Policy #0 lag: (min: 0.0, avg: 1.3, max: 4.0) [2023-02-22 17:14:36,686][15372] Avg episode reward: [(0, '21.050')] [2023-02-22 17:14:40,861][33578] Updated weights for policy 0, policy_version 982 (0.0011) [2023-02-22 17:14:41,692][15372] Fps is (10 sec: 4092.9, 60 sec: 3959.1, 300 sec: 3957.1). Total num frames: 2011136. Throughput: 0: 1006.8. Samples: 502304. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:14:41,699][15372] Avg episode reward: [(0, '22.087')] [2023-02-22 17:14:46,685][15372] Fps is (10 sec: 3276.6, 60 sec: 4028.0, 300 sec: 3957.1). Total num frames: 2027520. Throughput: 0: 965.7. Samples: 506824. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-02-22 17:14:46,690][15372] Avg episode reward: [(0, '22.835')] [2023-02-22 17:14:47,573][33578] Updated weights for policy 0, policy_version 992 (0.0031) [2023-02-22 17:14:51,684][15372] Fps is (10 sec: 3689.2, 60 sec: 3959.5, 300 sec: 3943.3). Total num frames: 2048000. Throughput: 0: 1007.5. Samples: 513296. Policy #0 lag: (min: 1.0, avg: 2.3, max: 4.0) [2023-02-22 17:14:51,694][15372] Avg episode reward: [(0, '23.456')] [2023-02-22 17:14:51,763][33578] Updated weights for policy 0, policy_version 1002 (0.0023) [2023-02-22 17:14:55,891][33578] Updated weights for policy 0, policy_version 1012 (0.0015) [2023-02-22 17:14:56,684][15372] Fps is (10 sec: 4505.9, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 2072576. Throughput: 0: 1017.2. Samples: 516888. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:14:56,689][15372] Avg episode reward: [(0, '22.993')] [2023-02-22 17:15:01,633][33578] Updated weights for policy 0, policy_version 1022 (0.0026) [2023-02-22 17:15:01,686][15372] Fps is (10 sec: 4504.9, 60 sec: 4028.1, 300 sec: 3971.1). Total num frames: 2093056. Throughput: 0: 989.4. Samples: 522916. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:15:01,693][15372] Avg episode reward: [(0, '22.790')] [2023-02-22 17:15:06,686][15372] Fps is (10 sec: 3276.1, 60 sec: 3959.3, 300 sec: 3943.2). Total num frames: 2105344. Throughput: 0: 969.9. Samples: 527648. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:15:06,691][15372] Avg episode reward: [(0, '24.057')] [2023-02-22 17:15:06,771][33564] Saving new best policy, reward=24.057! [2023-02-22 17:15:07,741][33578] Updated weights for policy 0, policy_version 1032 (0.0039) [2023-02-22 17:15:11,684][15372] Fps is (10 sec: 3687.0, 60 sec: 3959.5, 300 sec: 3943.3). Total num frames: 2129920. Throughput: 0: 996.6. Samples: 531046. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:15:11,691][15372] Avg episode reward: [(0, '22.827')] [2023-02-22 17:15:11,853][33578] Updated weights for policy 0, policy_version 1042 (0.0011) [2023-02-22 17:15:15,998][33578] Updated weights for policy 0, policy_version 1052 (0.0012) [2023-02-22 17:15:16,684][15372] Fps is (10 sec: 4916.3, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 2154496. Throughput: 0: 1034.0. Samples: 538594. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:15:16,687][15372] Avg episode reward: [(0, '22.388')] [2023-02-22 17:15:21,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 2170880. Throughput: 0: 987.1. Samples: 543950. Policy #0 lag: (min: 0.0, avg: 0.9, max: 4.0) [2023-02-22 17:15:21,691][15372] Avg episode reward: [(0, '22.954')] [2023-02-22 17:15:22,001][33578] Updated weights for policy 0, policy_version 1062 (0.0012) [2023-02-22 17:15:26,684][15372] Fps is (10 sec: 3276.8, 60 sec: 4027.8, 300 sec: 3957.2). Total num frames: 2187264. Throughput: 0: 977.9. Samples: 546302. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:15:26,687][15372] Avg episode reward: [(0, '23.099')] [2023-02-22 17:15:27,889][33578] Updated weights for policy 0, policy_version 1072 (0.0026) [2023-02-22 17:15:31,684][15372] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3957.2). Total num frames: 2211840. Throughput: 0: 1020.4. Samples: 552742. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:15:31,687][15372] Avg episode reward: [(0, '23.342')] [2023-02-22 17:15:31,963][33578] Updated weights for policy 0, policy_version 1082 (0.0013) [2023-02-22 17:15:36,053][33578] Updated weights for policy 0, policy_version 1092 (0.0011) [2023-02-22 17:15:36,684][15372] Fps is (10 sec: 4915.2, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 2236416. Throughput: 0: 1043.2. Samples: 560238. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:15:36,687][15372] Avg episode reward: [(0, '23.238')] [2023-02-22 17:15:41,684][15372] Fps is (10 sec: 4095.9, 60 sec: 4028.2, 300 sec: 3971.1). Total num frames: 2252800. Throughput: 0: 1016.1. Samples: 562614. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:15:41,687][15372] Avg episode reward: [(0, '23.569')] [2023-02-22 17:15:41,709][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001100_2252800.pth... [2023-02-22 17:15:41,846][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000638_1306624.pth [2023-02-22 17:15:42,192][33578] Updated weights for policy 0, policy_version 1102 (0.0015) [2023-02-22 17:15:46,684][15372] Fps is (10 sec: 3276.8, 60 sec: 4027.8, 300 sec: 3957.2). Total num frames: 2269184. Throughput: 0: 981.9. Samples: 567100. Policy #0 lag: (min: 1.0, avg: 1.7, max: 3.0) [2023-02-22 17:15:46,692][15372] Avg episode reward: [(0, '24.314')] [2023-02-22 17:15:46,694][33564] Saving new best policy, reward=24.314! [2023-02-22 17:15:48,075][33578] Updated weights for policy 0, policy_version 1112 (0.0027) [2023-02-22 17:15:51,684][15372] Fps is (10 sec: 4096.1, 60 sec: 4096.0, 300 sec: 3957.2). Total num frames: 2293760. Throughput: 0: 1030.1. Samples: 573998. Policy #0 lag: (min: 1.0, avg: 1.8, max: 3.0) [2023-02-22 17:15:51,687][15372] Avg episode reward: [(0, '25.187')] [2023-02-22 17:15:51,696][33564] Saving new best policy, reward=25.187! [2023-02-22 17:15:52,360][33578] Updated weights for policy 0, policy_version 1122 (0.0013) [2023-02-22 17:15:56,686][15372] Fps is (10 sec: 4504.9, 60 sec: 4027.6, 300 sec: 3957.1). Total num frames: 2314240. Throughput: 0: 1032.7. Samples: 577518. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:15:56,690][15372] Avg episode reward: [(0, '25.070')] [2023-02-22 17:15:56,755][33578] Updated weights for policy 0, policy_version 1132 (0.0012) [2023-02-22 17:16:01,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3959.6, 300 sec: 3957.2). Total num frames: 2330624. Throughput: 0: 984.8. Samples: 582912. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:16:01,688][15372] Avg episode reward: [(0, '24.964')] [2023-02-22 17:16:03,588][33578] Updated weights for policy 0, policy_version 1142 (0.0025) [2023-02-22 17:16:06,684][15372] Fps is (10 sec: 3277.3, 60 sec: 4027.9, 300 sec: 3943.3). Total num frames: 2347008. Throughput: 0: 976.7. Samples: 587902. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:16:06,691][15372] Avg episode reward: [(0, '25.985')] [2023-02-22 17:16:06,695][33564] Saving new best policy, reward=25.985! [2023-02-22 17:16:08,628][33578] Updated weights for policy 0, policy_version 1152 (0.0032) [2023-02-22 17:16:11,684][15372] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3943.3). Total num frames: 2371584. Throughput: 0: 1004.5. Samples: 591504. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:16:11,690][15372] Avg episode reward: [(0, '27.201')] [2023-02-22 17:16:11,700][33564] Saving new best policy, reward=27.201! [2023-02-22 17:16:12,760][33578] Updated weights for policy 0, policy_version 1162 (0.0027) [2023-02-22 17:16:16,684][15372] Fps is (10 sec: 4915.2, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 2396160. Throughput: 0: 1026.0. Samples: 598912. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:16:16,692][15372] Avg episode reward: [(0, '26.881')] [2023-02-22 17:16:17,765][33578] Updated weights for policy 0, policy_version 1172 (0.0014) [2023-02-22 17:16:21,685][15372] Fps is (10 sec: 3686.3, 60 sec: 3959.4, 300 sec: 3957.1). Total num frames: 2408448. Throughput: 0: 963.1. Samples: 603580. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:16:21,692][15372] Avg episode reward: [(0, '26.640')] [2023-02-22 17:16:24,298][33578] Updated weights for policy 0, policy_version 1182 (0.0015) [2023-02-22 17:16:26,684][15372] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 3957.2). Total num frames: 2428928. Throughput: 0: 960.0. Samples: 605812. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 17:16:26,693][15372] Avg episode reward: [(0, '26.469')] [2023-02-22 17:16:28,868][33578] Updated weights for policy 0, policy_version 1192 (0.0030) [2023-02-22 17:16:31,685][15372] Fps is (10 sec: 4505.5, 60 sec: 4027.7, 300 sec: 3957.1). Total num frames: 2453504. Throughput: 0: 1015.8. Samples: 612812. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:16:31,687][15372] Avg episode reward: [(0, '24.476')] [2023-02-22 17:16:33,049][33578] Updated weights for policy 0, policy_version 1202 (0.0018) [2023-02-22 17:16:36,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 2473984. Throughput: 0: 1018.0. Samples: 619808. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:16:36,688][15372] Avg episode reward: [(0, '23.306')] [2023-02-22 17:16:38,202][33578] Updated weights for policy 0, policy_version 1212 (0.0016) [2023-02-22 17:16:41,684][15372] Fps is (10 sec: 3686.6, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2490368. Throughput: 0: 990.3. Samples: 622080. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:16:41,687][15372] Avg episode reward: [(0, '21.766')] [2023-02-22 17:16:44,910][33578] Updated weights for policy 0, policy_version 1222 (0.0015) [2023-02-22 17:16:46,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3943.3). Total num frames: 2506752. Throughput: 0: 976.0. Samples: 626830. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:16:46,687][15372] Avg episode reward: [(0, '21.502')] [2023-02-22 17:16:49,303][33578] Updated weights for policy 0, policy_version 1232 (0.0022) [2023-02-22 17:16:51,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 2531328. Throughput: 0: 1029.8. Samples: 634244. Policy #0 lag: (min: 0.0, avg: 1.4, max: 4.0) [2023-02-22 17:16:51,687][15372] Avg episode reward: [(0, '21.624')] [2023-02-22 17:16:53,491][33578] Updated weights for policy 0, policy_version 1242 (0.0014) [2023-02-22 17:16:56,690][15372] Fps is (10 sec: 4912.5, 60 sec: 4027.5, 300 sec: 3998.7). Total num frames: 2555904. Throughput: 0: 1029.9. Samples: 637854. Policy #0 lag: (min: 1.0, avg: 2.4, max: 5.0) [2023-02-22 17:16:56,693][15372] Avg episode reward: [(0, '22.747')] [2023-02-22 17:16:58,865][33578] Updated weights for policy 0, policy_version 1252 (0.0033) [2023-02-22 17:17:01,684][15372] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2572288. Throughput: 0: 975.2. Samples: 642796. Policy #0 lag: (min: 1.0, avg: 2.0, max: 4.0) [2023-02-22 17:17:01,688][15372] Avg episode reward: [(0, '22.664')] [2023-02-22 17:17:05,277][33578] Updated weights for policy 0, policy_version 1262 (0.0030) [2023-02-22 17:17:06,684][15372] Fps is (10 sec: 3278.6, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 2588672. Throughput: 0: 994.4. Samples: 648328. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:17:06,692][15372] Avg episode reward: [(0, '23.825')] [2023-02-22 17:17:09,433][33578] Updated weights for policy 0, policy_version 1272 (0.0011) [2023-02-22 17:17:11,684][15372] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 2613248. Throughput: 0: 1025.1. Samples: 651942. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:17:11,687][15372] Avg episode reward: [(0, '24.858')] [2023-02-22 17:17:13,590][33578] Updated weights for policy 0, policy_version 1282 (0.0014) [2023-02-22 17:17:16,686][15372] Fps is (10 sec: 4504.9, 60 sec: 3959.4, 300 sec: 3998.8). Total num frames: 2633728. Throughput: 0: 1024.2. Samples: 658900. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-02-22 17:17:16,689][15372] Avg episode reward: [(0, '23.450')] [2023-02-22 17:17:19,661][33578] Updated weights for policy 0, policy_version 1292 (0.0030) [2023-02-22 17:17:21,684][15372] Fps is (10 sec: 3686.4, 60 sec: 4027.8, 300 sec: 3998.8). Total num frames: 2650112. Throughput: 0: 972.1. Samples: 663554. Policy #0 lag: (min: 1.0, avg: 2.2, max: 3.0) [2023-02-22 17:17:21,688][15372] Avg episode reward: [(0, '22.368')] [2023-02-22 17:17:25,423][33578] Updated weights for policy 0, policy_version 1302 (0.0033) [2023-02-22 17:17:26,684][15372] Fps is (10 sec: 3687.0, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2670592. Throughput: 0: 978.0. Samples: 666092. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:17:26,689][15372] Avg episode reward: [(0, '23.762')] [2023-02-22 17:17:29,627][33578] Updated weights for policy 0, policy_version 1312 (0.0019) [2023-02-22 17:17:31,684][15372] Fps is (10 sec: 4505.6, 60 sec: 4027.8, 300 sec: 4012.7). Total num frames: 2695168. Throughput: 0: 1035.6. Samples: 673434. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:17:31,689][15372] Avg episode reward: [(0, '23.113')] [2023-02-22 17:17:33,755][33578] Updated weights for policy 0, policy_version 1322 (0.0028) [2023-02-22 17:17:36,684][15372] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2715648. Throughput: 0: 1011.7. Samples: 679772. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:17:36,687][15372] Avg episode reward: [(0, '22.241')] [2023-02-22 17:17:39,964][33578] Updated weights for policy 0, policy_version 1332 (0.0017) [2023-02-22 17:17:41,685][15372] Fps is (10 sec: 3686.3, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2732032. Throughput: 0: 984.0. Samples: 682130. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 17:17:41,691][15372] Avg episode reward: [(0, '22.336')] [2023-02-22 17:17:41,704][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001334_2732032.pth... [2023-02-22 17:17:41,826][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000866_1773568.pth [2023-02-22 17:17:45,605][33578] Updated weights for policy 0, policy_version 1342 (0.0012) [2023-02-22 17:17:46,684][15372] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 4012.7). Total num frames: 2752512. Throughput: 0: 994.8. Samples: 687564. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:17:46,688][15372] Avg episode reward: [(0, '22.634')] [2023-02-22 17:17:49,649][33578] Updated weights for policy 0, policy_version 1352 (0.0018) [2023-02-22 17:17:51,684][15372] Fps is (10 sec: 4505.7, 60 sec: 4096.0, 300 sec: 4012.7). Total num frames: 2777088. Throughput: 0: 1037.4. Samples: 695012. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:17:51,687][15372] Avg episode reward: [(0, '22.482')] [2023-02-22 17:17:53,836][33578] Updated weights for policy 0, policy_version 1362 (0.0011) [2023-02-22 17:17:56,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3959.8, 300 sec: 3998.9). Total num frames: 2793472. Throughput: 0: 1031.2. Samples: 698348. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:17:56,688][15372] Avg episode reward: [(0, '22.018')] [2023-02-22 17:18:00,888][33578] Updated weights for policy 0, policy_version 1372 (0.0017) [2023-02-22 17:18:01,685][15372] Fps is (10 sec: 3276.5, 60 sec: 3959.4, 300 sec: 3998.8). Total num frames: 2809856. Throughput: 0: 975.3. Samples: 702786. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:18:01,693][15372] Avg episode reward: [(0, '21.401')] [2023-02-22 17:18:06,284][33578] Updated weights for policy 0, policy_version 1382 (0.0022) [2023-02-22 17:18:06,684][15372] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2830336. Throughput: 0: 996.0. Samples: 708374. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:18:06,691][15372] Avg episode reward: [(0, '22.251')] [2023-02-22 17:18:10,552][33578] Updated weights for policy 0, policy_version 1392 (0.0016) [2023-02-22 17:18:11,684][15372] Fps is (10 sec: 4506.0, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2854912. Throughput: 0: 1020.6. Samples: 712020. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:18:11,691][15372] Avg episode reward: [(0, '22.710')] [2023-02-22 17:18:15,932][33578] Updated weights for policy 0, policy_version 1402 (0.0029) [2023-02-22 17:18:16,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3959.6, 300 sec: 3998.8). Total num frames: 2871296. Throughput: 0: 994.0. Samples: 718162. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:18:16,690][15372] Avg episode reward: [(0, '22.645')] [2023-02-22 17:18:21,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3984.9). Total num frames: 2883584. Throughput: 0: 929.2. Samples: 721586. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:18:21,690][15372] Avg episode reward: [(0, '22.425')] [2023-02-22 17:18:24,967][33578] Updated weights for policy 0, policy_version 1412 (0.0028) [2023-02-22 17:18:26,684][15372] Fps is (10 sec: 2048.0, 60 sec: 3686.4, 300 sec: 3943.3). Total num frames: 2891776. Throughput: 0: 913.8. Samples: 723252. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:18:26,690][15372] Avg episode reward: [(0, '23.640')] [2023-02-22 17:18:31,684][15372] Fps is (10 sec: 2457.6, 60 sec: 3549.9, 300 sec: 3915.5). Total num frames: 2908160. Throughput: 0: 878.1. Samples: 727078. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:18:31,690][15372] Avg episode reward: [(0, '24.139')] [2023-02-22 17:18:31,874][33578] Updated weights for policy 0, policy_version 1422 (0.0021) [2023-02-22 17:18:36,430][33578] Updated weights for policy 0, policy_version 1432 (0.0039) [2023-02-22 17:18:36,684][15372] Fps is (10 sec: 4095.9, 60 sec: 3618.1, 300 sec: 3929.4). Total num frames: 2932736. Throughput: 0: 861.1. Samples: 733760. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-02-22 17:18:36,689][15372] Avg episode reward: [(0, '24.985')] [2023-02-22 17:18:41,685][15372] Fps is (10 sec: 4095.8, 60 sec: 3618.1, 300 sec: 3943.3). Total num frames: 2949120. Throughput: 0: 860.6. Samples: 737074. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) [2023-02-22 17:18:41,688][15372] Avg episode reward: [(0, '23.824')] [2023-02-22 17:18:41,916][33578] Updated weights for policy 0, policy_version 1442 (0.0019) [2023-02-22 17:18:46,685][15372] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3915.5). Total num frames: 2965504. Throughput: 0: 855.7. Samples: 741292. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:18:46,687][15372] Avg episode reward: [(0, '24.265')] [2023-02-22 17:18:49,571][33578] Updated weights for policy 0, policy_version 1452 (0.0022) [2023-02-22 17:18:51,684][15372] Fps is (10 sec: 3277.0, 60 sec: 3413.3, 300 sec: 3887.7). Total num frames: 2981888. Throughput: 0: 838.0. Samples: 746086. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:18:51,692][15372] Avg episode reward: [(0, '24.066')] [2023-02-22 17:18:54,473][33578] Updated weights for policy 0, policy_version 1462 (0.0015) [2023-02-22 17:18:56,684][15372] Fps is (10 sec: 3686.5, 60 sec: 3481.6, 300 sec: 3901.7). Total num frames: 3002368. Throughput: 0: 825.5. Samples: 749168. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:18:56,687][15372] Avg episode reward: [(0, '23.976')] [2023-02-22 17:18:59,275][33578] Updated weights for policy 0, policy_version 1472 (0.0011) [2023-02-22 17:19:01,685][15372] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3901.6). Total num frames: 3018752. Throughput: 0: 824.0. Samples: 755244. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:19:01,692][15372] Avg episode reward: [(0, '24.608')] [2023-02-22 17:19:06,249][33578] Updated weights for policy 0, policy_version 1482 (0.0011) [2023-02-22 17:19:06,688][15372] Fps is (10 sec: 3275.7, 60 sec: 3413.1, 300 sec: 3873.8). Total num frames: 3035136. Throughput: 0: 843.2. Samples: 759532. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:19:06,690][15372] Avg episode reward: [(0, '24.737')] [2023-02-22 17:19:11,612][33578] Updated weights for policy 0, policy_version 1492 (0.0014) [2023-02-22 17:19:11,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3860.0). Total num frames: 3055616. Throughput: 0: 858.5. Samples: 761886. Policy #0 lag: (min: 0.0, avg: 1.1, max: 2.0) [2023-02-22 17:19:11,687][15372] Avg episode reward: [(0, '25.048')] [2023-02-22 17:19:16,014][33578] Updated weights for policy 0, policy_version 1502 (0.0022) [2023-02-22 17:19:16,684][15372] Fps is (10 sec: 4097.4, 60 sec: 3413.3, 300 sec: 3873.8). Total num frames: 3076096. Throughput: 0: 928.8. Samples: 768876. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:19:16,686][15372] Avg episode reward: [(0, '25.006')] [2023-02-22 17:19:21,025][33578] Updated weights for policy 0, policy_version 1512 (0.0016) [2023-02-22 17:19:21,685][15372] Fps is (10 sec: 4095.9, 60 sec: 3549.8, 300 sec: 3901.6). Total num frames: 3096576. Throughput: 0: 917.1. Samples: 775028. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:19:21,689][15372] Avg episode reward: [(0, '25.792')] [2023-02-22 17:19:26,599][33564] Early stopping after 2 epochs (2 sgd steps), loss delta 0.0000009 [2023-02-22 17:19:26,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3873.8). Total num frames: 3112960. Throughput: 0: 891.3. Samples: 777180. Policy #0 lag: (min: 1.0, avg: 2.0, max: 4.0) [2023-02-22 17:19:26,689][15372] Avg episode reward: [(0, '25.796')] [2023-02-22 17:19:28,136][33578] Updated weights for policy 0, policy_version 1522 (0.0016) [2023-02-22 17:19:31,684][15372] Fps is (10 sec: 3276.9, 60 sec: 3686.4, 300 sec: 3846.1). Total num frames: 3129344. Throughput: 0: 906.0. Samples: 782060. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:19:31,690][15372] Avg episode reward: [(0, '25.659')] [2023-02-22 17:19:32,971][33578] Updated weights for policy 0, policy_version 1532 (0.0022) [2023-02-22 17:19:36,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3873.9). Total num frames: 3153920. Throughput: 0: 959.4. Samples: 789260. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:19:36,692][15372] Avg episode reward: [(0, '26.129')] [2023-02-22 17:19:37,276][33578] Updated weights for policy 0, policy_version 1542 (0.0024) [2023-02-22 17:19:41,685][15372] Fps is (10 sec: 4505.5, 60 sec: 3754.7, 300 sec: 3887.7). Total num frames: 3174400. Throughput: 0: 967.3. Samples: 792698. Policy #0 lag: (min: 1.0, avg: 2.3, max: 3.0) [2023-02-22 17:19:41,691][15372] Avg episode reward: [(0, '25.447')] [2023-02-22 17:19:41,706][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001550_3174400.pth... [2023-02-22 17:19:41,830][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001100_2252800.pth [2023-02-22 17:19:42,970][33578] Updated weights for policy 0, policy_version 1552 (0.0030) [2023-02-22 17:19:46,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3860.0). Total num frames: 3186688. Throughput: 0: 930.8. Samples: 797132. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:19:46,688][15372] Avg episode reward: [(0, '25.577')] [2023-02-22 17:19:49,699][33578] Updated weights for policy 0, policy_version 1562 (0.0037) [2023-02-22 17:19:51,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 3207168. Throughput: 0: 955.5. Samples: 802524. Policy #0 lag: (min: 1.0, avg: 2.3, max: 3.0) [2023-02-22 17:19:51,687][15372] Avg episode reward: [(0, '25.707')] [2023-02-22 17:19:54,059][33578] Updated weights for policy 0, policy_version 1572 (0.0015) [2023-02-22 17:19:56,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 3231744. Throughput: 0: 980.0. Samples: 805984. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:19:56,687][15372] Avg episode reward: [(0, '24.183')] [2023-02-22 17:19:58,537][33578] Updated weights for policy 0, policy_version 1582 (0.0016) [2023-02-22 17:20:01,687][15372] Fps is (10 sec: 4095.0, 60 sec: 3822.8, 300 sec: 3873.8). Total num frames: 3248128. Throughput: 0: 965.3. Samples: 812318. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:20:01,689][15372] Avg episode reward: [(0, '24.046')] [2023-02-22 17:20:05,052][33578] Updated weights for policy 0, policy_version 1592 (0.0023) [2023-02-22 17:20:06,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3823.2, 300 sec: 3846.1). Total num frames: 3264512. Throughput: 0: 924.1. Samples: 816614. Policy #0 lag: (min: 0.0, avg: 0.9, max: 4.0) [2023-02-22 17:20:06,688][15372] Avg episode reward: [(0, '24.785')] [2023-02-22 17:20:10,983][33578] Updated weights for policy 0, policy_version 1602 (0.0026) [2023-02-22 17:20:11,684][15372] Fps is (10 sec: 3277.6, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 3280896. Throughput: 0: 930.3. Samples: 819044. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:20:11,687][15372] Avg episode reward: [(0, '25.066')] [2023-02-22 17:20:15,421][33578] Updated weights for policy 0, policy_version 1612 (0.0019) [2023-02-22 17:20:16,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 3305472. Throughput: 0: 976.6. Samples: 826006. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:20:16,687][15372] Avg episode reward: [(0, '25.951')] [2023-02-22 17:20:20,409][33578] Updated weights for policy 0, policy_version 1622 (0.0029) [2023-02-22 17:20:21,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 3321856. Throughput: 0: 946.2. Samples: 831840. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:20:21,687][15372] Avg episode reward: [(0, '25.906')] [2023-02-22 17:20:26,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 3338240. Throughput: 0: 918.2. Samples: 834018. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:20:26,690][15372] Avg episode reward: [(0, '26.377')] [2023-02-22 17:20:27,604][33578] Updated weights for policy 0, policy_version 1632 (0.0041) [2023-02-22 17:20:31,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3358720. Throughput: 0: 932.0. Samples: 839070. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:20:31,690][15372] Avg episode reward: [(0, '26.738')] [2023-02-22 17:20:32,472][33578] Updated weights for policy 0, policy_version 1642 (0.0017) [2023-02-22 17:20:36,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 3379200. Throughput: 0: 966.6. Samples: 846022. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:20:36,688][15372] Avg episode reward: [(0, '25.367')] [2023-02-22 17:20:36,806][33578] Updated weights for policy 0, policy_version 1652 (0.0012) [2023-02-22 17:20:41,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 3399680. Throughput: 0: 962.2. Samples: 849284. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:20:41,688][15372] Avg episode reward: [(0, '25.343')] [2023-02-22 17:20:42,789][33578] Updated weights for policy 0, policy_version 1662 (0.0027) [2023-02-22 17:20:46,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 3411968. Throughput: 0: 915.2. Samples: 853502. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) [2023-02-22 17:20:46,692][15372] Avg episode reward: [(0, '25.246')] [2023-02-22 17:20:49,348][33578] Updated weights for policy 0, policy_version 1672 (0.0018) [2023-02-22 17:20:51,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.6). Total num frames: 3432448. Throughput: 0: 948.0. Samples: 859276. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:20:51,691][15372] Avg episode reward: [(0, '24.799')] [2023-02-22 17:20:53,722][33578] Updated weights for policy 0, policy_version 1682 (0.0012) [2023-02-22 17:20:56,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 3457024. Throughput: 0: 971.6. Samples: 862766. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:20:56,687][15372] Avg episode reward: [(0, '24.270')] [2023-02-22 17:20:58,050][33578] Updated weights for policy 0, policy_version 1692 (0.0016) [2023-02-22 17:21:01,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3754.8, 300 sec: 3818.3). Total num frames: 3473408. Throughput: 0: 954.8. Samples: 868974. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) [2023-02-22 17:21:01,687][15372] Avg episode reward: [(0, '23.982')] [2023-02-22 17:21:04,864][33578] Updated weights for policy 0, policy_version 1702 (0.0027) [2023-02-22 17:21:06,685][15372] Fps is (10 sec: 3276.6, 60 sec: 3754.6, 300 sec: 3790.5). Total num frames: 3489792. Throughput: 0: 921.0. Samples: 873286. Policy #0 lag: (min: 0.0, avg: 1.3, max: 4.0) [2023-02-22 17:21:06,694][15372] Avg episode reward: [(0, '24.228')] [2023-02-22 17:21:10,404][33578] Updated weights for policy 0, policy_version 1712 (0.0022) [2023-02-22 17:21:11,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 3510272. Throughput: 0: 933.3. Samples: 876018. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:21:11,687][15372] Avg episode reward: [(0, '22.694')] [2023-02-22 17:21:14,818][33578] Updated weights for policy 0, policy_version 1722 (0.0023) [2023-02-22 17:21:16,684][15372] Fps is (10 sec: 4506.0, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3534848. Throughput: 0: 976.8. Samples: 883024. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:21:16,688][15372] Avg episode reward: [(0, '23.563')] [2023-02-22 17:21:19,688][33578] Updated weights for policy 0, policy_version 1732 (0.0011) [2023-02-22 17:21:21,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3551232. Throughput: 0: 946.7. Samples: 888624. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:21:21,687][15372] Avg episode reward: [(0, '23.884')] [2023-02-22 17:21:26,609][33578] Updated weights for policy 0, policy_version 1742 (0.0022) [2023-02-22 17:21:26,685][15372] Fps is (10 sec: 3276.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 3567616. Throughput: 0: 924.2. Samples: 890872. Policy #0 lag: (min: 0.0, avg: 0.9, max: 3.0) [2023-02-22 17:21:26,690][15372] Avg episode reward: [(0, '23.812')] [2023-02-22 17:21:31,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 3584000. Throughput: 0: 948.3. Samples: 896176. Policy #0 lag: (min: 1.0, avg: 2.0, max: 4.0) [2023-02-22 17:21:31,687][15372] Avg episode reward: [(0, '24.462')] [2023-02-22 17:21:31,779][33578] Updated weights for policy 0, policy_version 1752 (0.0013) [2023-02-22 17:21:36,101][33578] Updated weights for policy 0, policy_version 1762 (0.0013) [2023-02-22 17:21:36,684][15372] Fps is (10 sec: 4096.2, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 3608576. Throughput: 0: 978.2. Samples: 903294. Policy #0 lag: (min: 0.0, avg: 1.3, max: 4.0) [2023-02-22 17:21:36,686][15372] Avg episode reward: [(0, '25.394')] [2023-02-22 17:21:41,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 3624960. Throughput: 0: 967.0. Samples: 906282. Policy #0 lag: (min: 1.0, avg: 2.0, max: 4.0) [2023-02-22 17:21:41,688][15372] Avg episode reward: [(0, '25.717')] [2023-02-22 17:21:41,723][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001772_3629056.pth... [2023-02-22 17:21:41,731][33578] Updated weights for policy 0, policy_version 1772 (0.0014) [2023-02-22 17:21:41,867][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001334_2732032.pth [2023-02-22 17:21:46,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 3641344. Throughput: 0: 921.5. Samples: 910440. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:21:46,686][15372] Avg episode reward: [(0, '26.905')] [2023-02-22 17:21:48,631][33578] Updated weights for policy 0, policy_version 1782 (0.0014) [2023-02-22 17:21:51,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3749.0). Total num frames: 3661824. Throughput: 0: 953.4. Samples: 916190. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:21:51,687][15372] Avg episode reward: [(0, '27.647')] [2023-02-22 17:21:51,700][33564] Saving new best policy, reward=27.647! [2023-02-22 17:21:53,322][33578] Updated weights for policy 0, policy_version 1792 (0.0012) [2023-02-22 17:21:56,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 3682304. Throughput: 0: 966.4. Samples: 919506. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:21:56,688][15372] Avg episode reward: [(0, '28.720')] [2023-02-22 17:21:56,692][33564] Saving new best policy, reward=28.720! [2023-02-22 17:21:57,725][33578] Updated weights for policy 0, policy_version 1802 (0.0012) [2023-02-22 17:22:01,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 3698688. Throughput: 0: 937.2. Samples: 925198. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:22:01,687][15372] Avg episode reward: [(0, '29.615')] [2023-02-22 17:22:01,708][33564] Saving new best policy, reward=29.615! [2023-02-22 17:22:04,866][33578] Updated weights for policy 0, policy_version 1812 (0.0019) [2023-02-22 17:22:06,685][15372] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3735.0). Total num frames: 3715072. Throughput: 0: 904.6. Samples: 929330. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:22:06,692][15372] Avg episode reward: [(0, '29.416')] [2023-02-22 17:22:10,436][33578] Updated weights for policy 0, policy_version 1822 (0.0018) [2023-02-22 17:22:11,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3735.0). Total num frames: 3735552. Throughput: 0: 920.0. Samples: 932270. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:22:11,687][15372] Avg episode reward: [(0, '27.865')] [2023-02-22 17:22:15,033][33578] Updated weights for policy 0, policy_version 1832 (0.0016) [2023-02-22 17:22:16,684][15372] Fps is (10 sec: 4096.1, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 3756032. Throughput: 0: 956.6. Samples: 939222. Policy #0 lag: (min: 0.0, avg: 1.4, max: 4.0) [2023-02-22 17:22:16,687][15372] Avg episode reward: [(0, '27.128')] [2023-02-22 17:22:20,506][33578] Updated weights for policy 0, policy_version 1842 (0.0027) [2023-02-22 17:22:21,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3735.0). Total num frames: 3772416. Throughput: 0: 915.8. Samples: 944504. Policy #0 lag: (min: 0.0, avg: 1.3, max: 4.0) [2023-02-22 17:22:21,692][15372] Avg episode reward: [(0, '27.646')] [2023-02-22 17:22:26,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 3788800. Throughput: 0: 897.2. Samples: 946658. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:22:26,691][15372] Avg episode reward: [(0, '27.291')] [2023-02-22 17:22:27,359][33578] Updated weights for policy 0, policy_version 1852 (0.0016) [2023-02-22 17:22:31,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 3809280. Throughput: 0: 929.7. Samples: 952278. Policy #0 lag: (min: 0.0, avg: 1.1, max: 2.0) [2023-02-22 17:22:31,686][15372] Avg episode reward: [(0, '26.788')] [2023-02-22 17:22:31,908][33578] Updated weights for policy 0, policy_version 1862 (0.0014) [2023-02-22 17:22:36,330][33578] Updated weights for policy 0, policy_version 1872 (0.0013) [2023-02-22 17:22:36,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3735.0). Total num frames: 3833856. Throughput: 0: 957.0. Samples: 959256. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:22:36,692][15372] Avg episode reward: [(0, '27.143')] [2023-02-22 17:22:41,690][15372] Fps is (10 sec: 4093.8, 60 sec: 3754.3, 300 sec: 3721.0). Total num frames: 3850240. Throughput: 0: 944.9. Samples: 962032. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:22:41,692][15372] Avg episode reward: [(0, '27.911')] [2023-02-22 17:22:42,523][33578] Updated weights for policy 0, policy_version 1882 (0.0024) [2023-02-22 17:22:46,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 3866624. Throughput: 0: 913.3. Samples: 966298. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:22:46,689][15372] Avg episode reward: [(0, '27.326')] [2023-02-22 17:22:48,823][33578] Updated weights for policy 0, policy_version 1892 (0.0035) [2023-02-22 17:22:51,684][15372] Fps is (10 sec: 3688.4, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 3887104. Throughput: 0: 955.2. Samples: 972314. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:22:51,686][15372] Avg episode reward: [(0, '25.814')] [2023-02-22 17:22:53,311][33578] Updated weights for policy 0, policy_version 1902 (0.0012) [2023-02-22 17:22:56,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 3907584. Throughput: 0: 963.7. Samples: 975638. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:22:56,694][15372] Avg episode reward: [(0, '26.353')] [2023-02-22 17:22:58,430][33578] Updated weights for policy 0, policy_version 1912 (0.0019) [2023-02-22 17:23:01,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 3923968. Throughput: 0: 930.7. Samples: 981102. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:23:01,692][15372] Avg episode reward: [(0, '25.670')] [2023-02-22 17:23:05,228][33578] Updated weights for policy 0, policy_version 1922 (0.0016) [2023-02-22 17:23:06,685][15372] Fps is (10 sec: 3276.7, 60 sec: 3754.6, 300 sec: 3679.5). Total num frames: 3940352. Throughput: 0: 910.4. Samples: 985472. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:23:06,690][15372] Avg episode reward: [(0, '24.492')] [2023-02-22 17:23:10,575][33578] Updated weights for policy 0, policy_version 1932 (0.0011) [2023-02-22 17:23:11,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 3960832. Throughput: 0: 927.9. Samples: 988412. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:23:11,690][15372] Avg episode reward: [(0, '25.193')] [2023-02-22 17:23:15,250][33578] Updated weights for policy 0, policy_version 1942 (0.0021) [2023-02-22 17:23:16,684][15372] Fps is (10 sec: 4096.2, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 3981312. Throughput: 0: 956.0. Samples: 995296. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:23:16,691][15372] Avg episode reward: [(0, '25.771')] [2023-02-22 17:23:20,543][33578] Updated weights for policy 0, policy_version 1952 (0.0011) [2023-02-22 17:23:21,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 3997696. Throughput: 0: 920.3. Samples: 1000670. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:23:21,689][15372] Avg episode reward: [(0, '27.050')] [2023-02-22 17:23:26,685][15372] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 4014080. Throughput: 0: 906.8. Samples: 1002834. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:23:26,691][15372] Avg episode reward: [(0, '26.777')] [2023-02-22 17:23:27,475][33578] Updated weights for policy 0, policy_version 1962 (0.0025) [2023-02-22 17:23:31,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3735.0). Total num frames: 4034560. Throughput: 0: 940.2. Samples: 1008606. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:23:31,687][15372] Avg episode reward: [(0, '26.154')] [2023-02-22 17:23:31,863][33578] Updated weights for policy 0, policy_version 1972 (0.0011) [2023-02-22 17:23:36,170][33578] Updated weights for policy 0, policy_version 1982 (0.0014) [2023-02-22 17:23:36,684][15372] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 4059136. Throughput: 0: 965.0. Samples: 1015740. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:23:36,687][15372] Avg episode reward: [(0, '26.164')] [2023-02-22 17:23:41,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3755.0, 300 sec: 3762.8). Total num frames: 4075520. Throughput: 0: 946.3. Samples: 1018220. Policy #0 lag: (min: 1.0, avg: 2.1, max: 3.0) [2023-02-22 17:23:41,693][15372] Avg episode reward: [(0, '25.232')] [2023-02-22 17:23:41,707][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001990_4075520.pth... [2023-02-22 17:23:41,873][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001550_3174400.pth [2023-02-22 17:23:42,885][33578] Updated weights for policy 0, policy_version 1992 (0.0011) [2023-02-22 17:23:46,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 4087808. Throughput: 0: 919.4. Samples: 1022474. Policy #0 lag: (min: 1.0, avg: 2.2, max: 3.0) [2023-02-22 17:23:46,692][15372] Avg episode reward: [(0, '25.905')] [2023-02-22 17:23:48,859][33578] Updated weights for policy 0, policy_version 2002 (0.0012) [2023-02-22 17:23:51,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 4112384. Throughput: 0: 956.9. Samples: 1028532. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) [2023-02-22 17:23:51,687][15372] Avg episode reward: [(0, '24.952')] [2023-02-22 17:23:53,389][33578] Updated weights for policy 0, policy_version 2012 (0.0012) [2023-02-22 17:23:56,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 4132864. Throughput: 0: 970.4. Samples: 1032078. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:23:56,690][15372] Avg episode reward: [(0, '25.769')] [2023-02-22 17:23:58,300][33578] Updated weights for policy 0, policy_version 2022 (0.0018) [2023-02-22 17:24:01,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 4149248. Throughput: 0: 935.9. Samples: 1037410. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:24:01,691][15372] Avg episode reward: [(0, '26.959')] [2023-02-22 17:24:05,567][33578] Updated weights for policy 0, policy_version 2032 (0.0025) [2023-02-22 17:24:06,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 4161536. Throughput: 0: 911.2. Samples: 1041672. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:24:06,693][15372] Avg episode reward: [(0, '26.180')] [2023-02-22 17:24:10,660][33578] Updated weights for policy 0, policy_version 2042 (0.0020) [2023-02-22 17:24:11,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 4186112. Throughput: 0: 932.8. Samples: 1044810. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:24:11,692][15372] Avg episode reward: [(0, '24.667')] [2023-02-22 17:24:15,435][33578] Updated weights for policy 0, policy_version 2052 (0.0012) [2023-02-22 17:24:16,684][15372] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 4206592. Throughput: 0: 949.1. Samples: 1051314. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:24:16,687][15372] Avg episode reward: [(0, '24.110')] [2023-02-22 17:24:21,691][15372] Fps is (10 sec: 3274.7, 60 sec: 3686.0, 300 sec: 3748.8). Total num frames: 4218880. Throughput: 0: 893.9. Samples: 1055970. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:24:21,693][15372] Avg episode reward: [(0, '23.200')] [2023-02-22 17:24:22,067][33578] Updated weights for policy 0, policy_version 2062 (0.0026) [2023-02-22 17:24:26,686][15372] Fps is (10 sec: 2457.3, 60 sec: 3618.1, 300 sec: 3735.0). Total num frames: 4231168. Throughput: 0: 876.6. Samples: 1057668. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) [2023-02-22 17:24:26,688][15372] Avg episode reward: [(0, '22.881')] [2023-02-22 17:24:31,178][33578] Updated weights for policy 0, policy_version 2072 (0.0046) [2023-02-22 17:24:31,684][15372] Fps is (10 sec: 2459.2, 60 sec: 3481.6, 300 sec: 3693.3). Total num frames: 4243456. Throughput: 0: 855.4. Samples: 1060966. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:24:31,689][15372] Avg episode reward: [(0, '21.914')] [2023-02-22 17:24:36,092][33578] Updated weights for policy 0, policy_version 2082 (0.0020) [2023-02-22 17:24:36,684][15372] Fps is (10 sec: 3277.2, 60 sec: 3413.3, 300 sec: 3693.3). Total num frames: 4263936. Throughput: 0: 855.0. Samples: 1067008. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:24:36,687][15372] Avg episode reward: [(0, '22.298')] [2023-02-22 17:24:40,403][33578] Updated weights for policy 0, policy_version 2092 (0.0023) [2023-02-22 17:24:41,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3735.0). Total num frames: 4288512. Throughput: 0: 856.0. Samples: 1070600. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:24:41,689][15372] Avg episode reward: [(0, '23.097')] [2023-02-22 17:24:46,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3707.2). Total num frames: 4300800. Throughput: 0: 845.9. Samples: 1075474. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:24:46,687][15372] Avg episode reward: [(0, '23.881')] [2023-02-22 17:24:47,328][33578] Updated weights for policy 0, policy_version 2102 (0.0052) [2023-02-22 17:24:51,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3679.5). Total num frames: 4317184. Throughput: 0: 857.9. Samples: 1080278. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:24:51,687][15372] Avg episode reward: [(0, '23.214')] [2023-02-22 17:24:52,770][33578] Updated weights for policy 0, policy_version 2112 (0.0021) [2023-02-22 17:24:56,685][15372] Fps is (10 sec: 4095.8, 60 sec: 3481.6, 300 sec: 3707.3). Total num frames: 4341760. Throughput: 0: 865.5. Samples: 1083760. Policy #0 lag: (min: 1.0, avg: 1.8, max: 3.0) [2023-02-22 17:24:56,687][15372] Avg episode reward: [(0, '23.893')] [2023-02-22 17:24:57,144][33578] Updated weights for policy 0, policy_version 2122 (0.0012) [2023-02-22 17:25:01,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3721.1). Total num frames: 4362240. Throughput: 0: 876.1. Samples: 1090740. Policy #0 lag: (min: 1.0, avg: 1.9, max: 4.0) [2023-02-22 17:25:01,689][15372] Avg episode reward: [(0, '24.957')] [2023-02-22 17:25:02,421][33578] Updated weights for policy 0, policy_version 2132 (0.0017) [2023-02-22 17:25:06,684][15372] Fps is (10 sec: 3686.6, 60 sec: 3618.1, 300 sec: 3721.1). Total num frames: 4378624. Throughput: 0: 870.5. Samples: 1095136. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:25:06,693][15372] Avg episode reward: [(0, '25.507')] [2023-02-22 17:25:09,524][33578] Updated weights for policy 0, policy_version 2142 (0.0022) [2023-02-22 17:25:11,685][15372] Fps is (10 sec: 3276.7, 60 sec: 3481.6, 300 sec: 3693.3). Total num frames: 4395008. Throughput: 0: 880.5. Samples: 1097290. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:25:11,687][15372] Avg episode reward: [(0, '24.778')] [2023-02-22 17:25:14,447][33578] Updated weights for policy 0, policy_version 2152 (0.0012) [2023-02-22 17:25:16,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3707.2). Total num frames: 4415488. Throughput: 0: 946.0. Samples: 1103536. Policy #0 lag: (min: 1.0, avg: 2.1, max: 3.0) [2023-02-22 17:25:16,687][15372] Avg episode reward: [(0, '24.289')] [2023-02-22 17:25:18,935][33578] Updated weights for policy 0, policy_version 2162 (0.0018) [2023-02-22 17:25:21,689][15372] Fps is (10 sec: 4094.2, 60 sec: 3618.2, 300 sec: 3721.1). Total num frames: 4435968. Throughput: 0: 953.8. Samples: 1109932. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:25:21,691][15372] Avg episode reward: [(0, '23.905')] [2023-02-22 17:25:25,250][33578] Updated weights for policy 0, policy_version 2172 (0.0036) [2023-02-22 17:25:26,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3686.5, 300 sec: 3707.2). Total num frames: 4452352. Throughput: 0: 921.3. Samples: 1112060. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:25:26,687][15372] Avg episode reward: [(0, '24.234')] [2023-02-22 17:25:31,629][33578] Updated weights for policy 0, policy_version 2182 (0.0040) [2023-02-22 17:25:31,684][15372] Fps is (10 sec: 3278.3, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 4468736. Throughput: 0: 911.5. Samples: 1116490. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) [2023-02-22 17:25:31,687][15372] Avg episode reward: [(0, '23.575')] [2023-02-22 17:25:35,917][33578] Updated weights for policy 0, policy_version 2192 (0.0013) [2023-02-22 17:25:36,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 4489216. Throughput: 0: 959.5. Samples: 1123454. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-22 17:25:36,690][15372] Avg episode reward: [(0, '24.429')] [2023-02-22 17:25:40,349][33578] Updated weights for policy 0, policy_version 2202 (0.0011) [2023-02-22 17:25:41,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3735.0). Total num frames: 4513792. Throughput: 0: 959.0. Samples: 1126914. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:25:41,688][15372] Avg episode reward: [(0, '24.585')] [2023-02-22 17:25:41,703][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002204_4513792.pth... [2023-02-22 17:25:41,850][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001772_3629056.pth [2023-02-22 17:25:46,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 4526080. Throughput: 0: 910.0. Samples: 1131690. Policy #0 lag: (min: 1.0, avg: 1.8, max: 3.0) [2023-02-22 17:25:46,691][15372] Avg episode reward: [(0, '25.496')] [2023-02-22 17:25:47,448][33578] Updated weights for policy 0, policy_version 2212 (0.0022) [2023-02-22 17:25:51,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 4542464. Throughput: 0: 920.7. Samples: 1136568. Policy #0 lag: (min: 1.0, avg: 1.8, max: 3.0) [2023-02-22 17:25:51,690][15372] Avg episode reward: [(0, '26.631')] [2023-02-22 17:25:52,804][33578] Updated weights for policy 0, policy_version 2222 (0.0012) [2023-02-22 17:25:56,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 4567040. Throughput: 0: 951.7. Samples: 1140116. Policy #0 lag: (min: 1.0, avg: 1.8, max: 4.0) [2023-02-22 17:25:56,691][15372] Avg episode reward: [(0, '27.061')] [2023-02-22 17:25:57,201][33578] Updated weights for policy 0, policy_version 2232 (0.0015) [2023-02-22 17:26:01,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 4587520. Throughput: 0: 965.9. Samples: 1147002. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 17:26:01,693][15372] Avg episode reward: [(0, '26.593')] [2023-02-22 17:26:02,708][33578] Updated weights for policy 0, policy_version 2242 (0.0021) [2023-02-22 17:26:06,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 4599808. Throughput: 0: 919.9. Samples: 1151322. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:26:06,692][15372] Avg episode reward: [(0, '27.164')] [2023-02-22 17:26:09,718][33578] Updated weights for policy 0, policy_version 2252 (0.0024) [2023-02-22 17:26:11,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 4620288. Throughput: 0: 922.5. Samples: 1153572. Policy #0 lag: (min: 0.0, avg: 1.2, max: 2.0) [2023-02-22 17:26:11,693][15372] Avg episode reward: [(0, '28.312')] [2023-02-22 17:26:14,307][33578] Updated weights for policy 0, policy_version 2262 (0.0018) [2023-02-22 17:26:16,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 4640768. Throughput: 0: 971.2. Samples: 1160192. Policy #0 lag: (min: 1.0, avg: 2.2, max: 3.0) [2023-02-22 17:26:16,694][15372] Avg episode reward: [(0, '28.845')] [2023-02-22 17:26:18,731][33578] Updated weights for policy 0, policy_version 2272 (0.0011) [2023-02-22 17:26:21,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3755.0, 300 sec: 3707.2). Total num frames: 4661248. Throughput: 0: 959.9. Samples: 1166650. Policy #0 lag: (min: 0.0, avg: 1.3, max: 4.0) [2023-02-22 17:26:21,690][15372] Avg episode reward: [(0, '29.390')] [2023-02-22 17:26:24,779][33578] Updated weights for policy 0, policy_version 2282 (0.0021) [2023-02-22 17:26:26,685][15372] Fps is (10 sec: 3686.3, 60 sec: 3754.6, 300 sec: 3707.2). Total num frames: 4677632. Throughput: 0: 931.6. Samples: 1168836. Policy #0 lag: (min: 0.0, avg: 1.4, max: 4.0) [2023-02-22 17:26:26,691][15372] Avg episode reward: [(0, '29.674')] [2023-02-22 17:26:26,701][33564] Saving new best policy, reward=29.674! [2023-02-22 17:26:31,063][33578] Updated weights for policy 0, policy_version 2292 (0.0021) [2023-02-22 17:26:31,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 4694016. Throughput: 0: 926.0. Samples: 1173362. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:26:31,693][15372] Avg episode reward: [(0, '28.220')] [2023-02-22 17:26:35,564][33578] Updated weights for policy 0, policy_version 2302 (0.0012) [2023-02-22 17:26:36,684][15372] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3707.2). Total num frames: 4718592. Throughput: 0: 970.7. Samples: 1180248. Policy #0 lag: (min: 0.0, avg: 1.1, max: 2.0) [2023-02-22 17:26:36,687][15372] Avg episode reward: [(0, '26.743')] [2023-02-22 17:26:39,920][33578] Updated weights for policy 0, policy_version 2312 (0.0032) [2023-02-22 17:26:41,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 4739072. Throughput: 0: 970.6. Samples: 1183794. Policy #0 lag: (min: 1.0, avg: 1.8, max: 5.0) [2023-02-22 17:26:41,688][15372] Avg episode reward: [(0, '26.328')] [2023-02-22 17:26:46,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 4751360. Throughput: 0: 915.3. Samples: 1188190. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:26:46,689][15372] Avg episode reward: [(0, '25.831')] [2023-02-22 17:26:47,194][33578] Updated weights for policy 0, policy_version 2322 (0.0024) [2023-02-22 17:26:51,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3693.3). Total num frames: 4771840. Throughput: 0: 937.1. Samples: 1193490. Policy #0 lag: (min: 1.0, avg: 2.1, max: 3.0) [2023-02-22 17:26:51,686][15372] Avg episode reward: [(0, '24.803')] [2023-02-22 17:26:52,421][33578] Updated weights for policy 0, policy_version 2332 (0.0013) [2023-02-22 17:26:56,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 4792320. Throughput: 0: 964.6. Samples: 1196978. Policy #0 lag: (min: 1.0, avg: 2.1, max: 3.0) [2023-02-22 17:26:56,689][15372] Avg episode reward: [(0, '24.763')] [2023-02-22 17:26:56,913][33578] Updated weights for policy 0, policy_version 2342 (0.0017) [2023-02-22 17:27:01,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 4812800. Throughput: 0: 966.3. Samples: 1203676. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:27:01,687][15372] Avg episode reward: [(0, '26.055')] [2023-02-22 17:27:02,254][33578] Updated weights for policy 0, policy_version 2352 (0.0012) [2023-02-22 17:27:06,689][15372] Fps is (10 sec: 3684.5, 60 sec: 3822.6, 300 sec: 3707.2). Total num frames: 4829184. Throughput: 0: 918.8. Samples: 1208000. Policy #0 lag: (min: 1.0, avg: 2.3, max: 4.0) [2023-02-22 17:27:06,696][15372] Avg episode reward: [(0, '27.480')] [2023-02-22 17:27:09,176][33578] Updated weights for policy 0, policy_version 2362 (0.0026) [2023-02-22 17:27:11,686][15372] Fps is (10 sec: 3276.2, 60 sec: 3754.6, 300 sec: 3693.3). Total num frames: 4845568. Throughput: 0: 919.4. Samples: 1210212. Policy #0 lag: (min: 1.0, avg: 2.4, max: 5.0) [2023-02-22 17:27:11,691][15372] Avg episode reward: [(0, '26.941')] [2023-02-22 17:27:13,647][33578] Updated weights for policy 0, policy_version 2372 (0.0027) [2023-02-22 17:27:16,684][15372] Fps is (10 sec: 4098.1, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 4870144. Throughput: 0: 976.5. Samples: 1217304. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:27:16,687][15372] Avg episode reward: [(0, '28.036')] [2023-02-22 17:27:17,955][33578] Updated weights for policy 0, policy_version 2382 (0.0015) [2023-02-22 17:27:21,684][15372] Fps is (10 sec: 4506.4, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 4890624. Throughput: 0: 959.6. Samples: 1223432. Policy #0 lag: (min: 1.0, avg: 1.9, max: 4.0) [2023-02-22 17:27:21,696][15372] Avg episode reward: [(0, '28.176')] [2023-02-22 17:27:23,797][33578] Updated weights for policy 0, policy_version 2392 (0.0026) [2023-02-22 17:27:26,684][15372] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 4907008. Throughput: 0: 929.8. Samples: 1225634. Policy #0 lag: (min: 1.0, avg: 1.7, max: 4.0) [2023-02-22 17:27:26,698][15372] Avg episode reward: [(0, '28.466')] [2023-02-22 17:27:30,345][33578] Updated weights for policy 0, policy_version 2402 (0.0042) [2023-02-22 17:27:31,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3693.3). Total num frames: 4923392. Throughput: 0: 938.7. Samples: 1230430. Policy #0 lag: (min: 1.0, avg: 1.8, max: 4.0) [2023-02-22 17:27:31,686][15372] Avg episode reward: [(0, '27.592')] [2023-02-22 17:27:34,690][33578] Updated weights for policy 0, policy_version 2412 (0.0015) [2023-02-22 17:27:36,684][15372] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3721.2). Total num frames: 4947968. Throughput: 0: 977.0. Samples: 1237456. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:27:36,687][15372] Avg episode reward: [(0, '25.578')] [2023-02-22 17:27:39,038][33578] Updated weights for policy 0, policy_version 2422 (0.0017) [2023-02-22 17:27:41,685][15372] Fps is (10 sec: 4505.4, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 4968448. Throughput: 0: 978.1. Samples: 1240992. Policy #0 lag: (min: 1.0, avg: 1.8, max: 5.0) [2023-02-22 17:27:41,688][15372] Avg episode reward: [(0, '26.025')] [2023-02-22 17:27:41,701][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002426_4968448.pth... [2023-02-22 17:27:41,839][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001990_4075520.pth [2023-02-22 17:27:45,967][33578] Updated weights for policy 0, policy_version 2432 (0.0033) [2023-02-22 17:27:46,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3707.2). Total num frames: 4980736. Throughput: 0: 923.3. Samples: 1245224. Policy #0 lag: (min: 0.0, avg: 0.8, max: 4.0) [2023-02-22 17:27:46,693][15372] Avg episode reward: [(0, '25.455')] [2023-02-22 17:27:51,685][15372] Fps is (10 sec: 2867.2, 60 sec: 3754.6, 300 sec: 3693.3). Total num frames: 4997120. Throughput: 0: 942.6. Samples: 1250414. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:27:51,687][15372] Avg episode reward: [(0, '25.951')] [2023-02-22 17:27:51,863][33578] Updated weights for policy 0, policy_version 2442 (0.0026) [2023-02-22 17:27:56,629][33578] Updated weights for policy 0, policy_version 2452 (0.0011) [2023-02-22 17:27:56,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 5021696. Throughput: 0: 968.0. Samples: 1253772. Policy #0 lag: (min: 1.0, avg: 2.4, max: 5.0) [2023-02-22 17:27:56,690][15372] Avg episode reward: [(0, '26.549')] [2023-02-22 17:28:01,686][15372] Fps is (10 sec: 4095.4, 60 sec: 3754.6, 300 sec: 3721.1). Total num frames: 5038080. Throughput: 0: 946.8. Samples: 1259912. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:28:01,689][15372] Avg episode reward: [(0, '27.483')] [2023-02-22 17:28:02,388][33578] Updated weights for policy 0, policy_version 2462 (0.0021) [2023-02-22 17:28:06,690][15372] Fps is (10 sec: 2865.6, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 5050368. Throughput: 0: 900.0. Samples: 1263936. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) [2023-02-22 17:28:06,692][15372] Avg episode reward: [(0, '27.634')] [2023-02-22 17:28:09,553][33578] Updated weights for policy 0, policy_version 2472 (0.0027) [2023-02-22 17:28:11,690][15372] Fps is (10 sec: 3275.6, 60 sec: 3754.4, 300 sec: 3693.3). Total num frames: 5070848. Throughput: 0: 903.4. Samples: 1266292. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:28:11,693][15372] Avg episode reward: [(0, '27.909')] [2023-02-22 17:28:14,027][33578] Updated weights for policy 0, policy_version 2482 (0.0022) [2023-02-22 17:28:16,684][15372] Fps is (10 sec: 4508.1, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 5095424. Throughput: 0: 948.4. Samples: 1273108. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:28:16,686][15372] Avg episode reward: [(0, '29.971')] [2023-02-22 17:28:16,696][33564] Saving new best policy, reward=29.971! [2023-02-22 17:28:18,438][33578] Updated weights for policy 0, policy_version 2492 (0.0029) [2023-02-22 17:28:21,687][15372] Fps is (10 sec: 4097.2, 60 sec: 3686.2, 300 sec: 3721.1). Total num frames: 5111808. Throughput: 0: 922.1. Samples: 1278952. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:28:21,694][15372] Avg episode reward: [(0, '29.913')] [2023-02-22 17:28:25,051][33578] Updated weights for policy 0, policy_version 2502 (0.0020) [2023-02-22 17:28:26,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 5128192. Throughput: 0: 892.5. Samples: 1281152. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:28:26,688][15372] Avg episode reward: [(0, '29.735')] [2023-02-22 17:28:31,037][33578] Updated weights for policy 0, policy_version 2512 (0.0021) [2023-02-22 17:28:31,684][15372] Fps is (10 sec: 3277.6, 60 sec: 3686.4, 300 sec: 3679.5). Total num frames: 5144576. Throughput: 0: 907.2. Samples: 1286050. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:28:31,690][15372] Avg episode reward: [(0, '28.606')] [2023-02-22 17:28:35,825][33578] Updated weights for policy 0, policy_version 2522 (0.0012) [2023-02-22 17:28:36,685][15372] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3693.3). Total num frames: 5165056. Throughput: 0: 936.9. Samples: 1292574. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:28:36,687][15372] Avg episode reward: [(0, '29.544')] [2023-02-22 17:28:41,116][33578] Updated weights for policy 0, policy_version 2532 (0.0025) [2023-02-22 17:28:41,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3618.2, 300 sec: 3721.1). Total num frames: 5185536. Throughput: 0: 933.9. Samples: 1295796. Policy #0 lag: (min: 1.0, avg: 2.0, max: 4.0) [2023-02-22 17:28:41,688][15372] Avg episode reward: [(0, '29.386')] [2023-02-22 17:28:46,685][15372] Fps is (10 sec: 3276.5, 60 sec: 3618.1, 300 sec: 3679.4). Total num frames: 5197824. Throughput: 0: 890.6. Samples: 1299990. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:28:46,691][15372] Avg episode reward: [(0, '29.498')] [2023-02-22 17:28:48,282][33578] Updated weights for policy 0, policy_version 2542 (0.0033) [2023-02-22 17:28:51,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3679.5). Total num frames: 5218304. Throughput: 0: 926.2. Samples: 1305612. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:28:51,690][15372] Avg episode reward: [(0, '28.306')] [2023-02-22 17:28:52,940][33578] Updated weights for policy 0, policy_version 2552 (0.0022) [2023-02-22 17:28:56,684][15372] Fps is (10 sec: 4506.1, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 5242880. Throughput: 0: 951.8. Samples: 1309116. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:28:56,686][15372] Avg episode reward: [(0, '28.341')] [2023-02-22 17:28:57,248][33578] Updated weights for policy 0, policy_version 2562 (0.0023) [2023-02-22 17:29:01,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3686.5, 300 sec: 3721.1). Total num frames: 5259264. Throughput: 0: 939.0. Samples: 1315362. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:29:01,689][15372] Avg episode reward: [(0, '28.251')] [2023-02-22 17:29:03,138][33578] Updated weights for policy 0, policy_version 2572 (0.0037) [2023-02-22 17:29:06,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3755.0, 300 sec: 3693.3). Total num frames: 5275648. Throughput: 0: 905.7. Samples: 1319706. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:29:06,690][15372] Avg episode reward: [(0, '27.045')] [2023-02-22 17:29:09,687][33578] Updated weights for policy 0, policy_version 2582 (0.0020) [2023-02-22 17:29:11,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3755.0, 300 sec: 3693.3). Total num frames: 5296128. Throughput: 0: 916.2. Samples: 1322382. Policy #0 lag: (min: 1.0, avg: 2.0, max: 4.0) [2023-02-22 17:29:11,691][15372] Avg episode reward: [(0, '26.719')] [2023-02-22 17:29:14,109][33578] Updated weights for policy 0, policy_version 2592 (0.0014) [2023-02-22 17:29:16,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3735.1). Total num frames: 5320704. Throughput: 0: 961.3. Samples: 1329310. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:29:16,688][15372] Avg episode reward: [(0, '26.380')] [2023-02-22 17:29:18,478][33578] Updated weights for policy 0, policy_version 2602 (0.0012) [2023-02-22 17:29:21,685][15372] Fps is (10 sec: 4095.8, 60 sec: 3754.8, 300 sec: 3748.9). Total num frames: 5337088. Throughput: 0: 944.3. Samples: 1335070. Policy #0 lag: (min: 1.0, avg: 2.0, max: 4.0) [2023-02-22 17:29:21,689][15372] Avg episode reward: [(0, '25.350')] [2023-02-22 17:29:25,466][33578] Updated weights for policy 0, policy_version 2612 (0.0030) [2023-02-22 17:29:26,685][15372] Fps is (10 sec: 2867.1, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 5349376. Throughput: 0: 921.5. Samples: 1337262. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:29:26,689][15372] Avg episode reward: [(0, '24.518')] [2023-02-22 17:29:31,080][33578] Updated weights for policy 0, policy_version 2622 (0.0031) [2023-02-22 17:29:31,684][15372] Fps is (10 sec: 3277.0, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 5369856. Throughput: 0: 942.9. Samples: 1342418. Policy #0 lag: (min: 1.0, avg: 2.4, max: 5.0) [2023-02-22 17:29:31,691][15372] Avg episode reward: [(0, '25.056')] [2023-02-22 17:29:35,488][33578] Updated weights for policy 0, policy_version 2632 (0.0023) [2023-02-22 17:29:36,684][15372] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 5394432. Throughput: 0: 971.0. Samples: 1349308. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:29:36,690][15372] Avg episode reward: [(0, '26.536')] [2023-02-22 17:29:40,773][33578] Updated weights for policy 0, policy_version 2642 (0.0013) [2023-02-22 17:29:41,687][15372] Fps is (10 sec: 4095.0, 60 sec: 3754.5, 300 sec: 3762.7). Total num frames: 5410816. Throughput: 0: 959.0. Samples: 1352272. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:29:41,690][15372] Avg episode reward: [(0, '27.350')] [2023-02-22 17:29:41,702][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002642_5410816.pth... [2023-02-22 17:29:41,863][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002204_4513792.pth [2023-02-22 17:29:46,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3762.8). Total num frames: 5427200. Throughput: 0: 917.7. Samples: 1356660. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:29:46,686][15372] Avg episode reward: [(0, '27.654')] [2023-02-22 17:29:47,883][33578] Updated weights for policy 0, policy_version 2652 (0.0024) [2023-02-22 17:29:51,684][15372] Fps is (10 sec: 3687.3, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 5447680. Throughput: 0: 949.3. Samples: 1362424. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:29:51,690][15372] Avg episode reward: [(0, '27.622')] [2023-02-22 17:29:52,485][33578] Updated weights for policy 0, policy_version 2662 (0.0012) [2023-02-22 17:29:56,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 5468160. Throughput: 0: 966.0. Samples: 1365852. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:29:56,687][15372] Avg episode reward: [(0, '29.558')] [2023-02-22 17:29:56,750][33578] Updated weights for policy 0, policy_version 2672 (0.0021) [2023-02-22 17:30:01,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 5488640. Throughput: 0: 948.5. Samples: 1371994. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:30:01,690][15372] Avg episode reward: [(0, '30.931')] [2023-02-22 17:30:01,706][33564] Saving new best policy, reward=30.931! [2023-02-22 17:30:03,095][33578] Updated weights for policy 0, policy_version 2682 (0.0024) [2023-02-22 17:30:06,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 5500928. Throughput: 0: 906.9. Samples: 1375880. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:30:06,690][15372] Avg episode reward: [(0, '30.492')] [2023-02-22 17:30:11,292][33578] Updated weights for policy 0, policy_version 2692 (0.0024) [2023-02-22 17:30:11,684][15372] Fps is (10 sec: 2457.6, 60 sec: 3618.1, 300 sec: 3721.1). Total num frames: 5513216. Throughput: 0: 898.9. Samples: 1377710. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 17:30:11,691][15372] Avg episode reward: [(0, '30.719')] [2023-02-22 17:30:16,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3707.3). Total num frames: 5529600. Throughput: 0: 880.4. Samples: 1382034. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-22 17:30:16,693][15372] Avg episode reward: [(0, '29.851')] [2023-02-22 17:30:17,310][33578] Updated weights for policy 0, policy_version 2702 (0.0018) [2023-02-22 17:30:21,688][15372] Fps is (10 sec: 3685.1, 60 sec: 3549.7, 300 sec: 3721.1). Total num frames: 5550080. Throughput: 0: 871.6. Samples: 1388534. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) [2023-02-22 17:30:21,695][15372] Avg episode reward: [(0, '29.767')] [2023-02-22 17:30:22,396][33578] Updated weights for policy 0, policy_version 2712 (0.0012) [2023-02-22 17:30:26,687][15372] Fps is (10 sec: 3276.0, 60 sec: 3549.7, 300 sec: 3707.2). Total num frames: 5562368. Throughput: 0: 853.6. Samples: 1390682. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:30:26,688][15372] Avg episode reward: [(0, '29.671')] [2023-02-22 17:30:29,569][33578] Updated weights for policy 0, policy_version 2722 (0.0018) [2023-02-22 17:30:31,684][15372] Fps is (10 sec: 3277.9, 60 sec: 3549.9, 300 sec: 3707.2). Total num frames: 5582848. Throughput: 0: 852.8. Samples: 1395036. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:30:31,692][15372] Avg episode reward: [(0, '30.584')] [2023-02-22 17:30:34,227][33578] Updated weights for policy 0, policy_version 2732 (0.0012) [2023-02-22 17:30:36,685][15372] Fps is (10 sec: 4096.9, 60 sec: 3481.6, 300 sec: 3693.3). Total num frames: 5603328. Throughput: 0: 880.8. Samples: 1402062. Policy #0 lag: (min: 1.0, avg: 2.2, max: 3.0) [2023-02-22 17:30:36,694][15372] Avg episode reward: [(0, '30.041')] [2023-02-22 17:30:38,550][33578] Updated weights for policy 0, policy_version 2742 (0.0020) [2023-02-22 17:30:41,685][15372] Fps is (10 sec: 4095.9, 60 sec: 3550.0, 300 sec: 3721.1). Total num frames: 5623808. Throughput: 0: 880.8. Samples: 1405490. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:30:41,696][15372] Avg episode reward: [(0, '29.161')] [2023-02-22 17:30:44,761][33578] Updated weights for policy 0, policy_version 2752 (0.0023) [2023-02-22 17:30:46,690][15372] Fps is (10 sec: 3684.5, 60 sec: 3549.5, 300 sec: 3721.0). Total num frames: 5640192. Throughput: 0: 847.2. Samples: 1410124. Policy #0 lag: (min: 1.0, avg: 2.2, max: 3.0) [2023-02-22 17:30:46,694][15372] Avg episode reward: [(0, '28.582')] [2023-02-22 17:30:51,235][33578] Updated weights for policy 0, policy_version 2762 (0.0025) [2023-02-22 17:30:51,684][15372] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3693.3). Total num frames: 5656576. Throughput: 0: 872.0. Samples: 1415122. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:30:51,689][15372] Avg episode reward: [(0, '27.904')] [2023-02-22 17:30:55,502][33578] Updated weights for policy 0, policy_version 2772 (0.0025) [2023-02-22 17:30:56,684][15372] Fps is (10 sec: 4098.3, 60 sec: 3549.9, 300 sec: 3707.2). Total num frames: 5681152. Throughput: 0: 910.8. Samples: 1418698. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:30:56,687][15372] Avg episode reward: [(0, '26.622')] [2023-02-22 17:30:59,723][33578] Updated weights for policy 0, policy_version 2782 (0.0020) [2023-02-22 17:31:01,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3735.0). Total num frames: 5701632. Throughput: 0: 969.6. Samples: 1425664. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:31:01,690][15372] Avg episode reward: [(0, '25.835')] [2023-02-22 17:31:06,685][15372] Fps is (10 sec: 3276.5, 60 sec: 3549.8, 300 sec: 3707.2). Total num frames: 5713920. Throughput: 0: 917.1. Samples: 1429800. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:31:06,689][15372] Avg episode reward: [(0, '26.069')] [2023-02-22 17:31:06,779][33578] Updated weights for policy 0, policy_version 2792 (0.0050) [2023-02-22 17:31:11,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 5734400. Throughput: 0: 920.1. Samples: 1432084. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:31:11,687][15372] Avg episode reward: [(0, '25.715')] [2023-02-22 17:31:12,364][33578] Updated weights for policy 0, policy_version 2802 (0.0036) [2023-02-22 17:31:16,684][15372] Fps is (10 sec: 4096.4, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 5754880. Throughput: 0: 970.0. Samples: 1438688. Policy #0 lag: (min: 1.0, avg: 2.4, max: 5.0) [2023-02-22 17:31:16,687][15372] Avg episode reward: [(0, '26.535')] [2023-02-22 17:31:17,050][33578] Updated weights for policy 0, policy_version 2812 (0.0018) [2023-02-22 17:31:21,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3754.9, 300 sec: 3721.1). Total num frames: 5775360. Throughput: 0: 944.9. Samples: 1444584. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:31:21,689][15372] Avg episode reward: [(0, '26.661')] [2023-02-22 17:31:22,863][33578] Updated weights for policy 0, policy_version 2822 (0.0023) [2023-02-22 17:31:26,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.8, 300 sec: 3707.2). Total num frames: 5787648. Throughput: 0: 912.4. Samples: 1446548. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:31:26,692][15372] Avg episode reward: [(0, '27.393')] [2023-02-22 17:31:30,119][33578] Updated weights for policy 0, policy_version 2832 (0.0052) [2023-02-22 17:31:31,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3679.5). Total num frames: 5804032. Throughput: 0: 911.0. Samples: 1451116. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:31:31,692][15372] Avg episode reward: [(0, '25.633')] [2023-02-22 17:31:34,550][33578] Updated weights for policy 0, policy_version 2842 (0.0017) [2023-02-22 17:31:36,685][15372] Fps is (10 sec: 4095.9, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 5828608. Throughput: 0: 946.5. Samples: 1457716. Policy #0 lag: (min: 0.0, avg: 0.9, max: 4.0) [2023-02-22 17:31:36,689][15372] Avg episode reward: [(0, '24.689')] [2023-02-22 17:31:39,320][33578] Updated weights for policy 0, policy_version 2852 (0.0012) [2023-02-22 17:31:41,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 5844992. Throughput: 0: 938.0. Samples: 1460906. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 17:31:41,688][15372] Avg episode reward: [(0, '24.727')] [2023-02-22 17:31:41,711][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002854_5844992.pth... [2023-02-22 17:31:41,883][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002426_4968448.pth [2023-02-22 17:31:46,372][33578] Updated weights for policy 0, policy_version 2862 (0.0028) [2023-02-22 17:31:46,685][15372] Fps is (10 sec: 3276.7, 60 sec: 3686.7, 300 sec: 3693.3). Total num frames: 5861376. Throughput: 0: 879.0. Samples: 1465218. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:31:46,691][15372] Avg episode reward: [(0, '24.838')] [2023-02-22 17:31:51,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3679.5). Total num frames: 5877760. Throughput: 0: 906.4. Samples: 1470586. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:31:51,693][15372] Avg episode reward: [(0, '25.015')] [2023-02-22 17:31:52,023][33578] Updated weights for policy 0, policy_version 2872 (0.0011) [2023-02-22 17:31:56,319][33578] Updated weights for policy 0, policy_version 2882 (0.0012) [2023-02-22 17:31:56,684][15372] Fps is (10 sec: 4096.2, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 5902336. Throughput: 0: 934.3. Samples: 1474126. Policy #0 lag: (min: 1.0, avg: 2.2, max: 4.0) [2023-02-22 17:31:56,687][15372] Avg episode reward: [(0, '26.381')] [2023-02-22 17:32:01,186][33578] Updated weights for policy 0, policy_version 2892 (0.0013) [2023-02-22 17:32:01,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3707.3). Total num frames: 5922816. Throughput: 0: 933.1. Samples: 1480678. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:32:01,695][15372] Avg episode reward: [(0, '26.741')] [2023-02-22 17:32:06,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3686.5, 300 sec: 3693.4). Total num frames: 5935104. Throughput: 0: 899.5. Samples: 1485062. Policy #0 lag: (min: 1.0, avg: 1.8, max: 3.0) [2023-02-22 17:32:06,687][15372] Avg episode reward: [(0, '27.473')] [2023-02-22 17:32:08,394][33578] Updated weights for policy 0, policy_version 2902 (0.0027) [2023-02-22 17:32:11,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3679.5). Total num frames: 5955584. Throughput: 0: 905.8. Samples: 1487310. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:32:11,692][15372] Avg episode reward: [(0, '27.444')] [2023-02-22 17:32:13,112][33578] Updated weights for policy 0, policy_version 2912 (0.0018) [2023-02-22 17:32:16,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 5980160. Throughput: 0: 961.3. Samples: 1494376. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:32:16,691][15372] Avg episode reward: [(0, '27.697')] [2023-02-22 17:32:17,351][33578] Updated weights for policy 0, policy_version 2922 (0.0011) [2023-02-22 17:32:21,685][15372] Fps is (10 sec: 4095.8, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 5996544. Throughput: 0: 952.2. Samples: 1500566. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:32:21,694][15372] Avg episode reward: [(0, '28.655')] [2023-02-22 17:32:23,170][33578] Updated weights for policy 0, policy_version 2932 (0.0021) [2023-02-22 17:32:26,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 6012928. Throughput: 0: 929.4. Samples: 1502730. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:32:26,689][15372] Avg episode reward: [(0, '27.118')] [2023-02-22 17:32:29,587][33578] Updated weights for policy 0, policy_version 2942 (0.0026) [2023-02-22 17:32:31,684][15372] Fps is (10 sec: 3686.6, 60 sec: 3822.9, 300 sec: 3679.5). Total num frames: 6033408. Throughput: 0: 946.1. Samples: 1507792. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) [2023-02-22 17:32:31,687][15372] Avg episode reward: [(0, '27.519')] [2023-02-22 17:32:33,979][33578] Updated weights for policy 0, policy_version 2952 (0.0024) [2023-02-22 17:32:36,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3823.0, 300 sec: 3693.3). Total num frames: 6057984. Throughput: 0: 985.0. Samples: 1514910. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:32:36,686][15372] Avg episode reward: [(0, '27.416')] [2023-02-22 17:32:38,344][33578] Updated weights for policy 0, policy_version 2962 (0.0017) [2023-02-22 17:32:41,685][15372] Fps is (10 sec: 4095.9, 60 sec: 3822.9, 300 sec: 3707.2). Total num frames: 6074368. Throughput: 0: 980.1. Samples: 1518230. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:32:41,690][15372] Avg episode reward: [(0, '27.561')] [2023-02-22 17:32:45,087][33578] Updated weights for policy 0, policy_version 2972 (0.0015) [2023-02-22 17:32:46,686][15372] Fps is (10 sec: 3276.3, 60 sec: 3822.9, 300 sec: 3707.2). Total num frames: 6090752. Throughput: 0: 928.5. Samples: 1522462. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:32:46,691][15372] Avg episode reward: [(0, '27.891')] [2023-02-22 17:32:51,307][33578] Updated weights for policy 0, policy_version 2982 (0.0026) [2023-02-22 17:32:51,684][15372] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3679.5). Total num frames: 6107136. Throughput: 0: 944.9. Samples: 1527584. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:32:51,686][15372] Avg episode reward: [(0, '28.026')] [2023-02-22 17:32:56,068][33578] Updated weights for policy 0, policy_version 2992 (0.0022) [2023-02-22 17:32:56,684][15372] Fps is (10 sec: 3687.0, 60 sec: 3754.7, 300 sec: 3693.4). Total num frames: 6127616. Throughput: 0: 966.6. Samples: 1530806. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:32:56,687][15372] Avg episode reward: [(0, '28.263')] [2023-02-22 17:33:01,687][15372] Fps is (10 sec: 3685.5, 60 sec: 3686.2, 300 sec: 3707.3). Total num frames: 6144000. Throughput: 0: 940.0. Samples: 1536680. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:33:01,690][15372] Avg episode reward: [(0, '29.457')] [2023-02-22 17:33:01,969][33578] Updated weights for policy 0, policy_version 3002 (0.0023) [2023-02-22 17:33:06,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3693.4). Total num frames: 6160384. Throughput: 0: 891.5. Samples: 1540684. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:33:06,695][15372] Avg episode reward: [(0, '28.566')] [2023-02-22 17:33:09,163][33578] Updated weights for policy 0, policy_version 3012 (0.0014) [2023-02-22 17:33:11,684][15372] Fps is (10 sec: 3277.6, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 6176768. Throughput: 0: 896.4. Samples: 1543068. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:33:11,693][15372] Avg episode reward: [(0, '26.700')] [2023-02-22 17:33:13,890][33578] Updated weights for policy 0, policy_version 3022 (0.0020) [2023-02-22 17:33:16,685][15372] Fps is (10 sec: 4095.9, 60 sec: 3686.4, 300 sec: 3693.4). Total num frames: 6201344. Throughput: 0: 930.8. Samples: 1549678. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:33:16,690][15372] Avg episode reward: [(0, '25.856')] [2023-02-22 17:33:18,538][33578] Updated weights for policy 0, policy_version 3032 (0.0019) [2023-02-22 17:33:21,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 6217728. Throughput: 0: 891.4. Samples: 1555022. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:33:21,687][15372] Avg episode reward: [(0, '25.761')] [2023-02-22 17:33:25,936][33578] Updated weights for policy 0, policy_version 3042 (0.0017) [2023-02-22 17:33:26,686][15372] Fps is (10 sec: 2866.8, 60 sec: 3618.0, 300 sec: 3679.4). Total num frames: 6230016. Throughput: 0: 862.0. Samples: 1557020. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:33:26,691][15372] Avg episode reward: [(0, '26.225')] [2023-02-22 17:33:31,470][33578] Updated weights for policy 0, policy_version 3052 (0.0012) [2023-02-22 17:33:31,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3679.5). Total num frames: 6250496. Throughput: 0: 879.1. Samples: 1562018. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:33:31,693][15372] Avg episode reward: [(0, '24.938')] [2023-02-22 17:33:35,894][33578] Updated weights for policy 0, policy_version 3062 (0.0017) [2023-02-22 17:33:36,684][15372] Fps is (10 sec: 4506.2, 60 sec: 3618.1, 300 sec: 3693.3). Total num frames: 6275072. Throughput: 0: 923.5. Samples: 1569142. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:33:36,687][15372] Avg episode reward: [(0, '24.957')] [2023-02-22 17:33:41,189][33578] Updated weights for policy 0, policy_version 3072 (0.0013) [2023-02-22 17:33:41,686][15372] Fps is (10 sec: 4095.2, 60 sec: 3618.0, 300 sec: 3707.2). Total num frames: 6291456. Throughput: 0: 921.6. Samples: 1572282. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:33:41,695][15372] Avg episode reward: [(0, '25.454')] [2023-02-22 17:33:41,715][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003072_6291456.pth... [2023-02-22 17:33:41,872][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002642_5410816.pth [2023-02-22 17:33:46,684][15372] Fps is (10 sec: 2867.3, 60 sec: 3550.0, 300 sec: 3679.5). Total num frames: 6303744. Throughput: 0: 886.8. Samples: 1576582. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:33:46,691][15372] Avg episode reward: [(0, '25.956')] [2023-02-22 17:33:48,255][33578] Updated weights for policy 0, policy_version 3082 (0.0011) [2023-02-22 17:33:51,684][15372] Fps is (10 sec: 3277.5, 60 sec: 3618.1, 300 sec: 3665.6). Total num frames: 6324224. Throughput: 0: 926.4. Samples: 1582370. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:33:51,687][15372] Avg episode reward: [(0, '27.697')] [2023-02-22 17:33:52,692][33578] Updated weights for policy 0, policy_version 3092 (0.0023) [2023-02-22 17:33:56,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 6348800. Throughput: 0: 952.0. Samples: 1585910. Policy #0 lag: (min: 0.0, avg: 1.3, max: 4.0) [2023-02-22 17:33:56,691][15372] Avg episode reward: [(0, '27.486')] [2023-02-22 17:33:56,946][33578] Updated weights for policy 0, policy_version 3102 (0.0025) [2023-02-22 17:34:01,684][15372] Fps is (10 sec: 4505.5, 60 sec: 3754.8, 300 sec: 3707.2). Total num frames: 6369280. Throughput: 0: 945.4. Samples: 1592222. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:34:01,687][15372] Avg episode reward: [(0, '28.180')] [2023-02-22 17:34:02,905][33578] Updated weights for policy 0, policy_version 3112 (0.0032) [2023-02-22 17:34:06,685][15372] Fps is (10 sec: 3276.7, 60 sec: 3686.4, 300 sec: 3679.5). Total num frames: 6381568. Throughput: 0: 925.6. Samples: 1596674. Policy #0 lag: (min: 1.0, avg: 2.4, max: 5.0) [2023-02-22 17:34:06,689][15372] Avg episode reward: [(0, '27.917')] [2023-02-22 17:34:09,255][33578] Updated weights for policy 0, policy_version 3122 (0.0012) [2023-02-22 17:34:11,684][15372] Fps is (10 sec: 3276.9, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 6402048. Throughput: 0: 943.1. Samples: 1599458. Policy #0 lag: (min: 1.0, avg: 2.4, max: 5.0) [2023-02-22 17:34:11,686][15372] Avg episode reward: [(0, '29.683')] [2023-02-22 17:34:13,474][33578] Updated weights for policy 0, policy_version 3132 (0.0011) [2023-02-22 17:34:16,684][15372] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 6426624. Throughput: 0: 988.1. Samples: 1606484. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-02-22 17:34:16,691][15372] Avg episode reward: [(0, '29.033')] [2023-02-22 17:34:17,761][33578] Updated weights for policy 0, policy_version 3142 (0.0029) [2023-02-22 17:34:21,689][15372] Fps is (10 sec: 4503.6, 60 sec: 3822.6, 300 sec: 3721.1). Total num frames: 6447104. Throughput: 0: 956.6. Samples: 1612194. Policy #0 lag: (min: 1.0, avg: 1.8, max: 3.0) [2023-02-22 17:34:21,696][15372] Avg episode reward: [(0, '26.683')] [2023-02-22 17:34:24,401][33578] Updated weights for policy 0, policy_version 3152 (0.0016) [2023-02-22 17:34:26,685][15372] Fps is (10 sec: 3276.6, 60 sec: 3823.0, 300 sec: 3693.3). Total num frames: 6459392. Throughput: 0: 934.7. Samples: 1614342. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 17:34:26,687][15372] Avg episode reward: [(0, '25.811')] [2023-02-22 17:34:30,519][33578] Updated weights for policy 0, policy_version 3162 (0.0020) [2023-02-22 17:34:31,685][15372] Fps is (10 sec: 3278.2, 60 sec: 3822.9, 300 sec: 3679.5). Total num frames: 6479872. Throughput: 0: 956.4. Samples: 1619620. Policy #0 lag: (min: 1.0, avg: 2.2, max: 3.0) [2023-02-22 17:34:31,688][15372] Avg episode reward: [(0, '25.119')] [2023-02-22 17:34:34,803][33578] Updated weights for policy 0, policy_version 3172 (0.0015) [2023-02-22 17:34:36,684][15372] Fps is (10 sec: 4505.8, 60 sec: 3822.9, 300 sec: 3707.3). Total num frames: 6504448. Throughput: 0: 983.4. Samples: 1626622. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:34:36,687][15372] Avg episode reward: [(0, '26.316')] [2023-02-22 17:34:39,729][33578] Updated weights for policy 0, policy_version 3182 (0.0012) [2023-02-22 17:34:41,689][15372] Fps is (10 sec: 4094.4, 60 sec: 3822.8, 300 sec: 3707.2). Total num frames: 6520832. Throughput: 0: 967.6. Samples: 1629454. Policy #0 lag: (min: 1.0, avg: 1.8, max: 3.0) [2023-02-22 17:34:41,694][15372] Avg episode reward: [(0, '25.632')] [2023-02-22 17:34:46,687][15372] Fps is (10 sec: 2866.4, 60 sec: 3822.8, 300 sec: 3679.4). Total num frames: 6533120. Throughput: 0: 924.0. Samples: 1633806. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:34:46,695][15372] Avg episode reward: [(0, '26.124')] [2023-02-22 17:34:47,056][33578] Updated weights for policy 0, policy_version 3192 (0.0047) [2023-02-22 17:34:51,685][15372] Fps is (10 sec: 3278.1, 60 sec: 3822.9, 300 sec: 3679.5). Total num frames: 6553600. Throughput: 0: 955.7. Samples: 1639682. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:34:51,687][15372] Avg episode reward: [(0, '27.494')] [2023-02-22 17:34:51,790][33578] Updated weights for policy 0, policy_version 3202 (0.0014) [2023-02-22 17:34:56,128][33578] Updated weights for policy 0, policy_version 3212 (0.0014) [2023-02-22 17:34:56,684][15372] Fps is (10 sec: 4506.8, 60 sec: 3822.9, 300 sec: 3693.3). Total num frames: 6578176. Throughput: 0: 971.5. Samples: 1643174. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:34:56,686][15372] Avg episode reward: [(0, '29.514')] [2023-02-22 17:35:01,684][15372] Fps is (10 sec: 4096.1, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 6594560. Throughput: 0: 948.6. Samples: 1649170. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-02-22 17:35:01,689][15372] Avg episode reward: [(0, '29.744')] [2023-02-22 17:35:01,937][33578] Updated weights for policy 0, policy_version 3222 (0.0020) [2023-02-22 17:35:06,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 6610944. Throughput: 0: 920.6. Samples: 1653618. Policy #0 lag: (min: 1.0, avg: 2.1, max: 3.0) [2023-02-22 17:35:06,687][15372] Avg episode reward: [(0, '27.986')] [2023-02-22 17:35:08,304][33578] Updated weights for policy 0, policy_version 3232 (0.0047) [2023-02-22 17:35:11,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 6631424. Throughput: 0: 941.1. Samples: 1656692. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:35:11,687][15372] Avg episode reward: [(0, '28.355')] [2023-02-22 17:35:12,772][33578] Updated weights for policy 0, policy_version 3242 (0.0020) [2023-02-22 17:35:16,685][15372] Fps is (10 sec: 4505.5, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 6656000. Throughput: 0: 978.4. Samples: 1663646. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:35:16,687][15372] Avg episode reward: [(0, '27.493')] [2023-02-22 17:35:17,095][33578] Updated weights for policy 0, policy_version 3252 (0.0015) [2023-02-22 17:35:21,686][15372] Fps is (10 sec: 4095.4, 60 sec: 3754.9, 300 sec: 3762.8). Total num frames: 6672384. Throughput: 0: 942.8. Samples: 1669050. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:35:21,688][15372] Avg episode reward: [(0, '27.261')] [2023-02-22 17:35:23,792][33578] Updated weights for policy 0, policy_version 3262 (0.0025) [2023-02-22 17:35:26,684][15372] Fps is (10 sec: 3276.9, 60 sec: 3823.0, 300 sec: 3748.9). Total num frames: 6688768. Throughput: 0: 929.5. Samples: 1671278. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:35:26,687][15372] Avg episode reward: [(0, '27.560')] [2023-02-22 17:35:29,604][33578] Updated weights for policy 0, policy_version 3272 (0.0012) [2023-02-22 17:35:31,684][15372] Fps is (10 sec: 3686.9, 60 sec: 3823.0, 300 sec: 3748.9). Total num frames: 6709248. Throughput: 0: 957.5. Samples: 1676892. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:35:31,687][15372] Avg episode reward: [(0, '28.177')] [2023-02-22 17:35:34,082][33578] Updated weights for policy 0, policy_version 3282 (0.0021) [2023-02-22 17:35:36,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 6733824. Throughput: 0: 983.6. Samples: 1683942. Policy #0 lag: (min: 1.0, avg: 2.3, max: 3.0) [2023-02-22 17:35:36,686][15372] Avg episode reward: [(0, '28.834')] [2023-02-22 17:35:38,839][33578] Updated weights for policy 0, policy_version 3292 (0.0021) [2023-02-22 17:35:41,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3823.2, 300 sec: 3762.8). Total num frames: 6750208. Throughput: 0: 964.4. Samples: 1686570. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:35:41,692][15372] Avg episode reward: [(0, '28.449')] [2023-02-22 17:35:41,706][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003296_6750208.pth... [2023-02-22 17:35:41,859][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002854_5844992.pth [2023-02-22 17:35:46,480][33578] Updated weights for policy 0, policy_version 3302 (0.0011) [2023-02-22 17:35:46,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3823.1, 300 sec: 3748.9). Total num frames: 6762496. Throughput: 0: 921.2. Samples: 1690622. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:35:46,687][15372] Avg episode reward: [(0, '28.723')] [2023-02-22 17:35:51,686][15372] Fps is (10 sec: 2457.2, 60 sec: 3686.3, 300 sec: 3707.2). Total num frames: 6774784. Throughput: 0: 910.2. Samples: 1694580. Policy #0 lag: (min: 1.0, avg: 2.1, max: 3.0) [2023-02-22 17:35:51,690][15372] Avg episode reward: [(0, '28.904')] [2023-02-22 17:35:53,933][33578] Updated weights for policy 0, policy_version 3312 (0.0042) [2023-02-22 17:35:56,684][15372] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3693.3). Total num frames: 6791168. Throughput: 0: 892.4. Samples: 1696850. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:35:56,690][15372] Avg episode reward: [(0, '30.088')] [2023-02-22 17:35:58,901][33578] Updated weights for policy 0, policy_version 3322 (0.0019) [2023-02-22 17:36:01,684][15372] Fps is (10 sec: 3687.1, 60 sec: 3618.1, 300 sec: 3721.1). Total num frames: 6811648. Throughput: 0: 865.6. Samples: 1702600. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:36:01,692][15372] Avg episode reward: [(0, '29.960')] [2023-02-22 17:36:06,019][33578] Updated weights for policy 0, policy_version 3332 (0.0049) [2023-02-22 17:36:06,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3693.3). Total num frames: 6823936. Throughput: 0: 843.5. Samples: 1707008. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:36:06,690][15372] Avg episode reward: [(0, '29.953')] [2023-02-22 17:36:11,115][33578] Updated weights for policy 0, policy_version 3342 (0.0013) [2023-02-22 17:36:11,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3693.3). Total num frames: 6844416. Throughput: 0: 856.5. Samples: 1709822. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:36:11,691][15372] Avg episode reward: [(0, '30.040')] [2023-02-22 17:36:15,425][33578] Updated weights for policy 0, policy_version 3352 (0.0011) [2023-02-22 17:36:16,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3707.2). Total num frames: 6868992. Throughput: 0: 889.7. Samples: 1716930. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:36:16,687][15372] Avg episode reward: [(0, '31.134')] [2023-02-22 17:36:16,695][33564] Saving new best policy, reward=31.134! [2023-02-22 17:36:20,525][33578] Updated weights for policy 0, policy_version 3362 (0.0042) [2023-02-22 17:36:21,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3721.1). Total num frames: 6885376. Throughput: 0: 858.4. Samples: 1722570. Policy #0 lag: (min: 1.0, avg: 2.4, max: 5.0) [2023-02-22 17:36:21,690][15372] Avg episode reward: [(0, '31.252')] [2023-02-22 17:36:21,710][33564] Saving new best policy, reward=31.252! [2023-02-22 17:36:26,685][15372] Fps is (10 sec: 3276.5, 60 sec: 3549.8, 300 sec: 3721.1). Total num frames: 6901760. Throughput: 0: 848.6. Samples: 1724758. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:36:26,690][15372] Avg episode reward: [(0, '30.657')] [2023-02-22 17:36:27,545][33578] Updated weights for policy 0, policy_version 3372 (0.0028) [2023-02-22 17:36:31,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3707.2). Total num frames: 6922240. Throughput: 0: 881.0. Samples: 1730266. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:36:31,687][15372] Avg episode reward: [(0, '28.469')] [2023-02-22 17:36:32,099][33578] Updated weights for policy 0, policy_version 3382 (0.0011) [2023-02-22 17:36:36,447][33578] Updated weights for policy 0, policy_version 3392 (0.0012) [2023-02-22 17:36:36,684][15372] Fps is (10 sec: 4506.0, 60 sec: 3549.9, 300 sec: 3735.0). Total num frames: 6946816. Throughput: 0: 951.4. Samples: 1737392. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:36:36,690][15372] Avg episode reward: [(0, '28.780')] [2023-02-22 17:36:41,688][15372] Fps is (10 sec: 4094.5, 60 sec: 3549.7, 300 sec: 3735.0). Total num frames: 6963200. Throughput: 0: 966.1. Samples: 1740326. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:36:41,691][15372] Avg episode reward: [(0, '28.544')] [2023-02-22 17:36:42,343][33578] Updated weights for policy 0, policy_version 3402 (0.0013) [2023-02-22 17:36:46,687][15372] Fps is (10 sec: 3276.0, 60 sec: 3618.0, 300 sec: 3735.0). Total num frames: 6979584. Throughput: 0: 934.8. Samples: 1744670. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:36:46,689][15372] Avg episode reward: [(0, '28.189')] [2023-02-22 17:36:48,776][33578] Updated weights for policy 0, policy_version 3412 (0.0025) [2023-02-22 17:36:51,684][15372] Fps is (10 sec: 3687.7, 60 sec: 3754.8, 300 sec: 3721.1). Total num frames: 7000064. Throughput: 0: 975.4. Samples: 1750902. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:36:51,687][15372] Avg episode reward: [(0, '28.059')] [2023-02-22 17:36:52,961][33578] Updated weights for policy 0, policy_version 3422 (0.0024) [2023-02-22 17:36:56,684][15372] Fps is (10 sec: 4506.7, 60 sec: 3891.2, 300 sec: 3735.0). Total num frames: 7024640. Throughput: 0: 993.6. Samples: 1754536. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:36:56,687][15372] Avg episode reward: [(0, '29.539')] [2023-02-22 17:36:57,306][33578] Updated weights for policy 0, policy_version 3432 (0.0018) [2023-02-22 17:37:01,685][15372] Fps is (10 sec: 4095.9, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 7041024. Throughput: 0: 968.4. Samples: 1760510. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:37:01,687][15372] Avg episode reward: [(0, '29.973')] [2023-02-22 17:37:03,678][33578] Updated weights for policy 0, policy_version 3442 (0.0020) [2023-02-22 17:37:06,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3735.0). Total num frames: 7057408. Throughput: 0: 942.2. Samples: 1764970. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:37:06,687][15372] Avg episode reward: [(0, '29.432')] [2023-02-22 17:37:09,441][33578] Updated weights for policy 0, policy_version 3452 (0.0023) [2023-02-22 17:37:11,684][15372] Fps is (10 sec: 3686.5, 60 sec: 3891.2, 300 sec: 3721.1). Total num frames: 7077888. Throughput: 0: 962.3. Samples: 1768060. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:37:11,686][15372] Avg episode reward: [(0, '29.765')] [2023-02-22 17:37:13,713][33578] Updated weights for policy 0, policy_version 3462 (0.0012) [2023-02-22 17:37:16,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3748.9). Total num frames: 7102464. Throughput: 0: 999.2. Samples: 1775230. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:37:16,687][15372] Avg episode reward: [(0, '29.884')] [2023-02-22 17:37:18,333][33578] Updated weights for policy 0, policy_version 3472 (0.0015) [2023-02-22 17:37:21,686][15372] Fps is (10 sec: 4095.2, 60 sec: 3891.1, 300 sec: 3748.9). Total num frames: 7118848. Throughput: 0: 958.4. Samples: 1780522. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:37:21,693][15372] Avg episode reward: [(0, '28.966')] [2023-02-22 17:37:25,199][33578] Updated weights for policy 0, policy_version 3482 (0.0031) [2023-02-22 17:37:26,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3735.0). Total num frames: 7135232. Throughput: 0: 942.6. Samples: 1782740. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2023-02-22 17:37:26,690][15372] Avg episode reward: [(0, '29.578')] [2023-02-22 17:37:30,394][33578] Updated weights for policy 0, policy_version 3492 (0.0022) [2023-02-22 17:37:31,690][15372] Fps is (10 sec: 3685.0, 60 sec: 3890.8, 300 sec: 3721.0). Total num frames: 7155712. Throughput: 0: 973.7. Samples: 1788488. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:37:31,695][15372] Avg episode reward: [(0, '29.079')] [2023-02-22 17:37:34,843][33578] Updated weights for policy 0, policy_version 3502 (0.0016) [2023-02-22 17:37:36,685][15372] Fps is (10 sec: 4505.5, 60 sec: 3891.2, 300 sec: 3748.9). Total num frames: 7180288. Throughput: 0: 992.6. Samples: 1795570. Policy #0 lag: (min: 0.0, avg: 1.2, max: 2.0) [2023-02-22 17:37:36,693][15372] Avg episode reward: [(0, '30.847')] [2023-02-22 17:37:40,393][33578] Updated weights for policy 0, policy_version 3512 (0.0025) [2023-02-22 17:37:41,688][15372] Fps is (10 sec: 3687.2, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 7192576. Throughput: 0: 968.9. Samples: 1798140. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 17:37:41,691][15372] Avg episode reward: [(0, '29.890')] [2023-02-22 17:37:41,807][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003514_7196672.pth... [2023-02-22 17:37:41,956][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003072_6291456.pth [2023-02-22 17:37:46,684][15372] Fps is (10 sec: 2867.3, 60 sec: 3823.1, 300 sec: 3735.0). Total num frames: 7208960. Throughput: 0: 932.0. Samples: 1802452. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:37:46,692][15372] Avg episode reward: [(0, '29.233')] [2023-02-22 17:37:47,436][33578] Updated weights for policy 0, policy_version 3522 (0.0042) [2023-02-22 17:37:51,684][15372] Fps is (10 sec: 3687.7, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 7229440. Throughput: 0: 970.6. Samples: 1808646. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:37:51,689][15372] Avg episode reward: [(0, '29.832')] [2023-02-22 17:37:51,800][33578] Updated weights for policy 0, policy_version 3532 (0.0014) [2023-02-22 17:37:56,130][33578] Updated weights for policy 0, policy_version 3542 (0.0019) [2023-02-22 17:37:56,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 7254016. Throughput: 0: 980.3. Samples: 1812174. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) [2023-02-22 17:37:56,688][15372] Avg episode reward: [(0, '28.656')] [2023-02-22 17:38:01,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 7270400. Throughput: 0: 944.5. Samples: 1817732. Policy #0 lag: (min: 1.0, avg: 2.1, max: 3.0) [2023-02-22 17:38:01,687][15372] Avg episode reward: [(0, '30.488')] [2023-02-22 17:38:02,502][33578] Updated weights for policy 0, policy_version 3552 (0.0024) [2023-02-22 17:38:06,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 7286784. Throughput: 0: 924.0. Samples: 1822102. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 17:38:06,687][15372] Avg episode reward: [(0, '31.378')] [2023-02-22 17:38:06,690][33564] Saving new best policy, reward=31.378! [2023-02-22 17:38:08,530][33578] Updated weights for policy 0, policy_version 3562 (0.0013) [2023-02-22 17:38:11,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 7307264. Throughput: 0: 944.6. Samples: 1825246. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:38:11,687][15372] Avg episode reward: [(0, '31.547')] [2023-02-22 17:38:11,702][33564] Saving new best policy, reward=31.547! [2023-02-22 17:38:13,076][33578] Updated weights for policy 0, policy_version 3572 (0.0011) [2023-02-22 17:38:16,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 7331840. Throughput: 0: 972.2. Samples: 1832232. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:38:16,687][15372] Avg episode reward: [(0, '32.192')] [2023-02-22 17:38:16,696][33564] Saving new best policy, reward=32.192! [2023-02-22 17:38:18,097][33578] Updated weights for policy 0, policy_version 3582 (0.0031) [2023-02-22 17:38:21,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.8, 300 sec: 3776.7). Total num frames: 7344128. Throughput: 0: 924.4. Samples: 1837170. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:38:21,691][15372] Avg episode reward: [(0, '31.513')] [2023-02-22 17:38:25,132][33578] Updated weights for policy 0, policy_version 3592 (0.0030) [2023-02-22 17:38:26,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 7360512. Throughput: 0: 915.5. Samples: 1839334. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:38:26,687][15372] Avg episode reward: [(0, '30.595')] [2023-02-22 17:38:30,141][33578] Updated weights for policy 0, policy_version 3602 (0.0030) [2023-02-22 17:38:31,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3755.0, 300 sec: 3748.9). Total num frames: 7380992. Throughput: 0: 952.0. Samples: 1845294. Policy #0 lag: (min: 1.0, avg: 2.3, max: 4.0) [2023-02-22 17:38:31,693][15372] Avg episode reward: [(0, '31.536')] [2023-02-22 17:38:34,531][33578] Updated weights for policy 0, policy_version 3612 (0.0018) [2023-02-22 17:38:36,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 7405568. Throughput: 0: 968.6. Samples: 1852234. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) [2023-02-22 17:38:36,689][15372] Avg episode reward: [(0, '30.267')] [2023-02-22 17:38:40,478][33578] Updated weights for policy 0, policy_version 3622 (0.0017) [2023-02-22 17:38:41,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.9, 300 sec: 3776.6). Total num frames: 7417856. Throughput: 0: 938.8. Samples: 1854420. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:38:41,693][15372] Avg episode reward: [(0, '29.103')] [2023-02-22 17:38:46,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 7434240. Throughput: 0: 910.9. Samples: 1858722. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:38:46,686][15372] Avg episode reward: [(0, '29.902')] [2023-02-22 17:38:47,277][33578] Updated weights for policy 0, policy_version 3632 (0.0023) [2023-02-22 17:38:51,614][33578] Updated weights for policy 0, policy_version 3642 (0.0013) [2023-02-22 17:38:51,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 7458816. Throughput: 0: 958.9. Samples: 1865252. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:38:51,688][15372] Avg episode reward: [(0, '28.431')] [2023-02-22 17:38:56,091][33578] Updated weights for policy 0, policy_version 3652 (0.0013) [2023-02-22 17:38:56,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 7479296. Throughput: 0: 964.0. Samples: 1868624. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:38:56,687][15372] Avg episode reward: [(0, '28.510')] [2023-02-22 17:39:01,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 7495680. Throughput: 0: 926.0. Samples: 1873900. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:39:01,686][15372] Avg episode reward: [(0, '27.928')] [2023-02-22 17:39:02,780][33578] Updated weights for policy 0, policy_version 3662 (0.0019) [2023-02-22 17:39:06,685][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 7512064. Throughput: 0: 917.0. Samples: 1878436. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:39:06,687][15372] Avg episode reward: [(0, '28.052')] [2023-02-22 17:39:08,460][33578] Updated weights for policy 0, policy_version 3672 (0.0021) [2023-02-22 17:39:11,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 7532544. Throughput: 0: 944.9. Samples: 1881856. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:39:11,694][15372] Avg episode reward: [(0, '29.758')] [2023-02-22 17:39:12,870][33578] Updated weights for policy 0, policy_version 3682 (0.0015) [2023-02-22 17:39:16,688][15372] Fps is (10 sec: 4504.2, 60 sec: 3754.5, 300 sec: 3762.8). Total num frames: 7557120. Throughput: 0: 966.3. Samples: 1888782. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:39:16,694][15372] Avg episode reward: [(0, '28.815')] [2023-02-22 17:39:17,698][33578] Updated weights for policy 0, policy_version 3692 (0.0013) [2023-02-22 17:39:21,685][15372] Fps is (10 sec: 3686.2, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 7569408. Throughput: 0: 917.4. Samples: 1893518. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:39:21,688][15372] Avg episode reward: [(0, '29.524')] [2023-02-22 17:39:25,076][33578] Updated weights for policy 0, policy_version 3702 (0.0027) [2023-02-22 17:39:26,684][15372] Fps is (10 sec: 2868.2, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 7585792. Throughput: 0: 914.0. Samples: 1895552. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:39:26,694][15372] Avg episode reward: [(0, '29.480')] [2023-02-22 17:39:29,918][33578] Updated weights for policy 0, policy_version 3712 (0.0011) [2023-02-22 17:39:31,684][15372] Fps is (10 sec: 3686.6, 60 sec: 3754.7, 300 sec: 3735.0). Total num frames: 7606272. Throughput: 0: 953.3. Samples: 1901620. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 17:39:31,692][15372] Avg episode reward: [(0, '30.727')] [2023-02-22 17:39:34,620][33578] Updated weights for policy 0, policy_version 3722 (0.0025) [2023-02-22 17:39:36,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 7626752. Throughput: 0: 951.2. Samples: 1908056. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:39:36,691][15372] Avg episode reward: [(0, '31.008')] [2023-02-22 17:39:41,330][33578] Updated weights for policy 0, policy_version 3732 (0.0026) [2023-02-22 17:39:41,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 7643136. Throughput: 0: 922.1. Samples: 1910118. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 17:39:41,690][15372] Avg episode reward: [(0, '31.770')] [2023-02-22 17:39:41,706][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003732_7643136.pth... [2023-02-22 17:39:41,841][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003296_6750208.pth [2023-02-22 17:39:46,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3735.0). Total num frames: 7655424. Throughput: 0: 898.7. Samples: 1914342. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:39:46,687][15372] Avg episode reward: [(0, '30.180')] [2023-02-22 17:39:47,574][33578] Updated weights for policy 0, policy_version 3742 (0.0055) [2023-02-22 17:39:51,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3735.0). Total num frames: 7680000. Throughput: 0: 944.8. Samples: 1920952. Policy #0 lag: (min: 1.0, avg: 2.3, max: 3.0) [2023-02-22 17:39:51,690][15372] Avg episode reward: [(0, '30.262')] [2023-02-22 17:39:52,152][33578] Updated weights for policy 0, policy_version 3752 (0.0021) [2023-02-22 17:39:56,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 7700480. Throughput: 0: 945.1. Samples: 1924384. Policy #0 lag: (min: 1.0, avg: 2.2, max: 4.0) [2023-02-22 17:39:56,688][15372] Avg episode reward: [(0, '32.127')] [2023-02-22 17:39:57,039][33578] Updated weights for policy 0, policy_version 3762 (0.0011) [2023-02-22 17:40:01,686][15372] Fps is (10 sec: 3685.9, 60 sec: 3686.3, 300 sec: 3748.9). Total num frames: 7716864. Throughput: 0: 893.8. Samples: 1929002. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:40:01,693][15372] Avg episode reward: [(0, '32.836')] [2023-02-22 17:40:01,706][33564] Saving new best policy, reward=32.836! [2023-02-22 17:40:04,647][33578] Updated weights for policy 0, policy_version 3772 (0.0015) [2023-02-22 17:40:06,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3721.1). Total num frames: 7729152. Throughput: 0: 890.2. Samples: 1933576. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 17:40:06,688][15372] Avg episode reward: [(0, '33.537')] [2023-02-22 17:40:06,842][33564] Saving new best policy, reward=33.537! [2023-02-22 17:40:09,589][33578] Updated weights for policy 0, policy_version 3782 (0.0015) [2023-02-22 17:40:11,685][15372] Fps is (10 sec: 3686.6, 60 sec: 3686.3, 300 sec: 3721.1). Total num frames: 7753728. Throughput: 0: 917.9. Samples: 1936860. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:40:11,688][15372] Avg episode reward: [(0, '32.840')] [2023-02-22 17:40:14,197][33578] Updated weights for policy 0, policy_version 3792 (0.0028) [2023-02-22 17:40:16,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3618.3, 300 sec: 3735.0). Total num frames: 7774208. Throughput: 0: 934.0. Samples: 1943652. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:40:16,693][15372] Avg episode reward: [(0, '31.462')] [2023-02-22 17:40:20,693][33578] Updated weights for policy 0, policy_version 3802 (0.0031) [2023-02-22 17:40:21,691][15372] Fps is (10 sec: 3274.9, 60 sec: 3617.8, 300 sec: 3721.0). Total num frames: 7786496. Throughput: 0: 883.5. Samples: 1947818. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:40:21,699][15372] Avg episode reward: [(0, '32.304')] [2023-02-22 17:40:26,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3707.2). Total num frames: 7802880. Throughput: 0: 885.0. Samples: 1949944. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:40:26,687][15372] Avg episode reward: [(0, '31.526')] [2023-02-22 17:40:26,926][33578] Updated weights for policy 0, policy_version 3812 (0.0027) [2023-02-22 17:40:31,655][33578] Updated weights for policy 0, policy_version 3822 (0.0012) [2023-02-22 17:40:31,684][15372] Fps is (10 sec: 4098.7, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 7827456. Throughput: 0: 933.6. Samples: 1956354. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:40:31,688][15372] Avg episode reward: [(0, '30.238')] [2023-02-22 17:40:36,688][15372] Fps is (10 sec: 4094.5, 60 sec: 3617.9, 300 sec: 3707.2). Total num frames: 7843840. Throughput: 0: 915.5. Samples: 1962154. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:40:36,696][15372] Avg episode reward: [(0, '29.810')] [2023-02-22 17:40:37,177][33578] Updated weights for policy 0, policy_version 3832 (0.0016) [2023-02-22 17:40:41,691][15372] Fps is (10 sec: 3274.7, 60 sec: 3617.7, 300 sec: 3721.0). Total num frames: 7860224. Throughput: 0: 884.4. Samples: 1964188. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-02-22 17:40:41,698][15372] Avg episode reward: [(0, '29.415')] [2023-02-22 17:40:44,446][33578] Updated weights for policy 0, policy_version 3842 (0.0021) [2023-02-22 17:40:46,684][15372] Fps is (10 sec: 3278.0, 60 sec: 3686.4, 300 sec: 3735.0). Total num frames: 7876608. Throughput: 0: 882.1. Samples: 1968694. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) [2023-02-22 17:40:46,686][15372] Avg episode reward: [(0, '29.255')] [2023-02-22 17:40:49,164][33578] Updated weights for policy 0, policy_version 3852 (0.0014) [2023-02-22 17:40:51,684][15372] Fps is (10 sec: 3688.8, 60 sec: 3618.1, 300 sec: 3748.9). Total num frames: 7897088. Throughput: 0: 931.0. Samples: 1975472. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-02-22 17:40:51,694][15372] Avg episode reward: [(0, '29.168')] [2023-02-22 17:40:53,626][33578] Updated weights for policy 0, policy_version 3862 (0.0014) [2023-02-22 17:40:56,691][15372] Fps is (10 sec: 4093.2, 60 sec: 3617.7, 300 sec: 3748.8). Total num frames: 7917568. Throughput: 0: 934.4. Samples: 1978912. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:40:56,699][15372] Avg episode reward: [(0, '28.368')] [2023-02-22 17:41:00,301][33578] Updated weights for policy 0, policy_version 3872 (0.0016) [2023-02-22 17:41:01,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3748.9). Total num frames: 7929856. Throughput: 0: 877.3. Samples: 1983132. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:41:01,693][15372] Avg episode reward: [(0, '28.794')] [2023-02-22 17:41:06,284][33578] Updated weights for policy 0, policy_version 3882 (0.0022) [2023-02-22 17:41:06,684][15372] Fps is (10 sec: 3279.1, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 7950336. Throughput: 0: 901.2. Samples: 1988368. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:41:06,687][15372] Avg episode reward: [(0, '28.594')] [2023-02-22 17:41:10,840][33578] Updated weights for policy 0, policy_version 3892 (0.0014) [2023-02-22 17:41:11,685][15372] Fps is (10 sec: 4505.5, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 7974912. Throughput: 0: 930.1. Samples: 1991800. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:41:11,688][15372] Avg episode reward: [(0, '28.655')] [2023-02-22 17:41:15,669][33578] Updated weights for policy 0, policy_version 3902 (0.0020) [2023-02-22 17:41:16,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3748.9). Total num frames: 7991296. Throughput: 0: 934.9. Samples: 1998426. Policy #0 lag: (min: 0.0, avg: 1.3, max: 4.0) [2023-02-22 17:41:16,688][15372] Avg episode reward: [(0, '29.342')] [2023-02-22 17:41:21,688][15372] Fps is (10 sec: 3275.8, 60 sec: 3686.6, 300 sec: 3748.8). Total num frames: 8007680. Throughput: 0: 900.9. Samples: 2002696. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:41:21,691][15372] Avg episode reward: [(0, '29.145')] [2023-02-22 17:41:22,908][33578] Updated weights for policy 0, policy_version 3912 (0.0026) [2023-02-22 17:41:26,689][15372] Fps is (10 sec: 2865.9, 60 sec: 3617.9, 300 sec: 3721.1). Total num frames: 8019968. Throughput: 0: 894.4. Samples: 2004432. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 17:41:26,692][15372] Avg episode reward: [(0, '28.954')] [2023-02-22 17:41:30,213][33578] Updated weights for policy 0, policy_version 3922 (0.0043) [2023-02-22 17:41:31,684][15372] Fps is (10 sec: 2868.2, 60 sec: 3481.6, 300 sec: 3693.3). Total num frames: 8036352. Throughput: 0: 889.6. Samples: 2008724. Policy #0 lag: (min: 0.0, avg: 0.9, max: 3.0) [2023-02-22 17:41:31,691][15372] Avg episode reward: [(0, '29.869')] [2023-02-22 17:41:35,471][33578] Updated weights for policy 0, policy_version 3932 (0.0013) [2023-02-22 17:41:36,685][15372] Fps is (10 sec: 3278.2, 60 sec: 3481.8, 300 sec: 3693.4). Total num frames: 8052736. Throughput: 0: 867.2. Samples: 2014498. Policy #0 lag: (min: 0.0, avg: 0.9, max: 3.0) [2023-02-22 17:41:36,686][15372] Avg episode reward: [(0, '31.917')] [2023-02-22 17:41:41,685][15372] Fps is (10 sec: 3276.7, 60 sec: 3482.0, 300 sec: 3693.4). Total num frames: 8069120. Throughput: 0: 840.2. Samples: 2016714. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:41:41,689][15372] Avg episode reward: [(0, '31.814')] [2023-02-22 17:41:41,702][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003940_8069120.pth... [2023-02-22 17:41:41,862][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003514_7196672.pth [2023-02-22 17:41:42,554][33578] Updated weights for policy 0, policy_version 3942 (0.0036) [2023-02-22 17:41:46,687][15372] Fps is (10 sec: 3276.1, 60 sec: 3481.5, 300 sec: 3679.4). Total num frames: 8085504. Throughput: 0: 851.2. Samples: 2021436. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:41:46,699][15372] Avg episode reward: [(0, '31.491')] [2023-02-22 17:41:47,687][33578] Updated weights for policy 0, policy_version 3952 (0.0025) [2023-02-22 17:41:51,684][15372] Fps is (10 sec: 4096.2, 60 sec: 3549.9, 300 sec: 3679.5). Total num frames: 8110080. Throughput: 0: 892.8. Samples: 2028544. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:41:51,689][15372] Avg episode reward: [(0, '29.681')] [2023-02-22 17:41:51,949][33578] Updated weights for policy 0, policy_version 3962 (0.0022) [2023-02-22 17:41:56,684][15372] Fps is (10 sec: 4506.7, 60 sec: 3550.3, 300 sec: 3693.3). Total num frames: 8130560. Throughput: 0: 896.0. Samples: 2032120. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:41:56,689][15372] Avg episode reward: [(0, '29.040')] [2023-02-22 17:41:57,428][33578] Updated weights for policy 0, policy_version 3972 (0.0011) [2023-02-22 17:42:01,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3693.3). Total num frames: 8146944. Throughput: 0: 847.4. Samples: 2036558. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:42:01,694][15372] Avg episode reward: [(0, '29.567')] [2023-02-22 17:42:04,128][33578] Updated weights for policy 0, policy_version 3982 (0.0024) [2023-02-22 17:42:06,685][15372] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3679.5). Total num frames: 8163328. Throughput: 0: 874.5. Samples: 2042046. Policy #0 lag: (min: 1.0, avg: 2.1, max: 3.0) [2023-02-22 17:42:06,687][15372] Avg episode reward: [(0, '28.745')] [2023-02-22 17:42:08,576][33578] Updated weights for policy 0, policy_version 3992 (0.0018) [2023-02-22 17:42:11,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3679.5). Total num frames: 8187904. Throughput: 0: 913.5. Samples: 2045534. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:42:11,694][15372] Avg episode reward: [(0, '27.217')] [2023-02-22 17:42:12,929][33578] Updated weights for policy 0, policy_version 4002 (0.0020) [2023-02-22 17:42:16,684][15372] Fps is (10 sec: 4505.7, 60 sec: 3618.1, 300 sec: 3693.4). Total num frames: 8208384. Throughput: 0: 967.4. Samples: 2052256. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:42:16,693][15372] Avg episode reward: [(0, '26.854')] [2023-02-22 17:42:18,715][33578] Updated weights for policy 0, policy_version 4012 (0.0011) [2023-02-22 17:42:21,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3618.3, 300 sec: 3693.3). Total num frames: 8224768. Throughput: 0: 938.1. Samples: 2056714. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:42:21,692][15372] Avg episode reward: [(0, '26.761')] [2023-02-22 17:42:24,990][33578] Updated weights for policy 0, policy_version 4022 (0.0030) [2023-02-22 17:42:26,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3755.0, 300 sec: 3693.4). Total num frames: 8245248. Throughput: 0: 943.0. Samples: 2059150. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:42:26,686][15372] Avg episode reward: [(0, '27.341')] [2023-02-22 17:42:29,540][33578] Updated weights for policy 0, policy_version 4032 (0.0025) [2023-02-22 17:42:31,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3679.5). Total num frames: 8265728. Throughput: 0: 997.2. Samples: 2066306. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:42:31,687][15372] Avg episode reward: [(0, '27.371')] [2023-02-22 17:42:33,784][33578] Updated weights for policy 0, policy_version 4042 (0.0015) [2023-02-22 17:42:36,686][15372] Fps is (10 sec: 4095.4, 60 sec: 3891.1, 300 sec: 3707.3). Total num frames: 8286208. Throughput: 0: 975.0. Samples: 2072422. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:42:36,691][15372] Avg episode reward: [(0, '27.713')] [2023-02-22 17:42:40,244][33578] Updated weights for policy 0, policy_version 4052 (0.0033) [2023-02-22 17:42:41,686][15372] Fps is (10 sec: 3276.3, 60 sec: 3822.9, 300 sec: 3693.3). Total num frames: 8298496. Throughput: 0: 944.7. Samples: 2074632. Policy #0 lag: (min: 1.0, avg: 2.5, max: 4.0) [2023-02-22 17:42:41,693][15372] Avg episode reward: [(0, '26.542')] [2023-02-22 17:42:46,175][33578] Updated weights for policy 0, policy_version 4062 (0.0032) [2023-02-22 17:42:46,684][15372] Fps is (10 sec: 3277.3, 60 sec: 3891.4, 300 sec: 3693.3). Total num frames: 8318976. Throughput: 0: 955.1. Samples: 2079538. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:42:46,693][15372] Avg episode reward: [(0, '27.937')] [2023-02-22 17:42:50,508][33578] Updated weights for policy 0, policy_version 4072 (0.0012) [2023-02-22 17:42:51,684][15372] Fps is (10 sec: 4506.3, 60 sec: 3891.2, 300 sec: 3693.3). Total num frames: 8343552. Throughput: 0: 988.9. Samples: 2086546. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:42:51,691][15372] Avg episode reward: [(0, '28.784')] [2023-02-22 17:42:55,280][33578] Updated weights for policy 0, policy_version 4082 (0.0019) [2023-02-22 17:42:56,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3693.3). Total num frames: 8359936. Throughput: 0: 987.4. Samples: 2089966. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:42:56,690][15372] Avg episode reward: [(0, '29.219')] [2023-02-22 17:43:01,685][15372] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3693.3). Total num frames: 8376320. Throughput: 0: 934.3. Samples: 2094302. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:43:01,695][15372] Avg episode reward: [(0, '27.609')] [2023-02-22 17:43:02,368][33578] Updated weights for policy 0, policy_version 4092 (0.0034) [2023-02-22 17:43:06,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3693.3). Total num frames: 8396800. Throughput: 0: 959.6. Samples: 2099898. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-22 17:43:06,687][15372] Avg episode reward: [(0, '28.849')] [2023-02-22 17:43:07,209][33578] Updated weights for policy 0, policy_version 4102 (0.0027) [2023-02-22 17:43:11,485][33578] Updated weights for policy 0, policy_version 4112 (0.0012) [2023-02-22 17:43:11,684][15372] Fps is (10 sec: 4505.7, 60 sec: 3891.2, 300 sec: 3693.3). Total num frames: 8421376. Throughput: 0: 986.2. Samples: 2103528. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:43:11,687][15372] Avg episode reward: [(0, '30.866')] [2023-02-22 17:43:16,686][15372] Fps is (10 sec: 4095.4, 60 sec: 3822.8, 300 sec: 3707.2). Total num frames: 8437760. Throughput: 0: 966.8. Samples: 2109812. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:43:16,693][15372] Avg episode reward: [(0, '30.191')] [2023-02-22 17:43:17,186][33578] Updated weights for policy 0, policy_version 4122 (0.0015) [2023-02-22 17:43:21,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3707.2). Total num frames: 8454144. Throughput: 0: 929.6. Samples: 2114254. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:43:21,689][15372] Avg episode reward: [(0, '29.803')] [2023-02-22 17:43:23,837][33578] Updated weights for policy 0, policy_version 4132 (0.0052) [2023-02-22 17:43:26,684][15372] Fps is (10 sec: 3687.0, 60 sec: 3822.9, 300 sec: 3707.2). Total num frames: 8474624. Throughput: 0: 941.4. Samples: 2116992. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:43:26,696][15372] Avg episode reward: [(0, '29.029')] [2023-02-22 17:43:28,131][33578] Updated weights for policy 0, policy_version 4142 (0.0025) [2023-02-22 17:43:31,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3707.2). Total num frames: 8499200. Throughput: 0: 992.4. Samples: 2124198. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:43:31,692][15372] Avg episode reward: [(0, '30.166')] [2023-02-22 17:43:32,362][33578] Updated weights for policy 0, policy_version 4152 (0.0014) [2023-02-22 17:43:36,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3721.1). Total num frames: 8515584. Throughput: 0: 964.0. Samples: 2129928. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:43:36,686][15372] Avg episode reward: [(0, '29.101')] [2023-02-22 17:43:38,654][33578] Updated weights for policy 0, policy_version 4162 (0.0012) [2023-02-22 17:43:41,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3891.3, 300 sec: 3721.1). Total num frames: 8531968. Throughput: 0: 935.7. Samples: 2132074. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:43:41,690][15372] Avg episode reward: [(0, '28.555')] [2023-02-22 17:43:41,708][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004166_8531968.pth... [2023-02-22 17:43:41,828][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003732_7643136.pth [2023-02-22 17:43:44,628][33578] Updated weights for policy 0, policy_version 4172 (0.0028) [2023-02-22 17:43:46,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3707.2). Total num frames: 8552448. Throughput: 0: 961.7. Samples: 2137578. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:43:46,687][15372] Avg episode reward: [(0, '30.050')] [2023-02-22 17:43:49,181][33578] Updated weights for policy 0, policy_version 4182 (0.0016) [2023-02-22 17:43:51,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3707.2). Total num frames: 8572928. Throughput: 0: 986.7. Samples: 2144298. Policy #0 lag: (min: 1.0, avg: 2.2, max: 4.0) [2023-02-22 17:43:51,688][15372] Avg episode reward: [(0, '30.061')] [2023-02-22 17:43:53,968][33578] Updated weights for policy 0, policy_version 4192 (0.0013) [2023-02-22 17:43:56,690][15372] Fps is (10 sec: 3684.4, 60 sec: 3822.6, 300 sec: 3707.2). Total num frames: 8589312. Throughput: 0: 971.1. Samples: 2147232. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:43:56,698][15372] Avg episode reward: [(0, '29.425')] [2023-02-22 17:44:00,768][33578] Updated weights for policy 0, policy_version 4202 (0.0015) [2023-02-22 17:44:01,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3707.2). Total num frames: 8605696. Throughput: 0: 931.4. Samples: 2151724. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:44:01,690][15372] Avg episode reward: [(0, '30.046')] [2023-02-22 17:44:06,047][33578] Updated weights for policy 0, policy_version 4212 (0.0036) [2023-02-22 17:44:06,684][15372] Fps is (10 sec: 3688.4, 60 sec: 3822.9, 300 sec: 3707.2). Total num frames: 8626176. Throughput: 0: 962.1. Samples: 2157550. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:44:06,687][15372] Avg episode reward: [(0, '30.337')] [2023-02-22 17:44:10,629][33578] Updated weights for policy 0, policy_version 4222 (0.0017) [2023-02-22 17:44:11,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3707.3). Total num frames: 8650752. Throughput: 0: 976.8. Samples: 2160950. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:44:11,690][15372] Avg episode reward: [(0, '30.828')] [2023-02-22 17:44:16,645][33578] Updated weights for policy 0, policy_version 4232 (0.0011) [2023-02-22 17:44:16,688][15372] Fps is (10 sec: 4094.6, 60 sec: 3822.8, 300 sec: 3721.1). Total num frames: 8667136. Throughput: 0: 939.4. Samples: 2166476. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:44:16,690][15372] Avg episode reward: [(0, '29.795')] [2023-02-22 17:44:21,685][15372] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 8679424. Throughput: 0: 906.4. Samples: 2170714. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:44:21,690][15372] Avg episode reward: [(0, '29.872')] [2023-02-22 17:44:23,491][33578] Updated weights for policy 0, policy_version 4242 (0.0017) [2023-02-22 17:44:26,684][15372] Fps is (10 sec: 3278.0, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 8699904. Throughput: 0: 922.2. Samples: 2173572. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:44:26,696][15372] Avg episode reward: [(0, '30.376')] [2023-02-22 17:44:28,069][33578] Updated weights for policy 0, policy_version 4252 (0.0017) [2023-02-22 17:44:31,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 8720384. Throughput: 0: 950.1. Samples: 2180332. Policy #0 lag: (min: 1.0, avg: 2.5, max: 5.0) [2023-02-22 17:44:31,694][15372] Avg episode reward: [(0, '30.121')] [2023-02-22 17:44:32,810][33578] Updated weights for policy 0, policy_version 4262 (0.0022) [2023-02-22 17:44:36,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 8736768. Throughput: 0: 917.3. Samples: 2185576. Policy #0 lag: (min: 1.0, avg: 2.5, max: 5.0) [2023-02-22 17:44:36,692][15372] Avg episode reward: [(0, '29.758')] [2023-02-22 17:44:39,825][33578] Updated weights for policy 0, policy_version 4272 (0.0029) [2023-02-22 17:44:41,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 8753152. Throughput: 0: 897.8. Samples: 2187626. Policy #0 lag: (min: 1.0, avg: 2.1, max: 3.0) [2023-02-22 17:44:41,690][15372] Avg episode reward: [(0, '29.488')] [2023-02-22 17:44:45,788][33578] Updated weights for policy 0, policy_version 4282 (0.0025) [2023-02-22 17:44:46,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3693.3). Total num frames: 8769536. Throughput: 0: 910.8. Samples: 2192712. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:44:46,691][15372] Avg episode reward: [(0, '29.168')] [2023-02-22 17:44:50,232][33578] Updated weights for policy 0, policy_version 4292 (0.0015) [2023-02-22 17:44:51,685][15372] Fps is (10 sec: 4095.9, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 8794112. Throughput: 0: 931.2. Samples: 2199452. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:44:51,690][15372] Avg episode reward: [(0, '30.293')] [2023-02-22 17:44:55,927][33578] Updated weights for policy 0, policy_version 4302 (0.0019) [2023-02-22 17:44:56,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3686.7, 300 sec: 3707.2). Total num frames: 8810496. Throughput: 0: 915.6. Samples: 2202152. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:44:56,688][15372] Avg episode reward: [(0, '30.077')] [2023-02-22 17:45:01,685][15372] Fps is (10 sec: 3276.7, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 8826880. Throughput: 0: 889.4. Samples: 2206498. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:45:01,692][15372] Avg episode reward: [(0, '30.042')] [2023-02-22 17:45:02,874][33578] Updated weights for policy 0, policy_version 4312 (0.0014) [2023-02-22 17:45:06,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 8847360. Throughput: 0: 932.7. Samples: 2212686. Policy #0 lag: (min: 0.0, avg: 1.3, max: 4.0) [2023-02-22 17:45:06,689][15372] Avg episode reward: [(0, '31.174')] [2023-02-22 17:45:07,149][33564] Early stopping after 2 epochs (2 sgd steps), loss delta 0.0000005 [2023-02-22 17:45:07,156][33578] Updated weights for policy 0, policy_version 4322 (0.0013) [2023-02-22 17:45:11,500][33578] Updated weights for policy 0, policy_version 4332 (0.0016) [2023-02-22 17:45:11,684][15372] Fps is (10 sec: 4505.9, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 8871936. Throughput: 0: 949.0. Samples: 2216276. Policy #0 lag: (min: 1.0, avg: 2.1, max: 3.0) [2023-02-22 17:45:11,691][15372] Avg episode reward: [(0, '31.039')] [2023-02-22 17:45:16,686][15372] Fps is (10 sec: 4095.2, 60 sec: 3686.5, 300 sec: 3735.1). Total num frames: 8888320. Throughput: 0: 921.4. Samples: 2221798. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:45:16,689][15372] Avg episode reward: [(0, '30.586')] [2023-02-22 17:45:17,939][33578] Updated weights for policy 0, policy_version 4342 (0.0031) [2023-02-22 17:45:21,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 8900608. Throughput: 0: 903.3. Samples: 2226224. Policy #0 lag: (min: 1.0, avg: 2.0, max: 4.0) [2023-02-22 17:45:21,687][15372] Avg episode reward: [(0, '31.281')] [2023-02-22 17:45:24,141][33578] Updated weights for policy 0, policy_version 4352 (0.0019) [2023-02-22 17:45:26,684][15372] Fps is (10 sec: 3277.4, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 8921088. Throughput: 0: 921.8. Samples: 2229108. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 17:45:26,692][15372] Avg episode reward: [(0, '30.197')] [2023-02-22 17:45:28,614][33578] Updated weights for policy 0, policy_version 4362 (0.0017) [2023-02-22 17:45:31,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3735.0). Total num frames: 8945664. Throughput: 0: 962.5. Samples: 2236024. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2023-02-22 17:45:31,687][15372] Avg episode reward: [(0, '29.516')] [2023-02-22 17:45:33,648][33578] Updated weights for policy 0, policy_version 4372 (0.0014) [2023-02-22 17:45:36,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3735.1). Total num frames: 8962048. Throughput: 0: 924.7. Samples: 2241064. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:45:36,688][15372] Avg episode reward: [(0, '29.544')] [2023-02-22 17:45:40,866][33578] Updated weights for policy 0, policy_version 4382 (0.0027) [2023-02-22 17:45:41,685][15372] Fps is (10 sec: 2867.1, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 8974336. Throughput: 0: 912.2. Samples: 2243200. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:45:41,690][15372] Avg episode reward: [(0, '28.280')] [2023-02-22 17:45:41,701][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004382_8974336.pth... [2023-02-22 17:45:41,901][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003940_8069120.pth [2023-02-22 17:45:45,809][33578] Updated weights for policy 0, policy_version 4392 (0.0024) [2023-02-22 17:45:46,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 8998912. Throughput: 0: 942.0. Samples: 2248888. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:45:46,687][15372] Avg episode reward: [(0, '28.666')] [2023-02-22 17:45:50,644][33578] Updated weights for policy 0, policy_version 4402 (0.0012) [2023-02-22 17:45:51,684][15372] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3735.1). Total num frames: 9019392. Throughput: 0: 950.4. Samples: 2255452. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:45:51,687][15372] Avg episode reward: [(0, '28.637')] [2023-02-22 17:45:56,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3735.0). Total num frames: 9031680. Throughput: 0: 921.6. Samples: 2257750. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:45:56,689][15372] Avg episode reward: [(0, '28.470')] [2023-02-22 17:45:57,041][33578] Updated weights for policy 0, policy_version 4412 (0.0025) [2023-02-22 17:46:01,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 9048064. Throughput: 0: 892.1. Samples: 2261942. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:46:01,687][15372] Avg episode reward: [(0, '28.516')] [2023-02-22 17:46:03,403][33578] Updated weights for policy 0, policy_version 4422 (0.0035) [2023-02-22 17:46:06,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 9068544. Throughput: 0: 933.1. Samples: 2268214. Policy #0 lag: (min: 1.0, avg: 2.1, max: 4.0) [2023-02-22 17:46:06,687][15372] Avg episode reward: [(0, '27.820')] [2023-02-22 17:46:07,799][33578] Updated weights for policy 0, policy_version 4432 (0.0013) [2023-02-22 17:46:11,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3735.0). Total num frames: 9093120. Throughput: 0: 942.5. Samples: 2271520. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:46:11,687][15372] Avg episode reward: [(0, '27.007')] [2023-02-22 17:46:12,640][33578] Updated weights for policy 0, policy_version 4442 (0.0013) [2023-02-22 17:46:16,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3721.2). Total num frames: 9105408. Throughput: 0: 907.4. Samples: 2276858. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:46:16,700][15372] Avg episode reward: [(0, '26.202')] [2023-02-22 17:46:19,646][33578] Updated weights for policy 0, policy_version 4452 (0.0014) [2023-02-22 17:46:21,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3735.1). Total num frames: 9121792. Throughput: 0: 893.6. Samples: 2281274. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:46:21,687][15372] Avg episode reward: [(0, '26.684')] [2023-02-22 17:46:24,791][33578] Updated weights for policy 0, policy_version 4462 (0.0027) [2023-02-22 17:46:26,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 9146368. Throughput: 0: 922.9. Samples: 2284730. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:46:26,693][15372] Avg episode reward: [(0, '27.435')] [2023-02-22 17:46:29,289][33578] Updated weights for policy 0, policy_version 4472 (0.0013) [2023-02-22 17:46:31,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3776.7). Total num frames: 9166848. Throughput: 0: 947.9. Samples: 2291542. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:46:31,688][15372] Avg episode reward: [(0, '27.240')] [2023-02-22 17:46:35,008][33578] Updated weights for policy 0, policy_version 4482 (0.0011) [2023-02-22 17:46:36,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3776.7). Total num frames: 9183232. Throughput: 0: 907.1. Samples: 2296270. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-22 17:46:36,687][15372] Avg episode reward: [(0, '27.966')] [2023-02-22 17:46:41,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3762.8). Total num frames: 9195520. Throughput: 0: 904.8. Samples: 2298464. Policy #0 lag: (min: 1.0, avg: 1.8, max: 3.0) [2023-02-22 17:46:41,686][15372] Avg episode reward: [(0, '28.708')] [2023-02-22 17:46:41,889][33578] Updated weights for policy 0, policy_version 4492 (0.0026) [2023-02-22 17:46:46,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3748.9). Total num frames: 9216000. Throughput: 0: 929.1. Samples: 2303752. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:46:46,689][15372] Avg episode reward: [(0, '29.769')] [2023-02-22 17:46:48,085][33578] Updated weights for policy 0, policy_version 4502 (0.0029) [2023-02-22 17:46:51,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3721.1). Total num frames: 9228288. Throughput: 0: 882.4. Samples: 2307924. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:46:51,691][15372] Avg episode reward: [(0, '30.182')] [2023-02-22 17:46:56,124][33578] Updated weights for policy 0, policy_version 4512 (0.0041) [2023-02-22 17:46:56,684][15372] Fps is (10 sec: 2457.6, 60 sec: 3481.6, 300 sec: 3707.2). Total num frames: 9240576. Throughput: 0: 845.8. Samples: 2309582. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 17:46:56,690][15372] Avg episode reward: [(0, '30.797')] [2023-02-22 17:47:01,685][15372] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3707.2). Total num frames: 9256960. Throughput: 0: 821.0. Samples: 2313802. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:47:01,692][15372] Avg episode reward: [(0, '31.440')] [2023-02-22 17:47:02,688][33578] Updated weights for policy 0, policy_version 4522 (0.0024) [2023-02-22 17:47:06,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3693.3). Total num frames: 9277440. Throughput: 0: 861.3. Samples: 2320034. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:47:06,689][15372] Avg episode reward: [(0, '31.608')] [2023-02-22 17:47:07,320][33578] Updated weights for policy 0, policy_version 4532 (0.0018) [2023-02-22 17:47:11,684][15372] Fps is (10 sec: 4096.1, 60 sec: 3413.3, 300 sec: 3693.3). Total num frames: 9297920. Throughput: 0: 860.4. Samples: 2323448. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:47:11,696][15372] Avg episode reward: [(0, '30.669')] [2023-02-22 17:47:11,748][33578] Updated weights for policy 0, policy_version 4542 (0.0011) [2023-02-22 17:47:16,686][15372] Fps is (10 sec: 3685.8, 60 sec: 3481.5, 300 sec: 3693.3). Total num frames: 9314304. Throughput: 0: 831.9. Samples: 2328980. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:47:16,689][15372] Avg episode reward: [(0, '30.681')] [2023-02-22 17:47:18,558][33578] Updated weights for policy 0, policy_version 4552 (0.0018) [2023-02-22 17:47:21,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3679.5). Total num frames: 9330688. Throughput: 0: 822.0. Samples: 2333260. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:47:21,687][15372] Avg episode reward: [(0, '29.979')] [2023-02-22 17:47:24,385][33578] Updated weights for policy 0, policy_version 4562 (0.0015) [2023-02-22 17:47:26,684][15372] Fps is (10 sec: 3687.0, 60 sec: 3413.3, 300 sec: 3679.5). Total num frames: 9351168. Throughput: 0: 844.0. Samples: 2336444. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:47:26,693][15372] Avg episode reward: [(0, '31.210')] [2023-02-22 17:47:28,746][33578] Updated weights for policy 0, policy_version 4572 (0.0024) [2023-02-22 17:47:31,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3481.6, 300 sec: 3693.4). Total num frames: 9375744. Throughput: 0: 884.4. Samples: 2343550. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:47:31,687][15372] Avg episode reward: [(0, '30.602')] [2023-02-22 17:47:33,641][33578] Updated weights for policy 0, policy_version 4582 (0.0011) [2023-02-22 17:47:36,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3707.2). Total num frames: 9392128. Throughput: 0: 904.7. Samples: 2348636. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2023-02-22 17:47:36,687][15372] Avg episode reward: [(0, '31.882')] [2023-02-22 17:47:40,554][33578] Updated weights for policy 0, policy_version 4592 (0.0018) [2023-02-22 17:47:41,685][15372] Fps is (10 sec: 2866.9, 60 sec: 3481.5, 300 sec: 3679.4). Total num frames: 9404416. Throughput: 0: 917.4. Samples: 2350868. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:47:41,692][15372] Avg episode reward: [(0, '32.266')] [2023-02-22 17:47:41,711][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004592_9404416.pth... [2023-02-22 17:47:41,848][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004166_8531968.pth [2023-02-22 17:47:45,392][33578] Updated weights for policy 0, policy_version 4602 (0.0019) [2023-02-22 17:47:46,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3679.5). Total num frames: 9428992. Throughput: 0: 956.1. Samples: 2356828. Policy #0 lag: (min: 0.0, avg: 1.1, max: 2.0) [2023-02-22 17:47:46,693][15372] Avg episode reward: [(0, '31.659')] [2023-02-22 17:47:49,787][33578] Updated weights for policy 0, policy_version 4612 (0.0011) [2023-02-22 17:47:51,684][15372] Fps is (10 sec: 4915.7, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 9453568. Throughput: 0: 977.6. Samples: 2364028. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:47:51,687][15372] Avg episode reward: [(0, '32.839')] [2023-02-22 17:47:55,338][33578] Updated weights for policy 0, policy_version 4622 (0.0011) [2023-02-22 17:47:56,685][15372] Fps is (10 sec: 3686.3, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 9465856. Throughput: 0: 953.8. Samples: 2366368. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:47:56,694][15372] Avg episode reward: [(0, '32.518')] [2023-02-22 17:48:01,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 9482240. Throughput: 0: 929.1. Samples: 2370786. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:48:01,690][15372] Avg episode reward: [(0, '31.666')] [2023-02-22 17:48:02,095][33578] Updated weights for policy 0, policy_version 4632 (0.0026) [2023-02-22 17:48:06,521][33578] Updated weights for policy 0, policy_version 4642 (0.0021) [2023-02-22 17:48:06,684][15372] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3679.5). Total num frames: 9506816. Throughput: 0: 981.3. Samples: 2377420. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:48:06,691][15372] Avg episode reward: [(0, '32.773')] [2023-02-22 17:48:10,686][33578] Updated weights for policy 0, policy_version 4652 (0.0011) [2023-02-22 17:48:11,684][15372] Fps is (10 sec: 4915.2, 60 sec: 3891.2, 300 sec: 3707.2). Total num frames: 9531392. Throughput: 0: 990.6. Samples: 2381020. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:48:11,690][15372] Avg episode reward: [(0, '31.835')] [2023-02-22 17:48:16,687][15372] Fps is (10 sec: 3685.5, 60 sec: 3822.9, 300 sec: 3693.3). Total num frames: 9543680. Throughput: 0: 952.1. Samples: 2386396. Policy #0 lag: (min: 1.0, avg: 2.4, max: 4.0) [2023-02-22 17:48:16,689][15372] Avg episode reward: [(0, '31.061')] [2023-02-22 17:48:17,198][33578] Updated weights for policy 0, policy_version 4662 (0.0025) [2023-02-22 17:48:21,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3679.5). Total num frames: 9560064. Throughput: 0: 938.7. Samples: 2390878. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:48:21,687][15372] Avg episode reward: [(0, '32.738')] [2023-02-22 17:48:23,043][33578] Updated weights for policy 0, policy_version 4672 (0.0040) [2023-02-22 17:48:26,684][15372] Fps is (10 sec: 4097.0, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 9584640. Throughput: 0: 970.2. Samples: 2394524. Policy #0 lag: (min: 1.0, avg: 2.3, max: 4.0) [2023-02-22 17:48:26,688][15372] Avg episode reward: [(0, '33.174')] [2023-02-22 17:48:27,202][33578] Updated weights for policy 0, policy_version 4682 (0.0016) [2023-02-22 17:48:31,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3693.3). Total num frames: 9605120. Throughput: 0: 994.4. Samples: 2401578. Policy #0 lag: (min: 1.0, avg: 2.3, max: 4.0) [2023-02-22 17:48:31,687][15372] Avg episode reward: [(0, '33.535')] [2023-02-22 17:48:31,871][33578] Updated weights for policy 0, policy_version 4692 (0.0014) [2023-02-22 17:48:36,688][15372] Fps is (10 sec: 3685.1, 60 sec: 3822.7, 300 sec: 3693.3). Total num frames: 9621504. Throughput: 0: 940.7. Samples: 2406362. Policy #0 lag: (min: 1.0, avg: 2.1, max: 3.0) [2023-02-22 17:48:36,694][15372] Avg episode reward: [(0, '34.185')] [2023-02-22 17:48:36,696][33564] Saving new best policy, reward=34.185! [2023-02-22 17:48:38,755][33578] Updated weights for policy 0, policy_version 4702 (0.0017) [2023-02-22 17:48:41,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3891.3, 300 sec: 3679.5). Total num frames: 9637888. Throughput: 0: 935.5. Samples: 2408466. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:48:41,693][15372] Avg episode reward: [(0, '32.450')] [2023-02-22 17:48:43,960][33578] Updated weights for policy 0, policy_version 4712 (0.0018) [2023-02-22 17:48:46,684][15372] Fps is (10 sec: 4097.4, 60 sec: 3891.2, 300 sec: 3693.3). Total num frames: 9662464. Throughput: 0: 979.6. Samples: 2414870. Policy #0 lag: (min: 1.0, avg: 2.4, max: 3.0) [2023-02-22 17:48:46,687][15372] Avg episode reward: [(0, '31.960')] [2023-02-22 17:48:48,439][33578] Updated weights for policy 0, policy_version 4722 (0.0016) [2023-02-22 17:48:51,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3707.3). Total num frames: 9682944. Throughput: 0: 984.0. Samples: 2421702. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:48:51,691][15372] Avg episode reward: [(0, '32.081')] [2023-02-22 17:48:53,918][33578] Updated weights for policy 0, policy_version 4732 (0.0011) [2023-02-22 17:48:56,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3707.2). Total num frames: 9699328. Throughput: 0: 953.0. Samples: 2423904. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:48:56,691][15372] Avg episode reward: [(0, '32.141')] [2023-02-22 17:49:00,625][33578] Updated weights for policy 0, policy_version 4742 (0.0043) [2023-02-22 17:49:01,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3693.3). Total num frames: 9715712. Throughput: 0: 932.5. Samples: 2428356. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:49:01,687][15372] Avg episode reward: [(0, '32.305')] [2023-02-22 17:49:05,136][33578] Updated weights for policy 0, policy_version 4752 (0.0017) [2023-02-22 17:49:06,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3679.5). Total num frames: 9736192. Throughput: 0: 988.8. Samples: 2435374. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:49:06,687][15372] Avg episode reward: [(0, '33.144')] [2023-02-22 17:49:09,325][33578] Updated weights for policy 0, policy_version 4762 (0.0013) [2023-02-22 17:49:11,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3707.3). Total num frames: 9760768. Throughput: 0: 987.7. Samples: 2438972. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:49:11,687][15372] Avg episode reward: [(0, '31.359')] [2023-02-22 17:49:15,192][33578] Updated weights for policy 0, policy_version 4772 (0.0015) [2023-02-22 17:49:16,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3891.4, 300 sec: 3721.1). Total num frames: 9777152. Throughput: 0: 943.2. Samples: 2444022. Policy #0 lag: (min: 0.0, avg: 0.9, max: 3.0) [2023-02-22 17:49:16,687][15372] Avg episode reward: [(0, '31.206')] [2023-02-22 17:49:21,482][33578] Updated weights for policy 0, policy_version 4782 (0.0017) [2023-02-22 17:49:21,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3707.2). Total num frames: 9793536. Throughput: 0: 945.4. Samples: 2448902. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:49:21,689][15372] Avg episode reward: [(0, '31.567')] [2023-02-22 17:49:26,039][33578] Updated weights for policy 0, policy_version 4792 (0.0028) [2023-02-22 17:49:26,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3707.2). Total num frames: 9814016. Throughput: 0: 973.5. Samples: 2452272. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:49:26,689][15372] Avg episode reward: [(0, '31.378')] [2023-02-22 17:49:30,441][33578] Updated weights for policy 0, policy_version 4802 (0.0012) [2023-02-22 17:49:31,684][15372] Fps is (10 sec: 4505.5, 60 sec: 3891.2, 300 sec: 3735.0). Total num frames: 9838592. Throughput: 0: 987.8. Samples: 2459322. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:49:31,687][15372] Avg episode reward: [(0, '30.673')] [2023-02-22 17:49:36,690][15372] Fps is (10 sec: 3684.4, 60 sec: 3822.8, 300 sec: 3721.0). Total num frames: 9850880. Throughput: 0: 939.0. Samples: 2463960. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:49:36,692][15372] Avg episode reward: [(0, '31.020')] [2023-02-22 17:49:36,966][33578] Updated weights for policy 0, policy_version 4812 (0.0024) [2023-02-22 17:49:41,685][15372] Fps is (10 sec: 2867.0, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 9867264. Throughput: 0: 938.6. Samples: 2466140. Policy #0 lag: (min: 1.0, avg: 1.9, max: 4.0) [2023-02-22 17:49:41,688][15372] Avg episode reward: [(0, '30.845')] [2023-02-22 17:49:41,703][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004818_9867264.pth... [2023-02-22 17:49:41,857][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004382_8974336.pth [2023-02-22 17:49:42,899][33578] Updated weights for policy 0, policy_version 4822 (0.0017) [2023-02-22 17:49:46,684][15372] Fps is (10 sec: 4098.2, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 9891840. Throughput: 0: 974.1. Samples: 2472192. Policy #0 lag: (min: 1.0, avg: 1.8, max: 3.0) [2023-02-22 17:49:46,687][15372] Avg episode reward: [(0, '31.731')] [2023-02-22 17:49:47,492][33578] Updated weights for policy 0, policy_version 4832 (0.0019) [2023-02-22 17:49:51,684][15372] Fps is (10 sec: 4506.0, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 9912320. Throughput: 0: 963.2. Samples: 2478720. Policy #0 lag: (min: 1.0, avg: 1.8, max: 3.0) [2023-02-22 17:49:51,687][15372] Avg episode reward: [(0, '31.692')] [2023-02-22 17:49:52,716][33578] Updated weights for policy 0, policy_version 4842 (0.0023) [2023-02-22 17:49:56,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 9924608. Throughput: 0: 929.7. Samples: 2480808. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:49:56,687][15372] Avg episode reward: [(0, '31.891')] [2023-02-22 17:50:00,137][33578] Updated weights for policy 0, policy_version 4852 (0.0025) [2023-02-22 17:50:01,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 9940992. Throughput: 0: 912.8. Samples: 2485098. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:50:01,689][15372] Avg episode reward: [(0, '32.351')] [2023-02-22 17:50:04,736][33578] Updated weights for policy 0, policy_version 4862 (0.0020) [2023-02-22 17:50:06,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3707.2). Total num frames: 9965568. Throughput: 0: 956.0. Samples: 2491924. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:50:06,691][15372] Avg episode reward: [(0, '32.574')] [2023-02-22 17:50:09,176][33578] Updated weights for policy 0, policy_version 4872 (0.0021) [2023-02-22 17:50:11,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 9986048. Throughput: 0: 953.8. Samples: 2495194. Policy #0 lag: (min: 1.0, avg: 1.8, max: 5.0) [2023-02-22 17:50:11,689][15372] Avg episode reward: [(0, '32.198')] [2023-02-22 17:50:16,063][33578] Updated weights for policy 0, policy_version 4882 (0.0017) [2023-02-22 17:50:16,687][15372] Fps is (10 sec: 3276.0, 60 sec: 3686.2, 300 sec: 3721.1). Total num frames: 9998336. Throughput: 0: 899.6. Samples: 2499806. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 17:50:16,690][15372] Avg episode reward: [(0, '30.484')] [2023-02-22 17:50:21,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 10014720. Throughput: 0: 905.7. Samples: 2504710. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:50:21,687][15372] Avg episode reward: [(0, '30.299')] [2023-02-22 17:50:22,007][33578] Updated weights for policy 0, policy_version 4892 (0.0028) [2023-02-22 17:50:26,469][33578] Updated weights for policy 0, policy_version 4902 (0.0014) [2023-02-22 17:50:26,684][15372] Fps is (10 sec: 4097.1, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 10039296. Throughput: 0: 935.0. Samples: 2508212. Policy #0 lag: (min: 1.0, avg: 2.4, max: 5.0) [2023-02-22 17:50:26,692][15372] Avg episode reward: [(0, '29.512')] [2023-02-22 17:50:31,217][33578] Updated weights for policy 0, policy_version 4912 (0.0013) [2023-02-22 17:50:31,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 10059776. Throughput: 0: 953.6. Samples: 2515104. Policy #0 lag: (min: 1.0, avg: 2.4, max: 5.0) [2023-02-22 17:50:31,687][15372] Avg episode reward: [(0, '30.068')] [2023-02-22 17:50:36,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3686.7, 300 sec: 3721.1). Total num frames: 10072064. Throughput: 0: 901.9. Samples: 2519306. Policy #0 lag: (min: 1.0, avg: 2.4, max: 5.0) [2023-02-22 17:50:36,687][15372] Avg episode reward: [(0, '29.480')] [2023-02-22 17:50:38,358][33578] Updated weights for policy 0, policy_version 4922 (0.0031) [2023-02-22 17:50:41,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3686.5, 300 sec: 3693.3). Total num frames: 10088448. Throughput: 0: 905.0. Samples: 2521532. Policy #0 lag: (min: 1.0, avg: 2.2, max: 3.0) [2023-02-22 17:50:41,693][15372] Avg episode reward: [(0, '29.732')] [2023-02-22 17:50:43,754][33578] Updated weights for policy 0, policy_version 4932 (0.0014) [2023-02-22 17:50:46,685][15372] Fps is (10 sec: 4095.8, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 10113024. Throughput: 0: 949.1. Samples: 2527806. Policy #0 lag: (min: 1.0, avg: 2.4, max: 5.0) [2023-02-22 17:50:46,687][15372] Avg episode reward: [(0, '29.900')] [2023-02-22 17:50:48,259][33578] Updated weights for policy 0, policy_version 4942 (0.0013) [2023-02-22 17:50:51,687][15372] Fps is (10 sec: 4504.5, 60 sec: 3686.2, 300 sec: 3735.0). Total num frames: 10133504. Throughput: 0: 937.2. Samples: 2534102. Policy #0 lag: (min: 1.0, avg: 2.3, max: 3.0) [2023-02-22 17:50:51,690][15372] Avg episode reward: [(0, '29.736')] [2023-02-22 17:50:54,075][33578] Updated weights for policy 0, policy_version 4952 (0.0036) [2023-02-22 17:50:56,684][15372] Fps is (10 sec: 3276.9, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 10145792. Throughput: 0: 912.2. Samples: 2536244. Policy #0 lag: (min: 1.0, avg: 2.2, max: 3.0) [2023-02-22 17:50:56,700][15372] Avg episode reward: [(0, '30.598')] [2023-02-22 17:51:00,836][33578] Updated weights for policy 0, policy_version 4962 (0.0041) [2023-02-22 17:51:01,684][15372] Fps is (10 sec: 2867.9, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 10162176. Throughput: 0: 907.3. Samples: 2540630. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:51:01,686][15372] Avg episode reward: [(0, '31.212')] [2023-02-22 17:51:05,429][33578] Updated weights for policy 0, policy_version 4972 (0.0011) [2023-02-22 17:51:06,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 10186752. Throughput: 0: 948.2. Samples: 2547380. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:51:06,687][15372] Avg episode reward: [(0, '30.485')] [2023-02-22 17:51:10,093][33578] Updated weights for policy 0, policy_version 4982 (0.0011) [2023-02-22 17:51:11,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3735.0). Total num frames: 10207232. Throughput: 0: 945.0. Samples: 2550738. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:51:11,692][15372] Avg episode reward: [(0, '29.236')] [2023-02-22 17:51:16,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3686.6, 300 sec: 3721.1). Total num frames: 10219520. Throughput: 0: 893.6. Samples: 2555318. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:51:16,694][15372] Avg episode reward: [(0, '29.344')] [2023-02-22 17:51:17,059][33578] Updated weights for policy 0, policy_version 4992 (0.0020) [2023-02-22 17:51:21,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 10235904. Throughput: 0: 909.9. Samples: 2560250. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:51:21,690][15372] Avg episode reward: [(0, '29.286')] [2023-02-22 17:51:22,677][33578] Updated weights for policy 0, policy_version 5002 (0.0012) [2023-02-22 17:51:26,685][15372] Fps is (10 sec: 4095.9, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 10260480. Throughput: 0: 936.8. Samples: 2563688. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:51:26,687][15372] Avg episode reward: [(0, '31.338')] [2023-02-22 17:51:27,173][33578] Updated weights for policy 0, policy_version 5012 (0.0011) [2023-02-22 17:51:31,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 10280960. Throughput: 0: 943.7. Samples: 2570270. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 17:51:31,687][15372] Avg episode reward: [(0, '31.457')] [2023-02-22 17:51:32,586][33578] Updated weights for policy 0, policy_version 5022 (0.0016) [2023-02-22 17:51:36,693][15372] Fps is (10 sec: 3274.1, 60 sec: 3685.9, 300 sec: 3721.0). Total num frames: 10293248. Throughput: 0: 897.5. Samples: 2574496. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 17:51:36,701][15372] Avg episode reward: [(0, '32.488')] [2023-02-22 17:51:40,027][33578] Updated weights for policy 0, policy_version 5032 (0.0020) [2023-02-22 17:51:41,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 10309632. Throughput: 0: 897.7. Samples: 2576640. Policy #0 lag: (min: 1.0, avg: 2.2, max: 4.0) [2023-02-22 17:51:41,691][15372] Avg episode reward: [(0, '32.524')] [2023-02-22 17:51:41,761][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005036_10313728.pth... [2023-02-22 17:51:41,871][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004592_9404416.pth [2023-02-22 17:51:44,716][33578] Updated weights for policy 0, policy_version 5042 (0.0015) [2023-02-22 17:51:46,684][15372] Fps is (10 sec: 4099.5, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 10334208. Throughput: 0: 942.1. Samples: 2583026. Policy #0 lag: (min: 1.0, avg: 2.1, max: 3.0) [2023-02-22 17:51:46,692][15372] Avg episode reward: [(0, '34.075')] [2023-02-22 17:51:49,322][33578] Updated weights for policy 0, policy_version 5052 (0.0024) [2023-02-22 17:51:51,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3618.3, 300 sec: 3762.8). Total num frames: 10350592. Throughput: 0: 927.1. Samples: 2589098. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-22 17:51:51,695][15372] Avg episode reward: [(0, '33.436')] [2023-02-22 17:51:56,487][33578] Updated weights for policy 0, policy_version 5062 (0.0029) [2023-02-22 17:51:56,685][15372] Fps is (10 sec: 3276.7, 60 sec: 3686.4, 300 sec: 3762.8). Total num frames: 10366976. Throughput: 0: 896.9. Samples: 2591100. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:51:56,694][15372] Avg episode reward: [(0, '32.821')] [2023-02-22 17:52:01,685][15372] Fps is (10 sec: 2866.9, 60 sec: 3618.1, 300 sec: 3735.0). Total num frames: 10379264. Throughput: 0: 882.8. Samples: 2595044. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:52:01,691][15372] Avg episode reward: [(0, '32.410')] [2023-02-22 17:52:03,951][33578] Updated weights for policy 0, policy_version 5072 (0.0016) [2023-02-22 17:52:06,684][15372] Fps is (10 sec: 2457.7, 60 sec: 3413.3, 300 sec: 3707.2). Total num frames: 10391552. Throughput: 0: 863.6. Samples: 2599110. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:52:06,695][15372] Avg episode reward: [(0, '32.156')] [2023-02-22 17:52:11,239][33578] Updated weights for policy 0, policy_version 5082 (0.0026) [2023-02-22 17:52:11,688][15372] Fps is (10 sec: 2866.3, 60 sec: 3344.8, 300 sec: 3707.2). Total num frames: 10407936. Throughput: 0: 834.9. Samples: 2601262. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:52:11,693][15372] Avg episode reward: [(0, '31.786')] [2023-02-22 17:52:16,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3693.3). Total num frames: 10420224. Throughput: 0: 784.7. Samples: 2605580. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:52:16,690][15372] Avg episode reward: [(0, '31.152')] [2023-02-22 17:52:18,474][33578] Updated weights for policy 0, policy_version 5092 (0.0020) [2023-02-22 17:52:21,684][15372] Fps is (10 sec: 3278.1, 60 sec: 3413.3, 300 sec: 3693.3). Total num frames: 10440704. Throughput: 0: 804.8. Samples: 2610704. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:52:21,687][15372] Avg episode reward: [(0, '29.614')] [2023-02-22 17:52:23,491][33578] Updated weights for policy 0, policy_version 5102 (0.0018) [2023-02-22 17:52:26,685][15372] Fps is (10 sec: 4095.9, 60 sec: 3345.1, 300 sec: 3679.5). Total num frames: 10461184. Throughput: 0: 828.1. Samples: 2613904. Policy #0 lag: (min: 0.0, avg: 1.4, max: 4.0) [2023-02-22 17:52:26,690][15372] Avg episode reward: [(0, '30.062')] [2023-02-22 17:52:27,889][33578] Updated weights for policy 0, policy_version 5112 (0.0015) [2023-02-22 17:52:31,685][15372] Fps is (10 sec: 4095.8, 60 sec: 3345.0, 300 sec: 3693.3). Total num frames: 10481664. Throughput: 0: 831.4. Samples: 2620438. Policy #0 lag: (min: 0.0, avg: 1.3, max: 4.0) [2023-02-22 17:52:31,692][15372] Avg episode reward: [(0, '30.475')] [2023-02-22 17:52:33,929][33578] Updated weights for policy 0, policy_version 5122 (0.0018) [2023-02-22 17:52:36,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3345.5, 300 sec: 3693.4). Total num frames: 10493952. Throughput: 0: 791.2. Samples: 2624700. Policy #0 lag: (min: 1.0, avg: 1.8, max: 5.0) [2023-02-22 17:52:36,687][15372] Avg episode reward: [(0, '30.277')] [2023-02-22 17:52:40,468][33578] Updated weights for policy 0, policy_version 5132 (0.0022) [2023-02-22 17:52:41,684][15372] Fps is (10 sec: 3276.9, 60 sec: 3413.3, 300 sec: 3679.5). Total num frames: 10514432. Throughput: 0: 796.8. Samples: 2626956. Policy #0 lag: (min: 1.0, avg: 1.9, max: 4.0) [2023-02-22 17:52:41,687][15372] Avg episode reward: [(0, '31.291')] [2023-02-22 17:52:45,155][33578] Updated weights for policy 0, policy_version 5142 (0.0022) [2023-02-22 17:52:46,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3665.6). Total num frames: 10534912. Throughput: 0: 858.5. Samples: 2633676. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:52:46,694][15372] Avg episode reward: [(0, '30.153')] [2023-02-22 17:52:49,890][33578] Updated weights for policy 0, policy_version 5152 (0.0017) [2023-02-22 17:52:51,686][15372] Fps is (10 sec: 4095.3, 60 sec: 3413.2, 300 sec: 3693.3). Total num frames: 10555392. Throughput: 0: 899.1. Samples: 2639572. Policy #0 lag: (min: 0.0, avg: 1.3, max: 4.0) [2023-02-22 17:52:51,692][15372] Avg episode reward: [(0, '31.057')] [2023-02-22 17:52:56,686][15372] Fps is (10 sec: 3276.4, 60 sec: 3345.0, 300 sec: 3679.4). Total num frames: 10567680. Throughput: 0: 899.2. Samples: 2641722. Policy #0 lag: (min: 0.0, avg: 1.3, max: 4.0) [2023-02-22 17:52:56,689][15372] Avg episode reward: [(0, '29.882')] [2023-02-22 17:52:57,070][33578] Updated weights for policy 0, policy_version 5162 (0.0027) [2023-02-22 17:53:01,684][15372] Fps is (10 sec: 3277.4, 60 sec: 3481.7, 300 sec: 3665.6). Total num frames: 10588160. Throughput: 0: 913.5. Samples: 2646686. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:53:01,695][15372] Avg episode reward: [(0, '30.171')] [2023-02-22 17:53:02,144][33578] Updated weights for policy 0, policy_version 5172 (0.0018) [2023-02-22 17:53:06,590][33578] Updated weights for policy 0, policy_version 5182 (0.0011) [2023-02-22 17:53:06,684][15372] Fps is (10 sec: 4506.3, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 10612736. Throughput: 0: 952.4. Samples: 2653562. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:53:06,687][15372] Avg episode reward: [(0, '30.553')] [2023-02-22 17:53:11,691][15372] Fps is (10 sec: 4093.3, 60 sec: 3686.2, 300 sec: 3679.4). Total num frames: 10629120. Throughput: 0: 955.0. Samples: 2656886. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:53:11,697][15372] Avg episode reward: [(0, '31.864')] [2023-02-22 17:53:12,420][33578] Updated weights for policy 0, policy_version 5192 (0.0011) [2023-02-22 17:53:16,685][15372] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 10645504. Throughput: 0: 903.8. Samples: 2661110. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:53:16,688][15372] Avg episode reward: [(0, '31.236')] [2023-02-22 17:53:19,343][33578] Updated weights for policy 0, policy_version 5202 (0.0032) [2023-02-22 17:53:21,684][15372] Fps is (10 sec: 3278.9, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 10661888. Throughput: 0: 927.6. Samples: 2666442. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:53:21,692][15372] Avg episode reward: [(0, '31.651')] [2023-02-22 17:53:23,885][33578] Updated weights for policy 0, policy_version 5212 (0.0017) [2023-02-22 17:53:26,684][15372] Fps is (10 sec: 4096.1, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 10686464. Throughput: 0: 953.8. Samples: 2669876. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-02-22 17:53:26,694][15372] Avg episode reward: [(0, '30.206')] [2023-02-22 17:53:28,357][33578] Updated weights for policy 0, policy_version 5222 (0.0019) [2023-02-22 17:53:31,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 10702848. Throughput: 0: 942.3. Samples: 2676078. Policy #0 lag: (min: 1.0, avg: 2.2, max: 3.0) [2023-02-22 17:53:31,689][15372] Avg episode reward: [(0, '32.122')] [2023-02-22 17:53:35,259][33578] Updated weights for policy 0, policy_version 5232 (0.0011) [2023-02-22 17:53:36,685][15372] Fps is (10 sec: 2867.1, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 10715136. Throughput: 0: 902.8. Samples: 2680196. Policy #0 lag: (min: 1.0, avg: 2.2, max: 3.0) [2023-02-22 17:53:36,694][15372] Avg episode reward: [(0, '33.114')] [2023-02-22 17:53:41,144][33578] Updated weights for policy 0, policy_version 5242 (0.0022) [2023-02-22 17:53:41,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 10735616. Throughput: 0: 909.4. Samples: 2682642. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:53:41,692][15372] Avg episode reward: [(0, '33.172')] [2023-02-22 17:53:41,703][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005242_10735616.pth... [2023-02-22 17:53:41,823][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004818_9867264.pth [2023-02-22 17:53:45,748][33578] Updated weights for policy 0, policy_version 5252 (0.0014) [2023-02-22 17:53:46,684][15372] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 10760192. Throughput: 0: 949.9. Samples: 2689430. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:53:46,687][15372] Avg episode reward: [(0, '31.364')] [2023-02-22 17:53:51,085][33578] Updated weights for policy 0, policy_version 5262 (0.0024) [2023-02-22 17:53:51,689][15372] Fps is (10 sec: 4094.2, 60 sec: 3686.2, 300 sec: 3651.6). Total num frames: 10776576. Throughput: 0: 922.7. Samples: 2695086. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:53:51,692][15372] Avg episode reward: [(0, '31.409')] [2023-02-22 17:53:56,687][15372] Fps is (10 sec: 2866.5, 60 sec: 3686.3, 300 sec: 3637.8). Total num frames: 10788864. Throughput: 0: 896.8. Samples: 2697240. Policy #0 lag: (min: 1.0, avg: 2.2, max: 4.0) [2023-02-22 17:53:56,692][15372] Avg episode reward: [(0, '32.133')] [2023-02-22 17:53:58,283][33578] Updated weights for policy 0, policy_version 5272 (0.0021) [2023-02-22 17:54:01,684][15372] Fps is (10 sec: 3278.3, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 10809344. Throughput: 0: 917.7. Samples: 2702406. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:54:01,690][15372] Avg episode reward: [(0, '33.047')] [2023-02-22 17:54:02,864][33578] Updated weights for policy 0, policy_version 5282 (0.0012) [2023-02-22 17:54:06,684][15372] Fps is (10 sec: 4506.7, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 10833920. Throughput: 0: 950.8. Samples: 2709226. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:54:06,692][15372] Avg episode reward: [(0, '33.201')] [2023-02-22 17:54:07,317][33578] Updated weights for policy 0, policy_version 5292 (0.0011) [2023-02-22 17:54:11,685][15372] Fps is (10 sec: 4095.9, 60 sec: 3686.8, 300 sec: 3637.8). Total num frames: 10850304. Throughput: 0: 939.1. Samples: 2712134. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:54:11,689][15372] Avg episode reward: [(0, '32.450')] [2023-02-22 17:54:13,736][33578] Updated weights for policy 0, policy_version 5302 (0.0020) [2023-02-22 17:54:16,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 10866688. Throughput: 0: 895.8. Samples: 2716388. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:54:16,692][15372] Avg episode reward: [(0, '31.656')] [2023-02-22 17:54:19,743][33578] Updated weights for policy 0, policy_version 5312 (0.0017) [2023-02-22 17:54:21,684][15372] Fps is (10 sec: 3686.5, 60 sec: 3754.7, 300 sec: 3637.8). Total num frames: 10887168. Throughput: 0: 934.7. Samples: 2722256. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:54:21,687][15372] Avg episode reward: [(0, '31.967')] [2023-02-22 17:54:24,273][33578] Updated weights for policy 0, policy_version 5322 (0.0012) [2023-02-22 17:54:26,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 10907648. Throughput: 0: 954.9. Samples: 2725614. Policy #0 lag: (min: 0.0, avg: 0.9, max: 4.0) [2023-02-22 17:54:26,687][15372] Avg episode reward: [(0, '31.073')] [2023-02-22 17:54:29,030][33578] Updated weights for policy 0, policy_version 5332 (0.0019) [2023-02-22 17:54:31,686][15372] Fps is (10 sec: 3685.9, 60 sec: 3686.3, 300 sec: 3637.9). Total num frames: 10924032. Throughput: 0: 934.4. Samples: 2731478. Policy #0 lag: (min: 0.0, avg: 0.9, max: 4.0) [2023-02-22 17:54:31,696][15372] Avg episode reward: [(0, '30.665')] [2023-02-22 17:54:36,392][33578] Updated weights for policy 0, policy_version 5342 (0.0018) [2023-02-22 17:54:36,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3637.8). Total num frames: 10940416. Throughput: 0: 902.8. Samples: 2735710. Policy #0 lag: (min: 1.0, avg: 1.9, max: 4.0) [2023-02-22 17:54:36,687][15372] Avg episode reward: [(0, '28.470')] [2023-02-22 17:54:41,528][33578] Updated weights for policy 0, policy_version 5352 (0.0016) [2023-02-22 17:54:41,684][15372] Fps is (10 sec: 3686.9, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 10960896. Throughput: 0: 917.4. Samples: 2738520. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:54:41,687][15372] Avg episode reward: [(0, '28.343')] [2023-02-22 17:54:46,005][33578] Updated weights for policy 0, policy_version 5362 (0.0014) [2023-02-22 17:54:46,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 10981376. Throughput: 0: 954.0. Samples: 2745336. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:54:46,687][15372] Avg episode reward: [(0, '28.565')] [2023-02-22 17:54:51,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3686.7, 300 sec: 3637.8). Total num frames: 10997760. Throughput: 0: 918.5. Samples: 2750560. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:54:51,687][15372] Avg episode reward: [(0, '29.458')] [2023-02-22 17:54:52,077][33578] Updated weights for policy 0, policy_version 5372 (0.0011) [2023-02-22 17:54:56,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.8, 300 sec: 3637.8). Total num frames: 11014144. Throughput: 0: 901.3. Samples: 2752694. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:54:56,692][15372] Avg episode reward: [(0, '30.204')] [2023-02-22 17:54:58,758][33578] Updated weights for policy 0, policy_version 5382 (0.0039) [2023-02-22 17:55:01,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 11034624. Throughput: 0: 926.7. Samples: 2758088. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 17:55:01,693][15372] Avg episode reward: [(0, '31.563')] [2023-02-22 17:55:03,278][33578] Updated weights for policy 0, policy_version 5392 (0.0018) [2023-02-22 17:55:06,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 11055104. Throughput: 0: 950.3. Samples: 2765018. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-02-22 17:55:06,691][15372] Avg episode reward: [(0, '31.061')] [2023-02-22 17:55:07,940][33578] Updated weights for policy 0, policy_version 5402 (0.0025) [2023-02-22 17:55:11,688][15372] Fps is (10 sec: 3685.1, 60 sec: 3686.2, 300 sec: 3637.8). Total num frames: 11071488. Throughput: 0: 933.0. Samples: 2767602. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-02-22 17:55:11,694][15372] Avg episode reward: [(0, '32.552')] [2023-02-22 17:55:15,043][33578] Updated weights for policy 0, policy_version 5412 (0.0023) [2023-02-22 17:55:16,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 11087872. Throughput: 0: 898.3. Samples: 2771900. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:55:16,690][15372] Avg episode reward: [(0, '32.212')] [2023-02-22 17:55:20,238][33578] Updated weights for policy 0, policy_version 5422 (0.0019) [2023-02-22 17:55:21,684][15372] Fps is (10 sec: 3687.7, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 11108352. Throughput: 0: 942.0. Samples: 2778100. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:55:21,687][15372] Avg episode reward: [(0, '31.459')] [2023-02-22 17:55:24,850][33578] Updated weights for policy 0, policy_version 5432 (0.0026) [2023-02-22 17:55:26,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3637.8). Total num frames: 11132928. Throughput: 0: 954.6. Samples: 2781478. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:55:26,690][15372] Avg episode reward: [(0, '30.512')] [2023-02-22 17:55:30,472][33578] Updated weights for policy 0, policy_version 5442 (0.0019) [2023-02-22 17:55:31,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3686.5, 300 sec: 3637.8). Total num frames: 11145216. Throughput: 0: 922.5. Samples: 2786848. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:55:31,687][15372] Avg episode reward: [(0, '29.749')] [2023-02-22 17:55:36,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 11161600. Throughput: 0: 900.5. Samples: 2791082. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:55:36,693][15372] Avg episode reward: [(0, '30.475')] [2023-02-22 17:55:37,475][33578] Updated weights for policy 0, policy_version 5452 (0.0022) [2023-02-22 17:55:41,685][15372] Fps is (10 sec: 3686.2, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 11182080. Throughput: 0: 923.6. Samples: 2794258. Policy #0 lag: (min: 1.0, avg: 2.0, max: 3.0) [2023-02-22 17:55:41,688][15372] Avg episode reward: [(0, '31.103')] [2023-02-22 17:55:41,700][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005460_11182080.pth... [2023-02-22 17:55:41,821][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005036_10313728.pth [2023-02-22 17:55:42,058][33578] Updated weights for policy 0, policy_version 5462 (0.0017) [2023-02-22 17:55:46,516][33578] Updated weights for policy 0, policy_version 5472 (0.0013) [2023-02-22 17:55:46,684][15372] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3637.8). Total num frames: 11206656. Throughput: 0: 954.6. Samples: 2801044. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:55:46,687][15372] Avg episode reward: [(0, '30.866')] [2023-02-22 17:55:51,684][15372] Fps is (10 sec: 3686.5, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 11218944. Throughput: 0: 909.2. Samples: 2805932. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:55:51,688][15372] Avg episode reward: [(0, '32.410')] [2023-02-22 17:55:53,717][33578] Updated weights for policy 0, policy_version 5482 (0.0025) [2023-02-22 17:55:56,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 11235328. Throughput: 0: 897.3. Samples: 2807978. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:55:56,687][15372] Avg episode reward: [(0, '32.861')] [2023-02-22 17:55:59,287][33578] Updated weights for policy 0, policy_version 5492 (0.0016) [2023-02-22 17:56:01,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 11255808. Throughput: 0: 932.5. Samples: 2813862. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:56:01,690][15372] Avg episode reward: [(0, '31.240')] [2023-02-22 17:56:03,695][33578] Updated weights for policy 0, policy_version 5502 (0.0020) [2023-02-22 17:56:06,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3637.8). Total num frames: 11280384. Throughput: 0: 948.0. Samples: 2820762. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:56:06,687][15372] Avg episode reward: [(0, '32.408')] [2023-02-22 17:56:09,284][33578] Updated weights for policy 0, policy_version 5512 (0.0013) [2023-02-22 17:56:11,694][15372] Fps is (10 sec: 3683.0, 60 sec: 3686.0, 300 sec: 3637.7). Total num frames: 11292672. Throughput: 0: 920.8. Samples: 2822922. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:56:11,700][15372] Avg episode reward: [(0, '30.993')] [2023-02-22 17:56:16,473][33578] Updated weights for policy 0, policy_version 5522 (0.0020) [2023-02-22 17:56:16,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 11309056. Throughput: 0: 896.2. Samples: 2827178. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-22 17:56:16,689][15372] Avg episode reward: [(0, '30.651')] [2023-02-22 17:56:20,971][33578] Updated weights for policy 0, policy_version 5532 (0.0018) [2023-02-22 17:56:21,684][15372] Fps is (10 sec: 3689.8, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 11329536. Throughput: 0: 944.1. Samples: 2833568. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-22 17:56:21,694][15372] Avg episode reward: [(0, '28.997')] [2023-02-22 17:56:25,443][33578] Updated weights for policy 0, policy_version 5542 (0.0019) [2023-02-22 17:56:26,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 11354112. Throughput: 0: 949.3. Samples: 2836976. Policy #0 lag: (min: 0.0, avg: 1.0, max: 4.0) [2023-02-22 17:56:26,689][15372] Avg episode reward: [(0, '27.926')] [2023-02-22 17:56:31,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3637.9). Total num frames: 11366400. Throughput: 0: 913.9. Samples: 2842168. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:56:31,692][15372] Avg episode reward: [(0, '29.149')] [2023-02-22 17:56:31,936][33578] Updated weights for policy 0, policy_version 5552 (0.0060) [2023-02-22 17:56:36,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 11382784. Throughput: 0: 897.9. Samples: 2846338. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:56:36,687][15372] Avg episode reward: [(0, '30.265')] [2023-02-22 17:56:38,212][33578] Updated weights for policy 0, policy_version 5562 (0.0011) [2023-02-22 17:56:41,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 11403264. Throughput: 0: 927.3. Samples: 2849706. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:56:41,695][15372] Avg episode reward: [(0, '29.764')] [2023-02-22 17:56:42,814][33578] Updated weights for policy 0, policy_version 5572 (0.0023) [2023-02-22 17:56:46,689][15372] Fps is (10 sec: 4503.5, 60 sec: 3686.1, 300 sec: 3651.6). Total num frames: 11427840. Throughput: 0: 948.4. Samples: 2856546. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:56:46,692][15372] Avg episode reward: [(0, '30.353')] [2023-02-22 17:56:47,741][33578] Updated weights for policy 0, policy_version 5582 (0.0014) [2023-02-22 17:56:51,685][15372] Fps is (10 sec: 3686.3, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 11440128. Throughput: 0: 895.8. Samples: 2861074. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:56:51,691][15372] Avg episode reward: [(0, '30.387')] [2023-02-22 17:56:55,101][33578] Updated weights for policy 0, policy_version 5592 (0.0022) [2023-02-22 17:56:56,684][15372] Fps is (10 sec: 2868.5, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 11456512. Throughput: 0: 895.3. Samples: 2863202. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:56:56,686][15372] Avg episode reward: [(0, '31.995')] [2023-02-22 17:56:59,999][33578] Updated weights for policy 0, policy_version 5602 (0.0022) [2023-02-22 17:57:01,684][15372] Fps is (10 sec: 3686.5, 60 sec: 3686.4, 300 sec: 3679.5). Total num frames: 11476992. Throughput: 0: 938.3. Samples: 2869402. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:57:01,693][15372] Avg episode reward: [(0, '33.239')] [2023-02-22 17:57:04,381][33578] Updated weights for policy 0, policy_version 5612 (0.0031) [2023-02-22 17:57:06,690][15372] Fps is (10 sec: 4093.9, 60 sec: 3617.8, 300 sec: 3693.3). Total num frames: 11497472. Throughput: 0: 936.4. Samples: 2875710. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:57:06,692][15372] Avg episode reward: [(0, '32.094')] [2023-02-22 17:57:11,689][15372] Fps is (10 sec: 3275.3, 60 sec: 3618.4, 300 sec: 3693.3). Total num frames: 11509760. Throughput: 0: 898.3. Samples: 2877404. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:57:11,693][15372] Avg episode reward: [(0, '31.408')] [2023-02-22 17:57:12,492][33578] Updated weights for policy 0, policy_version 5622 (0.0011) [2023-02-22 17:57:16,684][15372] Fps is (10 sec: 2458.9, 60 sec: 3549.9, 300 sec: 3665.6). Total num frames: 11522048. Throughput: 0: 857.5. Samples: 2880756. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-02-22 17:57:16,691][15372] Avg episode reward: [(0, '30.709')] [2023-02-22 17:57:20,873][33578] Updated weights for policy 0, policy_version 5632 (0.0023) [2023-02-22 17:57:21,684][15372] Fps is (10 sec: 2458.7, 60 sec: 3413.3, 300 sec: 3637.8). Total num frames: 11534336. Throughput: 0: 857.2. Samples: 2884910. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:57:21,687][15372] Avg episode reward: [(0, '32.432')] [2023-02-22 17:57:25,289][33578] Updated weights for policy 0, policy_version 5642 (0.0018) [2023-02-22 17:57:26,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3651.7). Total num frames: 11558912. Throughput: 0: 859.2. Samples: 2888370. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:57:26,687][15372] Avg episode reward: [(0, '32.260')] [2023-02-22 17:57:29,729][33578] Updated weights for policy 0, policy_version 5652 (0.0011) [2023-02-22 17:57:31,688][15372] Fps is (10 sec: 4503.8, 60 sec: 3549.6, 300 sec: 3679.4). Total num frames: 11579392. Throughput: 0: 860.9. Samples: 2895286. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:57:31,695][15372] Avg episode reward: [(0, '31.294')] [2023-02-22 17:57:36,348][33578] Updated weights for policy 0, policy_version 5662 (0.0021) [2023-02-22 17:57:36,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3665.6). Total num frames: 11595776. Throughput: 0: 854.8. Samples: 2899540. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:57:36,688][15372] Avg episode reward: [(0, '30.858')] [2023-02-22 17:57:41,684][15372] Fps is (10 sec: 3278.1, 60 sec: 3481.6, 300 sec: 3651.7). Total num frames: 11612160. Throughput: 0: 854.0. Samples: 2901634. Policy #0 lag: (min: 1.0, avg: 2.0, max: 4.0) [2023-02-22 17:57:41,687][15372] Avg episode reward: [(0, '30.608')] [2023-02-22 17:57:41,699][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005670_11612160.pth... [2023-02-22 17:57:41,821][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005242_10735616.pth [2023-02-22 17:57:42,527][33578] Updated weights for policy 0, policy_version 5672 (0.0018) [2023-02-22 17:57:46,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3413.6, 300 sec: 3651.7). Total num frames: 11632640. Throughput: 0: 856.4. Samples: 2907942. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:57:46,686][15372] Avg episode reward: [(0, '31.675')] [2023-02-22 17:57:47,092][33578] Updated weights for policy 0, policy_version 5682 (0.0014) [2023-02-22 17:57:51,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3679.5). Total num frames: 11653120. Throughput: 0: 862.7. Samples: 2914526. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:57:51,690][15372] Avg episode reward: [(0, '30.204')] [2023-02-22 17:57:52,026][33578] Updated weights for policy 0, policy_version 5692 (0.0014) [2023-02-22 17:57:56,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3665.6). Total num frames: 11669504. Throughput: 0: 872.4. Samples: 2916658. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:57:56,694][15372] Avg episode reward: [(0, '29.221')] [2023-02-22 17:57:59,279][33578] Updated weights for policy 0, policy_version 5702 (0.0011) [2023-02-22 17:58:01,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3637.8). Total num frames: 11685888. Throughput: 0: 893.2. Samples: 2920948. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:58:01,693][15372] Avg episode reward: [(0, '30.043')] [2023-02-22 17:58:04,154][33578] Updated weights for policy 0, policy_version 5712 (0.0023) [2023-02-22 17:58:06,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3481.9, 300 sec: 3651.8). Total num frames: 11706368. Throughput: 0: 954.0. Samples: 2927838. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:58:06,687][15372] Avg episode reward: [(0, '30.681')] [2023-02-22 17:58:08,578][33578] Updated weights for policy 0, policy_version 5722 (0.0017) [2023-02-22 17:58:11,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3618.4, 300 sec: 3665.6). Total num frames: 11726848. Throughput: 0: 953.6. Samples: 2931284. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:58:11,687][15372] Avg episode reward: [(0, '30.777')] [2023-02-22 17:58:14,791][33578] Updated weights for policy 0, policy_version 5732 (0.0013) [2023-02-22 17:58:16,684][15372] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 11743232. Throughput: 0: 902.6. Samples: 2935898. Policy #0 lag: (min: 1.0, avg: 2.0, max: 5.0) [2023-02-22 17:58:16,692][15372] Avg episode reward: [(0, '30.345')] [2023-02-22 17:58:21,033][33578] Updated weights for policy 0, policy_version 5742 (0.0018) [2023-02-22 17:58:21,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3637.8). Total num frames: 11759616. Throughput: 0: 918.8. Samples: 2940886. Policy #0 lag: (min: 1.0, avg: 1.9, max: 5.0) [2023-02-22 17:58:21,687][15372] Avg episode reward: [(0, '31.041')] [2023-02-22 17:58:25,584][33578] Updated weights for policy 0, policy_version 5752 (0.0017) [2023-02-22 17:58:26,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 11784192. Throughput: 0: 948.0. Samples: 2944296. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:58:26,693][15372] Avg episode reward: [(0, '30.409')] [2023-02-22 17:58:30,143][33578] Updated weights for policy 0, policy_version 5762 (0.0028) [2023-02-22 17:58:31,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3754.9, 300 sec: 3693.3). Total num frames: 11804672. Throughput: 0: 958.0. Samples: 2951050. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:58:31,688][15372] Avg episode reward: [(0, '30.353')] [2023-02-22 17:58:36,686][15372] Fps is (10 sec: 3276.1, 60 sec: 3686.3, 300 sec: 3665.5). Total num frames: 11816960. Throughput: 0: 905.8. Samples: 2955290. Policy #0 lag: (min: 1.0, avg: 2.1, max: 5.0) [2023-02-22 17:58:36,693][15372] Avg episode reward: [(0, '30.998')] [2023-02-22 17:58:37,384][33578] Updated weights for policy 0, policy_version 5772 (0.0024) [2023-02-22 17:58:41,684][15372] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 11833344. Throughput: 0: 905.1. Samples: 2957388. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-22 17:58:41,692][15372] Avg episode reward: [(0, '30.351')] [2023-02-22 17:58:42,755][33578] Updated weights for policy 0, policy_version 5782 (0.0014) [2023-02-22 17:58:46,684][15372] Fps is (10 sec: 4096.9, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 11857920. Throughput: 0: 957.4. Samples: 2964030. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:58:46,686][15372] Avg episode reward: [(0, '29.331')] [2023-02-22 17:58:47,259][33578] Updated weights for policy 0, policy_version 5792 (0.0011) [2023-02-22 17:58:51,684][15372] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3693.4). Total num frames: 11878400. Throughput: 0: 940.1. Samples: 2970142. Policy #0 lag: (min: 1.0, avg: 1.8, max: 5.0) [2023-02-22 17:58:51,690][15372] Avg episode reward: [(0, '29.973')] [2023-02-22 17:58:52,945][33578] Updated weights for policy 0, policy_version 5802 (0.0012) [2023-02-22 17:58:56,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 11890688. Throughput: 0: 910.2. Samples: 2972242. Policy #0 lag: (min: 1.0, avg: 1.8, max: 5.0) [2023-02-22 17:58:56,689][15372] Avg episode reward: [(0, '29.586')] [2023-02-22 17:59:00,041][33578] Updated weights for policy 0, policy_version 5812 (0.0012) [2023-02-22 17:59:01,688][15372] Fps is (10 sec: 2866.3, 60 sec: 3686.2, 300 sec: 3637.8). Total num frames: 11907072. Throughput: 0: 911.2. Samples: 2976904. Policy #0 lag: (min: 0.0, avg: 1.1, max: 4.0) [2023-02-22 17:59:01,693][15372] Avg episode reward: [(0, '30.169')] [2023-02-22 17:59:04,513][33578] Updated weights for policy 0, policy_version 5822 (0.0013) [2023-02-22 17:59:06,684][15372] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 11931648. Throughput: 0: 954.1. Samples: 2983820. Policy #0 lag: (min: 1.0, avg: 2.2, max: 5.0) [2023-02-22 17:59:06,689][15372] Avg episode reward: [(0, '31.170')] [2023-02-22 17:59:08,841][33578] Updated weights for policy 0, policy_version 5832 (0.0011) [2023-02-22 17:59:11,684][15372] Fps is (10 sec: 4507.0, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 11952128. Throughput: 0: 953.6. Samples: 2987210. Policy #0 lag: (min: 1.0, avg: 1.9, max: 3.0) [2023-02-22 17:59:11,693][15372] Avg episode reward: [(0, '30.783')] [2023-02-22 17:59:15,659][33578] Updated weights for policy 0, policy_version 5842 (0.0022) [2023-02-22 17:59:16,685][15372] Fps is (10 sec: 3276.7, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 11964416. Throughput: 0: 898.7. Samples: 2991494. Policy #0 lag: (min: 0.0, avg: 0.9, max: 4.0) [2023-02-22 17:59:16,687][15372] Avg episode reward: [(0, '30.571')] [2023-02-22 17:59:21,629][33578] Updated weights for policy 0, policy_version 5852 (0.0021) [2023-02-22 17:59:21,684][15372] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 11984896. Throughput: 0: 921.2. Samples: 2996740. Policy #0 lag: (min: 1.0, avg: 2.3, max: 5.0) [2023-02-22 17:59:21,687][15372] Avg episode reward: [(0, '32.563')] [2023-02-22 17:59:25,953][33578] Updated weights for policy 0, policy_version 5862 (0.0022) [2023-02-22 17:59:25,951][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005862_12005376.pth... [2023-02-22 17:59:25,957][33564] Stopping Batcher_0... [2023-02-22 17:59:25,965][33564] Loop batcher_evt_loop terminating... [2023-02-22 17:59:25,957][15372] Component Batcher_0 stopped! [2023-02-22 17:59:26,003][33580] Stopping RolloutWorker_w2... [2023-02-22 17:59:26,003][15372] Component RolloutWorker_w2 stopped! [2023-02-22 17:59:26,010][33580] Loop rollout_proc2_evt_loop terminating... [2023-02-22 17:59:26,022][15372] Component RolloutWorker_w0 stopped! [2023-02-22 17:59:26,022][33579] Stopping RolloutWorker_w0... [2023-02-22 17:59:26,031][33585] Stopping RolloutWorker_w6... [2023-02-22 17:59:26,029][33579] Loop rollout_proc0_evt_loop terminating... [2023-02-22 17:59:26,031][15372] Component RolloutWorker_w6 stopped! [2023-02-22 17:59:26,032][33585] Loop rollout_proc6_evt_loop terminating... [2023-02-22 17:59:26,042][33584] Stopping RolloutWorker_w4... [2023-02-22 17:59:26,043][33584] Loop rollout_proc4_evt_loop terminating... [2023-02-22 17:59:26,042][15372] Component RolloutWorker_w4 stopped! [2023-02-22 17:59:26,035][33578] Weights refcount: 2 0 [2023-02-22 17:59:26,073][15372] Component InferenceWorker_p0-w0 stopped! [2023-02-22 17:59:26,080][33578] Stopping InferenceWorker_p0-w0... [2023-02-22 17:59:26,081][33578] Loop inference_proc0-0_evt_loop terminating... [2023-02-22 17:59:26,087][15372] Component RolloutWorker_w3 stopped! [2023-02-22 17:59:26,092][33581] Stopping RolloutWorker_w3... [2023-02-22 17:59:26,093][33581] Loop rollout_proc3_evt_loop terminating... [2023-02-22 17:59:26,103][15372] Component RolloutWorker_w7 stopped! [2023-02-22 17:59:26,110][33586] Stopping RolloutWorker_w7... [2023-02-22 17:59:26,112][33583] Stopping RolloutWorker_w5... [2023-02-22 17:59:26,113][33583] Loop rollout_proc5_evt_loop terminating... [2023-02-22 17:59:26,111][15372] Component RolloutWorker_w5 stopped! [2023-02-22 17:59:26,128][33582] Stopping RolloutWorker_w1... [2023-02-22 17:59:26,129][33582] Loop rollout_proc1_evt_loop terminating... [2023-02-22 17:59:26,128][15372] Component RolloutWorker_w1 stopped! [2023-02-22 17:59:26,138][33586] Loop rollout_proc7_evt_loop terminating... [2023-02-22 17:59:26,146][33564] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005460_11182080.pth [2023-02-22 17:59:26,162][33564] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005862_12005376.pth... [2023-02-22 17:59:26,328][15372] Component LearnerWorker_p0 stopped! [2023-02-22 17:59:26,330][15372] Waiting for process learner_proc0 to stop... [2023-02-22 17:59:26,334][33564] Stopping LearnerWorker_p0... [2023-02-22 17:59:26,340][33564] Loop learner_proc0_evt_loop terminating... [2023-02-22 17:59:28,108][15372] Waiting for process inference_proc0-0 to join... [2023-02-22 17:59:28,565][15372] Waiting for process rollout_proc0 to join... [2023-02-22 17:59:29,000][15372] Waiting for process rollout_proc1 to join... [2023-02-22 17:59:29,001][15372] Waiting for process rollout_proc2 to join... [2023-02-22 17:59:29,005][15372] Waiting for process rollout_proc3 to join... [2023-02-22 17:59:29,009][15372] Waiting for process rollout_proc4 to join... [2023-02-22 17:59:29,011][15372] Waiting for process rollout_proc5 to join... [2023-02-22 17:59:29,013][15372] Waiting for process rollout_proc6 to join... [2023-02-22 17:59:29,015][15372] Waiting for process rollout_proc7 to join... [2023-02-22 17:59:29,017][15372] Batcher 0 profile tree view: batching: 76.1614, releasing_batches: 0.0772 [2023-02-22 17:59:29,019][15372] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0000 wait_policy_total: 1532.7945 update_model: 29.9913 weight_update: 0.0022 one_step: 0.0215 handle_policy_step: 1521.6423 deserialize: 45.6907, stack: 9.2191, obs_to_device_normalize: 352.0744, forward: 708.1331, send_messages: 78.9841 prepare_outputs: 251.3866 to_cpu: 157.8451 [2023-02-22 17:59:29,021][15372] Learner 0 profile tree view: misc: 0.0163, prepare_batch: 37.1625 train: 339.0023 epoch_init: 0.0610, minibatch_init: 0.0452, losses_postprocess: 3.1372, kl_divergence: 3.2232, after_optimizer: 123.2586 calculate_losses: 128.4199 losses_init: 0.0290, forward_head: 9.2740, bptt_initial: 83.4207, tail: 5.8307, advantages_returns: 1.6743, losses: 13.6099 bptt: 12.6453 bptt_forward_core: 12.1317 update: 77.9893 clip: 8.3192 [2023-02-22 17:59:29,029][15372] RolloutWorker_w0 profile tree view: wait_for_trajectories: 0.8135, enqueue_policy_requests: 407.7317, env_step: 2436.1384, overhead: 58.7857, complete_rollouts: 21.4931 save_policy_outputs: 60.5609 split_output_tensors: 29.3957 [2023-02-22 17:59:29,032][15372] RolloutWorker_w7 profile tree view: wait_for_trajectories: 0.9879, enqueue_policy_requests: 402.7956, env_step: 2451.4261, overhead: 59.0774, complete_rollouts: 19.7808 save_policy_outputs: 60.7501 split_output_tensors: 29.8261 [2023-02-22 17:59:29,035][15372] Loop Runner_EvtLoop terminating... [2023-02-22 17:59:29,038][15372] Runner profile tree view: main_loop: 3225.0710 [2023-02-22 17:59:29,039][15372] Collected {0: 12005376}, FPS: 3722.5 [2023-02-22 17:59:29,158][15372] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-22 17:59:29,160][15372] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-22 17:59:29,163][15372] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-22 17:59:29,166][15372] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-22 17:59:29,168][15372] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-22 17:59:29,171][15372] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-22 17:59:29,173][15372] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-02-22 17:59:29,174][15372] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-22 17:59:29,175][15372] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2023-02-22 17:59:29,176][15372] Adding new argument 'hf_repository'=None that is not in the saved config file! [2023-02-22 17:59:29,178][15372] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-22 17:59:29,179][15372] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-22 17:59:29,181][15372] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-22 17:59:29,182][15372] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-22 17:59:29,184][15372] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-22 17:59:29,209][15372] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 17:59:29,212][15372] RunningMeanStd input shape: (3, 72, 128) [2023-02-22 17:59:29,216][15372] RunningMeanStd input shape: (1,) [2023-02-22 17:59:29,232][15372] ConvEncoder: input_channels=3 [2023-02-22 17:59:29,995][15372] Conv encoder output size: 512 [2023-02-22 17:59:29,998][15372] Policy head output size: 512 [2023-02-22 17:59:33,056][15372] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005862_12005376.pth... [2023-02-22 17:59:34,341][15372] Num frames 100... [2023-02-22 17:59:34,450][15372] Num frames 200... [2023-02-22 17:59:34,562][15372] Num frames 300... [2023-02-22 17:59:34,671][15372] Num frames 400... [2023-02-22 17:59:34,792][15372] Num frames 500... [2023-02-22 17:59:34,907][15372] Num frames 600... [2023-02-22 17:59:35,018][15372] Num frames 700... [2023-02-22 17:59:35,134][15372] Num frames 800... [2023-02-22 17:59:35,248][15372] Num frames 900... [2023-02-22 17:59:35,362][15372] Num frames 1000... [2023-02-22 17:59:35,473][15372] Num frames 1100... [2023-02-22 17:59:35,584][15372] Num frames 1200... [2023-02-22 17:59:35,695][15372] Num frames 1300... [2023-02-22 17:59:35,807][15372] Num frames 1400... [2023-02-22 17:59:35,924][15372] Num frames 1500... [2023-02-22 17:59:36,020][15372] Avg episode rewards: #0: 33.360, true rewards: #0: 15.360 [2023-02-22 17:59:36,022][15372] Avg episode reward: 33.360, avg true_objective: 15.360 [2023-02-22 17:59:36,094][15372] Num frames 1600... [2023-02-22 17:59:36,222][15372] Num frames 1700... [2023-02-22 17:59:36,338][15372] Num frames 1800... [2023-02-22 17:59:36,446][15372] Num frames 1900... [2023-02-22 17:59:36,566][15372] Num frames 2000... [2023-02-22 17:59:36,680][15372] Num frames 2100... [2023-02-22 17:59:36,795][15372] Num frames 2200... [2023-02-22 17:59:36,916][15372] Num frames 2300... [2023-02-22 17:59:37,026][15372] Num frames 2400... [2023-02-22 17:59:37,138][15372] Num frames 2500... [2023-02-22 17:59:37,261][15372] Num frames 2600... [2023-02-22 17:59:37,377][15372] Num frames 2700... [2023-02-22 17:59:37,489][15372] Num frames 2800... [2023-02-22 17:59:37,610][15372] Num frames 2900... [2023-02-22 17:59:37,723][15372] Num frames 3000... [2023-02-22 17:59:37,837][15372] Num frames 3100... [2023-02-22 17:59:37,948][15372] Num frames 3200... [2023-02-22 17:59:38,065][15372] Num frames 3300... [2023-02-22 17:59:38,184][15372] Num frames 3400... [2023-02-22 17:59:38,305][15372] Num frames 3500... [2023-02-22 17:59:38,418][15372] Num frames 3600... [2023-02-22 17:59:38,514][15372] Avg episode rewards: #0: 43.680, true rewards: #0: 18.180 [2023-02-22 17:59:38,517][15372] Avg episode reward: 43.680, avg true_objective: 18.180 [2023-02-22 17:59:38,590][15372] Num frames 3700... [2023-02-22 17:59:38,701][15372] Num frames 3800... [2023-02-22 17:59:38,811][15372] Num frames 3900... [2023-02-22 17:59:38,927][15372] Num frames 4000... [2023-02-22 17:59:39,040][15372] Num frames 4100... [2023-02-22 17:59:39,163][15372] Num frames 4200... [2023-02-22 17:59:39,280][15372] Avg episode rewards: #0: 33.480, true rewards: #0: 14.147 [2023-02-22 17:59:39,282][15372] Avg episode reward: 33.480, avg true_objective: 14.147 [2023-02-22 17:59:39,350][15372] Num frames 4300... [2023-02-22 17:59:39,464][15372] Num frames 4400... [2023-02-22 17:59:39,580][15372] Num frames 4500... [2023-02-22 17:59:39,690][15372] Num frames 4600... [2023-02-22 17:59:39,815][15372] Num frames 4700... [2023-02-22 17:59:39,937][15372] Num frames 4800... [2023-02-22 17:59:40,049][15372] Num frames 4900... [2023-02-22 17:59:40,165][15372] Num frames 5000... [2023-02-22 17:59:40,281][15372] Num frames 5100... [2023-02-22 17:59:40,397][15372] Num frames 5200... [2023-02-22 17:59:40,509][15372] Num frames 5300... [2023-02-22 17:59:40,624][15372] Num frames 5400... [2023-02-22 17:59:40,738][15372] Num frames 5500... [2023-02-22 17:59:40,855][15372] Num frames 5600... [2023-02-22 17:59:40,965][15372] Num frames 5700... [2023-02-22 17:59:41,126][15372] Avg episode rewards: #0: 35.475, true rewards: #0: 14.475 [2023-02-22 17:59:41,128][15372] Avg episode reward: 35.475, avg true_objective: 14.475 [2023-02-22 17:59:41,144][15372] Num frames 5800... [2023-02-22 17:59:41,261][15372] Num frames 5900... [2023-02-22 17:59:41,372][15372] Num frames 6000... [2023-02-22 17:59:41,484][15372] Num frames 6100... [2023-02-22 17:59:41,605][15372] Num frames 6200... [2023-02-22 17:59:41,734][15372] Num frames 6300... [2023-02-22 17:59:41,847][15372] Num frames 6400... [2023-02-22 17:59:41,963][15372] Num frames 6500... [2023-02-22 17:59:42,078][15372] Num frames 6600... [2023-02-22 17:59:42,196][15372] Num frames 6700... [2023-02-22 17:59:42,314][15372] Num frames 6800... [2023-02-22 17:59:42,428][15372] Num frames 6900... [2023-02-22 17:59:42,542][15372] Num frames 7000... [2023-02-22 17:59:42,660][15372] Num frames 7100... [2023-02-22 17:59:42,788][15372] Num frames 7200... [2023-02-22 17:59:42,959][15372] Num frames 7300... [2023-02-22 17:59:43,167][15372] Avg episode rewards: #0: 35.584, true rewards: #0: 14.784 [2023-02-22 17:59:43,170][15372] Avg episode reward: 35.584, avg true_objective: 14.784 [2023-02-22 17:59:43,192][15372] Num frames 7400... [2023-02-22 17:59:43,352][15372] Num frames 7500... [2023-02-22 17:59:43,504][15372] Num frames 7600... [2023-02-22 17:59:43,653][15372] Num frames 7700... [2023-02-22 17:59:43,816][15372] Num frames 7800... [2023-02-22 17:59:43,979][15372] Num frames 7900... [2023-02-22 17:59:44,134][15372] Num frames 8000... [2023-02-22 17:59:44,293][15372] Num frames 8100... [2023-02-22 17:59:44,455][15372] Num frames 8200... [2023-02-22 17:59:44,606][15372] Num frames 8300... [2023-02-22 17:59:44,761][15372] Num frames 8400... [2023-02-22 17:59:44,900][15372] Avg episode rewards: #0: 33.580, true rewards: #0: 14.080 [2023-02-22 17:59:44,903][15372] Avg episode reward: 33.580, avg true_objective: 14.080 [2023-02-22 17:59:44,993][15372] Num frames 8500... [2023-02-22 17:59:45,160][15372] Num frames 8600... [2023-02-22 17:59:45,327][15372] Num frames 8700... [2023-02-22 17:59:45,493][15372] Num frames 8800... [2023-02-22 17:59:45,655][15372] Num frames 8900... [2023-02-22 17:59:45,819][15372] Num frames 9000... [2023-02-22 17:59:45,987][15372] Num frames 9100... [2023-02-22 17:59:46,086][15372] Avg episode rewards: #0: 30.750, true rewards: #0: 13.036 [2023-02-22 17:59:46,089][15372] Avg episode reward: 30.750, avg true_objective: 13.036 [2023-02-22 17:59:46,219][15372] Num frames 9200... [2023-02-22 17:59:46,375][15372] Num frames 9300... [2023-02-22 17:59:46,501][15372] Num frames 9400... [2023-02-22 17:59:46,615][15372] Num frames 9500... [2023-02-22 17:59:46,734][15372] Num frames 9600... [2023-02-22 17:59:46,861][15372] Num frames 9700... [2023-02-22 17:59:46,981][15372] Num frames 9800... [2023-02-22 17:59:47,094][15372] Num frames 9900... [2023-02-22 17:59:47,217][15372] Avg episode rewards: #0: 29.321, true rewards: #0: 12.446 [2023-02-22 17:59:47,219][15372] Avg episode reward: 29.321, avg true_objective: 12.446 [2023-02-22 17:59:47,276][15372] Num frames 10000... [2023-02-22 17:59:47,401][15372] Num frames 10100... [2023-02-22 17:59:47,523][15372] Num frames 10200... [2023-02-22 17:59:47,636][15372] Num frames 10300... [2023-02-22 17:59:47,749][15372] Num frames 10400... [2023-02-22 17:59:47,870][15372] Num frames 10500... [2023-02-22 17:59:47,986][15372] Num frames 10600... [2023-02-22 17:59:48,103][15372] Num frames 10700... [2023-02-22 17:59:48,221][15372] Num frames 10800... [2023-02-22 17:59:48,342][15372] Num frames 10900... [2023-02-22 17:59:48,469][15372] Num frames 11000... [2023-02-22 17:59:48,581][15372] Num frames 11100... [2023-02-22 17:59:48,658][15372] Avg episode rewards: #0: 28.687, true rewards: #0: 12.353 [2023-02-22 17:59:48,660][15372] Avg episode reward: 28.687, avg true_objective: 12.353 [2023-02-22 17:59:48,758][15372] Num frames 11200... [2023-02-22 17:59:48,873][15372] Num frames 11300... [2023-02-22 17:59:48,987][15372] Num frames 11400... [2023-02-22 17:59:49,103][15372] Num frames 11500... [2023-02-22 17:59:49,228][15372] Num frames 11600... [2023-02-22 17:59:49,320][15372] Avg episode rewards: #0: 26.830, true rewards: #0: 11.630 [2023-02-22 17:59:49,322][15372] Avg episode reward: 26.830, avg true_objective: 11.630 [2023-02-22 18:00:59,440][15372] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-22 18:11:29,176][15372] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-22 18:11:29,178][15372] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-22 18:11:29,180][15372] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-22 18:11:29,182][15372] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-22 18:11:29,184][15372] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-22 18:11:29,185][15372] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-22 18:11:29,188][15372] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-22 18:11:29,190][15372] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-22 18:11:29,191][15372] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-22 18:11:29,194][15372] Adding new argument 'hf_repository'='nikogarro/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-22 18:11:29,196][15372] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-22 18:11:29,197][15372] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-22 18:11:29,198][15372] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-22 18:11:29,199][15372] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-22 18:11:29,200][15372] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-22 18:11:29,230][15372] RunningMeanStd input shape: (3, 72, 128) [2023-02-22 18:11:29,233][15372] RunningMeanStd input shape: (1,) [2023-02-22 18:11:29,258][15372] ConvEncoder: input_channels=3 [2023-02-22 18:11:29,329][15372] Conv encoder output size: 512 [2023-02-22 18:11:29,332][15372] Policy head output size: 512 [2023-02-22 18:11:29,361][15372] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005862_12005376.pth... [2023-02-22 18:11:30,031][15372] Num frames 100... [2023-02-22 18:11:30,189][15372] Num frames 200... [2023-02-22 18:11:30,339][15372] Num frames 300... [2023-02-22 18:11:30,493][15372] Num frames 400... [2023-02-22 18:11:30,655][15372] Num frames 500... [2023-02-22 18:11:30,821][15372] Num frames 600... [2023-02-22 18:11:30,988][15372] Num frames 700... [2023-02-22 18:11:31,155][15372] Num frames 800... [2023-02-22 18:11:31,317][15372] Num frames 900... [2023-02-22 18:11:31,485][15372] Num frames 1000... [2023-02-22 18:11:31,649][15372] Num frames 1100... [2023-02-22 18:11:31,817][15372] Num frames 1200... [2023-02-22 18:11:31,985][15372] Num frames 1300... [2023-02-22 18:11:32,144][15372] Num frames 1400... [2023-02-22 18:11:32,281][15372] Num frames 1500... [2023-02-22 18:11:32,395][15372] Num frames 1600... [2023-02-22 18:11:32,509][15372] Num frames 1700... [2023-02-22 18:11:32,631][15372] Num frames 1800... [2023-02-22 18:11:32,751][15372] Num frames 1900... [2023-02-22 18:11:32,871][15372] Num frames 2000... [2023-02-22 18:11:33,005][15372] Num frames 2100... [2023-02-22 18:11:33,060][15372] Avg episode rewards: #0: 59.999, true rewards: #0: 21.000 [2023-02-22 18:11:33,062][15372] Avg episode reward: 59.999, avg true_objective: 21.000 [2023-02-22 18:11:33,184][15372] Num frames 2200... [2023-02-22 18:11:33,299][15372] Num frames 2300... [2023-02-22 18:11:33,416][15372] Num frames 2400... [2023-02-22 18:11:33,533][15372] Num frames 2500... [2023-02-22 18:11:33,650][15372] Num frames 2600... [2023-02-22 18:11:33,782][15372] Num frames 2700... [2023-02-22 18:11:33,905][15372] Num frames 2800... [2023-02-22 18:11:34,020][15372] Num frames 2900... [2023-02-22 18:11:34,136][15372] Num frames 3000... [2023-02-22 18:11:34,251][15372] Num frames 3100... [2023-02-22 18:11:34,367][15372] Num frames 3200... [2023-02-22 18:11:34,488][15372] Num frames 3300... [2023-02-22 18:11:34,610][15372] Num frames 3400... [2023-02-22 18:11:34,729][15372] Num frames 3500... [2023-02-22 18:11:34,846][15372] Num frames 3600... [2023-02-22 18:11:34,967][15372] Num frames 3700... [2023-02-22 18:11:35,129][15372] Avg episode rewards: #0: 49.454, true rewards: #0: 18.955 [2023-02-22 18:11:35,132][15372] Avg episode reward: 49.454, avg true_objective: 18.955 [2023-02-22 18:11:35,148][15372] Num frames 3800... [2023-02-22 18:11:35,264][15372] Num frames 3900... [2023-02-22 18:11:35,379][15372] Num frames 4000... [2023-02-22 18:11:35,496][15372] Num frames 4100... [2023-02-22 18:11:35,616][15372] Num frames 4200... [2023-02-22 18:11:35,732][15372] Num frames 4300... [2023-02-22 18:11:35,856][15372] Num frames 4400... [2023-02-22 18:11:35,976][15372] Num frames 4500... [2023-02-22 18:11:36,095][15372] Num frames 4600... [2023-02-22 18:11:36,213][15372] Num frames 4700... [2023-02-22 18:11:36,326][15372] Num frames 4800... [2023-02-22 18:11:36,404][15372] Avg episode rewards: #0: 40.396, true rewards: #0: 16.063 [2023-02-22 18:11:36,406][15372] Avg episode reward: 40.396, avg true_objective: 16.063 [2023-02-22 18:11:36,501][15372] Num frames 4900... [2023-02-22 18:11:36,617][15372] Num frames 5000... [2023-02-22 18:11:36,730][15372] Num frames 5100... [2023-02-22 18:11:36,853][15372] Num frames 5200... [2023-02-22 18:11:36,978][15372] Num frames 5300... [2023-02-22 18:11:37,097][15372] Num frames 5400... [2023-02-22 18:11:37,222][15372] Num frames 5500... [2023-02-22 18:11:37,339][15372] Num frames 5600... [2023-02-22 18:11:37,461][15372] Num frames 5700... [2023-02-22 18:11:37,586][15372] Num frames 5800... [2023-02-22 18:11:37,705][15372] Num frames 5900... [2023-02-22 18:11:37,828][15372] Num frames 6000... [2023-02-22 18:11:37,950][15372] Num frames 6100... [2023-02-22 18:11:38,074][15372] Num frames 6200... [2023-02-22 18:11:38,236][15372] Num frames 6300... [2023-02-22 18:11:38,436][15372] Avg episode rewards: #0: 40.722, true rewards: #0: 15.972 [2023-02-22 18:11:38,438][15372] Avg episode reward: 40.722, avg true_objective: 15.972 [2023-02-22 18:11:38,458][15372] Num frames 6400... [2023-02-22 18:11:38,616][15372] Num frames 6500... [2023-02-22 18:11:38,771][15372] Num frames 6600... [2023-02-22 18:11:38,936][15372] Num frames 6700... [2023-02-22 18:11:39,092][15372] Num frames 6800... [2023-02-22 18:11:39,254][15372] Num frames 6900... [2023-02-22 18:11:39,408][15372] Num frames 7000... [2023-02-22 18:11:39,559][15372] Num frames 7100... [2023-02-22 18:11:39,713][15372] Num frames 7200... [2023-02-22 18:11:39,877][15372] Num frames 7300... [2023-02-22 18:11:40,036][15372] Num frames 7400... [2023-02-22 18:11:40,200][15372] Num frames 7500... [2023-02-22 18:11:40,358][15372] Num frames 7600... [2023-02-22 18:11:40,521][15372] Num frames 7700... [2023-02-22 18:11:40,754][15372] Avg episode rewards: #0: 38.794, true rewards: #0: 15.594 [2023-02-22 18:11:40,756][15372] Avg episode reward: 38.794, avg true_objective: 15.594 [2023-02-22 18:11:40,763][15372] Num frames 7800... [2023-02-22 18:11:40,947][15372] Num frames 7900... [2023-02-22 18:11:41,109][15372] Num frames 8000... [2023-02-22 18:11:41,271][15372] Num frames 8100... [2023-02-22 18:11:41,432][15372] Num frames 8200... [2023-02-22 18:11:41,596][15372] Num frames 8300... [2023-02-22 18:11:41,742][15372] Num frames 8400... [2023-02-22 18:11:41,860][15372] Num frames 8500... [2023-02-22 18:11:41,986][15372] Num frames 8600... [2023-02-22 18:11:42,103][15372] Num frames 8700... [2023-02-22 18:11:42,219][15372] Num frames 8800... [2023-02-22 18:11:42,334][15372] Num frames 8900... [2023-02-22 18:11:42,453][15372] Num frames 9000... [2023-02-22 18:11:42,575][15372] Num frames 9100... [2023-02-22 18:11:42,688][15372] Num frames 9200... [2023-02-22 18:11:42,829][15372] Avg episode rewards: #0: 39.790, true rewards: #0: 15.457 [2023-02-22 18:11:42,831][15372] Avg episode reward: 39.790, avg true_objective: 15.457 [2023-02-22 18:11:42,866][15372] Num frames 9300... [2023-02-22 18:11:42,986][15372] Num frames 9400... [2023-02-22 18:11:43,103][15372] Num frames 9500... [2023-02-22 18:11:43,217][15372] Num frames 9600... [2023-02-22 18:11:43,329][15372] Num frames 9700... [2023-02-22 18:11:43,444][15372] Num frames 9800... [2023-02-22 18:11:43,559][15372] Num frames 9900... [2023-02-22 18:11:43,690][15372] Num frames 10000... [2023-02-22 18:11:43,808][15372] Num frames 10100... [2023-02-22 18:11:43,925][15372] Num frames 10200... [2023-02-22 18:11:44,047][15372] Num frames 10300... [2023-02-22 18:11:44,166][15372] Num frames 10400... [2023-02-22 18:11:44,284][15372] Num frames 10500... [2023-02-22 18:11:44,397][15372] Num frames 10600... [2023-02-22 18:11:44,513][15372] Num frames 10700... [2023-02-22 18:11:44,625][15372] Avg episode rewards: #0: 38.494, true rewards: #0: 15.351 [2023-02-22 18:11:44,627][15372] Avg episode reward: 38.494, avg true_objective: 15.351 [2023-02-22 18:11:44,692][15372] Num frames 10800... [2023-02-22 18:11:44,808][15372] Num frames 10900... [2023-02-22 18:11:44,927][15372] Num frames 11000... [2023-02-22 18:11:45,046][15372] Num frames 11100... [2023-02-22 18:11:45,162][15372] Num frames 11200... [2023-02-22 18:11:45,276][15372] Num frames 11300... [2023-02-22 18:11:45,389][15372] Num frames 11400... [2023-02-22 18:11:45,505][15372] Num frames 11500... [2023-02-22 18:11:45,583][15372] Avg episode rewards: #0: 35.399, true rewards: #0: 14.399 [2023-02-22 18:11:45,586][15372] Avg episode reward: 35.399, avg true_objective: 14.399 [2023-02-22 18:11:45,682][15372] Num frames 11600... [2023-02-22 18:11:45,801][15372] Num frames 11700... [2023-02-22 18:11:45,921][15372] Num frames 11800... [2023-02-22 18:11:46,044][15372] Num frames 11900... [2023-02-22 18:11:46,167][15372] Num frames 12000... [2023-02-22 18:11:46,284][15372] Num frames 12100... [2023-02-22 18:11:46,397][15372] Num frames 12200... [2023-02-22 18:11:46,511][15372] Num frames 12300... [2023-02-22 18:11:46,626][15372] Num frames 12400... [2023-02-22 18:11:46,743][15372] Num frames 12500... [2023-02-22 18:11:46,857][15372] Num frames 12600... [2023-02-22 18:11:46,970][15372] Num frames 12700... [2023-02-22 18:11:47,090][15372] Num frames 12800... [2023-02-22 18:11:47,205][15372] Num frames 12900... [2023-02-22 18:11:47,324][15372] Num frames 13000... [2023-02-22 18:11:47,438][15372] Num frames 13100... [2023-02-22 18:11:47,553][15372] Num frames 13200... [2023-02-22 18:11:47,669][15372] Num frames 13300... [2023-02-22 18:11:47,782][15372] Num frames 13400... [2023-02-22 18:11:47,898][15372] Num frames 13500... [2023-02-22 18:11:48,020][15372] Num frames 13600... [2023-02-22 18:11:48,105][15372] Avg episode rewards: #0: 37.910, true rewards: #0: 15.132 [2023-02-22 18:11:48,106][15372] Avg episode reward: 37.910, avg true_objective: 15.132 [2023-02-22 18:11:48,202][15372] Num frames 13700... [2023-02-22 18:11:48,333][15372] Num frames 13800... [2023-02-22 18:11:48,455][15372] Num frames 13900... [2023-02-22 18:11:48,571][15372] Num frames 14000... [2023-02-22 18:11:48,683][15372] Num frames 14100... [2023-02-22 18:11:48,803][15372] Num frames 14200... [2023-02-22 18:11:48,965][15372] Avg episode rewards: #0: 35.191, true rewards: #0: 14.291 [2023-02-22 18:11:48,967][15372] Avg episode reward: 35.191, avg true_objective: 14.291 [2023-02-22 18:13:12,318][15372] Replay video saved to /content/train_dir/default_experiment/replay.mp4!