Unterwexi's picture
Upload . with huggingface_hub
0e59a5a
[2023-02-23 10:01:14,889][07928] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-02-23 10:01:14,892][07928] Rollout worker 0 uses device cpu
[2023-02-23 10:01:14,894][07928] Rollout worker 1 uses device cpu
[2023-02-23 10:01:14,896][07928] Rollout worker 2 uses device cpu
[2023-02-23 10:01:14,897][07928] Rollout worker 3 uses device cpu
[2023-02-23 10:01:14,899][07928] Rollout worker 4 uses device cpu
[2023-02-23 10:01:14,900][07928] Rollout worker 5 uses device cpu
[2023-02-23 10:01:14,902][07928] Rollout worker 6 uses device cpu
[2023-02-23 10:01:14,904][07928] Rollout worker 7 uses device cpu
[2023-02-23 10:01:15,008][07928] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 10:01:15,010][07928] InferenceWorker_p0-w0: min num requests: 2
[2023-02-23 10:01:15,041][07928] Starting all processes...
[2023-02-23 10:01:15,043][07928] Starting process learner_proc0
[2023-02-23 10:01:15,098][07928] Starting all processes...
[2023-02-23 10:01:15,106][07928] Starting process inference_proc0-0
[2023-02-23 10:01:15,107][07928] Starting process rollout_proc0
[2023-02-23 10:01:15,108][07928] Starting process rollout_proc1
[2023-02-23 10:01:15,110][07928] Starting process rollout_proc2
[2023-02-23 10:01:15,111][07928] Starting process rollout_proc3
[2023-02-23 10:01:15,114][07928] Starting process rollout_proc4
[2023-02-23 10:01:15,121][07928] Starting process rollout_proc5
[2023-02-23 10:01:15,124][07928] Starting process rollout_proc6
[2023-02-23 10:01:15,125][07928] Starting process rollout_proc7
[2023-02-23 10:01:17,133][12605] Worker 4 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
[2023-02-23 10:01:17,157][12588] Worker 0 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
[2023-02-23 10:01:17,338][12608] Worker 5 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
[2023-02-23 10:01:17,425][12572] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 10:01:17,426][12572] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-02-23 10:01:17,504][12606] Worker 7 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
[2023-02-23 10:01:17,518][12586] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 10:01:17,518][12586] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-02-23 10:01:17,522][12587] Worker 1 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
[2023-02-23 10:01:17,525][12572] Num visible devices: 1
[2023-02-23 10:01:17,531][12586] Num visible devices: 1
[2023-02-23 10:01:17,561][12572] Starting seed is not provided
[2023-02-23 10:01:17,562][12572] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 10:01:17,562][12572] Initializing actor-critic model on device cuda:0
[2023-02-23 10:01:17,562][12572] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 10:01:17,564][12572] RunningMeanStd input shape: (1,)
[2023-02-23 10:01:17,569][12589] Worker 2 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
[2023-02-23 10:01:17,579][12572] ConvEncoder: input_channels=3
[2023-02-23 10:01:17,594][12607] Worker 6 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
[2023-02-23 10:01:17,596][12590] Worker 3 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
[2023-02-23 10:01:17,842][12572] Conv encoder output size: 512
[2023-02-23 10:01:17,842][12572] Policy head output size: 512
[2023-02-23 10:01:17,891][12572] Created Actor Critic model with architecture:
[2023-02-23 10:01:17,891][12572] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2023-02-23 10:01:24,796][12572] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-02-23 10:01:24,797][12572] No checkpoints found
[2023-02-23 10:01:24,798][12572] Did not load from checkpoint, starting from scratch!
[2023-02-23 10:01:24,798][12572] Initialized policy 0 weights for model version 0
[2023-02-23 10:01:24,801][12572] LearnerWorker_p0 finished initialization!
[2023-02-23 10:01:24,801][12572] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 10:01:24,910][12586] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 10:01:24,911][12586] RunningMeanStd input shape: (1,)
[2023-02-23 10:01:24,926][12586] ConvEncoder: input_channels=3
[2023-02-23 10:01:25,036][12586] Conv encoder output size: 512
[2023-02-23 10:01:25,036][12586] Policy head output size: 512
[2023-02-23 10:01:25,316][07928] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 10:01:27,796][07928] Inference worker 0-0 is ready!
[2023-02-23 10:01:27,798][07928] All inference workers are ready! Signal rollout workers to start!
[2023-02-23 10:01:27,818][12587] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 10:01:27,818][12590] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 10:01:27,823][12589] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 10:01:27,825][12608] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 10:01:27,825][12607] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 10:01:27,825][12588] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 10:01:27,825][12606] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 10:01:27,825][12605] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 10:01:27,886][12606] VizDoom game.init() threw an exception ViZDoomUnexpectedExitException('Controlled ViZDoom instance exited unexpectedly.'). Terminate process...
[2023-02-23 10:01:27,886][12605] VizDoom game.init() threw an exception ViZDoomUnexpectedExitException('Controlled ViZDoom instance exited unexpectedly.'). Terminate process...
[2023-02-23 10:01:27,887][12587] VizDoom game.init() threw an exception ViZDoomUnexpectedExitException('Controlled ViZDoom instance exited unexpectedly.'). Terminate process...
[2023-02-23 10:01:27,887][12589] VizDoom game.init() threw an exception ViZDoomUnexpectedExitException('Controlled ViZDoom instance exited unexpectedly.'). Terminate process...
[2023-02-23 10:01:27,887][12606] EvtLoop [rollout_proc7_evt_loop, process=rollout_proc7] unhandled exception in slot='init' connected to emitter=Emitter(object_id='Sampler', signal_name='_inference_workers_initialized'), args=()
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 228, in _game_init
self.game.init()
vizdoom.vizdoom.ViZDoomUnexpectedExitException: Controlled ViZDoom instance exited unexpectedly.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal
slot_callable(*args)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 150, in init
env_runner.init(self.timing)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 418, in init
self._reset()
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 430, in _reset
observations, info = e.reset(seed=seed) # new way of doing seeding since Gym 0.26.0
File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset
return self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 125, in reset
obs, info = self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 110, in reset
obs, info = self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 30, in reset
return self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 379, in reset
obs, info = self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 84, in reset
obs, info = self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset
return self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 51, in reset
return self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 323, in reset
self._ensure_initialized()
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 274, in _ensure_initialized
self.initialize()
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 269, in initialize
self._game_init()
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 244, in _game_init
raise EnvCriticalError()
sample_factory.envs.env_utils.EnvCriticalError
[2023-02-23 10:01:27,887][12605] EvtLoop [rollout_proc4_evt_loop, process=rollout_proc4] unhandled exception in slot='init' connected to emitter=Emitter(object_id='Sampler', signal_name='_inference_workers_initialized'), args=()
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 228, in _game_init
self.game.init()
vizdoom.vizdoom.ViZDoomUnexpectedExitException: Controlled ViZDoom instance exited unexpectedly.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal
slot_callable(*args)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 150, in init
env_runner.init(self.timing)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 418, in init
self._reset()
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 430, in _reset
observations, info = e.reset(seed=seed) # new way of doing seeding since Gym 0.26.0
File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset
return self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 125, in reset
obs, info = self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 110, in reset
obs, info = self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 30, in reset
return self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 379, in reset
obs, info = self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 84, in reset
obs, info = self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset
return self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 51, in reset
return self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 323, in reset
self._ensure_initialized()
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 274, in _ensure_initialized
self.initialize()
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 269, in initialize
self._game_init()
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 244, in _game_init
raise EnvCriticalError()
sample_factory.envs.env_utils.EnvCriticalError
[2023-02-23 10:01:27,889][12606] Unhandled exception in evt loop rollout_proc7_evt_loop
[2023-02-23 10:01:27,887][12587] EvtLoop [rollout_proc1_evt_loop, process=rollout_proc1] unhandled exception in slot='init' connected to emitter=Emitter(object_id='Sampler', signal_name='_inference_workers_initialized'), args=()
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 228, in _game_init
self.game.init()
vizdoom.vizdoom.ViZDoomUnexpectedExitException: Controlled ViZDoom instance exited unexpectedly.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal
slot_callable(*args)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 150, in init
env_runner.init(self.timing)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 418, in init
self._reset()
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 430, in _reset
observations, info = e.reset(seed=seed) # new way of doing seeding since Gym 0.26.0
File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset
return self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 125, in reset
obs, info = self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 110, in reset
obs, info = self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 30, in reset
return self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 379, in reset
obs, info = self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 84, in reset
obs, info = self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset
return self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 51, in reset
return self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 323, in reset
self._ensure_initialized()
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 274, in _ensure_initialized
self.initialize()
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 269, in initialize
self._game_init()
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 244, in _game_init
raise EnvCriticalError()
sample_factory.envs.env_utils.EnvCriticalError
[2023-02-23 10:01:27,889][12605] Unhandled exception in evt loop rollout_proc4_evt_loop
[2023-02-23 10:01:27,889][12587] Unhandled exception in evt loop rollout_proc1_evt_loop
[2023-02-23 10:01:27,888][12589] EvtLoop [rollout_proc2_evt_loop, process=rollout_proc2] unhandled exception in slot='init' connected to emitter=Emitter(object_id='Sampler', signal_name='_inference_workers_initialized'), args=()
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 228, in _game_init
self.game.init()
vizdoom.vizdoom.ViZDoomUnexpectedExitException: Controlled ViZDoom instance exited unexpectedly.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal
slot_callable(*args)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 150, in init
env_runner.init(self.timing)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 418, in init
self._reset()
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 430, in _reset
observations, info = e.reset(seed=seed) # new way of doing seeding since Gym 0.26.0
File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset
return self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 125, in reset
obs, info = self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 110, in reset
obs, info = self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 30, in reset
return self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 379, in reset
obs, info = self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 84, in reset
obs, info = self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset
return self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 51, in reset
return self.env.reset(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 323, in reset
self._ensure_initialized()
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 274, in _ensure_initialized
self.initialize()
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 269, in initialize
self._game_init()
File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 244, in _game_init
raise EnvCriticalError()
sample_factory.envs.env_utils.EnvCriticalError
[2023-02-23 10:01:27,889][12589] Unhandled exception in evt loop rollout_proc2_evt_loop
[2023-02-23 10:01:28,189][12588] Decorrelating experience for 0 frames...
[2023-02-23 10:01:28,216][12607] Decorrelating experience for 0 frames...
[2023-02-23 10:01:28,228][12590] Decorrelating experience for 0 frames...
[2023-02-23 10:01:28,429][12588] Decorrelating experience for 32 frames...
[2023-02-23 10:01:28,469][12590] Decorrelating experience for 32 frames...
[2023-02-23 10:01:28,495][12608] Decorrelating experience for 0 frames...
[2023-02-23 10:01:28,722][12588] Decorrelating experience for 64 frames...
[2023-02-23 10:01:28,764][12590] Decorrelating experience for 64 frames...
[2023-02-23 10:01:28,765][12607] Decorrelating experience for 32 frames...
[2023-02-23 10:01:28,989][12608] Decorrelating experience for 32 frames...
[2023-02-23 10:01:29,010][12588] Decorrelating experience for 96 frames...
[2023-02-23 10:01:29,249][12590] Decorrelating experience for 96 frames...
[2023-02-23 10:01:29,276][12608] Decorrelating experience for 64 frames...
[2023-02-23 10:01:29,524][12607] Decorrelating experience for 64 frames...
[2023-02-23 10:01:29,554][12608] Decorrelating experience for 96 frames...
[2023-02-23 10:01:29,799][12607] Decorrelating experience for 96 frames...
[2023-02-23 10:01:30,316][07928] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 10:01:33,445][12572] Signal inference workers to stop experience collection...
[2023-02-23 10:01:33,450][12586] InferenceWorker_p0-w0: stopping experience collection
[2023-02-23 10:01:35,000][07928] Heartbeat connected on Batcher_0
[2023-02-23 10:01:35,008][07928] Heartbeat connected on InferenceWorker_p0-w0
[2023-02-23 10:01:35,016][07928] Heartbeat connected on RolloutWorker_w0
[2023-02-23 10:01:35,027][07928] Heartbeat connected on RolloutWorker_w3
[2023-02-23 10:01:35,033][07928] Heartbeat connected on RolloutWorker_w5
[2023-02-23 10:01:35,037][07928] Heartbeat connected on RolloutWorker_w6
[2023-02-23 10:01:35,316][07928] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 227.4. Samples: 2274. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 10:01:35,318][07928] Avg episode reward: [(0, '3.162')]
[2023-02-23 10:01:36,532][12572] Signal inference workers to resume experience collection...
[2023-02-23 10:01:36,533][12586] InferenceWorker_p0-w0: resuming experience collection
[2023-02-23 10:01:37,513][07928] Heartbeat connected on LearnerWorker_p0
[2023-02-23 10:01:39,988][12586] Updated weights for policy 0, policy_version 10 (0.0371)
[2023-02-23 10:01:40,316][07928] Fps is (10 sec: 4096.0, 60 sec: 2730.7, 300 sec: 2730.7). Total num frames: 40960. Throughput: 0: 653.2. Samples: 9798. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0)
[2023-02-23 10:01:40,319][07928] Avg episode reward: [(0, '4.392')]
[2023-02-23 10:01:43,461][12586] Updated weights for policy 0, policy_version 20 (0.0009)
[2023-02-23 10:01:45,316][07928] Fps is (10 sec: 10240.0, 60 sec: 5120.0, 300 sec: 5120.0). Total num frames: 102400. Throughput: 0: 939.1. Samples: 18782. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:01:45,319][07928] Avg episode reward: [(0, '4.444')]
[2023-02-23 10:01:45,321][12572] Saving new best policy, reward=4.444!
[2023-02-23 10:01:46,885][12586] Updated weights for policy 0, policy_version 30 (0.0010)
[2023-02-23 10:01:50,316][07928] Fps is (10 sec: 11878.5, 60 sec: 6389.8, 300 sec: 6389.8). Total num frames: 159744. Throughput: 0: 1467.9. Samples: 36698. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:01:50,318][07928] Avg episode reward: [(0, '4.416')]
[2023-02-23 10:01:50,418][12586] Updated weights for policy 0, policy_version 40 (0.0010)
[2023-02-23 10:01:53,940][12586] Updated weights for policy 0, policy_version 50 (0.0010)
[2023-02-23 10:01:55,316][07928] Fps is (10 sec: 11878.5, 60 sec: 7372.8, 300 sec: 7372.8). Total num frames: 221184. Throughput: 0: 1805.7. Samples: 54170. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:01:55,318][07928] Avg episode reward: [(0, '4.526')]
[2023-02-23 10:01:55,320][12572] Saving new best policy, reward=4.526!
[2023-02-23 10:01:57,287][12586] Updated weights for policy 0, policy_version 60 (0.0010)
[2023-02-23 10:02:00,316][07928] Fps is (10 sec: 11878.4, 60 sec: 7957.9, 300 sec: 7957.9). Total num frames: 278528. Throughput: 0: 1806.2. Samples: 63218. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:02:00,319][07928] Avg episode reward: [(0, '4.694')]
[2023-02-23 10:02:00,326][12572] Saving new best policy, reward=4.694!
[2023-02-23 10:02:00,757][12586] Updated weights for policy 0, policy_version 70 (0.0009)
[2023-02-23 10:02:04,246][12586] Updated weights for policy 0, policy_version 80 (0.0009)
[2023-02-23 10:02:05,316][07928] Fps is (10 sec: 11878.3, 60 sec: 8499.2, 300 sec: 8499.2). Total num frames: 339968. Throughput: 0: 2020.4. Samples: 80818. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0)
[2023-02-23 10:02:05,319][07928] Avg episode reward: [(0, '4.675')]
[2023-02-23 10:02:07,736][12586] Updated weights for policy 0, policy_version 90 (0.0010)
[2023-02-23 10:02:10,316][07928] Fps is (10 sec: 11878.3, 60 sec: 8829.1, 300 sec: 8829.1). Total num frames: 397312. Throughput: 0: 2191.0. Samples: 98596. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:02:10,318][07928] Avg episode reward: [(0, '4.678')]
[2023-02-23 10:02:11,161][12586] Updated weights for policy 0, policy_version 100 (0.0010)
[2023-02-23 10:02:14,622][12586] Updated weights for policy 0, policy_version 110 (0.0009)
[2023-02-23 10:02:15,316][07928] Fps is (10 sec: 11878.5, 60 sec: 9175.0, 300 sec: 9175.0). Total num frames: 458752. Throughput: 0: 2391.4. Samples: 107612. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:02:15,319][07928] Avg episode reward: [(0, '4.539')]
[2023-02-23 10:02:18,086][12586] Updated weights for policy 0, policy_version 120 (0.0009)
[2023-02-23 10:02:20,316][07928] Fps is (10 sec: 11878.4, 60 sec: 9383.6, 300 sec: 9383.6). Total num frames: 516096. Throughput: 0: 2732.6. Samples: 125242. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:02:20,319][07928] Avg episode reward: [(0, '4.457')]
[2023-02-23 10:02:21,619][12586] Updated weights for policy 0, policy_version 130 (0.0011)
[2023-02-23 10:02:25,015][12586] Updated weights for policy 0, policy_version 140 (0.0010)
[2023-02-23 10:02:25,316][07928] Fps is (10 sec: 11468.8, 60 sec: 9557.3, 300 sec: 9557.3). Total num frames: 573440. Throughput: 0: 2961.1. Samples: 143046. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:02:25,318][07928] Avg episode reward: [(0, '4.964')]
[2023-02-23 10:02:25,320][12572] Saving new best policy, reward=4.964!
[2023-02-23 10:02:28,403][12586] Updated weights for policy 0, policy_version 150 (0.0009)
[2023-02-23 10:02:30,316][07928] Fps is (10 sec: 11878.4, 60 sec: 10581.3, 300 sec: 9767.4). Total num frames: 634880. Throughput: 0: 2960.2. Samples: 151990. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:02:30,318][07928] Avg episode reward: [(0, '5.044')]
[2023-02-23 10:02:30,326][12572] Saving new best policy, reward=5.044!
[2023-02-23 10:02:31,868][12586] Updated weights for policy 0, policy_version 160 (0.0010)
[2023-02-23 10:02:35,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11537.1, 300 sec: 9888.9). Total num frames: 692224. Throughput: 0: 2955.5. Samples: 169696. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:02:35,319][07928] Avg episode reward: [(0, '5.287')]
[2023-02-23 10:02:35,322][12572] Saving new best policy, reward=5.287!
[2023-02-23 10:02:35,422][12586] Updated weights for policy 0, policy_version 170 (0.0010)
[2023-02-23 10:02:38,725][12586] Updated weights for policy 0, policy_version 180 (0.0010)
[2023-02-23 10:02:40,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11878.4, 300 sec: 10048.8). Total num frames: 753664. Throughput: 0: 2970.3. Samples: 187836. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:02:40,318][07928] Avg episode reward: [(0, '4.893')]
[2023-02-23 10:02:42,075][12586] Updated weights for policy 0, policy_version 190 (0.0010)
[2023-02-23 10:02:45,316][07928] Fps is (10 sec: 12288.0, 60 sec: 11878.4, 300 sec: 10188.8). Total num frames: 815104. Throughput: 0: 2970.4. Samples: 196886. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:02:45,319][07928] Avg episode reward: [(0, '5.249')]
[2023-02-23 10:02:45,510][12586] Updated weights for policy 0, policy_version 200 (0.0011)
[2023-02-23 10:02:49,010][12586] Updated weights for policy 0, policy_version 210 (0.0011)
[2023-02-23 10:02:50,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11878.4, 300 sec: 10264.1). Total num frames: 872448. Throughput: 0: 2972.1. Samples: 214562. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:02:50,318][07928] Avg episode reward: [(0, '5.614')]
[2023-02-23 10:02:50,333][12572] Saving new best policy, reward=5.614!
[2023-02-23 10:02:52,397][12586] Updated weights for policy 0, policy_version 220 (0.0009)
[2023-02-23 10:02:55,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11878.4, 300 sec: 10376.5). Total num frames: 933888. Throughput: 0: 2977.8. Samples: 232598. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:02:55,319][07928] Avg episode reward: [(0, '5.575')]
[2023-02-23 10:02:55,886][12586] Updated weights for policy 0, policy_version 230 (0.0009)
[2023-02-23 10:02:59,279][12586] Updated weights for policy 0, policy_version 240 (0.0010)
[2023-02-23 10:03:00,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11878.4, 300 sec: 10434.0). Total num frames: 991232. Throughput: 0: 2978.0. Samples: 241624. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:03:00,319][07928] Avg episode reward: [(0, '6.060')]
[2023-02-23 10:03:00,327][12572] Saving new best policy, reward=6.060!
[2023-02-23 10:03:02,895][12586] Updated weights for policy 0, policy_version 250 (0.0011)
[2023-02-23 10:03:05,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11878.4, 300 sec: 10526.7). Total num frames: 1052672. Throughput: 0: 2974.2. Samples: 259082. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:03:05,318][07928] Avg episode reward: [(0, '6.367')]
[2023-02-23 10:03:05,321][12572] Saving new best policy, reward=6.367!
[2023-02-23 10:03:06,334][12586] Updated weights for policy 0, policy_version 260 (0.0009)
[2023-02-23 10:03:09,754][12586] Updated weights for policy 0, policy_version 270 (0.0009)
[2023-02-23 10:03:10,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11878.4, 300 sec: 10571.6). Total num frames: 1110016. Throughput: 0: 2976.6. Samples: 276992. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:03:10,319][07928] Avg episode reward: [(0, '5.979')]
[2023-02-23 10:03:10,327][12572] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000271_1110016.pth...
[2023-02-23 10:03:13,173][12586] Updated weights for policy 0, policy_version 280 (0.0010)
[2023-02-23 10:03:15,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11810.1, 300 sec: 10612.4). Total num frames: 1167360. Throughput: 0: 2976.8. Samples: 285944. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:03:15,320][07928] Avg episode reward: [(0, '6.954')]
[2023-02-23 10:03:15,343][12572] Saving new best policy, reward=6.954!
[2023-02-23 10:03:16,759][12586] Updated weights for policy 0, policy_version 290 (0.0010)
[2023-02-23 10:03:20,195][12586] Updated weights for policy 0, policy_version 300 (0.0010)
[2023-02-23 10:03:20,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11878.4, 300 sec: 10685.2). Total num frames: 1228800. Throughput: 0: 2969.2. Samples: 303310. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:03:20,319][07928] Avg episode reward: [(0, '7.500')]
[2023-02-23 10:03:20,326][12572] Saving new best policy, reward=7.500!
[2023-02-23 10:03:23,581][12586] Updated weights for policy 0, policy_version 310 (0.0009)
[2023-02-23 10:03:25,316][07928] Fps is (10 sec: 12288.1, 60 sec: 11946.7, 300 sec: 10752.0). Total num frames: 1290240. Throughput: 0: 2969.2. Samples: 321448. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:03:25,318][07928] Avg episode reward: [(0, '8.413')]
[2023-02-23 10:03:25,321][12572] Saving new best policy, reward=8.413!
[2023-02-23 10:03:26,976][12586] Updated weights for policy 0, policy_version 320 (0.0009)
[2023-02-23 10:03:30,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11878.4, 300 sec: 10780.7). Total num frames: 1347584. Throughput: 0: 2966.9. Samples: 330398. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:03:30,318][07928] Avg episode reward: [(0, '9.060')]
[2023-02-23 10:03:30,327][12572] Saving new best policy, reward=9.060!
[2023-02-23 10:03:30,505][12586] Updated weights for policy 0, policy_version 330 (0.0011)
[2023-02-23 10:03:33,984][12586] Updated weights for policy 0, policy_version 340 (0.0010)
[2023-02-23 10:03:35,316][07928] Fps is (10 sec: 11468.6, 60 sec: 11878.4, 300 sec: 10807.1). Total num frames: 1404928. Throughput: 0: 2963.5. Samples: 347922. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:03:35,319][07928] Avg episode reward: [(0, '10.180')]
[2023-02-23 10:03:35,321][12572] Saving new best policy, reward=10.180!
[2023-02-23 10:03:37,401][12586] Updated weights for policy 0, policy_version 350 (0.0011)
[2023-02-23 10:03:40,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11878.4, 300 sec: 10862.0). Total num frames: 1466368. Throughput: 0: 2955.1. Samples: 365576. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:03:40,319][07928] Avg episode reward: [(0, '8.840')]
[2023-02-23 10:03:40,974][12586] Updated weights for policy 0, policy_version 360 (0.0009)
[2023-02-23 10:03:44,668][12586] Updated weights for policy 0, policy_version 370 (0.0010)
[2023-02-23 10:03:45,316][07928] Fps is (10 sec: 11469.1, 60 sec: 11741.9, 300 sec: 10854.4). Total num frames: 1519616. Throughput: 0: 2942.1. Samples: 374016. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:03:45,318][07928] Avg episode reward: [(0, '7.377')]
[2023-02-23 10:03:48,226][12586] Updated weights for policy 0, policy_version 380 (0.0010)
[2023-02-23 10:03:50,316][07928] Fps is (10 sec: 11059.3, 60 sec: 11741.9, 300 sec: 10875.6). Total num frames: 1576960. Throughput: 0: 2934.8. Samples: 391148. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:03:50,319][07928] Avg episode reward: [(0, '8.717')]
[2023-02-23 10:03:51,725][12586] Updated weights for policy 0, policy_version 390 (0.0010)
[2023-02-23 10:03:55,161][12586] Updated weights for policy 0, policy_version 400 (0.0009)
[2023-02-23 10:03:55,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 10922.7). Total num frames: 1638400. Throughput: 0: 2931.5. Samples: 408908. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:03:55,319][07928] Avg episode reward: [(0, '10.533')]
[2023-02-23 10:03:55,322][12572] Saving new best policy, reward=10.533!
[2023-02-23 10:03:58,690][12586] Updated weights for policy 0, policy_version 410 (0.0011)
[2023-02-23 10:04:00,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 10940.3). Total num frames: 1695744. Throughput: 0: 2926.7. Samples: 417644. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:04:00,319][07928] Avg episode reward: [(0, '12.205')]
[2023-02-23 10:04:00,327][12572] Saving new best policy, reward=12.205!
[2023-02-23 10:04:02,165][12586] Updated weights for policy 0, policy_version 420 (0.0010)
[2023-02-23 10:04:05,316][07928] Fps is (10 sec: 11878.2, 60 sec: 11741.8, 300 sec: 10982.4). Total num frames: 1757184. Throughput: 0: 2934.4. Samples: 435358. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:04:05,319][07928] Avg episode reward: [(0, '15.359')]
[2023-02-23 10:04:05,321][12572] Saving new best policy, reward=15.359!
[2023-02-23 10:04:05,589][12586] Updated weights for policy 0, policy_version 430 (0.0009)
[2023-02-23 10:04:08,980][12586] Updated weights for policy 0, policy_version 440 (0.0010)
[2023-02-23 10:04:10,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 10997.1). Total num frames: 1814528. Throughput: 0: 2931.1. Samples: 453346. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:04:10,318][07928] Avg episode reward: [(0, '15.015')]
[2023-02-23 10:04:12,555][12586] Updated weights for policy 0, policy_version 450 (0.0011)
[2023-02-23 10:04:15,316][07928] Fps is (10 sec: 11469.0, 60 sec: 11741.9, 300 sec: 11011.0). Total num frames: 1871872. Throughput: 0: 2919.7. Samples: 461786. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:04:15,318][07928] Avg episode reward: [(0, '14.508')]
[2023-02-23 10:04:16,132][12586] Updated weights for policy 0, policy_version 460 (0.0010)
[2023-02-23 10:04:19,579][12586] Updated weights for policy 0, policy_version 470 (0.0009)
[2023-02-23 10:04:20,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11047.5). Total num frames: 1933312. Throughput: 0: 2918.5. Samples: 479256. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:04:20,318][07928] Avg episode reward: [(0, '14.599')]
[2023-02-23 10:04:23,050][12586] Updated weights for policy 0, policy_version 480 (0.0010)
[2023-02-23 10:04:25,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11673.6, 300 sec: 11059.2). Total num frames: 1990656. Throughput: 0: 2919.7. Samples: 496962. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:04:25,318][07928] Avg episode reward: [(0, '15.783')]
[2023-02-23 10:04:25,320][12572] Saving new best policy, reward=15.783!
[2023-02-23 10:04:26,564][12586] Updated weights for policy 0, policy_version 490 (0.0009)
[2023-02-23 10:04:30,113][12586] Updated weights for policy 0, policy_version 500 (0.0010)
[2023-02-23 10:04:30,316][07928] Fps is (10 sec: 11468.6, 60 sec: 11673.6, 300 sec: 11070.3). Total num frames: 2048000. Throughput: 0: 2923.8. Samples: 505588. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:04:30,319][07928] Avg episode reward: [(0, '18.673')]
[2023-02-23 10:04:30,327][12572] Saving new best policy, reward=18.673!
[2023-02-23 10:04:33,503][12586] Updated weights for policy 0, policy_version 510 (0.0009)
[2023-02-23 10:04:35,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11102.3). Total num frames: 2109440. Throughput: 0: 2939.1. Samples: 523406. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:04:35,318][07928] Avg episode reward: [(0, '19.466')]
[2023-02-23 10:04:35,320][12572] Saving new best policy, reward=19.466!
[2023-02-23 10:04:36,939][12586] Updated weights for policy 0, policy_version 520 (0.0010)
[2023-02-23 10:04:40,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11673.6, 300 sec: 11111.7). Total num frames: 2166784. Throughput: 0: 2940.5. Samples: 541230. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:04:40,319][07928] Avg episode reward: [(0, '19.047')]
[2023-02-23 10:04:40,422][12586] Updated weights for policy 0, policy_version 530 (0.0010)
[2023-02-23 10:04:43,939][12586] Updated weights for policy 0, policy_version 540 (0.0010)
[2023-02-23 10:04:45,316][07928] Fps is (10 sec: 11468.9, 60 sec: 11741.9, 300 sec: 11120.6). Total num frames: 2224128. Throughput: 0: 2937.0. Samples: 549808. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:04:45,318][07928] Avg episode reward: [(0, '19.725')]
[2023-02-23 10:04:45,320][12572] Saving new best policy, reward=19.725!
[2023-02-23 10:04:47,371][12586] Updated weights for policy 0, policy_version 550 (0.0010)
[2023-02-23 10:04:50,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11810.1, 300 sec: 11149.1). Total num frames: 2285568. Throughput: 0: 2942.5. Samples: 567772. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:04:50,318][07928] Avg episode reward: [(0, '19.666')]
[2023-02-23 10:04:50,817][12586] Updated weights for policy 0, policy_version 560 (0.0010)
[2023-02-23 10:04:54,339][12586] Updated weights for policy 0, policy_version 570 (0.0010)
[2023-02-23 10:04:55,316][07928] Fps is (10 sec: 11878.2, 60 sec: 11741.8, 300 sec: 11156.7). Total num frames: 2342912. Throughput: 0: 2931.6. Samples: 585266. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:04:55,319][07928] Avg episode reward: [(0, '19.913')]
[2023-02-23 10:04:55,321][12572] Saving new best policy, reward=19.913!
[2023-02-23 10:04:57,864][12586] Updated weights for policy 0, policy_version 580 (0.0010)
[2023-02-23 10:05:00,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11810.1, 300 sec: 11183.0). Total num frames: 2404352. Throughput: 0: 2938.0. Samples: 593998. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:05:00,319][07928] Avg episode reward: [(0, '20.096')]
[2023-02-23 10:05:00,327][12572] Saving new best policy, reward=20.096!
[2023-02-23 10:05:01,312][12586] Updated weights for policy 0, policy_version 590 (0.0010)
[2023-02-23 10:05:04,776][12586] Updated weights for policy 0, policy_version 600 (0.0010)
[2023-02-23 10:05:05,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11189.5). Total num frames: 2461696. Throughput: 0: 2945.6. Samples: 611806. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:05:05,318][07928] Avg episode reward: [(0, '20.172')]
[2023-02-23 10:05:05,322][12572] Saving new best policy, reward=20.172!
[2023-02-23 10:05:08,337][12586] Updated weights for policy 0, policy_version 610 (0.0011)
[2023-02-23 10:05:10,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11195.7). Total num frames: 2519040. Throughput: 0: 2938.3. Samples: 629184. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:05:10,319][07928] Avg episode reward: [(0, '19.201')]
[2023-02-23 10:05:10,328][12572] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000615_2519040.pth...
[2023-02-23 10:05:11,846][12586] Updated weights for policy 0, policy_version 620 (0.0010)
[2023-02-23 10:05:15,248][12586] Updated weights for policy 0, policy_version 630 (0.0009)
[2023-02-23 10:05:15,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11810.1, 300 sec: 11219.5). Total num frames: 2580480. Throughput: 0: 2945.6. Samples: 638140. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:05:15,319][07928] Avg episode reward: [(0, '17.916')]
[2023-02-23 10:05:18,598][12586] Updated weights for policy 0, policy_version 640 (0.0010)
[2023-02-23 10:05:20,316][07928] Fps is (10 sec: 12288.0, 60 sec: 11810.1, 300 sec: 11242.2). Total num frames: 2641920. Throughput: 0: 2953.6. Samples: 656318. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:05:20,318][07928] Avg episode reward: [(0, '20.862')]
[2023-02-23 10:05:20,328][12572] Saving new best policy, reward=20.862!
[2023-02-23 10:05:22,066][12586] Updated weights for policy 0, policy_version 650 (0.0010)
[2023-02-23 10:05:25,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11810.1, 300 sec: 11246.9). Total num frames: 2699264. Throughput: 0: 2943.2. Samples: 673672. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:05:25,319][07928] Avg episode reward: [(0, '20.226')]
[2023-02-23 10:05:25,643][12586] Updated weights for policy 0, policy_version 660 (0.0010)
[2023-02-23 10:05:29,060][12586] Updated weights for policy 0, policy_version 670 (0.0009)
[2023-02-23 10:05:30,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11810.1, 300 sec: 11251.5). Total num frames: 2756608. Throughput: 0: 2949.3. Samples: 682528. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:05:30,318][07928] Avg episode reward: [(0, '20.335')]
[2023-02-23 10:05:32,571][12586] Updated weights for policy 0, policy_version 680 (0.0010)
[2023-02-23 10:05:35,316][07928] Fps is (10 sec: 11468.6, 60 sec: 11741.8, 300 sec: 11255.8). Total num frames: 2813952. Throughput: 0: 2944.2. Samples: 700260. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:05:35,319][07928] Avg episode reward: [(0, '20.945')]
[2023-02-23 10:05:35,322][12572] Saving new best policy, reward=20.945!
[2023-02-23 10:05:36,104][12586] Updated weights for policy 0, policy_version 690 (0.0009)
[2023-02-23 10:05:39,672][12586] Updated weights for policy 0, policy_version 700 (0.0011)
[2023-02-23 10:05:40,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11260.0). Total num frames: 2871296. Throughput: 0: 2938.4. Samples: 717494. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:05:40,319][07928] Avg episode reward: [(0, '21.987')]
[2023-02-23 10:05:40,346][12572] Saving new best policy, reward=21.987!
[2023-02-23 10:05:43,136][12586] Updated weights for policy 0, policy_version 710 (0.0010)
[2023-02-23 10:05:45,316][07928] Fps is (10 sec: 11878.6, 60 sec: 11810.1, 300 sec: 11279.8). Total num frames: 2932736. Throughput: 0: 2942.5. Samples: 726412. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:05:45,318][07928] Avg episode reward: [(0, '24.042')]
[2023-02-23 10:05:45,320][12572] Saving new best policy, reward=24.042!
[2023-02-23 10:05:46,578][12586] Updated weights for policy 0, policy_version 720 (0.0010)
[2023-02-23 10:05:50,007][12586] Updated weights for policy 0, policy_version 730 (0.0009)
[2023-02-23 10:05:50,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.8, 300 sec: 11283.3). Total num frames: 2990080. Throughput: 0: 2944.2. Samples: 744296. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:05:50,319][07928] Avg episode reward: [(0, '19.464')]
[2023-02-23 10:05:53,557][12586] Updated weights for policy 0, policy_version 740 (0.0010)
[2023-02-23 10:05:55,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11810.1, 300 sec: 11301.9). Total num frames: 3051520. Throughput: 0: 2943.6. Samples: 761646. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:05:55,319][07928] Avg episode reward: [(0, '19.617')]
[2023-02-23 10:05:57,015][12586] Updated weights for policy 0, policy_version 750 (0.0009)
[2023-02-23 10:06:00,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.8, 300 sec: 11305.0). Total num frames: 3108864. Throughput: 0: 2942.1. Samples: 770534. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:06:00,319][07928] Avg episode reward: [(0, '22.151')]
[2023-02-23 10:06:00,508][12586] Updated weights for policy 0, policy_version 760 (0.0010)
[2023-02-23 10:06:03,961][12586] Updated weights for policy 0, policy_version 770 (0.0011)
[2023-02-23 10:06:05,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11307.9). Total num frames: 3166208. Throughput: 0: 2930.9. Samples: 788208. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:06:05,318][07928] Avg episode reward: [(0, '20.739')]
[2023-02-23 10:06:07,594][12586] Updated weights for policy 0, policy_version 780 (0.0010)
[2023-02-23 10:06:10,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11741.8, 300 sec: 11310.7). Total num frames: 3223552. Throughput: 0: 2927.8. Samples: 805422. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:06:10,319][07928] Avg episode reward: [(0, '21.233')]
[2023-02-23 10:06:11,106][12586] Updated weights for policy 0, policy_version 790 (0.0019)
[2023-02-23 10:06:14,540][12586] Updated weights for policy 0, policy_version 800 (0.0010)
[2023-02-23 10:06:15,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.8, 300 sec: 11327.6). Total num frames: 3284992. Throughput: 0: 2929.5. Samples: 814354. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:06:15,319][07928] Avg episode reward: [(0, '22.765')]
[2023-02-23 10:06:17,943][12586] Updated weights for policy 0, policy_version 810 (0.0010)
[2023-02-23 10:06:20,316][07928] Fps is (10 sec: 11878.6, 60 sec: 11673.6, 300 sec: 11330.0). Total num frames: 3342336. Throughput: 0: 2931.4. Samples: 832174. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:06:20,318][07928] Avg episode reward: [(0, '23.397')]
[2023-02-23 10:06:21,504][12586] Updated weights for policy 0, policy_version 820 (0.0010)
[2023-02-23 10:06:24,949][12586] Updated weights for policy 0, policy_version 830 (0.0009)
[2023-02-23 10:06:25,316][07928] Fps is (10 sec: 11878.6, 60 sec: 11741.9, 300 sec: 11538.2). Total num frames: 3403776. Throughput: 0: 2941.4. Samples: 849858. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:06:25,319][07928] Avg episode reward: [(0, '25.012')]
[2023-02-23 10:06:25,321][12572] Saving new best policy, reward=25.012!
[2023-02-23 10:06:28,397][12586] Updated weights for policy 0, policy_version 840 (0.0010)
[2023-02-23 10:06:30,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11732.6). Total num frames: 3461120. Throughput: 0: 2940.3. Samples: 858728. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:06:30,318][07928] Avg episode reward: [(0, '27.276')]
[2023-02-23 10:06:30,327][12572] Saving new best policy, reward=27.276!
[2023-02-23 10:06:31,843][12586] Updated weights for policy 0, policy_version 850 (0.0009)
[2023-02-23 10:06:35,317][07928] Fps is (10 sec: 11468.3, 60 sec: 11741.8, 300 sec: 11788.1). Total num frames: 3518464. Throughput: 0: 2933.6. Samples: 876310. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:06:35,318][07928] Avg episode reward: [(0, '27.989')]
[2023-02-23 10:06:35,320][12572] Saving new best policy, reward=27.989!
[2023-02-23 10:06:35,429][12586] Updated weights for policy 0, policy_version 860 (0.0011)
[2023-02-23 10:06:38,844][12586] Updated weights for policy 0, policy_version 870 (0.0009)
[2023-02-23 10:06:40,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11810.1, 300 sec: 11788.1). Total num frames: 3579904. Throughput: 0: 2940.6. Samples: 893972. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:06:40,319][07928] Avg episode reward: [(0, '24.301')]
[2023-02-23 10:06:42,297][12586] Updated weights for policy 0, policy_version 880 (0.0010)
[2023-02-23 10:06:45,316][07928] Fps is (10 sec: 11878.9, 60 sec: 11741.9, 300 sec: 11788.1). Total num frames: 3637248. Throughput: 0: 2940.5. Samples: 902858. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:06:45,319][07928] Avg episode reward: [(0, '24.364')]
[2023-02-23 10:06:45,870][12586] Updated weights for policy 0, policy_version 890 (0.0010)
[2023-02-23 10:06:49,432][12586] Updated weights for policy 0, policy_version 900 (0.0009)
[2023-02-23 10:06:50,316][07928] Fps is (10 sec: 11468.9, 60 sec: 11741.9, 300 sec: 11774.3). Total num frames: 3694592. Throughput: 0: 2933.5. Samples: 920216. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:06:50,318][07928] Avg episode reward: [(0, '23.977')]
[2023-02-23 10:06:52,845][12586] Updated weights for policy 0, policy_version 910 (0.0009)
[2023-02-23 10:06:55,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 11788.1). Total num frames: 3756032. Throughput: 0: 2949.1. Samples: 938132. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:06:55,319][07928] Avg episode reward: [(0, '23.055')]
[2023-02-23 10:06:56,270][12586] Updated weights for policy 0, policy_version 920 (0.0009)
[2023-02-23 10:06:59,663][12586] Updated weights for policy 0, policy_version 930 (0.0009)
[2023-02-23 10:07:00,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 11774.3). Total num frames: 3813376. Throughput: 0: 2949.1. Samples: 947062. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:07:00,320][07928] Avg episode reward: [(0, '22.458')]
[2023-02-23 10:07:03,263][12586] Updated weights for policy 0, policy_version 940 (0.0011)
[2023-02-23 10:07:05,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11774.3). Total num frames: 3870720. Throughput: 0: 2939.6. Samples: 964458. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:07:05,318][07928] Avg episode reward: [(0, '22.598')]
[2023-02-23 10:07:06,732][12586] Updated weights for policy 0, policy_version 950 (0.0011)
[2023-02-23 10:07:10,158][12586] Updated weights for policy 0, policy_version 960 (0.0010)
[2023-02-23 10:07:10,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11810.2, 300 sec: 11774.3). Total num frames: 3932160. Throughput: 0: 2944.3. Samples: 982350. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:07:10,319][07928] Avg episode reward: [(0, '25.482')]
[2023-02-23 10:07:10,328][12572] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000960_3932160.pth...
[2023-02-23 10:07:10,390][12572] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000271_1110016.pth
[2023-02-23 10:07:13,589][12586] Updated weights for policy 0, policy_version 970 (0.0010)
[2023-02-23 10:07:15,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 11774.3). Total num frames: 3989504. Throughput: 0: 2942.8. Samples: 991152. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:07:15,318][07928] Avg episode reward: [(0, '26.571')]
[2023-02-23 10:07:17,197][12586] Updated weights for policy 0, policy_version 980 (0.0010)
[2023-02-23 10:07:20,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11810.1, 300 sec: 11788.1). Total num frames: 4050944. Throughput: 0: 2940.2. Samples: 1008618. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:07:20,318][07928] Avg episode reward: [(0, '26.161')]
[2023-02-23 10:07:20,657][12586] Updated weights for policy 0, policy_version 990 (0.0009)
[2023-02-23 10:07:24,028][12586] Updated weights for policy 0, policy_version 1000 (0.0009)
[2023-02-23 10:07:25,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.8, 300 sec: 11774.3). Total num frames: 4108288. Throughput: 0: 2944.1. Samples: 1026456. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:07:25,319][07928] Avg episode reward: [(0, '25.728')]
[2023-02-23 10:07:27,516][12586] Updated weights for policy 0, policy_version 1010 (0.0008)
[2023-02-23 10:07:30,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11741.9, 300 sec: 11774.3). Total num frames: 4165632. Throughput: 0: 2943.6. Samples: 1035322. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:07:30,319][07928] Avg episode reward: [(0, '26.442')]
[2023-02-23 10:07:31,081][12586] Updated weights for policy 0, policy_version 1020 (0.0011)
[2023-02-23 10:07:34,584][12586] Updated weights for policy 0, policy_version 1030 (0.0010)
[2023-02-23 10:07:35,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11810.2, 300 sec: 11774.3). Total num frames: 4227072. Throughput: 0: 2944.8. Samples: 1052734. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:07:35,318][07928] Avg episode reward: [(0, '25.407')]
[2023-02-23 10:07:38,056][12586] Updated weights for policy 0, policy_version 1040 (0.0011)
[2023-02-23 10:07:40,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11760.4). Total num frames: 4284416. Throughput: 0: 2941.1. Samples: 1070482. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:07:40,319][07928] Avg episode reward: [(0, '26.590')]
[2023-02-23 10:07:41,526][12586] Updated weights for policy 0, policy_version 1050 (0.0010)
[2023-02-23 10:07:45,079][12586] Updated weights for policy 0, policy_version 1060 (0.0010)
[2023-02-23 10:07:45,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11760.4). Total num frames: 4341760. Throughput: 0: 2939.6. Samples: 1079344. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:07:45,318][07928] Avg episode reward: [(0, '27.716')]
[2023-02-23 10:07:48,524][12586] Updated weights for policy 0, policy_version 1070 (0.0010)
[2023-02-23 10:07:50,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11810.1, 300 sec: 11760.4). Total num frames: 4403200. Throughput: 0: 2943.1. Samples: 1096896. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:07:50,319][07928] Avg episode reward: [(0, '26.096')]
[2023-02-23 10:07:51,934][12586] Updated weights for policy 0, policy_version 1080 (0.0009)
[2023-02-23 10:07:55,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 11760.4). Total num frames: 4460544. Throughput: 0: 2944.0. Samples: 1114832. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:07:55,319][07928] Avg episode reward: [(0, '26.028')]
[2023-02-23 10:07:55,356][12586] Updated weights for policy 0, policy_version 1090 (0.0009)
[2023-02-23 10:07:58,938][12586] Updated weights for policy 0, policy_version 1100 (0.0010)
[2023-02-23 10:08:00,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11746.5). Total num frames: 4517888. Throughput: 0: 2941.7. Samples: 1123528. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:08:00,319][07928] Avg episode reward: [(0, '28.459')]
[2023-02-23 10:08:00,327][12572] Saving new best policy, reward=28.459!
[2023-02-23 10:08:02,490][12586] Updated weights for policy 0, policy_version 1110 (0.0011)
[2023-02-23 10:08:05,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11810.1, 300 sec: 11760.4). Total num frames: 4579328. Throughput: 0: 2940.1. Samples: 1140924. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:08:05,318][07928] Avg episode reward: [(0, '29.227')]
[2023-02-23 10:08:05,321][12572] Saving new best policy, reward=29.227!
[2023-02-23 10:08:05,893][12586] Updated weights for policy 0, policy_version 1120 (0.0010)
[2023-02-23 10:08:09,322][12586] Updated weights for policy 0, policy_version 1130 (0.0009)
[2023-02-23 10:08:10,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 11760.4). Total num frames: 4636672. Throughput: 0: 2938.9. Samples: 1158708. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:08:10,319][07928] Avg episode reward: [(0, '25.131')]
[2023-02-23 10:08:12,873][12586] Updated weights for policy 0, policy_version 1140 (0.0010)
[2023-02-23 10:08:15,316][07928] Fps is (10 sec: 11468.9, 60 sec: 11741.9, 300 sec: 11746.5). Total num frames: 4694016. Throughput: 0: 2933.7. Samples: 1167338. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:08:15,319][07928] Avg episode reward: [(0, '24.979')]
[2023-02-23 10:08:16,372][12586] Updated weights for policy 0, policy_version 1150 (0.0010)
[2023-02-23 10:08:19,813][12586] Updated weights for policy 0, policy_version 1160 (0.0009)
[2023-02-23 10:08:20,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11746.5). Total num frames: 4755456. Throughput: 0: 2942.8. Samples: 1185158. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:08:20,318][07928] Avg episode reward: [(0, '24.569')]
[2023-02-23 10:08:23,211][12586] Updated weights for policy 0, policy_version 1170 (0.0010)
[2023-02-23 10:08:25,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11746.5). Total num frames: 4812800. Throughput: 0: 2947.1. Samples: 1203102. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:08:25,319][07928] Avg episode reward: [(0, '23.322')]
[2023-02-23 10:08:26,811][12586] Updated weights for policy 0, policy_version 1180 (0.0009)
[2023-02-23 10:08:30,316][07928] Fps is (10 sec: 11468.9, 60 sec: 11741.9, 300 sec: 11746.5). Total num frames: 4870144. Throughput: 0: 2936.0. Samples: 1211462. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:08:30,319][07928] Avg episode reward: [(0, '23.754')]
[2023-02-23 10:08:30,340][12586] Updated weights for policy 0, policy_version 1190 (0.0009)
[2023-02-23 10:08:33,807][12586] Updated weights for policy 0, policy_version 1200 (0.0010)
[2023-02-23 10:08:35,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11746.5). Total num frames: 4931584. Throughput: 0: 2939.4. Samples: 1229168. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:08:35,318][07928] Avg episode reward: [(0, '27.204')]
[2023-02-23 10:08:37,270][12586] Updated weights for policy 0, policy_version 1210 (0.0010)
[2023-02-23 10:08:40,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11760.4). Total num frames: 4988928. Throughput: 0: 2929.0. Samples: 1246636. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:08:40,318][07928] Avg episode reward: [(0, '28.629')]
[2023-02-23 10:08:40,861][12586] Updated weights for policy 0, policy_version 1220 (0.0010)
[2023-02-23 10:08:44,373][12586] Updated weights for policy 0, policy_version 1230 (0.0010)
[2023-02-23 10:08:45,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11760.4). Total num frames: 5046272. Throughput: 0: 2922.8. Samples: 1255052. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:08:45,319][07928] Avg episode reward: [(0, '25.272')]
[2023-02-23 10:08:47,903][12586] Updated weights for policy 0, policy_version 1240 (0.0010)
[2023-02-23 10:08:50,316][07928] Fps is (10 sec: 11468.6, 60 sec: 11673.6, 300 sec: 11746.5). Total num frames: 5103616. Throughput: 0: 2922.3. Samples: 1272430. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:08:50,320][07928] Avg episode reward: [(0, '24.644')]
[2023-02-23 10:08:51,569][12586] Updated weights for policy 0, policy_version 1250 (0.0010)
[2023-02-23 10:08:55,316][07928] Fps is (10 sec: 11059.1, 60 sec: 11605.3, 300 sec: 11732.6). Total num frames: 5156864. Throughput: 0: 2894.8. Samples: 1288974. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:08:55,319][07928] Avg episode reward: [(0, '25.320')]
[2023-02-23 10:08:55,345][12586] Updated weights for policy 0, policy_version 1260 (0.0011)
[2023-02-23 10:08:59,052][12586] Updated weights for policy 0, policy_version 1270 (0.0010)
[2023-02-23 10:09:00,316][07928] Fps is (10 sec: 11059.3, 60 sec: 11605.3, 300 sec: 11718.7). Total num frames: 5214208. Throughput: 0: 2885.8. Samples: 1297198. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:09:00,319][07928] Avg episode reward: [(0, '23.456')]
[2023-02-23 10:09:02,657][12586] Updated weights for policy 0, policy_version 1280 (0.0009)
[2023-02-23 10:09:05,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11537.1, 300 sec: 11718.7). Total num frames: 5271552. Throughput: 0: 2873.4. Samples: 1314460. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:09:05,319][07928] Avg episode reward: [(0, '23.322')]
[2023-02-23 10:09:06,136][12586] Updated weights for policy 0, policy_version 1290 (0.0009)
[2023-02-23 10:09:09,778][12586] Updated weights for policy 0, policy_version 1300 (0.0010)
[2023-02-23 10:09:10,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11537.1, 300 sec: 11718.7). Total num frames: 5328896. Throughput: 0: 2854.5. Samples: 1331554. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:09:10,319][07928] Avg episode reward: [(0, '26.863')]
[2023-02-23 10:09:10,329][12572] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001301_5328896.pth...
[2023-02-23 10:09:10,393][12572] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000615_2519040.pth
[2023-02-23 10:09:13,315][12586] Updated weights for policy 0, policy_version 1310 (0.0010)
[2023-02-23 10:09:15,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11537.1, 300 sec: 11704.8). Total num frames: 5386240. Throughput: 0: 2862.4. Samples: 1340272. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:09:15,319][07928] Avg episode reward: [(0, '28.845')]
[2023-02-23 10:09:16,808][12586] Updated weights for policy 0, policy_version 1320 (0.0010)
[2023-02-23 10:09:20,247][12586] Updated weights for policy 0, policy_version 1330 (0.0010)
[2023-02-23 10:09:20,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11537.1, 300 sec: 11718.7). Total num frames: 5447680. Throughput: 0: 2861.2. Samples: 1357920. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:09:20,319][07928] Avg episode reward: [(0, '28.959')]
[2023-02-23 10:09:23,772][12586] Updated weights for policy 0, policy_version 1340 (0.0010)
[2023-02-23 10:09:25,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11537.0, 300 sec: 11718.7). Total num frames: 5505024. Throughput: 0: 2862.4. Samples: 1375442. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:09:25,319][07928] Avg episode reward: [(0, '28.191')]
[2023-02-23 10:09:27,230][12586] Updated weights for policy 0, policy_version 1350 (0.0010)
[2023-02-23 10:09:30,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11537.1, 300 sec: 11704.8). Total num frames: 5562368. Throughput: 0: 2873.4. Samples: 1384356. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:09:30,318][07928] Avg episode reward: [(0, '25.934')]
[2023-02-23 10:09:30,727][12586] Updated weights for policy 0, policy_version 1360 (0.0010)
[2023-02-23 10:09:34,145][12586] Updated weights for policy 0, policy_version 1370 (0.0010)
[2023-02-23 10:09:35,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11537.1, 300 sec: 11718.7). Total num frames: 5623808. Throughput: 0: 2882.5. Samples: 1402144. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:09:35,318][07928] Avg episode reward: [(0, '25.730')]
[2023-02-23 10:09:37,796][12586] Updated weights for policy 0, policy_version 1380 (0.0010)
[2023-02-23 10:09:40,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11537.1, 300 sec: 11718.7). Total num frames: 5681152. Throughput: 0: 2892.9. Samples: 1419156. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:09:40,319][07928] Avg episode reward: [(0, '26.536')]
[2023-02-23 10:09:41,352][12586] Updated weights for policy 0, policy_version 1390 (0.0010)
[2023-02-23 10:09:44,816][12586] Updated weights for policy 0, policy_version 1400 (0.0009)
[2023-02-23 10:09:45,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11537.1, 300 sec: 11704.8). Total num frames: 5738496. Throughput: 0: 2907.1. Samples: 1428016. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:09:45,318][07928] Avg episode reward: [(0, '26.147')]
[2023-02-23 10:09:48,218][12586] Updated weights for policy 0, policy_version 1410 (0.0010)
[2023-02-23 10:09:50,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11537.1, 300 sec: 11704.8). Total num frames: 5795840. Throughput: 0: 2923.6. Samples: 1446022. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:09:50,318][07928] Avg episode reward: [(0, '26.277')]
[2023-02-23 10:09:51,759][12586] Updated weights for policy 0, policy_version 1420 (0.0009)
[2023-02-23 10:09:55,179][12586] Updated weights for policy 0, policy_version 1430 (0.0009)
[2023-02-23 10:09:55,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11673.6, 300 sec: 11704.8). Total num frames: 5857280. Throughput: 0: 2932.6. Samples: 1463522. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:09:55,319][07928] Avg episode reward: [(0, '27.436')]
[2023-02-23 10:09:58,617][12586] Updated weights for policy 0, policy_version 1440 (0.0010)
[2023-02-23 10:10:00,316][07928] Fps is (10 sec: 12288.1, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 5918720. Throughput: 0: 2936.9. Samples: 1472432. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:10:00,318][07928] Avg episode reward: [(0, '27.312')]
[2023-02-23 10:10:02,027][12586] Updated weights for policy 0, policy_version 1450 (0.0010)
[2023-02-23 10:10:05,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 5976064. Throughput: 0: 2940.4. Samples: 1490238. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:10:05,319][07928] Avg episode reward: [(0, '25.902')]
[2023-02-23 10:10:05,605][12586] Updated weights for policy 0, policy_version 1460 (0.0010)
[2023-02-23 10:10:09,152][12586] Updated weights for policy 0, policy_version 1470 (0.0010)
[2023-02-23 10:10:10,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 6033408. Throughput: 0: 2937.2. Samples: 1507614. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:10:10,319][07928] Avg episode reward: [(0, '24.792')]
[2023-02-23 10:10:12,588][12586] Updated weights for policy 0, policy_version 1480 (0.0011)
[2023-02-23 10:10:15,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11691.0). Total num frames: 6090752. Throughput: 0: 2934.4. Samples: 1516406. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:10:15,319][07928] Avg episode reward: [(0, '25.819')]
[2023-02-23 10:10:16,129][12586] Updated weights for policy 0, policy_version 1490 (0.0010)
[2023-02-23 10:10:19,690][12586] Updated weights for policy 0, policy_version 1500 (0.0011)
[2023-02-23 10:10:20,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11673.6, 300 sec: 11691.0). Total num frames: 6148096. Throughput: 0: 2927.5. Samples: 1533882. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:10:20,319][07928] Avg episode reward: [(0, '27.924')]
[2023-02-23 10:10:23,215][12586] Updated weights for policy 0, policy_version 1510 (0.0011)
[2023-02-23 10:10:25,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 6209536. Throughput: 0: 2935.0. Samples: 1551232. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:10:25,319][07928] Avg episode reward: [(0, '28.339')]
[2023-02-23 10:10:26,632][12586] Updated weights for policy 0, policy_version 1520 (0.0010)
[2023-02-23 10:10:30,073][12586] Updated weights for policy 0, policy_version 1530 (0.0010)
[2023-02-23 10:10:30,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.8, 300 sec: 11704.8). Total num frames: 6266880. Throughput: 0: 2937.6. Samples: 1560208. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:10:30,319][07928] Avg episode reward: [(0, '25.792')]
[2023-02-23 10:10:33,577][12586] Updated weights for policy 0, policy_version 1540 (0.0010)
[2023-02-23 10:10:35,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11673.6, 300 sec: 11704.8). Total num frames: 6324224. Throughput: 0: 2927.6. Samples: 1577762. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:10:35,319][07928] Avg episode reward: [(0, '24.709')]
[2023-02-23 10:10:37,101][12586] Updated weights for policy 0, policy_version 1550 (0.0011)
[2023-02-23 10:10:40,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 6385664. Throughput: 0: 2936.7. Samples: 1595672. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:10:40,319][07928] Avg episode reward: [(0, '26.531')]
[2023-02-23 10:10:40,519][12586] Updated weights for policy 0, policy_version 1560 (0.0010)
[2023-02-23 10:10:43,949][12586] Updated weights for policy 0, policy_version 1570 (0.0009)
[2023-02-23 10:10:45,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 6443008. Throughput: 0: 2936.7. Samples: 1604584. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:10:45,319][07928] Avg episode reward: [(0, '29.442')]
[2023-02-23 10:10:45,326][12572] Saving new best policy, reward=29.442!
[2023-02-23 10:10:47,485][12586] Updated weights for policy 0, policy_version 1580 (0.0010)
[2023-02-23 10:10:50,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11741.9, 300 sec: 11691.0). Total num frames: 6500352. Throughput: 0: 2930.5. Samples: 1622112. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:10:50,318][07928] Avg episode reward: [(0, '31.268')]
[2023-02-23 10:10:50,338][12572] Saving new best policy, reward=31.268!
[2023-02-23 10:10:51,035][12586] Updated weights for policy 0, policy_version 1590 (0.0010)
[2023-02-23 10:10:54,440][12586] Updated weights for policy 0, policy_version 1600 (0.0010)
[2023-02-23 10:10:55,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 6561792. Throughput: 0: 2934.8. Samples: 1639682. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:10:55,319][07928] Avg episode reward: [(0, '27.090')]
[2023-02-23 10:10:57,928][12586] Updated weights for policy 0, policy_version 1610 (0.0010)
[2023-02-23 10:11:00,316][07928] Fps is (10 sec: 11878.6, 60 sec: 11673.6, 300 sec: 11704.8). Total num frames: 6619136. Throughput: 0: 2934.9. Samples: 1648476. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:11:00,318][07928] Avg episode reward: [(0, '26.959')]
[2023-02-23 10:11:01,480][12586] Updated weights for policy 0, policy_version 1620 (0.0010)
[2023-02-23 10:11:05,006][12586] Updated weights for policy 0, policy_version 1630 (0.0010)
[2023-02-23 10:11:05,316][07928] Fps is (10 sec: 11468.9, 60 sec: 11673.6, 300 sec: 11704.8). Total num frames: 6676480. Throughput: 0: 2931.0. Samples: 1665776. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:11:05,319][07928] Avg episode reward: [(0, '25.257')]
[2023-02-23 10:11:08,466][12586] Updated weights for policy 0, policy_version 1640 (0.0010)
[2023-02-23 10:11:10,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 6737920. Throughput: 0: 2939.2. Samples: 1683498. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:11:10,318][07928] Avg episode reward: [(0, '24.753')]
[2023-02-23 10:11:10,327][12572] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001645_6737920.pth...
[2023-02-23 10:11:10,328][07928] Components not started: RolloutWorker_w1, RolloutWorker_w2, RolloutWorker_w4, RolloutWorker_w7, wait_time=600.0 seconds
[2023-02-23 10:11:10,381][12572] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000960_3932160.pth
[2023-02-23 10:11:11,945][12586] Updated weights for policy 0, policy_version 1650 (0.0010)
[2023-02-23 10:11:15,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 6795264. Throughput: 0: 2937.5. Samples: 1692394. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:11:15,318][07928] Avg episode reward: [(0, '27.484')]
[2023-02-23 10:11:15,508][12586] Updated weights for policy 0, policy_version 1660 (0.0010)
[2023-02-23 10:11:19,063][12586] Updated weights for policy 0, policy_version 1670 (0.0010)
[2023-02-23 10:11:20,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11691.0). Total num frames: 6852608. Throughput: 0: 2930.5. Samples: 1709634. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:11:20,319][07928] Avg episode reward: [(0, '26.860')]
[2023-02-23 10:11:22,489][12586] Updated weights for policy 0, policy_version 1680 (0.0010)
[2023-02-23 10:11:25,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 6914048. Throughput: 0: 2928.6. Samples: 1727458. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:11:25,319][07928] Avg episode reward: [(0, '23.902')]
[2023-02-23 10:11:25,952][12586] Updated weights for policy 0, policy_version 1690 (0.0009)
[2023-02-23 10:11:29,441][12586] Updated weights for policy 0, policy_version 1700 (0.0009)
[2023-02-23 10:11:30,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 11704.9). Total num frames: 6971392. Throughput: 0: 2926.5. Samples: 1736278. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0)
[2023-02-23 10:11:30,319][07928] Avg episode reward: [(0, '24.960')]
[2023-02-23 10:11:33,071][12586] Updated weights for policy 0, policy_version 1710 (0.0010)
[2023-02-23 10:11:35,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11741.9, 300 sec: 11691.0). Total num frames: 7028736. Throughput: 0: 2917.7. Samples: 1753406. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:11:35,319][07928] Avg episode reward: [(0, '26.163')]
[2023-02-23 10:11:36,579][12586] Updated weights for policy 0, policy_version 1720 (0.0010)
[2023-02-23 10:11:39,974][12586] Updated weights for policy 0, policy_version 1730 (0.0010)
[2023-02-23 10:11:40,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11673.6, 300 sec: 11691.0). Total num frames: 7086080. Throughput: 0: 2923.9. Samples: 1771256. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:11:40,319][07928] Avg episode reward: [(0, '24.602')]
[2023-02-23 10:11:43,523][12586] Updated weights for policy 0, policy_version 1740 (0.0009)
[2023-02-23 10:11:45,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11673.6, 300 sec: 11691.0). Total num frames: 7143424. Throughput: 0: 2925.6. Samples: 1780126. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:11:45,319][07928] Avg episode reward: [(0, '26.056')]
[2023-02-23 10:11:47,083][12586] Updated weights for policy 0, policy_version 1750 (0.0010)
[2023-02-23 10:11:50,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11691.0). Total num frames: 7204864. Throughput: 0: 2930.2. Samples: 1797636. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:11:50,318][07928] Avg episode reward: [(0, '27.538')]
[2023-02-23 10:11:50,431][12586] Updated weights for policy 0, policy_version 1760 (0.0009)
[2023-02-23 10:11:53,869][12586] Updated weights for policy 0, policy_version 1770 (0.0009)
[2023-02-23 10:11:55,316][07928] Fps is (10 sec: 12288.1, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 7266304. Throughput: 0: 2937.5. Samples: 1815684. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:11:55,319][07928] Avg episode reward: [(0, '25.179')]
[2023-02-23 10:11:57,418][12586] Updated weights for policy 0, policy_version 1780 (0.0010)
[2023-02-23 10:12:00,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11673.6, 300 sec: 11691.0). Total num frames: 7319552. Throughput: 0: 2929.3. Samples: 1824212. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:12:00,320][07928] Avg episode reward: [(0, '27.563')]
[2023-02-23 10:12:00,999][12586] Updated weights for policy 0, policy_version 1790 (0.0010)
[2023-02-23 10:12:04,429][12586] Updated weights for policy 0, policy_version 1800 (0.0010)
[2023-02-23 10:12:05,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11691.0). Total num frames: 7380992. Throughput: 0: 2935.5. Samples: 1841732. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:12:05,318][07928] Avg episode reward: [(0, '28.453')]
[2023-02-23 10:12:07,861][12586] Updated weights for policy 0, policy_version 1810 (0.0010)
[2023-02-23 10:12:10,316][07928] Fps is (10 sec: 12288.0, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 7442432. Throughput: 0: 2934.9. Samples: 1859528. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:12:10,319][07928] Avg episode reward: [(0, '26.893')]
[2023-02-23 10:12:11,322][12586] Updated weights for policy 0, policy_version 1820 (0.0010)
[2023-02-23 10:12:14,942][12586] Updated weights for policy 0, policy_version 1830 (0.0010)
[2023-02-23 10:12:15,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11691.0). Total num frames: 7499776. Throughput: 0: 2929.3. Samples: 1868096. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:12:15,318][07928] Avg episode reward: [(0, '27.101')]
[2023-02-23 10:12:18,383][12586] Updated weights for policy 0, policy_version 1840 (0.0011)
[2023-02-23 10:12:20,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11691.0). Total num frames: 7557120. Throughput: 0: 2938.9. Samples: 1885658. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:12:20,319][07928] Avg episode reward: [(0, '28.484')]
[2023-02-23 10:12:21,861][12586] Updated weights for policy 0, policy_version 1850 (0.0010)
[2023-02-23 10:12:25,315][12586] Updated weights for policy 0, policy_version 1860 (0.0010)
[2023-02-23 10:12:25,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.8, 300 sec: 11704.8). Total num frames: 7618560. Throughput: 0: 2936.4. Samples: 1903396. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:12:25,318][07928] Avg episode reward: [(0, '26.328')]
[2023-02-23 10:12:28,920][12586] Updated weights for policy 0, policy_version 1870 (0.0011)
[2023-02-23 10:12:30,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11673.6, 300 sec: 11677.1). Total num frames: 7671808. Throughput: 0: 2929.5. Samples: 1911952. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:12:30,318][07928] Avg episode reward: [(0, '26.245')]
[2023-02-23 10:12:32,439][12586] Updated weights for policy 0, policy_version 1880 (0.0009)
[2023-02-23 10:12:35,316][07928] Fps is (10 sec: 11468.9, 60 sec: 11741.9, 300 sec: 11691.0). Total num frames: 7733248. Throughput: 0: 2930.7. Samples: 1929516. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:12:35,319][07928] Avg episode reward: [(0, '29.782')]
[2023-02-23 10:12:35,859][12586] Updated weights for policy 0, policy_version 1890 (0.0009)
[2023-02-23 10:12:39,307][12586] Updated weights for policy 0, policy_version 1900 (0.0010)
[2023-02-23 10:12:40,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11691.0). Total num frames: 7790592. Throughput: 0: 2923.9. Samples: 1947260. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:12:40,319][07928] Avg episode reward: [(0, '29.671')]
[2023-02-23 10:12:42,901][12586] Updated weights for policy 0, policy_version 1910 (0.0010)
[2023-02-23 10:12:45,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11810.2, 300 sec: 11691.0). Total num frames: 7852032. Throughput: 0: 2924.4. Samples: 1955810. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:12:45,319][07928] Avg episode reward: [(0, '27.291')]
[2023-02-23 10:12:46,375][12586] Updated weights for policy 0, policy_version 1920 (0.0010)
[2023-02-23 10:12:49,834][12586] Updated weights for policy 0, policy_version 1930 (0.0010)
[2023-02-23 10:12:50,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.8, 300 sec: 11691.0). Total num frames: 7909376. Throughput: 0: 2930.5. Samples: 1973606. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:12:50,319][07928] Avg episode reward: [(0, '24.834')]
[2023-02-23 10:12:53,328][12586] Updated weights for policy 0, policy_version 1940 (0.0009)
[2023-02-23 10:12:55,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11673.6, 300 sec: 11691.0). Total num frames: 7966720. Throughput: 0: 2923.7. Samples: 1991094. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:12:55,319][07928] Avg episode reward: [(0, '24.086')]
[2023-02-23 10:12:56,941][12586] Updated weights for policy 0, policy_version 1950 (0.0011)
[2023-02-23 10:13:00,316][07928] Fps is (10 sec: 11469.0, 60 sec: 11741.9, 300 sec: 11677.1). Total num frames: 8024064. Throughput: 0: 2927.7. Samples: 1999842. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:13:00,318][07928] Avg episode reward: [(0, '26.631')]
[2023-02-23 10:13:00,382][12586] Updated weights for policy 0, policy_version 1960 (0.0010)
[2023-02-23 10:13:03,753][12586] Updated weights for policy 0, policy_version 1970 (0.0009)
[2023-02-23 10:13:05,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11691.0). Total num frames: 8085504. Throughput: 0: 2932.4. Samples: 2017616. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:13:05,319][07928] Avg episode reward: [(0, '28.920')]
[2023-02-23 10:13:07,174][12586] Updated weights for policy 0, policy_version 1980 (0.0010)
[2023-02-23 10:13:10,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11673.6, 300 sec: 11691.0). Total num frames: 8142848. Throughput: 0: 2931.6. Samples: 2035316. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:13:10,318][07928] Avg episode reward: [(0, '29.284')]
[2023-02-23 10:13:10,327][12572] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001988_8142848.pth...
[2023-02-23 10:13:10,399][12572] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001301_5328896.pth
[2023-02-23 10:13:10,768][12586] Updated weights for policy 0, policy_version 1990 (0.0010)
[2023-02-23 10:13:14,210][12586] Updated weights for policy 0, policy_version 2000 (0.0011)
[2023-02-23 10:13:15,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11691.0). Total num frames: 8204288. Throughput: 0: 2937.5. Samples: 2044140. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:13:15,319][07928] Avg episode reward: [(0, '27.621')]
[2023-02-23 10:13:17,632][12586] Updated weights for policy 0, policy_version 2010 (0.0010)
[2023-02-23 10:13:20,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11691.0). Total num frames: 8261632. Throughput: 0: 2943.2. Samples: 2061962. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:13:20,319][07928] Avg episode reward: [(0, '24.027')]
[2023-02-23 10:13:21,102][12586] Updated weights for policy 0, policy_version 2020 (0.0011)
[2023-02-23 10:13:24,707][12586] Updated weights for policy 0, policy_version 2030 (0.0010)
[2023-02-23 10:13:25,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11673.6, 300 sec: 11691.0). Total num frames: 8318976. Throughput: 0: 2933.3. Samples: 2079260. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:13:25,318][07928] Avg episode reward: [(0, '23.015')]
[2023-02-23 10:13:28,129][12586] Updated weights for policy 0, policy_version 2040 (0.0009)
[2023-02-23 10:13:30,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11810.1, 300 sec: 11691.0). Total num frames: 8380416. Throughput: 0: 2941.6. Samples: 2088184. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:13:30,319][07928] Avg episode reward: [(0, '24.891')]
[2023-02-23 10:13:31,599][12586] Updated weights for policy 0, policy_version 2050 (0.0009)
[2023-02-23 10:13:35,089][12586] Updated weights for policy 0, policy_version 2060 (0.0009)
[2023-02-23 10:13:35,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11691.0). Total num frames: 8437760. Throughput: 0: 2941.2. Samples: 2105960. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:13:35,318][07928] Avg episode reward: [(0, '26.526')]
[2023-02-23 10:13:38,669][12586] Updated weights for policy 0, policy_version 2070 (0.0010)
[2023-02-23 10:13:40,316][07928] Fps is (10 sec: 11468.9, 60 sec: 11741.9, 300 sec: 11691.0). Total num frames: 8495104. Throughput: 0: 2937.3. Samples: 2123274. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:13:40,319][07928] Avg episode reward: [(0, '27.356')]
[2023-02-23 10:13:42,151][12586] Updated weights for policy 0, policy_version 2080 (0.0009)
[2023-02-23 10:13:45,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 8556544. Throughput: 0: 2941.7. Samples: 2132220. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:13:45,318][07928] Avg episode reward: [(0, '28.665')]
[2023-02-23 10:13:45,569][12586] Updated weights for policy 0, policy_version 2090 (0.0010)
[2023-02-23 10:13:49,117][12586] Updated weights for policy 0, policy_version 2100 (0.0009)
[2023-02-23 10:13:50,316][07928] Fps is (10 sec: 11878.2, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 8613888. Throughput: 0: 2938.4. Samples: 2149846. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:13:50,318][07928] Avg episode reward: [(0, '28.402')]
[2023-02-23 10:13:52,726][12586] Updated weights for policy 0, policy_version 2110 (0.0010)
[2023-02-23 10:13:55,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 8671232. Throughput: 0: 2927.8. Samples: 2167068. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:13:55,319][07928] Avg episode reward: [(0, '26.862')]
[2023-02-23 10:13:56,179][12586] Updated weights for policy 0, policy_version 2120 (0.0010)
[2023-02-23 10:13:59,755][12586] Updated weights for policy 0, policy_version 2130 (0.0010)
[2023-02-23 10:14:00,316][07928] Fps is (10 sec: 11469.0, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 8728576. Throughput: 0: 2926.8. Samples: 2175848. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:14:00,318][07928] Avg episode reward: [(0, '28.137')]
[2023-02-23 10:14:03,353][12586] Updated weights for policy 0, policy_version 2140 (0.0010)
[2023-02-23 10:14:05,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11673.6, 300 sec: 11718.7). Total num frames: 8785920. Throughput: 0: 2908.4. Samples: 2192840. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:14:05,318][07928] Avg episode reward: [(0, '29.132')]
[2023-02-23 10:14:07,083][12586] Updated weights for policy 0, policy_version 2150 (0.0010)
[2023-02-23 10:14:10,316][07928] Fps is (10 sec: 11059.1, 60 sec: 11605.3, 300 sec: 11704.8). Total num frames: 8839168. Throughput: 0: 2895.5. Samples: 2209560. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:14:10,320][07928] Avg episode reward: [(0, '27.056')]
[2023-02-23 10:14:10,691][12586] Updated weights for policy 0, policy_version 2160 (0.0009)
[2023-02-23 10:14:14,225][12586] Updated weights for policy 0, policy_version 2170 (0.0010)
[2023-02-23 10:14:15,316][07928] Fps is (10 sec: 11468.9, 60 sec: 11605.3, 300 sec: 11704.8). Total num frames: 8900608. Throughput: 0: 2889.0. Samples: 2218190. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:14:15,319][07928] Avg episode reward: [(0, '27.865')]
[2023-02-23 10:14:17,689][12586] Updated weights for policy 0, policy_version 2180 (0.0010)
[2023-02-23 10:14:20,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11605.3, 300 sec: 11704.8). Total num frames: 8957952. Throughput: 0: 2885.7. Samples: 2235818. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:14:20,318][07928] Avg episode reward: [(0, '25.145')]
[2023-02-23 10:14:21,281][12586] Updated weights for policy 0, policy_version 2190 (0.0010)
[2023-02-23 10:14:24,739][12586] Updated weights for policy 0, policy_version 2200 (0.0009)
[2023-02-23 10:14:25,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11605.3, 300 sec: 11704.8). Total num frames: 9015296. Throughput: 0: 2890.0. Samples: 2253326. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:14:25,319][07928] Avg episode reward: [(0, '26.101')]
[2023-02-23 10:14:28,179][12586] Updated weights for policy 0, policy_version 2210 (0.0010)
[2023-02-23 10:14:30,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11605.3, 300 sec: 11704.8). Total num frames: 9076736. Throughput: 0: 2889.1. Samples: 2262230. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:14:30,318][07928] Avg episode reward: [(0, '29.454')]
[2023-02-23 10:14:31,656][12586] Updated weights for policy 0, policy_version 2220 (0.0010)
[2023-02-23 10:14:35,231][12586] Updated weights for policy 0, policy_version 2230 (0.0010)
[2023-02-23 10:14:35,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11605.3, 300 sec: 11704.8). Total num frames: 9134080. Throughput: 0: 2887.2. Samples: 2279768. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:14:35,319][07928] Avg episode reward: [(0, '28.350')]
[2023-02-23 10:14:38,671][12586] Updated weights for policy 0, policy_version 2240 (0.0010)
[2023-02-23 10:14:40,316][07928] Fps is (10 sec: 11468.9, 60 sec: 11605.3, 300 sec: 11704.8). Total num frames: 9191424. Throughput: 0: 2898.8. Samples: 2297512. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:14:40,319][07928] Avg episode reward: [(0, '29.783')]
[2023-02-23 10:14:42,130][12586] Updated weights for policy 0, policy_version 2250 (0.0010)
[2023-02-23 10:14:45,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11605.3, 300 sec: 11718.7). Total num frames: 9252864. Throughput: 0: 2901.2. Samples: 2306404. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:14:45,319][07928] Avg episode reward: [(0, '28.516')]
[2023-02-23 10:14:45,582][12586] Updated weights for policy 0, policy_version 2260 (0.0009)
[2023-02-23 10:14:49,171][12586] Updated weights for policy 0, policy_version 2270 (0.0010)
[2023-02-23 10:14:50,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11605.4, 300 sec: 11704.8). Total num frames: 9310208. Throughput: 0: 2910.6. Samples: 2323816. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:14:50,318][07928] Avg episode reward: [(0, '27.668')]
[2023-02-23 10:14:52,623][12586] Updated weights for policy 0, policy_version 2280 (0.0010)
[2023-02-23 10:14:55,316][07928] Fps is (10 sec: 11468.9, 60 sec: 11605.3, 300 sec: 11691.0). Total num frames: 9367552. Throughput: 0: 2932.6. Samples: 2341526. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:14:55,319][07928] Avg episode reward: [(0, '26.350')]
[2023-02-23 10:14:56,097][12586] Updated weights for policy 0, policy_version 2290 (0.0009)
[2023-02-23 10:14:59,501][12586] Updated weights for policy 0, policy_version 2300 (0.0010)
[2023-02-23 10:15:00,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11673.6, 300 sec: 11704.8). Total num frames: 9428992. Throughput: 0: 2938.3. Samples: 2350412. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:15:00,319][07928] Avg episode reward: [(0, '24.962')]
[2023-02-23 10:15:03,093][12586] Updated weights for policy 0, policy_version 2310 (0.0010)
[2023-02-23 10:15:05,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11673.6, 300 sec: 11704.8). Total num frames: 9486336. Throughput: 0: 2931.1. Samples: 2367716. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:15:05,318][07928] Avg episode reward: [(0, '26.976')]
[2023-02-23 10:15:06,575][12586] Updated weights for policy 0, policy_version 2320 (0.0010)
[2023-02-23 10:15:10,009][12586] Updated weights for policy 0, policy_version 2330 (0.0009)
[2023-02-23 10:15:10,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 9543680. Throughput: 0: 2939.4. Samples: 2385598. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:15:10,319][07928] Avg episode reward: [(0, '29.735')]
[2023-02-23 10:15:10,346][12572] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002331_9547776.pth...
[2023-02-23 10:15:10,406][12572] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001645_6737920.pth
[2023-02-23 10:15:13,436][12586] Updated weights for policy 0, policy_version 2340 (0.0010)
[2023-02-23 10:15:15,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 9605120. Throughput: 0: 2938.9. Samples: 2394482. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:15:15,318][07928] Avg episode reward: [(0, '26.928')]
[2023-02-23 10:15:17,073][12586] Updated weights for policy 0, policy_version 2350 (0.0010)
[2023-02-23 10:15:20,316][07928] Fps is (10 sec: 11878.6, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 9662464. Throughput: 0: 2933.5. Samples: 2411776. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:15:20,318][07928] Avg episode reward: [(0, '26.675')]
[2023-02-23 10:15:20,512][12586] Updated weights for policy 0, policy_version 2360 (0.0010)
[2023-02-23 10:15:23,849][12586] Updated weights for policy 0, policy_version 2370 (0.0009)
[2023-02-23 10:15:25,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11810.1, 300 sec: 11718.7). Total num frames: 9723904. Throughput: 0: 2942.5. Samples: 2429924. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:15:25,319][07928] Avg episode reward: [(0, '26.843')]
[2023-02-23 10:15:27,342][12586] Updated weights for policy 0, policy_version 2380 (0.0009)
[2023-02-23 10:15:30,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 9781248. Throughput: 0: 2943.6. Samples: 2438866. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:15:30,319][07928] Avg episode reward: [(0, '29.238')]
[2023-02-23 10:15:30,858][12586] Updated weights for policy 0, policy_version 2390 (0.0010)
[2023-02-23 10:15:34,373][12586] Updated weights for policy 0, policy_version 2400 (0.0009)
[2023-02-23 10:15:35,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 9838592. Throughput: 0: 2941.5. Samples: 2456186. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:15:35,319][07928] Avg episode reward: [(0, '27.777')]
[2023-02-23 10:15:37,827][12586] Updated weights for policy 0, policy_version 2410 (0.0010)
[2023-02-23 10:15:40,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11810.1, 300 sec: 11718.7). Total num frames: 9900032. Throughput: 0: 2943.9. Samples: 2474002. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:15:40,319][07928] Avg episode reward: [(0, '25.937')]
[2023-02-23 10:15:41,291][12586] Updated weights for policy 0, policy_version 2420 (0.0010)
[2023-02-23 10:15:44,834][12586] Updated weights for policy 0, policy_version 2430 (0.0010)
[2023-02-23 10:15:45,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 9957376. Throughput: 0: 2943.8. Samples: 2482882. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:15:45,318][07928] Avg episode reward: [(0, '27.318')]
[2023-02-23 10:15:48,325][12586] Updated weights for policy 0, policy_version 2440 (0.0009)
[2023-02-23 10:15:50,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11741.8, 300 sec: 11704.8). Total num frames: 10014720. Throughput: 0: 2944.5. Samples: 2500220. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:15:50,319][07928] Avg episode reward: [(0, '26.420')]
[2023-02-23 10:15:51,782][12586] Updated weights for policy 0, policy_version 2450 (0.0010)
[2023-02-23 10:15:55,191][12586] Updated weights for policy 0, policy_version 2460 (0.0009)
[2023-02-23 10:15:55,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11810.1, 300 sec: 11718.7). Total num frames: 10076160. Throughput: 0: 2945.8. Samples: 2518160. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:15:55,318][07928] Avg episode reward: [(0, '25.901')]
[2023-02-23 10:15:58,758][12586] Updated weights for policy 0, policy_version 2470 (0.0011)
[2023-02-23 10:16:00,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 10133504. Throughput: 0: 2942.1. Samples: 2526878. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:16:00,319][07928] Avg episode reward: [(0, '28.329')]
[2023-02-23 10:16:02,262][12586] Updated weights for policy 0, policy_version 2480 (0.0010)
[2023-02-23 10:16:05,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 10190848. Throughput: 0: 2945.6. Samples: 2544326. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:16:05,318][07928] Avg episode reward: [(0, '26.690')]
[2023-02-23 10:16:05,699][12586] Updated weights for policy 0, policy_version 2490 (0.0010)
[2023-02-23 10:16:09,133][12586] Updated weights for policy 0, policy_version 2500 (0.0010)
[2023-02-23 10:16:10,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11810.1, 300 sec: 11718.7). Total num frames: 10252288. Throughput: 0: 2942.0. Samples: 2562316. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:16:10,318][07928] Avg episode reward: [(0, '25.406')]
[2023-02-23 10:16:12,659][12586] Updated weights for policy 0, policy_version 2510 (0.0010)
[2023-02-23 10:16:15,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 10309632. Throughput: 0: 2935.8. Samples: 2570976. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:16:15,318][07928] Avg episode reward: [(0, '27.418')]
[2023-02-23 10:16:16,186][12586] Updated weights for policy 0, policy_version 2520 (0.0009)
[2023-02-23 10:16:19,676][12586] Updated weights for policy 0, policy_version 2530 (0.0010)
[2023-02-23 10:16:20,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11741.8, 300 sec: 11704.8). Total num frames: 10366976. Throughput: 0: 2941.4. Samples: 2588550. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:16:20,319][07928] Avg episode reward: [(0, '27.411')]
[2023-02-23 10:16:23,095][12586] Updated weights for policy 0, policy_version 2540 (0.0010)
[2023-02-23 10:16:25,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 10428416. Throughput: 0: 2942.4. Samples: 2606410. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:16:25,319][07928] Avg episode reward: [(0, '27.097')]
[2023-02-23 10:16:26,602][12586] Updated weights for policy 0, policy_version 2550 (0.0010)
[2023-02-23 10:16:30,085][12586] Updated weights for policy 0, policy_version 2560 (0.0010)
[2023-02-23 10:16:30,316][07928] Fps is (10 sec: 11878.6, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 10485760. Throughput: 0: 2935.7. Samples: 2614990. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:16:30,318][07928] Avg episode reward: [(0, '28.808')]
[2023-02-23 10:16:33,532][12586] Updated weights for policy 0, policy_version 2570 (0.0010)
[2023-02-23 10:16:35,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11810.1, 300 sec: 11732.6). Total num frames: 10547200. Throughput: 0: 2945.2. Samples: 2632756. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:16:35,319][07928] Avg episode reward: [(0, '29.456')]
[2023-02-23 10:16:36,945][12586] Updated weights for policy 0, policy_version 2580 (0.0010)
[2023-02-23 10:16:40,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11732.6). Total num frames: 10604544. Throughput: 0: 2940.4. Samples: 2650480. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:16:40,318][07928] Avg episode reward: [(0, '30.134')]
[2023-02-23 10:16:40,500][12586] Updated weights for policy 0, policy_version 2590 (0.0010)
[2023-02-23 10:16:44,041][12586] Updated weights for policy 0, policy_version 2600 (0.0009)
[2023-02-23 10:16:45,316][07928] Fps is (10 sec: 11468.9, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 10661888. Throughput: 0: 2937.2. Samples: 2659052. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:16:45,318][07928] Avg episode reward: [(0, '27.245')]
[2023-02-23 10:16:47,486][12586] Updated weights for policy 0, policy_version 2610 (0.0010)
[2023-02-23 10:16:50,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11810.2, 300 sec: 11718.7). Total num frames: 10723328. Throughput: 0: 2946.4. Samples: 2676912. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:16:50,318][07928] Avg episode reward: [(0, '25.823')]
[2023-02-23 10:16:50,900][12586] Updated weights for policy 0, policy_version 2620 (0.0009)
[2023-02-23 10:16:54,450][12586] Updated weights for policy 0, policy_version 2630 (0.0011)
[2023-02-23 10:16:55,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 11732.6). Total num frames: 10780672. Throughput: 0: 2937.6. Samples: 2694508. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:16:55,318][07928] Avg episode reward: [(0, '25.668')]
[2023-02-23 10:16:57,962][12586] Updated weights for policy 0, policy_version 2640 (0.0010)
[2023-02-23 10:17:00,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 10838016. Throughput: 0: 2938.5. Samples: 2703208. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:17:00,319][07928] Avg episode reward: [(0, '26.557')]
[2023-02-23 10:17:01,414][12586] Updated weights for policy 0, policy_version 2650 (0.0009)
[2023-02-23 10:17:04,866][12586] Updated weights for policy 0, policy_version 2660 (0.0009)
[2023-02-23 10:17:05,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11810.1, 300 sec: 11718.7). Total num frames: 10899456. Throughput: 0: 2943.5. Samples: 2721006. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:17:05,319][07928] Avg episode reward: [(0, '28.197')]
[2023-02-23 10:17:08,353][12586] Updated weights for policy 0, policy_version 2670 (0.0010)
[2023-02-23 10:17:10,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 10956800. Throughput: 0: 2936.9. Samples: 2738572. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:17:10,319][07928] Avg episode reward: [(0, '29.509')]
[2023-02-23 10:17:10,328][12572] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002675_10956800.pth...
[2023-02-23 10:17:10,387][12572] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001988_8142848.pth
[2023-02-23 10:17:11,908][12586] Updated weights for policy 0, policy_version 2680 (0.0010)
[2023-02-23 10:17:15,318][07928] Fps is (10 sec: 11467.3, 60 sec: 11741.6, 300 sec: 11718.7). Total num frames: 11014144. Throughput: 0: 2940.3. Samples: 2747306. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:17:15,320][07928] Avg episode reward: [(0, '29.441')]
[2023-02-23 10:17:15,346][12586] Updated weights for policy 0, policy_version 2690 (0.0010)
[2023-02-23 10:17:18,850][12586] Updated weights for policy 0, policy_version 2700 (0.0010)
[2023-02-23 10:17:20,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11810.1, 300 sec: 11718.7). Total num frames: 11075584. Throughput: 0: 2940.9. Samples: 2765098. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:17:20,319][07928] Avg episode reward: [(0, '27.494')]
[2023-02-23 10:17:22,394][12586] Updated weights for policy 0, policy_version 2710 (0.0010)
[2023-02-23 10:17:25,316][07928] Fps is (10 sec: 11880.1, 60 sec: 11741.9, 300 sec: 11732.6). Total num frames: 11132928. Throughput: 0: 2931.4. Samples: 2782392. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:17:25,319][07928] Avg episode reward: [(0, '27.068')]
[2023-02-23 10:17:25,911][12586] Updated weights for policy 0, policy_version 2720 (0.0009)
[2023-02-23 10:17:29,314][12586] Updated weights for policy 0, policy_version 2730 (0.0009)
[2023-02-23 10:17:30,316][07928] Fps is (10 sec: 11468.9, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 11190272. Throughput: 0: 2938.5. Samples: 2791284. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:17:30,319][07928] Avg episode reward: [(0, '27.498')]
[2023-02-23 10:17:32,788][12586] Updated weights for policy 0, policy_version 2740 (0.0010)
[2023-02-23 10:17:35,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11732.6). Total num frames: 11251712. Throughput: 0: 2938.2. Samples: 2809132. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:17:35,319][07928] Avg episode reward: [(0, '25.918')]
[2023-02-23 10:17:36,243][12586] Updated weights for policy 0, policy_version 2750 (0.0009)
[2023-02-23 10:17:39,819][12586] Updated weights for policy 0, policy_version 2760 (0.0010)
[2023-02-23 10:17:40,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 11309056. Throughput: 0: 2933.4. Samples: 2826512. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:17:40,319][07928] Avg episode reward: [(0, '25.305')]
[2023-02-23 10:17:43,352][12586] Updated weights for policy 0, policy_version 2770 (0.0010)
[2023-02-23 10:17:45,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 11366400. Throughput: 0: 2935.6. Samples: 2835308. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:17:45,319][07928] Avg episode reward: [(0, '28.672')]
[2023-02-23 10:17:46,792][12586] Updated weights for policy 0, policy_version 2780 (0.0009)
[2023-02-23 10:17:50,272][12586] Updated weights for policy 0, policy_version 2790 (0.0010)
[2023-02-23 10:17:50,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.8, 300 sec: 11732.6). Total num frames: 11427840. Throughput: 0: 2935.8. Samples: 2853118. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:17:50,318][07928] Avg episode reward: [(0, '29.112')]
[2023-02-23 10:17:53,874][12586] Updated weights for policy 0, policy_version 2800 (0.0010)
[2023-02-23 10:17:55,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11732.6). Total num frames: 11485184. Throughput: 0: 2927.6. Samples: 2870316. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:17:55,318][07928] Avg episode reward: [(0, '25.457')]
[2023-02-23 10:17:57,305][12586] Updated weights for policy 0, policy_version 2810 (0.0009)
[2023-02-23 10:18:00,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 11542528. Throughput: 0: 2932.8. Samples: 2879276. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:18:00,318][07928] Avg episode reward: [(0, '26.369')]
[2023-02-23 10:18:00,745][12586] Updated weights for policy 0, policy_version 2820 (0.0010)
[2023-02-23 10:18:04,273][12586] Updated weights for policy 0, policy_version 2830 (0.0010)
[2023-02-23 10:18:05,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11673.6, 300 sec: 11718.7). Total num frames: 11599872. Throughput: 0: 2931.8. Samples: 2897028. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:18:05,318][07928] Avg episode reward: [(0, '28.658')]
[2023-02-23 10:18:07,877][12586] Updated weights for policy 0, policy_version 2840 (0.0010)
[2023-02-23 10:18:10,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 11661312. Throughput: 0: 2934.5. Samples: 2914444. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:18:10,318][07928] Avg episode reward: [(0, '33.540')]
[2023-02-23 10:18:10,326][12572] Saving new best policy, reward=33.540!
[2023-02-23 10:18:11,262][12586] Updated weights for policy 0, policy_version 2850 (0.0009)
[2023-02-23 10:18:14,655][12586] Updated weights for policy 0, policy_version 2860 (0.0009)
[2023-02-23 10:18:15,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11742.2, 300 sec: 11718.7). Total num frames: 11718656. Throughput: 0: 2934.7. Samples: 2923344. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:18:15,318][07928] Avg episode reward: [(0, '32.987')]
[2023-02-23 10:18:18,122][12586] Updated weights for policy 0, policy_version 2870 (0.0010)
[2023-02-23 10:18:20,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 11732.6). Total num frames: 11780096. Throughput: 0: 2931.8. Samples: 2941064. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:18:20,318][07928] Avg episode reward: [(0, '29.489')]
[2023-02-23 10:18:21,734][12586] Updated weights for policy 0, policy_version 2880 (0.0011)
[2023-02-23 10:18:25,203][12586] Updated weights for policy 0, policy_version 2890 (0.0009)
[2023-02-23 10:18:25,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 11837440. Throughput: 0: 2931.3. Samples: 2958422. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:18:25,319][07928] Avg episode reward: [(0, '29.296')]
[2023-02-23 10:18:28,639][12586] Updated weights for policy 0, policy_version 2900 (0.0010)
[2023-02-23 10:18:30,316][07928] Fps is (10 sec: 11468.9, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 11894784. Throughput: 0: 2935.3. Samples: 2967396. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:18:30,318][07928] Avg episode reward: [(0, '26.779')]
[2023-02-23 10:18:32,175][12586] Updated weights for policy 0, policy_version 2910 (0.0010)
[2023-02-23 10:18:35,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11673.6, 300 sec: 11718.7). Total num frames: 11952128. Throughput: 0: 2925.3. Samples: 2984756. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:18:35,318][07928] Avg episode reward: [(0, '27.933')]
[2023-02-23 10:18:35,790][12586] Updated weights for policy 0, policy_version 2920 (0.0010)
[2023-02-23 10:18:39,193][12586] Updated weights for policy 0, policy_version 2930 (0.0010)
[2023-02-23 10:18:40,322][07928] Fps is (10 sec: 11871.6, 60 sec: 11740.8, 300 sec: 11718.5). Total num frames: 12013568. Throughput: 0: 2934.0. Samples: 3002364. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:18:40,325][07928] Avg episode reward: [(0, '29.674')]
[2023-02-23 10:18:42,671][12586] Updated weights for policy 0, policy_version 2940 (0.0009)
[2023-02-23 10:18:45,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 12070912. Throughput: 0: 2932.7. Samples: 3011248. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:18:45,319][07928] Avg episode reward: [(0, '30.354')]
[2023-02-23 10:18:46,173][12586] Updated weights for policy 0, policy_version 2950 (0.0010)
[2023-02-23 10:18:49,731][12586] Updated weights for policy 0, policy_version 2960 (0.0009)
[2023-02-23 10:18:50,316][07928] Fps is (10 sec: 11475.4, 60 sec: 11673.6, 300 sec: 11718.7). Total num frames: 12128256. Throughput: 0: 2925.4. Samples: 3028670. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:18:50,319][07928] Avg episode reward: [(0, '29.000')]
[2023-02-23 10:18:53,188][12586] Updated weights for policy 0, policy_version 2970 (0.0010)
[2023-02-23 10:18:55,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11732.6). Total num frames: 12189696. Throughput: 0: 2930.9. Samples: 3046336. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:18:55,319][07928] Avg episode reward: [(0, '27.217')]
[2023-02-23 10:18:56,654][12586] Updated weights for policy 0, policy_version 2980 (0.0009)
[2023-02-23 10:19:00,115][12586] Updated weights for policy 0, policy_version 2990 (0.0009)
[2023-02-23 10:19:00,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 11732.6). Total num frames: 12247040. Throughput: 0: 2931.0. Samples: 3055240. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:19:00,319][07928] Avg episode reward: [(0, '29.113')]
[2023-02-23 10:19:03,730][12586] Updated weights for policy 0, policy_version 3000 (0.0011)
[2023-02-23 10:19:05,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11746.5). Total num frames: 12304384. Throughput: 0: 2923.7. Samples: 3072630. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:19:05,319][07928] Avg episode reward: [(0, '30.398')]
[2023-02-23 10:19:07,191][12586] Updated weights for policy 0, policy_version 3010 (0.0010)
[2023-02-23 10:19:10,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.8, 300 sec: 11746.5). Total num frames: 12365824. Throughput: 0: 2928.5. Samples: 3090206. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:19:10,318][07928] Avg episode reward: [(0, '27.805')]
[2023-02-23 10:19:10,327][12572] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003019_12365824.pth...
[2023-02-23 10:19:10,390][12572] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002331_9547776.pth
[2023-02-23 10:19:10,659][12586] Updated weights for policy 0, policy_version 3020 (0.0011)
[2023-02-23 10:19:14,281][12586] Updated weights for policy 0, policy_version 3030 (0.0009)
[2023-02-23 10:19:15,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11673.6, 300 sec: 11732.6). Total num frames: 12419072. Throughput: 0: 2919.4. Samples: 3098768. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:19:15,319][07928] Avg episode reward: [(0, '27.242')]
[2023-02-23 10:19:18,021][12586] Updated weights for policy 0, policy_version 3040 (0.0010)
[2023-02-23 10:19:20,316][07928] Fps is (10 sec: 11059.2, 60 sec: 11605.3, 300 sec: 11732.6). Total num frames: 12476416. Throughput: 0: 2902.3. Samples: 3115360. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:19:20,319][07928] Avg episode reward: [(0, '26.474')]
[2023-02-23 10:19:21,630][12586] Updated weights for policy 0, policy_version 3050 (0.0009)
[2023-02-23 10:19:25,148][12586] Updated weights for policy 0, policy_version 3060 (0.0010)
[2023-02-23 10:19:25,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11605.3, 300 sec: 11718.7). Total num frames: 12533760. Throughput: 0: 2897.6. Samples: 3132740. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:19:25,319][07928] Avg episode reward: [(0, '26.344')]
[2023-02-23 10:19:28,572][12586] Updated weights for policy 0, policy_version 3070 (0.0010)
[2023-02-23 10:19:30,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11605.3, 300 sec: 11718.7). Total num frames: 12591104. Throughput: 0: 2898.4. Samples: 3141676. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:19:30,320][07928] Avg episode reward: [(0, '29.466')]
[2023-02-23 10:19:32,156][12586] Updated weights for policy 0, policy_version 3080 (0.0011)
[2023-02-23 10:19:35,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11673.6, 300 sec: 11732.6). Total num frames: 12652544. Throughput: 0: 2896.0. Samples: 3158992. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:19:35,318][07928] Avg episode reward: [(0, '31.520')]
[2023-02-23 10:19:35,669][12586] Updated weights for policy 0, policy_version 3090 (0.0009)
[2023-02-23 10:19:39,082][12586] Updated weights for policy 0, policy_version 3100 (0.0010)
[2023-02-23 10:19:40,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11606.4, 300 sec: 11718.7). Total num frames: 12709888. Throughput: 0: 2899.7. Samples: 3176824. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:19:40,318][07928] Avg episode reward: [(0, '31.629')]
[2023-02-23 10:19:42,508][12586] Updated weights for policy 0, policy_version 3110 (0.0010)
[2023-02-23 10:19:45,316][07928] Fps is (10 sec: 11468.6, 60 sec: 11605.3, 300 sec: 11718.7). Total num frames: 12767232. Throughput: 0: 2899.8. Samples: 3185732. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:19:45,319][07928] Avg episode reward: [(0, '30.374')]
[2023-02-23 10:19:46,123][12586] Updated weights for policy 0, policy_version 3120 (0.0010)
[2023-02-23 10:19:49,602][12586] Updated weights for policy 0, policy_version 3130 (0.0010)
[2023-02-23 10:19:50,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11673.6, 300 sec: 11732.6). Total num frames: 12828672. Throughput: 0: 2897.8. Samples: 3203032. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:19:50,318][07928] Avg episode reward: [(0, '30.721')]
[2023-02-23 10:19:53,062][12586] Updated weights for policy 0, policy_version 3140 (0.0011)
[2023-02-23 10:19:55,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11605.3, 300 sec: 11718.7). Total num frames: 12886016. Throughput: 0: 2905.0. Samples: 3220932. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:19:55,319][07928] Avg episode reward: [(0, '30.237')]
[2023-02-23 10:19:56,488][12586] Updated weights for policy 0, policy_version 3150 (0.0010)
[2023-02-23 10:20:00,095][12586] Updated weights for policy 0, policy_version 3160 (0.0011)
[2023-02-23 10:20:00,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11605.3, 300 sec: 11718.7). Total num frames: 12943360. Throughput: 0: 2909.6. Samples: 3229702. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:20:00,319][07928] Avg episode reward: [(0, '29.583')]
[2023-02-23 10:20:03,522][12586] Updated weights for policy 0, policy_version 3170 (0.0010)
[2023-02-23 10:20:05,316][07928] Fps is (10 sec: 11878.6, 60 sec: 11673.6, 300 sec: 11732.6). Total num frames: 13004800. Throughput: 0: 2929.9. Samples: 3247206. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:20:05,319][07928] Avg episode reward: [(0, '30.188')]
[2023-02-23 10:20:07,005][12586] Updated weights for policy 0, policy_version 3180 (0.0011)
[2023-02-23 10:20:10,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11605.3, 300 sec: 11718.7). Total num frames: 13062144. Throughput: 0: 2938.4. Samples: 3264966. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:20:10,319][07928] Avg episode reward: [(0, '28.855')]
[2023-02-23 10:20:10,458][12586] Updated weights for policy 0, policy_version 3190 (0.0009)
[2023-02-23 10:20:14,020][12586] Updated weights for policy 0, policy_version 3200 (0.0010)
[2023-02-23 10:20:15,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11673.6, 300 sec: 11718.7). Total num frames: 13119488. Throughput: 0: 2932.7. Samples: 3273646. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:20:15,318][07928] Avg episode reward: [(0, '26.008')]
[2023-02-23 10:20:17,473][12586] Updated weights for policy 0, policy_version 3210 (0.0009)
[2023-02-23 10:20:20,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 13180928. Throughput: 0: 2938.3. Samples: 3291214. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:20:20,318][07928] Avg episode reward: [(0, '27.272')]
[2023-02-23 10:20:21,001][12586] Updated weights for policy 0, policy_version 3220 (0.0010)
[2023-02-23 10:20:24,468][12586] Updated weights for policy 0, policy_version 3230 (0.0010)
[2023-02-23 10:20:25,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 13238272. Throughput: 0: 2933.3. Samples: 3308822. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:20:25,319][07928] Avg episode reward: [(0, '30.163')]
[2023-02-23 10:20:28,011][12586] Updated weights for policy 0, policy_version 3240 (0.0011)
[2023-02-23 10:20:30,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 13295616. Throughput: 0: 2927.3. Samples: 3317462. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:20:30,318][07928] Avg episode reward: [(0, '28.624')]
[2023-02-23 10:20:31,464][12586] Updated weights for policy 0, policy_version 3250 (0.0010)
[2023-02-23 10:20:34,920][12586] Updated weights for policy 0, policy_version 3260 (0.0009)
[2023-02-23 10:20:35,316][07928] Fps is (10 sec: 11878.6, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 13357056. Throughput: 0: 2937.9. Samples: 3335238. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:20:35,319][07928] Avg episode reward: [(0, '29.683')]
[2023-02-23 10:20:38,389][12586] Updated weights for policy 0, policy_version 3270 (0.0010)
[2023-02-23 10:20:40,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 13414400. Throughput: 0: 2933.4. Samples: 3352936. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:20:40,319][07928] Avg episode reward: [(0, '30.999')]
[2023-02-23 10:20:41,940][12586] Updated weights for policy 0, policy_version 3280 (0.0011)
[2023-02-23 10:20:45,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 13471744. Throughput: 0: 2929.8. Samples: 3361544. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:20:45,318][07928] Avg episode reward: [(0, '31.855')]
[2023-02-23 10:20:45,422][12586] Updated weights for policy 0, policy_version 3290 (0.0010)
[2023-02-23 10:20:48,891][12586] Updated weights for policy 0, policy_version 3300 (0.0010)
[2023-02-23 10:20:50,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 13533184. Throughput: 0: 2934.5. Samples: 3379260. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:20:50,320][07928] Avg episode reward: [(0, '33.253')]
[2023-02-23 10:20:52,396][12586] Updated weights for policy 0, policy_version 3310 (0.0010)
[2023-02-23 10:20:55,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 13590528. Throughput: 0: 2931.4. Samples: 3396880. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:20:55,319][07928] Avg episode reward: [(0, '32.637')]
[2023-02-23 10:20:55,906][12586] Updated weights for policy 0, policy_version 3320 (0.0010)
[2023-02-23 10:20:59,384][12586] Updated weights for policy 0, policy_version 3330 (0.0011)
[2023-02-23 10:21:00,316][07928] Fps is (10 sec: 11468.9, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 13647872. Throughput: 0: 2929.4. Samples: 3405470. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:21:00,318][07928] Avg episode reward: [(0, '34.519')]
[2023-02-23 10:21:00,327][12572] Saving new best policy, reward=34.519!
[2023-02-23 10:21:02,875][12586] Updated weights for policy 0, policy_version 3340 (0.0010)
[2023-02-23 10:21:05,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11673.6, 300 sec: 11704.8). Total num frames: 13705216. Throughput: 0: 2931.0. Samples: 3423108. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:21:05,319][07928] Avg episode reward: [(0, '33.124')]
[2023-02-23 10:21:06,387][12586] Updated weights for policy 0, policy_version 3350 (0.0010)
[2023-02-23 10:21:09,890][12586] Updated weights for policy 0, policy_version 3360 (0.0011)
[2023-02-23 10:21:10,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 13766656. Throughput: 0: 2929.3. Samples: 3440642. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:21:10,318][07928] Avg episode reward: [(0, '30.978')]
[2023-02-23 10:21:10,328][12572] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003361_13766656.pth...
[2023-02-23 10:21:10,328][07928] Components not started: RolloutWorker_w1, RolloutWorker_w2, RolloutWorker_w4, RolloutWorker_w7, wait_time=1200.0 seconds
[2023-02-23 10:21:10,393][12572] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002675_10956800.pth
[2023-02-23 10:21:13,442][12586] Updated weights for policy 0, policy_version 3370 (0.0010)
[2023-02-23 10:21:15,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 13824000. Throughput: 0: 2930.4. Samples: 3449330. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:21:15,319][07928] Avg episode reward: [(0, '31.329')]
[2023-02-23 10:21:16,809][12586] Updated weights for policy 0, policy_version 3380 (0.0010)
[2023-02-23 10:21:20,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11673.6, 300 sec: 11704.8). Total num frames: 13881344. Throughput: 0: 2932.0. Samples: 3467176. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0)
[2023-02-23 10:21:20,319][07928] Avg episode reward: [(0, '31.064')]
[2023-02-23 10:21:20,334][12586] Updated weights for policy 0, policy_version 3390 (0.0010)
[2023-02-23 10:21:23,893][12586] Updated weights for policy 0, policy_version 3400 (0.0009)
[2023-02-23 10:21:25,316][07928] Fps is (10 sec: 11468.6, 60 sec: 11673.6, 300 sec: 11704.8). Total num frames: 13938688. Throughput: 0: 2922.2. Samples: 3484436. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:21:25,319][07928] Avg episode reward: [(0, '31.357')]
[2023-02-23 10:21:27,475][12586] Updated weights for policy 0, policy_version 3410 (0.0010)
[2023-02-23 10:21:30,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.8, 300 sec: 11704.8). Total num frames: 14000128. Throughput: 0: 2926.4. Samples: 3493234. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:21:30,319][07928] Avg episode reward: [(0, '32.724')]
[2023-02-23 10:21:30,893][12586] Updated weights for policy 0, policy_version 3420 (0.0010)
[2023-02-23 10:21:34,278][12586] Updated weights for policy 0, policy_version 3430 (0.0010)
[2023-02-23 10:21:35,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11673.6, 300 sec: 11704.8). Total num frames: 14057472. Throughput: 0: 2932.5. Samples: 3511224. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:21:35,319][07928] Avg episode reward: [(0, '29.417')]
[2023-02-23 10:21:37,855][12586] Updated weights for policy 0, policy_version 3440 (0.0009)
[2023-02-23 10:21:40,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11673.6, 300 sec: 11704.8). Total num frames: 14114816. Throughput: 0: 2925.1. Samples: 3528510. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:21:40,319][07928] Avg episode reward: [(0, '30.837')]
[2023-02-23 10:21:41,364][12586] Updated weights for policy 0, policy_version 3450 (0.0009)
[2023-02-23 10:21:44,801][12586] Updated weights for policy 0, policy_version 3460 (0.0010)
[2023-02-23 10:21:45,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 14176256. Throughput: 0: 2931.7. Samples: 3537396. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:21:45,319][07928] Avg episode reward: [(0, '34.276')]
[2023-02-23 10:21:48,227][12586] Updated weights for policy 0, policy_version 3470 (0.0009)
[2023-02-23 10:21:50,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11673.6, 300 sec: 11704.8). Total num frames: 14233600. Throughput: 0: 2935.6. Samples: 3555210. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:21:50,318][07928] Avg episode reward: [(0, '35.306')]
[2023-02-23 10:21:50,341][12572] Saving new best policy, reward=35.306!
[2023-02-23 10:21:51,764][12586] Updated weights for policy 0, policy_version 3480 (0.0010)
[2023-02-23 10:21:55,271][12586] Updated weights for policy 0, policy_version 3490 (0.0010)
[2023-02-23 10:21:55,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 14295040. Throughput: 0: 2932.2. Samples: 3572592. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:21:55,318][07928] Avg episode reward: [(0, '32.189')]
[2023-02-23 10:21:58,694][12586] Updated weights for policy 0, policy_version 3500 (0.0009)
[2023-02-23 10:22:00,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 14352384. Throughput: 0: 2938.8. Samples: 3581574. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:22:00,319][07928] Avg episode reward: [(0, '28.716')]
[2023-02-23 10:22:02,141][12586] Updated weights for policy 0, policy_version 3510 (0.0009)
[2023-02-23 10:22:05,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 14409728. Throughput: 0: 2937.6. Samples: 3599368. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:22:05,319][07928] Avg episode reward: [(0, '28.258')]
[2023-02-23 10:22:05,692][12586] Updated weights for policy 0, policy_version 3520 (0.0010)
[2023-02-23 10:22:09,260][12586] Updated weights for policy 0, policy_version 3530 (0.0010)
[2023-02-23 10:22:10,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11718.8). Total num frames: 14471168. Throughput: 0: 2937.4. Samples: 3616620. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:22:10,318][07928] Avg episode reward: [(0, '29.778')]
[2023-02-23 10:22:12,666][12586] Updated weights for policy 0, policy_version 3540 (0.0010)
[2023-02-23 10:22:15,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.8, 300 sec: 11704.8). Total num frames: 14528512. Throughput: 0: 2940.4. Samples: 3625554. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:22:15,319][07928] Avg episode reward: [(0, '29.483')]
[2023-02-23 10:22:16,113][12586] Updated weights for policy 0, policy_version 3550 (0.0010)
[2023-02-23 10:22:19,687][12586] Updated weights for policy 0, policy_version 3560 (0.0009)
[2023-02-23 10:22:20,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 14585856. Throughput: 0: 2934.9. Samples: 3643296. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:22:20,319][07928] Avg episode reward: [(0, '30.962')]
[2023-02-23 10:22:23,228][12586] Updated weights for policy 0, policy_version 3570 (0.0010)
[2023-02-23 10:22:25,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 14643200. Throughput: 0: 2936.8. Samples: 3660668. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:22:25,319][07928] Avg episode reward: [(0, '31.122')]
[2023-02-23 10:22:26,679][12586] Updated weights for policy 0, policy_version 3580 (0.0010)
[2023-02-23 10:22:30,176][12586] Updated weights for policy 0, policy_version 3590 (0.0010)
[2023-02-23 10:22:30,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 14704640. Throughput: 0: 2934.1. Samples: 3669430. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:22:30,319][07928] Avg episode reward: [(0, '30.864')]
[2023-02-23 10:22:33,622][12586] Updated weights for policy 0, policy_version 3600 (0.0010)
[2023-02-23 10:22:35,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 14761984. Throughput: 0: 2930.5. Samples: 3687082. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:22:35,319][07928] Avg episode reward: [(0, '30.638')]
[2023-02-23 10:22:37,259][12586] Updated weights for policy 0, policy_version 3610 (0.0010)
[2023-02-23 10:22:40,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 14819328. Throughput: 0: 2932.3. Samples: 3704548. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:22:40,319][07928] Avg episode reward: [(0, '30.407')]
[2023-02-23 10:22:40,742][12586] Updated weights for policy 0, policy_version 3620 (0.0010)
[2023-02-23 10:22:44,149][12586] Updated weights for policy 0, policy_version 3630 (0.0009)
[2023-02-23 10:22:45,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 14880768. Throughput: 0: 2930.6. Samples: 3713450. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:22:45,318][07928] Avg episode reward: [(0, '31.235')]
[2023-02-23 10:22:47,630][12586] Updated weights for policy 0, policy_version 3640 (0.0010)
[2023-02-23 10:22:50,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.8, 300 sec: 11704.8). Total num frames: 14938112. Throughput: 0: 2923.0. Samples: 3730902. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:22:50,318][07928] Avg episode reward: [(0, '30.184')]
[2023-02-23 10:22:51,189][12586] Updated weights for policy 0, policy_version 3650 (0.0009)
[2023-02-23 10:22:54,633][12586] Updated weights for policy 0, policy_version 3660 (0.0010)
[2023-02-23 10:22:55,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 14999552. Throughput: 0: 2933.1. Samples: 3748610. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:22:55,319][07928] Avg episode reward: [(0, '28.302')]
[2023-02-23 10:22:58,063][12586] Updated weights for policy 0, policy_version 3670 (0.0009)
[2023-02-23 10:23:00,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.8, 300 sec: 11718.7). Total num frames: 15056896. Throughput: 0: 2935.8. Samples: 3757664. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:23:00,319][07928] Avg episode reward: [(0, '28.515')]
[2023-02-23 10:23:01,514][12586] Updated weights for policy 0, policy_version 3680 (0.0010)
[2023-02-23 10:23:05,150][12586] Updated weights for policy 0, policy_version 3690 (0.0010)
[2023-02-23 10:23:05,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.8, 300 sec: 11704.8). Total num frames: 15114240. Throughput: 0: 2927.9. Samples: 3775050. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:23:05,319][07928] Avg episode reward: [(0, '29.399')]
[2023-02-23 10:23:08,618][12586] Updated weights for policy 0, policy_version 3700 (0.0010)
[2023-02-23 10:23:10,316][07928] Fps is (10 sec: 11878.6, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 15175680. Throughput: 0: 2935.1. Samples: 3792746. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:23:10,318][07928] Avg episode reward: [(0, '29.447')]
[2023-02-23 10:23:10,326][12572] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003705_15175680.pth...
[2023-02-23 10:23:10,384][12572] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003019_12365824.pth
[2023-02-23 10:23:12,040][12586] Updated weights for policy 0, policy_version 3710 (0.0009)
[2023-02-23 10:23:15,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 15233024. Throughput: 0: 2938.5. Samples: 3801662. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:23:15,319][07928] Avg episode reward: [(0, '27.997')]
[2023-02-23 10:23:15,558][12586] Updated weights for policy 0, policy_version 3720 (0.0010)
[2023-02-23 10:23:19,170][12586] Updated weights for policy 0, policy_version 3730 (0.0011)
[2023-02-23 10:23:20,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 15290368. Throughput: 0: 2929.0. Samples: 3818888. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:23:20,319][07928] Avg episode reward: [(0, '28.780')]
[2023-02-23 10:23:22,617][12586] Updated weights for policy 0, policy_version 3740 (0.0009)
[2023-02-23 10:23:25,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 15347712. Throughput: 0: 2938.2. Samples: 3836766. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:23:25,319][07928] Avg episode reward: [(0, '30.954')]
[2023-02-23 10:23:26,014][12586] Updated weights for policy 0, policy_version 3750 (0.0010)
[2023-02-23 10:23:29,490][12586] Updated weights for policy 0, policy_version 3760 (0.0010)
[2023-02-23 10:23:30,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 15409152. Throughput: 0: 2937.1. Samples: 3845620. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:23:30,320][07928] Avg episode reward: [(0, '30.788')]
[2023-02-23 10:23:33,106][12586] Updated weights for policy 0, policy_version 3770 (0.0010)
[2023-02-23 10:23:35,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11705.1). Total num frames: 15466496. Throughput: 0: 2930.9. Samples: 3862790. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:23:35,319][07928] Avg episode reward: [(0, '29.478')]
[2023-02-23 10:23:36,547][12586] Updated weights for policy 0, policy_version 3780 (0.0009)
[2023-02-23 10:23:40,020][12586] Updated weights for policy 0, policy_version 3790 (0.0009)
[2023-02-23 10:23:40,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 15523840. Throughput: 0: 2937.1. Samples: 3880778. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:23:40,318][07928] Avg episode reward: [(0, '28.195')]
[2023-02-23 10:23:43,517][12586] Updated weights for policy 0, policy_version 3800 (0.0009)
[2023-02-23 10:23:45,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 11718.7). Total num frames: 15585280. Throughput: 0: 2931.7. Samples: 3889590. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:23:45,319][07928] Avg episode reward: [(0, '31.478')]
[2023-02-23 10:23:47,062][12586] Updated weights for policy 0, policy_version 3810 (0.0010)
[2023-02-23 10:23:50,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 15642624. Throughput: 0: 2928.2. Samples: 3906820. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:23:50,319][07928] Avg episode reward: [(0, '31.893')]
[2023-02-23 10:23:50,524][12586] Updated weights for policy 0, policy_version 3820 (0.0010)
[2023-02-23 10:23:53,960][12586] Updated weights for policy 0, policy_version 3830 (0.0009)
[2023-02-23 10:23:55,316][07928] Fps is (10 sec: 11468.9, 60 sec: 11673.6, 300 sec: 11704.8). Total num frames: 15699968. Throughput: 0: 2932.4. Samples: 3924704. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:23:55,319][07928] Avg episode reward: [(0, '31.387')]
[2023-02-23 10:23:57,447][12586] Updated weights for policy 0, policy_version 3840 (0.0010)
[2023-02-23 10:24:00,317][07928] Fps is (10 sec: 11877.9, 60 sec: 11741.8, 300 sec: 11718.7). Total num frames: 15761408. Throughput: 0: 2930.0. Samples: 3933514. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:24:00,321][07928] Avg episode reward: [(0, '32.249')]
[2023-02-23 10:24:01,036][12586] Updated weights for policy 0, policy_version 3850 (0.0011)
[2023-02-23 10:24:04,463][12586] Updated weights for policy 0, policy_version 3860 (0.0009)
[2023-02-23 10:24:05,316][07928] Fps is (10 sec: 11878.3, 60 sec: 11741.9, 300 sec: 11704.8). Total num frames: 15818752. Throughput: 0: 2932.8. Samples: 3950866. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:24:05,319][07928] Avg episode reward: [(0, '32.389')]
[2023-02-23 10:24:07,955][12586] Updated weights for policy 0, policy_version 3870 (0.0011)
[2023-02-23 10:24:10,316][07928] Fps is (10 sec: 11469.2, 60 sec: 11673.6, 300 sec: 11718.7). Total num frames: 15876096. Throughput: 0: 2929.9. Samples: 3968612. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:24:10,319][07928] Avg episode reward: [(0, '31.641')]
[2023-02-23 10:24:11,441][12586] Updated weights for policy 0, policy_version 3880 (0.0009)
[2023-02-23 10:24:15,024][12586] Updated weights for policy 0, policy_version 3890 (0.0010)
[2023-02-23 10:24:15,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11673.6, 300 sec: 11718.7). Total num frames: 15933440. Throughput: 0: 2927.1. Samples: 3977338. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:24:15,319][07928] Avg episode reward: [(0, '31.428')]
[2023-02-23 10:24:18,503][12586] Updated weights for policy 0, policy_version 3900 (0.0010)
[2023-02-23 10:24:20,316][07928] Fps is (10 sec: 11878.2, 60 sec: 11741.8, 300 sec: 11732.6). Total num frames: 15994880. Throughput: 0: 2932.2. Samples: 3994740. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:24:20,318][07928] Avg episode reward: [(0, '30.172')]
[2023-02-23 10:24:22,113][12586] Updated weights for policy 0, policy_version 3910 (0.0010)
[2023-02-23 10:24:25,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11673.6, 300 sec: 11718.7). Total num frames: 16048128. Throughput: 0: 2908.8. Samples: 4011674. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:24:25,319][07928] Avg episode reward: [(0, '30.851')]
[2023-02-23 10:24:25,766][12586] Updated weights for policy 0, policy_version 3920 (0.0009)
[2023-02-23 10:24:29,573][12586] Updated weights for policy 0, policy_version 3930 (0.0010)
[2023-02-23 10:24:30,316][07928] Fps is (10 sec: 11059.3, 60 sec: 11605.3, 300 sec: 11704.8). Total num frames: 16105472. Throughput: 0: 2894.3. Samples: 4019832. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:24:30,319][07928] Avg episode reward: [(0, '30.528')]
[2023-02-23 10:24:33,165][12586] Updated weights for policy 0, policy_version 3940 (0.0009)
[2023-02-23 10:24:35,316][07928] Fps is (10 sec: 11468.9, 60 sec: 11605.3, 300 sec: 11704.8). Total num frames: 16162816. Throughput: 0: 2884.4. Samples: 4036620. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:24:35,318][07928] Avg episode reward: [(0, '28.011')]
[2023-02-23 10:24:36,709][12586] Updated weights for policy 0, policy_version 3950 (0.0009)
[2023-02-23 10:24:40,151][12586] Updated weights for policy 0, policy_version 3960 (0.0009)
[2023-02-23 10:24:40,316][07928] Fps is (10 sec: 11468.9, 60 sec: 11605.3, 300 sec: 11704.8). Total num frames: 16220160. Throughput: 0: 2881.4. Samples: 4054366. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:24:40,318][07928] Avg episode reward: [(0, '29.105')]
[2023-02-23 10:24:43,746][12586] Updated weights for policy 0, policy_version 3970 (0.0010)
[2023-02-23 10:24:45,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11537.1, 300 sec: 11691.0). Total num frames: 16277504. Throughput: 0: 2874.9. Samples: 4062882. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:24:45,319][07928] Avg episode reward: [(0, '31.402')]
[2023-02-23 10:24:47,318][12586] Updated weights for policy 0, policy_version 3980 (0.0009)
[2023-02-23 10:24:50,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11537.0, 300 sec: 11691.0). Total num frames: 16334848. Throughput: 0: 2872.6. Samples: 4080132. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:24:50,319][07928] Avg episode reward: [(0, '31.379')]
[2023-02-23 10:24:50,821][12586] Updated weights for policy 0, policy_version 3990 (0.0010)
[2023-02-23 10:24:54,337][12586] Updated weights for policy 0, policy_version 4000 (0.0010)
[2023-02-23 10:24:55,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11537.1, 300 sec: 11691.0). Total num frames: 16392192. Throughput: 0: 2868.0. Samples: 4097672. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:24:55,319][07928] Avg episode reward: [(0, '31.139')]
[2023-02-23 10:24:57,912][12586] Updated weights for policy 0, policy_version 4010 (0.0010)
[2023-02-23 10:25:00,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11468.9, 300 sec: 11677.1). Total num frames: 16449536. Throughput: 0: 2863.9. Samples: 4106212. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:25:00,318][07928] Avg episode reward: [(0, '29.004')]
[2023-02-23 10:25:01,369][12586] Updated weights for policy 0, policy_version 4020 (0.0009)
[2023-02-23 10:25:04,857][12586] Updated weights for policy 0, policy_version 4030 (0.0009)
[2023-02-23 10:25:05,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11537.1, 300 sec: 11691.0). Total num frames: 16510976. Throughput: 0: 2869.7. Samples: 4123878. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:25:05,318][07928] Avg episode reward: [(0, '30.233')]
[2023-02-23 10:25:08,369][12586] Updated weights for policy 0, policy_version 4040 (0.0010)
[2023-02-23 10:25:10,316][07928] Fps is (10 sec: 11878.4, 60 sec: 11537.0, 300 sec: 11691.0). Total num frames: 16568320. Throughput: 0: 2882.0. Samples: 4141364. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:25:10,319][07928] Avg episode reward: [(0, '32.020')]
[2023-02-23 10:25:10,328][12572] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004045_16568320.pth...
[2023-02-23 10:25:10,390][12572] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003361_13766656.pth
[2023-02-23 10:25:11,965][12586] Updated weights for policy 0, policy_version 4050 (0.0011)
[2023-02-23 10:25:15,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11537.1, 300 sec: 11677.1). Total num frames: 16625664. Throughput: 0: 2892.8. Samples: 4150010. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:25:15,318][07928] Avg episode reward: [(0, '30.166')]
[2023-02-23 10:25:15,408][12586] Updated weights for policy 0, policy_version 4060 (0.0010)
[2023-02-23 10:25:18,919][12586] Updated weights for policy 0, policy_version 4070 (0.0010)
[2023-02-23 10:25:20,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11468.8, 300 sec: 11677.1). Total num frames: 16683008. Throughput: 0: 2910.7. Samples: 4167602. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:25:20,318][07928] Avg episode reward: [(0, '28.119')]
[2023-02-23 10:25:22,629][12586] Updated weights for policy 0, policy_version 4080 (0.0010)
[2023-02-23 10:25:25,316][07928] Fps is (10 sec: 11468.9, 60 sec: 11537.1, 300 sec: 11677.1). Total num frames: 16740352. Throughput: 0: 2885.1. Samples: 4184196. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:25:25,319][07928] Avg episode reward: [(0, '30.226')]
[2023-02-23 10:25:26,387][12586] Updated weights for policy 0, policy_version 4090 (0.0010)
[2023-02-23 10:25:30,051][12586] Updated weights for policy 0, policy_version 4100 (0.0010)
[2023-02-23 10:25:30,316][07928] Fps is (10 sec: 11059.2, 60 sec: 11468.8, 300 sec: 11649.3). Total num frames: 16793600. Throughput: 0: 2882.6. Samples: 4192600. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:25:30,318][07928] Avg episode reward: [(0, '29.269')]
[2023-02-23 10:25:33,785][12586] Updated weights for policy 0, policy_version 4110 (0.0010)
[2023-02-23 10:25:35,316][07928] Fps is (10 sec: 11059.1, 60 sec: 11468.8, 300 sec: 11649.3). Total num frames: 16850944. Throughput: 0: 2866.2. Samples: 4209112. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:25:35,319][07928] Avg episode reward: [(0, '27.494')]
[2023-02-23 10:25:37,472][12586] Updated weights for policy 0, policy_version 4120 (0.0010)
[2023-02-23 10:25:40,316][07928] Fps is (10 sec: 11059.3, 60 sec: 11400.5, 300 sec: 11635.4). Total num frames: 16904192. Throughput: 0: 2838.4. Samples: 4225402. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:25:40,318][07928] Avg episode reward: [(0, '29.727')]
[2023-02-23 10:25:41,231][12586] Updated weights for policy 0, policy_version 4130 (0.0010)
[2023-02-23 10:25:44,898][12586] Updated weights for policy 0, policy_version 4140 (0.0009)
[2023-02-23 10:25:45,316][07928] Fps is (10 sec: 11059.3, 60 sec: 11400.5, 300 sec: 11621.5). Total num frames: 16961536. Throughput: 0: 2835.3. Samples: 4233800. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:25:45,319][07928] Avg episode reward: [(0, '30.776')]
[2023-02-23 10:25:48,596][12586] Updated weights for policy 0, policy_version 4150 (0.0009)
[2023-02-23 10:25:50,316][07928] Fps is (10 sec: 11059.2, 60 sec: 11332.3, 300 sec: 11607.6). Total num frames: 17014784. Throughput: 0: 2810.2. Samples: 4250336. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:25:50,319][07928] Avg episode reward: [(0, '31.474')]
[2023-02-23 10:25:52,337][12586] Updated weights for policy 0, policy_version 4160 (0.0011)
[2023-02-23 10:25:55,316][07928] Fps is (10 sec: 10649.5, 60 sec: 11264.0, 300 sec: 11593.8). Total num frames: 17068032. Throughput: 0: 2784.4. Samples: 4266662. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0)
[2023-02-23 10:25:55,318][07928] Avg episode reward: [(0, '29.910')]
[2023-02-23 10:25:56,121][12586] Updated weights for policy 0, policy_version 4170 (0.0010)
[2023-02-23 10:25:59,763][12586] Updated weights for policy 0, policy_version 4180 (0.0010)
[2023-02-23 10:26:00,316][07928] Fps is (10 sec: 11059.0, 60 sec: 11264.0, 300 sec: 11593.8). Total num frames: 17125376. Throughput: 0: 2779.9. Samples: 4275108. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:26:00,319][07928] Avg episode reward: [(0, '28.301')]
[2023-02-23 10:26:03,505][12586] Updated weights for policy 0, policy_version 4190 (0.0010)
[2023-02-23 10:26:05,316][07928] Fps is (10 sec: 11059.2, 60 sec: 11127.5, 300 sec: 11566.0). Total num frames: 17178624. Throughput: 0: 2756.8. Samples: 4291656. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:26:05,318][07928] Avg episode reward: [(0, '28.822')]
[2023-02-23 10:26:07,273][12586] Updated weights for policy 0, policy_version 4200 (0.0010)
[2023-02-23 10:26:10,316][07928] Fps is (10 sec: 11059.4, 60 sec: 11127.5, 300 sec: 11566.0). Total num frames: 17235968. Throughput: 0: 2752.4. Samples: 4308056. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:26:10,319][07928] Avg episode reward: [(0, '29.432')]
[2023-02-23 10:26:10,980][12586] Updated weights for policy 0, policy_version 4210 (0.0010)
[2023-02-23 10:26:14,689][12586] Updated weights for policy 0, policy_version 4220 (0.0009)
[2023-02-23 10:26:15,316][07928] Fps is (10 sec: 11059.3, 60 sec: 11059.2, 300 sec: 11552.1). Total num frames: 17289216. Throughput: 0: 2750.4. Samples: 4316370. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:26:15,318][07928] Avg episode reward: [(0, '33.160')]
[2023-02-23 10:26:18,409][12586] Updated weights for policy 0, policy_version 4230 (0.0010)
[2023-02-23 10:26:20,316][07928] Fps is (10 sec: 10649.5, 60 sec: 10990.9, 300 sec: 11538.2). Total num frames: 17342464. Throughput: 0: 2749.6. Samples: 4332842. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:26:20,319][07928] Avg episode reward: [(0, '33.464')]
[2023-02-23 10:26:22,227][12586] Updated weights for policy 0, policy_version 4240 (0.0010)
[2023-02-23 10:26:25,316][07928] Fps is (10 sec: 11059.1, 60 sec: 10990.9, 300 sec: 11524.3). Total num frames: 17399808. Throughput: 0: 2750.9. Samples: 4349194. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:26:25,319][07928] Avg episode reward: [(0, '32.479')]
[2023-02-23 10:26:25,891][12586] Updated weights for policy 0, policy_version 4250 (0.0010)
[2023-02-23 10:26:29,557][12586] Updated weights for policy 0, policy_version 4260 (0.0010)
[2023-02-23 10:26:30,316][07928] Fps is (10 sec: 11468.8, 60 sec: 11059.2, 300 sec: 11524.3). Total num frames: 17457152. Throughput: 0: 2750.0. Samples: 4357550. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:26:30,319][07928] Avg episode reward: [(0, '31.179')]
[2023-02-23 10:26:33,291][12586] Updated weights for policy 0, policy_version 4270 (0.0010)
[2023-02-23 10:26:35,316][07928] Fps is (10 sec: 11059.2, 60 sec: 10990.9, 300 sec: 11510.5). Total num frames: 17510400. Throughput: 0: 2750.4. Samples: 4374102. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:26:35,319][07928] Avg episode reward: [(0, '29.562')]
[2023-02-23 10:26:37,133][12586] Updated weights for policy 0, policy_version 4280 (0.0010)
[2023-02-23 10:26:40,316][07928] Fps is (10 sec: 10649.7, 60 sec: 10990.9, 300 sec: 11482.7). Total num frames: 17563648. Throughput: 0: 2754.4. Samples: 4390610. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:26:40,319][07928] Avg episode reward: [(0, '31.225')]
[2023-02-23 10:26:40,773][12586] Updated weights for policy 0, policy_version 4290 (0.0009)
[2023-02-23 10:26:44,424][12586] Updated weights for policy 0, policy_version 4300 (0.0009)
[2023-02-23 10:26:45,316][07928] Fps is (10 sec: 11059.3, 60 sec: 10990.9, 300 sec: 11482.7). Total num frames: 17620992. Throughput: 0: 2754.3. Samples: 4399052. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:26:45,319][07928] Avg episode reward: [(0, '31.434')]
[2023-02-23 10:26:48,249][12586] Updated weights for policy 0, policy_version 4310 (0.0011)
[2023-02-23 10:26:50,316][07928] Fps is (10 sec: 11059.2, 60 sec: 10990.9, 300 sec: 11454.9). Total num frames: 17674240. Throughput: 0: 2748.0. Samples: 4415314. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:26:50,319][07928] Avg episode reward: [(0, '32.764')]
[2023-02-23 10:26:51,973][12586] Updated weights for policy 0, policy_version 4320 (0.0010)
[2023-02-23 10:26:55,316][07928] Fps is (10 sec: 11059.2, 60 sec: 11059.2, 300 sec: 11454.9). Total num frames: 17731584. Throughput: 0: 2751.9. Samples: 4431892. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:26:55,318][07928] Avg episode reward: [(0, '31.208')]
[2023-02-23 10:26:55,663][12586] Updated weights for policy 0, policy_version 4330 (0.0010)
[2023-02-23 10:26:59,335][12586] Updated weights for policy 0, policy_version 4340 (0.0010)
[2023-02-23 10:27:00,316][07928] Fps is (10 sec: 11059.2, 60 sec: 10991.0, 300 sec: 11441.0). Total num frames: 17784832. Throughput: 0: 2755.4. Samples: 4440364. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:27:00,319][07928] Avg episode reward: [(0, '29.610')]
[2023-02-23 10:27:03,102][12586] Updated weights for policy 0, policy_version 4350 (0.0011)
[2023-02-23 10:27:05,316][07928] Fps is (10 sec: 10649.6, 60 sec: 10990.9, 300 sec: 11413.3). Total num frames: 17838080. Throughput: 0: 2749.3. Samples: 4456560. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:27:05,319][07928] Avg episode reward: [(0, '30.766')]
[2023-02-23 10:27:06,849][12586] Updated weights for policy 0, policy_version 4360 (0.0010)
[2023-02-23 10:27:10,316][07928] Fps is (10 sec: 11059.2, 60 sec: 10990.9, 300 sec: 11413.3). Total num frames: 17895424. Throughput: 0: 2756.9. Samples: 4473256. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:27:10,318][07928] Avg episode reward: [(0, '30.851')]
[2023-02-23 10:27:10,327][12572] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004369_17895424.pth...
[2023-02-23 10:27:10,383][12572] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003705_15175680.pth
[2023-02-23 10:27:10,530][12586] Updated weights for policy 0, policy_version 4370 (0.0010)
[2023-02-23 10:27:14,226][12586] Updated weights for policy 0, policy_version 4380 (0.0009)
[2023-02-23 10:27:15,316][07928] Fps is (10 sec: 11059.2, 60 sec: 10990.9, 300 sec: 11399.4). Total num frames: 17948672. Throughput: 0: 2755.4. Samples: 4481542. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:27:15,319][07928] Avg episode reward: [(0, '31.678')]
[2023-02-23 10:27:18,077][12586] Updated weights for policy 0, policy_version 4390 (0.0010)
[2023-02-23 10:27:20,316][07928] Fps is (10 sec: 11059.2, 60 sec: 11059.2, 300 sec: 11399.4). Total num frames: 18006016. Throughput: 0: 2747.7. Samples: 4497748. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:27:20,319][07928] Avg episode reward: [(0, '29.802')]
[2023-02-23 10:27:21,789][12586] Updated weights for policy 0, policy_version 4400 (0.0010)
[2023-02-23 10:27:25,316][07928] Fps is (10 sec: 11059.2, 60 sec: 10990.9, 300 sec: 11371.6). Total num frames: 18059264. Throughput: 0: 2751.9. Samples: 4514444. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:27:25,319][07928] Avg episode reward: [(0, '30.004')]
[2023-02-23 10:27:25,352][12586] Updated weights for policy 0, policy_version 4410 (0.0010)
[2023-02-23 10:27:28,901][12586] Updated weights for policy 0, policy_version 4420 (0.0010)
[2023-02-23 10:27:30,316][07928] Fps is (10 sec: 11059.1, 60 sec: 10990.9, 300 sec: 11371.6). Total num frames: 18116608. Throughput: 0: 2759.5. Samples: 4523230. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:27:30,319][07928] Avg episode reward: [(0, '30.341')]
[2023-02-23 10:27:32,607][12586] Updated weights for policy 0, policy_version 4430 (0.0010)
[2023-02-23 10:27:35,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11059.2, 300 sec: 11371.6). Total num frames: 18173952. Throughput: 0: 2769.0. Samples: 4539918. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:27:35,319][07928] Avg episode reward: [(0, '34.548')]
[2023-02-23 10:27:36,077][12586] Updated weights for policy 0, policy_version 4440 (0.0008)
[2023-02-23 10:27:39,536][12586] Updated weights for policy 0, policy_version 4450 (0.0010)
[2023-02-23 10:27:40,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11195.7, 300 sec: 11371.6). Total num frames: 18235392. Throughput: 0: 2798.5. Samples: 4557826. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:27:40,319][07928] Avg episode reward: [(0, '34.705')]
[2023-02-23 10:27:43,050][12586] Updated weights for policy 0, policy_version 4460 (0.0009)
[2023-02-23 10:27:45,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11195.7, 300 sec: 11371.6). Total num frames: 18292736. Throughput: 0: 2805.0. Samples: 4566590. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:27:45,319][07928] Avg episode reward: [(0, '33.618')]
[2023-02-23 10:27:46,728][12586] Updated weights for policy 0, policy_version 4470 (0.0010)
[2023-02-23 10:27:50,316][07928] Fps is (10 sec: 11059.2, 60 sec: 11195.7, 300 sec: 11343.8). Total num frames: 18345984. Throughput: 0: 2821.5. Samples: 4583526. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:27:50,319][07928] Avg episode reward: [(0, '31.906')]
[2023-02-23 10:27:50,338][12586] Updated weights for policy 0, policy_version 4480 (0.0009)
[2023-02-23 10:27:53,837][12586] Updated weights for policy 0, policy_version 4490 (0.0009)
[2023-02-23 10:27:55,331][07928] Fps is (10 sec: 11452.2, 60 sec: 11261.3, 300 sec: 11357.2). Total num frames: 18407424. Throughput: 0: 2836.6. Samples: 4600942. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:27:55,333][07928] Avg episode reward: [(0, '30.245')]
[2023-02-23 10:27:57,354][12586] Updated weights for policy 0, policy_version 4500 (0.0010)
[2023-02-23 10:28:00,316][07928] Fps is (10 sec: 11468.7, 60 sec: 11264.0, 300 sec: 11343.8). Total num frames: 18460672. Throughput: 0: 2844.9. Samples: 4609564. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:28:00,319][07928] Avg episode reward: [(0, '28.957')]
[2023-02-23 10:28:01,061][12586] Updated weights for policy 0, policy_version 4510 (0.0010)
[2023-02-23 10:28:04,519][12586] Updated weights for policy 0, policy_version 4520 (0.0011)
[2023-02-23 10:28:05,316][07928] Fps is (10 sec: 11485.5, 60 sec: 11400.5, 300 sec: 11343.8). Total num frames: 18522112. Throughput: 0: 2866.1. Samples: 4626724. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:28:05,319][07928] Avg episode reward: [(0, '28.097')]
[2023-02-23 10:28:08,030][12586] Updated weights for policy 0, policy_version 4530 (0.0010)
[2023-02-23 10:28:10,316][07928] Fps is (10 sec: 11878.5, 60 sec: 11400.5, 300 sec: 11343.8). Total num frames: 18579456. Throughput: 0: 2886.3. Samples: 4644328. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:28:10,319][07928] Avg episode reward: [(0, '29.610')]
[2023-02-23 10:28:11,617][12586] Updated weights for policy 0, policy_version 4540 (0.0010)
[2023-02-23 10:28:15,316][07928] Fps is (10 sec: 11059.2, 60 sec: 11400.5, 300 sec: 11330.0). Total num frames: 18632704. Throughput: 0: 2872.1. Samples: 4652476. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:28:15,319][07928] Avg episode reward: [(0, '32.062')]
[2023-02-23 10:28:15,488][12586] Updated weights for policy 0, policy_version 4550 (0.0011)
[2023-02-23 10:28:19,182][12586] Updated weights for policy 0, policy_version 4560 (0.0011)
[2023-02-23 10:28:20,316][07928] Fps is (10 sec: 11059.0, 60 sec: 11400.5, 300 sec: 11329.9). Total num frames: 18690048. Throughput: 0: 2860.3. Samples: 4668632. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:28:20,319][07928] Avg episode reward: [(0, '32.626')]
[2023-02-23 10:28:22,880][12586] Updated weights for policy 0, policy_version 4570 (0.0009)
[2023-02-23 10:28:25,316][07928] Fps is (10 sec: 11059.2, 60 sec: 11400.5, 300 sec: 11302.2). Total num frames: 18743296. Throughput: 0: 2835.1. Samples: 4685406. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:28:25,319][07928] Avg episode reward: [(0, '31.837')]
[2023-02-23 10:28:26,573][12586] Updated weights for policy 0, policy_version 4580 (0.0010)
[2023-02-23 10:28:30,316][07928] Fps is (10 sec: 10649.7, 60 sec: 11332.3, 300 sec: 11288.3). Total num frames: 18796544. Throughput: 0: 2820.9. Samples: 4693530. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:28:30,319][07928] Avg episode reward: [(0, '30.619')]
[2023-02-23 10:28:30,354][12586] Updated weights for policy 0, policy_version 4590 (0.0010)
[2023-02-23 10:28:34,027][12586] Updated weights for policy 0, policy_version 4600 (0.0010)
[2023-02-23 10:28:35,316][07928] Fps is (10 sec: 11059.0, 60 sec: 11332.3, 300 sec: 11288.3). Total num frames: 18853888. Throughput: 0: 2811.5. Samples: 4710046. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:28:35,318][07928] Avg episode reward: [(0, '30.610')]
[2023-02-23 10:28:37,735][12586] Updated weights for policy 0, policy_version 4610 (0.0011)
[2023-02-23 10:28:40,316][07928] Fps is (10 sec: 11059.2, 60 sec: 11195.7, 300 sec: 11260.5). Total num frames: 18907136. Throughput: 0: 2794.3. Samples: 4726646. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:28:40,319][07928] Avg episode reward: [(0, '32.121')]
[2023-02-23 10:28:41,476][12586] Updated weights for policy 0, policy_version 4620 (0.0010)
[2023-02-23 10:28:45,279][12586] Updated weights for policy 0, policy_version 4630 (0.0010)
[2023-02-23 10:28:45,316][07928] Fps is (10 sec: 11059.4, 60 sec: 11195.7, 300 sec: 11260.5). Total num frames: 18964480. Throughput: 0: 2780.0. Samples: 4734664. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:28:45,319][07928] Avg episode reward: [(0, '30.664')]
[2023-02-23 10:28:48,930][12586] Updated weights for policy 0, policy_version 4640 (0.0009)
[2023-02-23 10:28:50,316][07928] Fps is (10 sec: 11059.1, 60 sec: 11195.7, 300 sec: 11246.6). Total num frames: 19017728. Throughput: 0: 2768.9. Samples: 4751326. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:28:50,318][07928] Avg episode reward: [(0, '30.023')]
[2023-02-23 10:28:52,567][12586] Updated weights for policy 0, policy_version 4650 (0.0009)
[2023-02-23 10:28:55,316][07928] Fps is (10 sec: 11059.1, 60 sec: 11130.1, 300 sec: 11232.8). Total num frames: 19075072. Throughput: 0: 2748.3. Samples: 4768004. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:28:55,319][07928] Avg episode reward: [(0, '32.518')]
[2023-02-23 10:28:56,387][12586] Updated weights for policy 0, policy_version 4660 (0.0010)
[2023-02-23 10:29:00,147][12586] Updated weights for policy 0, policy_version 4670 (0.0010)
[2023-02-23 10:29:00,316][07928] Fps is (10 sec: 11059.3, 60 sec: 11127.5, 300 sec: 11218.9). Total num frames: 19128320. Throughput: 0: 2744.5. Samples: 4775978. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:29:00,318][07928] Avg episode reward: [(0, '31.227')]
[2023-02-23 10:29:03,794][12586] Updated weights for policy 0, policy_version 4680 (0.0010)
[2023-02-23 10:29:05,316][07928] Fps is (10 sec: 11059.3, 60 sec: 11059.2, 300 sec: 11218.9). Total num frames: 19185664. Throughput: 0: 2754.9. Samples: 4792602. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:29:05,318][07928] Avg episode reward: [(0, '29.746')]
[2023-02-23 10:29:07,432][12586] Updated weights for policy 0, policy_version 4690 (0.0010)
[2023-02-23 10:29:10,316][07928] Fps is (10 sec: 11059.2, 60 sec: 10990.9, 300 sec: 11205.0). Total num frames: 19238912. Throughput: 0: 2751.9. Samples: 4809242. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:29:10,318][07928] Avg episode reward: [(0, '30.221')]
[2023-02-23 10:29:10,327][12572] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004697_19238912.pth...
[2023-02-23 10:29:10,387][12572] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004045_16568320.pth
[2023-02-23 10:29:11,228][12586] Updated weights for policy 0, policy_version 4700 (0.0010)
[2023-02-23 10:29:14,937][12586] Updated weights for policy 0, policy_version 4710 (0.0010)
[2023-02-23 10:29:15,316][07928] Fps is (10 sec: 10649.6, 60 sec: 10990.9, 300 sec: 11177.2). Total num frames: 19292160. Throughput: 0: 2751.6. Samples: 4817352. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:29:15,319][07928] Avg episode reward: [(0, '28.669')]
[2023-02-23 10:29:18,698][12586] Updated weights for policy 0, policy_version 4720 (0.0010)
[2023-02-23 10:29:20,316][07928] Fps is (10 sec: 11059.2, 60 sec: 10991.0, 300 sec: 11191.1). Total num frames: 19349504. Throughput: 0: 2752.0. Samples: 4833886. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:29:20,318][07928] Avg episode reward: [(0, '28.059')]
[2023-02-23 10:29:22,328][12586] Updated weights for policy 0, policy_version 4730 (0.0010)
[2023-02-23 10:29:25,316][07928] Fps is (10 sec: 11059.1, 60 sec: 10990.9, 300 sec: 11177.2). Total num frames: 19402752. Throughput: 0: 2749.9. Samples: 4850392. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:29:25,320][07928] Avg episode reward: [(0, '29.552')]
[2023-02-23 10:29:26,141][12586] Updated weights for policy 0, policy_version 4740 (0.0010)
[2023-02-23 10:29:29,941][12586] Updated weights for policy 0, policy_version 4750 (0.0010)
[2023-02-23 10:29:30,316][07928] Fps is (10 sec: 11059.1, 60 sec: 11059.2, 300 sec: 11177.2). Total num frames: 19460096. Throughput: 0: 2753.4. Samples: 4858568. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:29:30,319][07928] Avg episode reward: [(0, '32.846')]
[2023-02-23 10:29:33,714][12586] Updated weights for policy 0, policy_version 4760 (0.0010)
[2023-02-23 10:29:35,316][07928] Fps is (10 sec: 11059.3, 60 sec: 10991.0, 300 sec: 11163.3). Total num frames: 19513344. Throughput: 0: 2743.4. Samples: 4874778. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:29:35,318][07928] Avg episode reward: [(0, '31.482')]
[2023-02-23 10:29:37,563][12586] Updated weights for policy 0, policy_version 4770 (0.0010)
[2023-02-23 10:29:40,316][07928] Fps is (10 sec: 10239.9, 60 sec: 10922.7, 300 sec: 11135.6). Total num frames: 19562496. Throughput: 0: 2721.5. Samples: 4890470. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:29:40,318][07928] Avg episode reward: [(0, '30.771')]
[2023-02-23 10:29:41,497][12586] Updated weights for policy 0, policy_version 4780 (0.0010)
[2023-02-23 10:29:45,211][12586] Updated weights for policy 0, policy_version 4790 (0.0010)
[2023-02-23 10:29:45,316][07928] Fps is (10 sec: 10649.6, 60 sec: 10922.7, 300 sec: 11135.6). Total num frames: 19619840. Throughput: 0: 2725.0. Samples: 4898604. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:29:45,318][07928] Avg episode reward: [(0, '32.326')]
[2023-02-23 10:29:48,917][12586] Updated weights for policy 0, policy_version 4800 (0.0010)
[2023-02-23 10:29:50,316][07928] Fps is (10 sec: 11059.4, 60 sec: 10922.7, 300 sec: 11121.7). Total num frames: 19673088. Throughput: 0: 2723.1. Samples: 4915140. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:29:50,318][07928] Avg episode reward: [(0, '34.064')]
[2023-02-23 10:29:52,655][12586] Updated weights for policy 0, policy_version 4810 (0.0011)
[2023-02-23 10:29:55,316][07928] Fps is (10 sec: 11059.2, 60 sec: 10922.7, 300 sec: 11121.7). Total num frames: 19730432. Throughput: 0: 2716.2. Samples: 4931472. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:29:55,318][07928] Avg episode reward: [(0, '32.391')]
[2023-02-23 10:29:56,413][12586] Updated weights for policy 0, policy_version 4820 (0.0010)
[2023-02-23 10:30:00,144][12586] Updated weights for policy 0, policy_version 4830 (0.0009)
[2023-02-23 10:30:00,316][07928] Fps is (10 sec: 11059.1, 60 sec: 10922.7, 300 sec: 11093.9). Total num frames: 19783680. Throughput: 0: 2721.5. Samples: 4939818. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 10:30:00,318][07928] Avg episode reward: [(0, '32.075')]
[2023-02-23 10:30:03,782][12586] Updated weights for policy 0, policy_version 4840 (0.0011)
[2023-02-23 10:30:05,316][07928] Fps is (10 sec: 11059.1, 60 sec: 10922.6, 300 sec: 11093.9). Total num frames: 19841024. Throughput: 0: 2725.5. Samples: 4956534. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:30:05,318][07928] Avg episode reward: [(0, '32.543')]
[2023-02-23 10:30:07,567][12586] Updated weights for policy 0, policy_version 4850 (0.0010)
[2023-02-23 10:30:10,316][07928] Fps is (10 sec: 11059.3, 60 sec: 10922.7, 300 sec: 11080.0). Total num frames: 19894272. Throughput: 0: 2718.7. Samples: 4972732. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 10:30:10,319][07928] Avg episode reward: [(0, '32.644')]
[2023-02-23 10:30:11,317][12586] Updated weights for policy 0, policy_version 4860 (0.0010)
[2023-02-23 10:30:15,019][12586] Updated weights for policy 0, policy_version 4870 (0.0010)
[2023-02-23 10:30:15,316][07928] Fps is (10 sec: 10649.7, 60 sec: 10922.7, 300 sec: 11066.1). Total num frames: 19947520. Throughput: 0: 2721.8. Samples: 4981048. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 10:30:15,319][07928] Avg episode reward: [(0, '33.707')]
[2023-02-23 10:30:18,749][12586] Updated weights for policy 0, policy_version 4880 (0.0010)
[2023-02-23 10:30:20,242][12572] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth...
[2023-02-23 10:30:20,243][07928] Component Batcher_0 stopped!
[2023-02-23 10:30:20,246][07928] Component RolloutWorker_w1 process died already! Don't wait for it.
[2023-02-23 10:30:20,243][12572] Stopping Batcher_0...
[2023-02-23 10:30:20,249][12572] Loop batcher_evt_loop terminating...
[2023-02-23 10:30:20,249][07928] Component RolloutWorker_w2 process died already! Don't wait for it.
[2023-02-23 10:30:20,251][07928] Component RolloutWorker_w4 process died already! Don't wait for it.
[2023-02-23 10:30:20,253][07928] Component RolloutWorker_w7 process died already! Don't wait for it.
[2023-02-23 10:30:20,255][12607] Stopping RolloutWorker_w6...
[2023-02-23 10:30:20,255][12607] Loop rollout_proc6_evt_loop terminating...
[2023-02-23 10:30:20,256][07928] Component RolloutWorker_w6 stopped!
[2023-02-23 10:30:20,257][12588] Stopping RolloutWorker_w0...
[2023-02-23 10:30:20,258][12588] Loop rollout_proc0_evt_loop terminating...
[2023-02-23 10:30:20,258][07928] Component RolloutWorker_w0 stopped!
[2023-02-23 10:30:20,259][12586] Weights refcount: 2 0
[2023-02-23 10:30:20,261][12586] Stopping InferenceWorker_p0-w0...
[2023-02-23 10:30:20,262][12586] Loop inference_proc0-0_evt_loop terminating...
[2023-02-23 10:30:20,261][07928] Component InferenceWorker_p0-w0 stopped!
[2023-02-23 10:30:20,265][12608] Stopping RolloutWorker_w5...
[2023-02-23 10:30:20,265][12608] Loop rollout_proc5_evt_loop terminating...
[2023-02-23 10:30:20,265][07928] Component RolloutWorker_w5 stopped!
[2023-02-23 10:30:20,272][12590] Stopping RolloutWorker_w3...
[2023-02-23 10:30:20,273][12590] Loop rollout_proc3_evt_loop terminating...
[2023-02-23 10:30:20,272][07928] Component RolloutWorker_w3 stopped!
[2023-02-23 10:30:20,307][12572] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004369_17895424.pth
[2023-02-23 10:30:20,313][12572] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth...
[2023-02-23 10:30:20,392][12572] Stopping LearnerWorker_p0...
[2023-02-23 10:30:20,393][12572] Loop learner_proc0_evt_loop terminating...
[2023-02-23 10:30:20,393][07928] Component LearnerWorker_p0 stopped!
[2023-02-23 10:30:20,396][07928] Waiting for process learner_proc0 to stop...
[2023-02-23 10:30:21,886][07928] Waiting for process inference_proc0-0 to join...
[2023-02-23 10:30:21,889][07928] Waiting for process rollout_proc0 to join...
[2023-02-23 10:30:21,891][07928] Waiting for process rollout_proc1 to join...
[2023-02-23 10:30:21,893][07928] Waiting for process rollout_proc2 to join...
[2023-02-23 10:30:21,895][07928] Waiting for process rollout_proc3 to join...
[2023-02-23 10:30:21,897][07928] Waiting for process rollout_proc4 to join...
[2023-02-23 10:30:21,898][07928] Waiting for process rollout_proc5 to join...
[2023-02-23 10:30:21,901][07928] Waiting for process rollout_proc6 to join...
[2023-02-23 10:30:21,902][07928] Waiting for process rollout_proc7 to join...
[2023-02-23 10:30:21,904][07928] Batcher 0 profile tree view:
batching: 74.3165, releasing_batches: 0.1040
[2023-02-23 10:30:21,906][07928] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0000
wait_policy_total: 25.5259
update_model: 23.4186
weight_update: 0.0009
one_step: 0.0024
handle_policy_step: 1588.9123
deserialize: 50.7068, stack: 9.6286, obs_to_device_normalize: 364.3744, forward: 754.5875, send_messages: 95.9341
prepare_outputs: 236.1393
to_cpu: 145.2426
[2023-02-23 10:30:21,907][07928] Learner 0 profile tree view:
misc: 0.0290, prepare_batch: 34.3051
train: 89.8894
epoch_init: 0.0258, minibatch_init: 0.0256, losses_postprocess: 2.3762, kl_divergence: 2.8221, after_optimizer: 6.4989
calculate_losses: 35.8647
losses_init: 0.0151, forward_head: 5.0676, bptt_initial: 15.4692, tail: 2.8205, advantages_returns: 0.7707, losses: 4.8844
bptt: 6.0336
bptt_forward_core: 5.7997
update: 40.6677
clip: 5.0121
[2023-02-23 10:30:21,909][07928] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 1.4081, enqueue_policy_requests: 68.0882, env_step: 1057.1057, overhead: 94.3103, complete_rollouts: 2.2493
save_policy_outputs: 76.6386
split_output_tensors: 37.6363
[2023-02-23 10:30:21,911][07928] Loop Runner_EvtLoop terminating...
[2023-02-23 10:30:21,914][07928] Runner profile tree view:
main_loop: 1746.8731
[2023-02-23 10:30:21,915][07928] Collected {0: 20004864}, FPS: 11451.8
[2023-02-23 10:34:04,080][07928] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-23 10:34:04,082][07928] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-23 10:34:04,084][07928] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-23 10:34:04,085][07928] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-23 10:34:04,087][07928] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 10:34:04,089][07928] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-23 10:34:04,090][07928] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 10:34:04,092][07928] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-23 10:34:04,093][07928] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2023-02-23 10:34:04,095][07928] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2023-02-23 10:34:04,096][07928] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-23 10:34:04,098][07928] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-23 10:34:04,099][07928] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-23 10:34:04,101][07928] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-23 10:34:04,102][07928] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-23 10:34:04,120][07928] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 10:34:04,123][07928] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 10:34:04,127][07928] RunningMeanStd input shape: (1,)
[2023-02-23 10:34:04,146][07928] ConvEncoder: input_channels=3
[2023-02-23 10:34:04,977][07928] Conv encoder output size: 512
[2023-02-23 10:34:04,981][07928] Policy head output size: 512
[2023-02-23 10:34:07,930][07928] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth...
[2023-02-23 10:34:09,832][07928] Num frames 100...
[2023-02-23 10:34:09,956][07928] Num frames 200...
[2023-02-23 10:34:10,082][07928] Num frames 300...
[2023-02-23 10:34:10,211][07928] Num frames 400...
[2023-02-23 10:34:10,336][07928] Num frames 500...
[2023-02-23 10:34:10,479][07928] Avg episode rewards: #0: 11.690, true rewards: #0: 5.690
[2023-02-23 10:34:10,481][07928] Avg episode reward: 11.690, avg true_objective: 5.690
[2023-02-23 10:34:10,520][07928] Num frames 600...
[2023-02-23 10:34:10,647][07928] Num frames 700...
[2023-02-23 10:34:10,773][07928] Num frames 800...
[2023-02-23 10:34:10,902][07928] Num frames 900...
[2023-02-23 10:34:11,078][07928] Avg episode rewards: #0: 9.985, true rewards: #0: 4.985
[2023-02-23 10:34:11,080][07928] Avg episode reward: 9.985, avg true_objective: 4.985
[2023-02-23 10:34:11,085][07928] Num frames 1000...
[2023-02-23 10:34:11,202][07928] Num frames 1100...
[2023-02-23 10:34:11,315][07928] Num frames 1200...
[2023-02-23 10:34:11,434][07928] Num frames 1300...
[2023-02-23 10:34:11,547][07928] Num frames 1400...
[2023-02-23 10:34:11,661][07928] Num frames 1500...
[2023-02-23 10:34:11,778][07928] Num frames 1600...
[2023-02-23 10:34:11,897][07928] Num frames 1700...
[2023-02-23 10:34:12,020][07928] Num frames 1800...
[2023-02-23 10:34:12,143][07928] Num frames 1900...
[2023-02-23 10:34:12,290][07928] Num frames 2000...
[2023-02-23 10:34:12,415][07928] Num frames 2100...
[2023-02-23 10:34:12,539][07928] Num frames 2200...
[2023-02-23 10:34:12,664][07928] Num frames 2300...
[2023-02-23 10:34:12,793][07928] Num frames 2400...
[2023-02-23 10:34:12,917][07928] Num frames 2500...
[2023-02-23 10:34:13,041][07928] Num frames 2600...
[2023-02-23 10:34:13,166][07928] Num frames 2700...
[2023-02-23 10:34:13,293][07928] Num frames 2800...
[2023-02-23 10:34:13,415][07928] Num frames 2900...
[2023-02-23 10:34:13,557][07928] Avg episode rewards: #0: 23.906, true rewards: #0: 9.907
[2023-02-23 10:34:13,560][07928] Avg episode reward: 23.906, avg true_objective: 9.907
[2023-02-23 10:34:13,595][07928] Num frames 3000...
[2023-02-23 10:34:13,712][07928] Num frames 3100...
[2023-02-23 10:34:13,826][07928] Num frames 3200...
[2023-02-23 10:34:13,940][07928] Num frames 3300...
[2023-02-23 10:34:14,053][07928] Num frames 3400...
[2023-02-23 10:34:14,171][07928] Num frames 3500...
[2023-02-23 10:34:14,290][07928] Num frames 3600...
[2023-02-23 10:34:14,410][07928] Num frames 3700...
[2023-02-23 10:34:14,534][07928] Num frames 3800...
[2023-02-23 10:34:14,657][07928] Num frames 3900...
[2023-02-23 10:34:14,778][07928] Num frames 4000...
[2023-02-23 10:34:14,900][07928] Num frames 4100...
[2023-02-23 10:34:15,020][07928] Num frames 4200...
[2023-02-23 10:34:15,142][07928] Num frames 4300...
[2023-02-23 10:34:15,287][07928] Num frames 4400...
[2023-02-23 10:34:15,411][07928] Num frames 4500...
[2023-02-23 10:34:15,534][07928] Num frames 4600...
[2023-02-23 10:34:15,662][07928] Num frames 4700...
[2023-02-23 10:34:15,787][07928] Num frames 4800...
[2023-02-23 10:34:15,910][07928] Num frames 4900...
[2023-02-23 10:34:16,034][07928] Num frames 5000...
[2023-02-23 10:34:16,174][07928] Avg episode rewards: #0: 32.679, true rewards: #0: 12.680
[2023-02-23 10:34:16,176][07928] Avg episode reward: 32.679, avg true_objective: 12.680
[2023-02-23 10:34:16,210][07928] Num frames 5100...
[2023-02-23 10:34:16,332][07928] Num frames 5200...
[2023-02-23 10:34:16,447][07928] Num frames 5300...
[2023-02-23 10:34:16,561][07928] Num frames 5400...
[2023-02-23 10:34:16,678][07928] Num frames 5500...
[2023-02-23 10:34:16,793][07928] Num frames 5600...
[2023-02-23 10:34:16,908][07928] Num frames 5700...
[2023-02-23 10:34:17,027][07928] Num frames 5800...
[2023-02-23 10:34:17,146][07928] Num frames 5900...
[2023-02-23 10:34:17,266][07928] Num frames 6000...
[2023-02-23 10:34:17,389][07928] Num frames 6100...
[2023-02-23 10:34:17,512][07928] Num frames 6200...
[2023-02-23 10:34:17,632][07928] Num frames 6300...
[2023-02-23 10:34:17,799][07928] Avg episode rewards: #0: 32.966, true rewards: #0: 12.766
[2023-02-23 10:34:17,801][07928] Avg episode reward: 32.966, avg true_objective: 12.766
[2023-02-23 10:34:17,825][07928] Num frames 6400...
[2023-02-23 10:34:17,944][07928] Num frames 6500...
[2023-02-23 10:34:18,065][07928] Num frames 6600...
[2023-02-23 10:34:18,181][07928] Num frames 6700...
[2023-02-23 10:34:18,305][07928] Num frames 6800...
[2023-02-23 10:34:18,428][07928] Num frames 6900...
[2023-02-23 10:34:18,553][07928] Num frames 7000...
[2023-02-23 10:34:18,668][07928] Num frames 7100...
[2023-02-23 10:34:18,783][07928] Num frames 7200...
[2023-02-23 10:34:18,899][07928] Num frames 7300...
[2023-02-23 10:34:19,018][07928] Num frames 7400...
[2023-02-23 10:34:19,136][07928] Num frames 7500...
[2023-02-23 10:34:19,255][07928] Num frames 7600...
[2023-02-23 10:34:19,381][07928] Num frames 7700...
[2023-02-23 10:34:19,511][07928] Num frames 7800...
[2023-02-23 10:34:19,641][07928] Num frames 7900...
[2023-02-23 10:34:19,769][07928] Num frames 8000...
[2023-02-23 10:34:19,901][07928] Avg episode rewards: #0: 35.266, true rewards: #0: 13.433
[2023-02-23 10:34:19,903][07928] Avg episode reward: 35.266, avg true_objective: 13.433
[2023-02-23 10:34:19,958][07928] Num frames 8100...
[2023-02-23 10:34:20,085][07928] Num frames 8200...
[2023-02-23 10:34:20,212][07928] Num frames 8300...
[2023-02-23 10:34:20,330][07928] Num frames 8400...
[2023-02-23 10:34:20,451][07928] Num frames 8500...
[2023-02-23 10:34:20,578][07928] Num frames 8600...
[2023-02-23 10:34:20,699][07928] Num frames 8700...
[2023-02-23 10:34:20,821][07928] Num frames 8800...
[2023-02-23 10:34:20,941][07928] Num frames 8900...
[2023-02-23 10:34:21,055][07928] Num frames 9000...
[2023-02-23 10:34:21,172][07928] Num frames 9100...
[2023-02-23 10:34:21,291][07928] Num frames 9200...
[2023-02-23 10:34:21,409][07928] Num frames 9300...
[2023-02-23 10:34:21,523][07928] Num frames 9400...
[2023-02-23 10:34:21,638][07928] Num frames 9500...
[2023-02-23 10:34:21,760][07928] Num frames 9600...
[2023-02-23 10:34:21,884][07928] Num frames 9700...
[2023-02-23 10:34:22,005][07928] Num frames 9800...
[2023-02-23 10:34:22,131][07928] Num frames 9900...
[2023-02-23 10:34:22,255][07928] Num frames 10000...
[2023-02-23 10:34:22,375][07928] Num frames 10100...
[2023-02-23 10:34:22,506][07928] Avg episode rewards: #0: 38.942, true rewards: #0: 14.514
[2023-02-23 10:34:22,508][07928] Avg episode reward: 38.942, avg true_objective: 14.514
[2023-02-23 10:34:22,558][07928] Num frames 10200...
[2023-02-23 10:34:22,677][07928] Num frames 10300...
[2023-02-23 10:34:22,797][07928] Num frames 10400...
[2023-02-23 10:34:22,915][07928] Num frames 10500...
[2023-02-23 10:34:23,037][07928] Num frames 10600...
[2023-02-23 10:34:23,162][07928] Num frames 10700...
[2023-02-23 10:34:23,228][07928] Avg episode rewards: #0: 35.759, true rewards: #0: 13.385
[2023-02-23 10:34:23,230][07928] Avg episode reward: 35.759, avg true_objective: 13.385
[2023-02-23 10:34:23,344][07928] Num frames 10800...
[2023-02-23 10:34:23,465][07928] Num frames 10900...
[2023-02-23 10:34:23,582][07928] Num frames 11000...
[2023-02-23 10:34:23,696][07928] Num frames 11100...
[2023-02-23 10:34:23,807][07928] Num frames 11200...
[2023-02-23 10:34:23,921][07928] Num frames 11300...
[2023-02-23 10:34:24,033][07928] Num frames 11400...
[2023-02-23 10:34:24,152][07928] Num frames 11500...
[2023-02-23 10:34:24,271][07928] Num frames 11600...
[2023-02-23 10:34:24,390][07928] Num frames 11700...
[2023-02-23 10:34:24,524][07928] Num frames 11800...
[2023-02-23 10:34:24,648][07928] Num frames 11900...
[2023-02-23 10:34:24,775][07928] Num frames 12000...
[2023-02-23 10:34:24,897][07928] Num frames 12100...
[2023-02-23 10:34:25,025][07928] Num frames 12200...
[2023-02-23 10:34:25,151][07928] Num frames 12300...
[2023-02-23 10:34:25,276][07928] Num frames 12400...
[2023-02-23 10:34:25,403][07928] Num frames 12500...
[2023-02-23 10:34:25,533][07928] Num frames 12600...
[2023-02-23 10:34:25,661][07928] Num frames 12700...
[2023-02-23 10:34:25,782][07928] Num frames 12800...
[2023-02-23 10:34:25,848][07928] Avg episode rewards: #0: 38.675, true rewards: #0: 14.231
[2023-02-23 10:34:25,850][07928] Avg episode reward: 38.675, avg true_objective: 14.231
[2023-02-23 10:34:25,963][07928] Num frames 12900...
[2023-02-23 10:34:26,075][07928] Num frames 13000...
[2023-02-23 10:34:26,190][07928] Num frames 13100...
[2023-02-23 10:34:26,302][07928] Num frames 13200...
[2023-02-23 10:34:26,416][07928] Num frames 13300...
[2023-02-23 10:34:26,532][07928] Num frames 13400...
[2023-02-23 10:34:26,650][07928] Num frames 13500...
[2023-02-23 10:34:26,767][07928] Num frames 13600...
[2023-02-23 10:34:26,905][07928] Avg episode rewards: #0: 36.571, true rewards: #0: 13.672
[2023-02-23 10:34:26,908][07928] Avg episode reward: 36.571, avg true_objective: 13.672
[2023-02-23 10:34:59,650][07928] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-23 10:40:24,704][07928] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-23 10:40:24,706][07928] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-23 10:40:24,707][07928] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-23 10:40:24,710][07928] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-23 10:40:24,711][07928] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 10:40:24,712][07928] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-23 10:40:24,714][07928] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-02-23 10:40:24,716][07928] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-23 10:40:24,718][07928] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-02-23 10:40:24,719][07928] Adding new argument 'hf_repository'='Unterwexi/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-02-23 10:40:24,722][07928] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-23 10:40:24,723][07928] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-23 10:40:24,725][07928] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-23 10:40:24,726][07928] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-23 10:40:24,727][07928] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-23 10:40:24,745][07928] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 10:40:24,748][07928] RunningMeanStd input shape: (1,)
[2023-02-23 10:40:24,763][07928] ConvEncoder: input_channels=3
[2023-02-23 10:40:24,804][07928] Conv encoder output size: 512
[2023-02-23 10:40:24,806][07928] Policy head output size: 512
[2023-02-23 10:40:24,831][07928] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth...
[2023-02-23 10:40:25,305][07928] Num frames 100...
[2023-02-23 10:40:25,415][07928] Num frames 200...
[2023-02-23 10:40:25,523][07928] Num frames 300...
[2023-02-23 10:40:25,633][07928] Num frames 400...
[2023-02-23 10:40:25,746][07928] Num frames 500...
[2023-02-23 10:40:25,858][07928] Num frames 600...
[2023-02-23 10:40:25,970][07928] Num frames 700...
[2023-02-23 10:40:26,079][07928] Num frames 800...
[2023-02-23 10:40:26,189][07928] Num frames 900...
[2023-02-23 10:40:26,297][07928] Num frames 1000...
[2023-02-23 10:40:26,406][07928] Num frames 1100...
[2023-02-23 10:40:26,515][07928] Num frames 1200...
[2023-02-23 10:40:26,628][07928] Num frames 1300...
[2023-02-23 10:40:26,739][07928] Num frames 1400...
[2023-02-23 10:40:26,808][07928] Avg episode rewards: #0: 37.120, true rewards: #0: 14.120
[2023-02-23 10:40:26,810][07928] Avg episode reward: 37.120, avg true_objective: 14.120
[2023-02-23 10:40:26,908][07928] Num frames 1500...
[2023-02-23 10:40:27,032][07928] Num frames 1600...
[2023-02-23 10:40:27,145][07928] Num frames 1700...
[2023-02-23 10:40:27,258][07928] Num frames 1800...
[2023-02-23 10:40:27,368][07928] Num frames 1900...
[2023-02-23 10:40:27,481][07928] Num frames 2000...
[2023-02-23 10:40:27,591][07928] Num frames 2100...
[2023-02-23 10:40:27,703][07928] Num frames 2200...
[2023-02-23 10:40:27,812][07928] Num frames 2300...
[2023-02-23 10:40:27,925][07928] Num frames 2400...
[2023-02-23 10:40:28,041][07928] Num frames 2500...
[2023-02-23 10:40:28,153][07928] Num frames 2600...
[2023-02-23 10:40:28,302][07928] Avg episode rewards: #0: 35.435, true rewards: #0: 13.435
[2023-02-23 10:40:28,304][07928] Avg episode reward: 35.435, avg true_objective: 13.435
[2023-02-23 10:40:28,320][07928] Num frames 2700...
[2023-02-23 10:40:28,429][07928] Num frames 2800...
[2023-02-23 10:40:28,543][07928] Num frames 2900...
[2023-02-23 10:40:28,655][07928] Num frames 3000...
[2023-02-23 10:40:28,764][07928] Num frames 3100...
[2023-02-23 10:40:28,874][07928] Num frames 3200...
[2023-02-23 10:40:28,998][07928] Num frames 3300...
[2023-02-23 10:40:29,109][07928] Num frames 3400...
[2023-02-23 10:40:29,217][07928] Num frames 3500...
[2023-02-23 10:40:29,327][07928] Num frames 3600...
[2023-02-23 10:40:29,437][07928] Num frames 3700...
[2023-02-23 10:40:29,552][07928] Num frames 3800...
[2023-02-23 10:40:29,663][07928] Num frames 3900...
[2023-02-23 10:40:29,784][07928] Avg episode rewards: #0: 35.203, true rewards: #0: 13.203
[2023-02-23 10:40:29,786][07928] Avg episode reward: 35.203, avg true_objective: 13.203
[2023-02-23 10:40:29,831][07928] Num frames 4000...
[2023-02-23 10:40:29,940][07928] Num frames 4100...
[2023-02-23 10:40:30,055][07928] Num frames 4200...
[2023-02-23 10:40:30,167][07928] Num frames 4300...
[2023-02-23 10:40:30,280][07928] Num frames 4400...
[2023-02-23 10:40:30,396][07928] Num frames 4500...
[2023-02-23 10:40:30,507][07928] Num frames 4600...
[2023-02-23 10:40:30,619][07928] Num frames 4700...
[2023-02-23 10:40:30,730][07928] Num frames 4800...
[2023-02-23 10:40:30,841][07928] Num frames 4900...
[2023-02-23 10:40:30,950][07928] Num frames 5000...
[2023-02-23 10:40:31,061][07928] Num frames 5100...
[2023-02-23 10:40:31,169][07928] Num frames 5200...
[2023-02-23 10:40:31,278][07928] Num frames 5300...
[2023-02-23 10:40:31,388][07928] Num frames 5400...
[2023-02-23 10:40:31,503][07928] Num frames 5500...
[2023-02-23 10:40:31,616][07928] Num frames 5600...
[2023-02-23 10:40:31,729][07928] Num frames 5700...
[2023-02-23 10:40:31,841][07928] Num frames 5800...
[2023-02-23 10:40:31,954][07928] Num frames 5900...
[2023-02-23 10:40:32,085][07928] Num frames 6000...
[2023-02-23 10:40:32,208][07928] Avg episode rewards: #0: 41.152, true rewards: #0: 15.152
[2023-02-23 10:40:32,210][07928] Avg episode reward: 41.152, avg true_objective: 15.152
[2023-02-23 10:40:32,256][07928] Num frames 6100...
[2023-02-23 10:40:32,367][07928] Num frames 6200...
[2023-02-23 10:40:32,479][07928] Num frames 6300...
[2023-02-23 10:40:32,591][07928] Num frames 6400...
[2023-02-23 10:40:32,703][07928] Num frames 6500...
[2023-02-23 10:40:32,817][07928] Num frames 6600...
[2023-02-23 10:40:32,933][07928] Num frames 6700...
[2023-02-23 10:40:33,057][07928] Num frames 6800...
[2023-02-23 10:40:33,189][07928] Num frames 6900...
[2023-02-23 10:40:33,298][07928] Num frames 7000...
[2023-02-23 10:40:33,410][07928] Num frames 7100...
[2023-02-23 10:40:33,535][07928] Num frames 7200...
[2023-02-23 10:40:33,650][07928] Num frames 7300...
[2023-02-23 10:40:33,762][07928] Num frames 7400...
[2023-02-23 10:40:33,876][07928] Num frames 7500...
[2023-02-23 10:40:33,988][07928] Num frames 7600...
[2023-02-23 10:40:34,157][07928] Avg episode rewards: #0: 41.786, true rewards: #0: 15.386
[2023-02-23 10:40:34,159][07928] Avg episode reward: 41.786, avg true_objective: 15.386
[2023-02-23 10:40:34,169][07928] Num frames 7700...
[2023-02-23 10:40:34,283][07928] Num frames 7800...
[2023-02-23 10:40:34,397][07928] Num frames 7900...
[2023-02-23 10:40:34,513][07928] Num frames 8000...
[2023-02-23 10:40:34,635][07928] Num frames 8100...
[2023-02-23 10:40:34,752][07928] Num frames 8200...
[2023-02-23 10:40:34,867][07928] Num frames 8300...
[2023-02-23 10:40:34,984][07928] Num frames 8400...
[2023-02-23 10:40:35,101][07928] Num frames 8500...
[2023-02-23 10:40:35,217][07928] Num frames 8600...
[2023-02-23 10:40:35,329][07928] Num frames 8700...
[2023-02-23 10:40:35,448][07928] Num frames 8800...
[2023-02-23 10:40:35,561][07928] Num frames 8900...
[2023-02-23 10:40:35,674][07928] Num frames 9000...
[2023-02-23 10:40:35,789][07928] Num frames 9100...
[2023-02-23 10:40:35,900][07928] Num frames 9200...
[2023-02-23 10:40:36,009][07928] Num frames 9300...
[2023-02-23 10:40:36,118][07928] Num frames 9400...
[2023-02-23 10:40:36,233][07928] Num frames 9500...
[2023-02-23 10:40:36,345][07928] Num frames 9600...
[2023-02-23 10:40:36,457][07928] Num frames 9700...
[2023-02-23 10:40:36,616][07928] Avg episode rewards: #0: 44.988, true rewards: #0: 16.322
[2023-02-23 10:40:36,617][07928] Avg episode reward: 44.988, avg true_objective: 16.322
[2023-02-23 10:40:36,626][07928] Num frames 9800...
[2023-02-23 10:40:36,737][07928] Num frames 9900...
[2023-02-23 10:40:36,842][07928] Num frames 10000...
[2023-02-23 10:40:36,949][07928] Num frames 10100...
[2023-02-23 10:40:37,057][07928] Num frames 10200...
[2023-02-23 10:40:37,166][07928] Num frames 10300...
[2023-02-23 10:40:37,274][07928] Num frames 10400...
[2023-02-23 10:40:37,380][07928] Num frames 10500...
[2023-02-23 10:40:37,490][07928] Num frames 10600...
[2023-02-23 10:40:37,602][07928] Num frames 10700...
[2023-02-23 10:40:37,710][07928] Num frames 10800...
[2023-02-23 10:40:37,818][07928] Num frames 10900...
[2023-02-23 10:40:37,928][07928] Num frames 11000...
[2023-02-23 10:40:38,040][07928] Num frames 11100...
[2023-02-23 10:40:38,152][07928] Num frames 11200...
[2023-02-23 10:40:38,263][07928] Num frames 11300...
[2023-02-23 10:40:38,359][07928] Avg episode rewards: #0: 44.479, true rewards: #0: 16.194
[2023-02-23 10:40:38,361][07928] Avg episode reward: 44.479, avg true_objective: 16.194
[2023-02-23 10:40:38,434][07928] Num frames 11400...
[2023-02-23 10:40:38,547][07928] Num frames 11500...
[2023-02-23 10:40:38,660][07928] Num frames 11600...
[2023-02-23 10:40:38,771][07928] Num frames 11700...
[2023-02-23 10:40:38,882][07928] Num frames 11800...
[2023-02-23 10:40:38,988][07928] Avg episode rewards: #0: 40.185, true rewards: #0: 14.810
[2023-02-23 10:40:38,990][07928] Avg episode reward: 40.185, avg true_objective: 14.810
[2023-02-23 10:40:39,049][07928] Num frames 11900...
[2023-02-23 10:40:39,159][07928] Num frames 12000...
[2023-02-23 10:40:39,272][07928] Num frames 12100...
[2023-02-23 10:40:39,383][07928] Num frames 12200...
[2023-02-23 10:40:39,491][07928] Num frames 12300...
[2023-02-23 10:40:39,600][07928] Num frames 12400...
[2023-02-23 10:40:39,713][07928] Num frames 12500...
[2023-02-23 10:40:39,846][07928] Avg episode rewards: #0: 37.632, true rewards: #0: 13.966
[2023-02-23 10:40:39,848][07928] Avg episode reward: 37.632, avg true_objective: 13.966
[2023-02-23 10:40:39,884][07928] Num frames 12600...
[2023-02-23 10:40:39,994][07928] Num frames 12700...
[2023-02-23 10:40:40,102][07928] Num frames 12800...
[2023-02-23 10:40:40,211][07928] Num frames 12900...
[2023-02-23 10:40:40,324][07928] Num frames 13000...
[2023-02-23 10:40:40,435][07928] Num frames 13100...
[2023-02-23 10:40:40,545][07928] Num frames 13200...
[2023-02-23 10:40:40,659][07928] Num frames 13300...
[2023-02-23 10:40:40,771][07928] Num frames 13400...
[2023-02-23 10:40:40,883][07928] Num frames 13500...
[2023-02-23 10:40:40,997][07928] Num frames 13600...
[2023-02-23 10:40:41,110][07928] Num frames 13700...
[2023-02-23 10:40:41,221][07928] Num frames 13800...
[2023-02-23 10:40:41,351][07928] Num frames 13900...
[2023-02-23 10:40:41,463][07928] Num frames 14000...
[2023-02-23 10:40:41,576][07928] Num frames 14100...
[2023-02-23 10:40:41,693][07928] Num frames 14200...
[2023-02-23 10:40:41,808][07928] Num frames 14300...
[2023-02-23 10:40:41,919][07928] Num frames 14400...
[2023-02-23 10:40:42,031][07928] Num frames 14500...
[2023-02-23 10:40:42,142][07928] Num frames 14600...
[2023-02-23 10:40:42,279][07928] Avg episode rewards: #0: 38.968, true rewards: #0: 14.669
[2023-02-23 10:40:42,281][07928] Avg episode reward: 38.968, avg true_objective: 14.669
[2023-02-23 10:41:16,530][07928] Replay video saved to /content/train_dir/default_experiment/replay.mp4!