[2023-02-26 17:25:35,004][00949] Saving configuration to /content/train_dir/default_experiment/config.json... [2023-02-26 17:25:35,011][00949] Rollout worker 0 uses device cpu [2023-02-26 17:25:35,013][00949] Rollout worker 1 uses device cpu [2023-02-26 17:25:35,014][00949] Rollout worker 2 uses device cpu [2023-02-26 17:25:35,015][00949] Rollout worker 3 uses device cpu [2023-02-26 17:25:35,017][00949] Rollout worker 4 uses device cpu [2023-02-26 17:25:35,018][00949] Rollout worker 5 uses device cpu [2023-02-26 17:25:35,019][00949] Rollout worker 6 uses device cpu [2023-02-26 17:25:35,020][00949] Rollout worker 7 uses device cpu [2023-02-26 17:25:35,213][00949] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 17:25:35,215][00949] InferenceWorker_p0-w0: min num requests: 2 [2023-02-26 17:25:35,245][00949] Starting all processes... [2023-02-26 17:25:35,248][00949] Starting process learner_proc0 [2023-02-26 17:25:35,304][00949] Starting all processes... [2023-02-26 17:25:35,316][00949] Starting process inference_proc0-0 [2023-02-26 17:25:35,317][00949] Starting process rollout_proc0 [2023-02-26 17:25:35,320][00949] Starting process rollout_proc1 [2023-02-26 17:25:35,320][00949] Starting process rollout_proc2 [2023-02-26 17:25:35,320][00949] Starting process rollout_proc3 [2023-02-26 17:25:35,320][00949] Starting process rollout_proc4 [2023-02-26 17:25:35,320][00949] Starting process rollout_proc5 [2023-02-26 17:25:35,321][00949] Starting process rollout_proc6 [2023-02-26 17:25:35,321][00949] Starting process rollout_proc7 [2023-02-26 17:25:45,593][11517] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 17:25:45,601][11517] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-02-26 17:25:46,032][11538] Worker 6 uses CPU cores [0] [2023-02-26 17:25:46,155][11531] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 17:25:46,158][11531] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-02-26 17:25:46,266][11536] Worker 5 uses CPU cores [1] [2023-02-26 17:25:46,344][11537] Worker 4 uses CPU cores [0] [2023-02-26 17:25:46,376][11533] Worker 0 uses CPU cores [0] [2023-02-26 17:25:46,412][11532] Worker 1 uses CPU cores [1] [2023-02-26 17:25:46,498][11535] Worker 3 uses CPU cores [1] [2023-02-26 17:25:46,532][11539] Worker 7 uses CPU cores [1] [2023-02-26 17:25:46,544][11534] Worker 2 uses CPU cores [0] [2023-02-26 17:25:46,864][11531] Num visible devices: 1 [2023-02-26 17:25:46,865][11517] Num visible devices: 1 [2023-02-26 17:25:46,876][11517] Starting seed is not provided [2023-02-26 17:25:46,876][11517] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 17:25:46,877][11517] Initializing actor-critic model on device cuda:0 [2023-02-26 17:25:46,878][11517] RunningMeanStd input shape: (3, 72, 128) [2023-02-26 17:25:46,881][11517] RunningMeanStd input shape: (1,) [2023-02-26 17:25:46,901][11517] ConvEncoder: input_channels=3 [2023-02-26 17:25:47,245][11517] Conv encoder output size: 512 [2023-02-26 17:25:47,245][11517] Policy head output size: 512 [2023-02-26 17:25:47,305][11517] Created Actor Critic model with architecture: [2023-02-26 17:25:47,305][11517] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) [2023-02-26 17:25:54,835][11517] Using optimizer [2023-02-26 17:25:54,838][11517] No checkpoints found [2023-02-26 17:25:54,838][11517] Did not load from checkpoint, starting from scratch! [2023-02-26 17:25:54,839][11517] Initialized policy 0 weights for model version 0 [2023-02-26 17:25:54,850][11517] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 17:25:54,857][11517] LearnerWorker_p0 finished initialization! [2023-02-26 17:25:55,050][11531] RunningMeanStd input shape: (3, 72, 128) [2023-02-26 17:25:55,052][11531] RunningMeanStd input shape: (1,) [2023-02-26 17:25:55,064][11531] ConvEncoder: input_channels=3 [2023-02-26 17:25:55,163][11531] Conv encoder output size: 512 [2023-02-26 17:25:55,163][11531] Policy head output size: 512 [2023-02-26 17:25:55,206][00949] Heartbeat connected on Batcher_0 [2023-02-26 17:25:55,214][00949] Heartbeat connected on LearnerWorker_p0 [2023-02-26 17:25:55,223][00949] Heartbeat connected on RolloutWorker_w0 [2023-02-26 17:25:55,227][00949] Heartbeat connected on RolloutWorker_w1 [2023-02-26 17:25:55,230][00949] Heartbeat connected on RolloutWorker_w2 [2023-02-26 17:25:55,236][00949] Heartbeat connected on RolloutWorker_w3 [2023-02-26 17:25:55,237][00949] Heartbeat connected on RolloutWorker_w4 [2023-02-26 17:25:55,241][00949] Heartbeat connected on RolloutWorker_w5 [2023-02-26 17:25:55,244][00949] Heartbeat connected on RolloutWorker_w6 [2023-02-26 17:25:55,248][00949] Heartbeat connected on RolloutWorker_w7 [2023-02-26 17:25:57,453][00949] Inference worker 0-0 is ready! [2023-02-26 17:25:57,456][00949] All inference workers are ready! Signal rollout workers to start! [2023-02-26 17:25:57,485][00949] Heartbeat connected on InferenceWorker_p0-w0 [2023-02-26 17:25:57,670][11532] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 17:25:57,665][11535] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 17:25:57,671][11539] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 17:25:57,668][11536] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 17:25:57,717][11538] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 17:25:57,725][11534] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 17:25:57,726][11533] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 17:25:57,731][11537] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 17:25:57,951][11533] VizDoom game.init() threw an exception ViZDoomUnexpectedExitException('Controlled ViZDoom instance exited unexpectedly.'). Terminate process... [2023-02-26 17:25:57,954][11533] EvtLoop [rollout_proc0_evt_loop, process=rollout_proc0] unhandled exception in slot='init' connected to emitter=Emitter(object_id='Sampler', signal_name='_inference_workers_initialized'), args=() Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 228, in _game_init self.game.init() vizdoom.vizdoom.ViZDoomUnexpectedExitException: Controlled ViZDoom instance exited unexpectedly. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 150, in init env_runner.init(self.timing) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 418, in init self._reset() File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 430, in _reset observations, info = e.reset(seed=seed) # new way of doing seeding since Gym 0.26.0 File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 125, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 110, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 30, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 379, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 84, in reset obs, info = self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 323, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 51, in reset return self.env.reset(**kwargs) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 323, in reset self._ensure_initialized() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 274, in _ensure_initialized self.initialize() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 269, in initialize self._game_init() File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 244, in _game_init raise EnvCriticalError() sample_factory.envs.env_utils.EnvCriticalError [2023-02-26 17:25:57,956][11533] Unhandled exception in evt loop rollout_proc0_evt_loop [2023-02-26 17:25:58,917][11537] Decorrelating experience for 0 frames... [2023-02-26 17:25:59,586][11539] Decorrelating experience for 0 frames... [2023-02-26 17:25:59,588][11536] Decorrelating experience for 0 frames... [2023-02-26 17:25:59,590][11535] Decorrelating experience for 0 frames... [2023-02-26 17:25:59,594][11532] Decorrelating experience for 0 frames... [2023-02-26 17:25:59,840][00949] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-26 17:25:59,904][11537] Decorrelating experience for 32 frames... [2023-02-26 17:26:00,411][11532] Decorrelating experience for 32 frames... [2023-02-26 17:26:00,821][11538] Decorrelating experience for 0 frames... [2023-02-26 17:26:00,872][11534] Decorrelating experience for 0 frames... [2023-02-26 17:26:01,437][11537] Decorrelating experience for 64 frames... [2023-02-26 17:26:01,911][11539] Decorrelating experience for 32 frames... [2023-02-26 17:26:01,994][11532] Decorrelating experience for 64 frames... [2023-02-26 17:26:02,223][11538] Decorrelating experience for 32 frames... [2023-02-26 17:26:02,294][11534] Decorrelating experience for 32 frames... [2023-02-26 17:26:02,513][11536] Decorrelating experience for 32 frames... [2023-02-26 17:26:02,958][11537] Decorrelating experience for 96 frames... [2023-02-26 17:26:03,483][11539] Decorrelating experience for 64 frames... [2023-02-26 17:26:03,489][11532] Decorrelating experience for 96 frames... [2023-02-26 17:26:03,741][11534] Decorrelating experience for 64 frames... [2023-02-26 17:26:04,370][11539] Decorrelating experience for 96 frames... [2023-02-26 17:26:04,511][11535] Decorrelating experience for 32 frames... [2023-02-26 17:26:04,685][11538] Decorrelating experience for 64 frames... [2023-02-26 17:26:04,840][00949] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-26 17:26:05,209][11535] Decorrelating experience for 64 frames... [2023-02-26 17:26:05,296][11534] Decorrelating experience for 96 frames... [2023-02-26 17:26:05,365][11536] Decorrelating experience for 64 frames... [2023-02-26 17:26:05,646][11538] Decorrelating experience for 96 frames... [2023-02-26 17:26:06,013][11536] Decorrelating experience for 96 frames... [2023-02-26 17:26:06,199][11535] Decorrelating experience for 96 frames... [2023-02-26 17:26:09,840][00949] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 218.4. Samples: 2184. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-26 17:26:09,842][00949] Avg episode reward: [(0, '2.212')] [2023-02-26 17:26:10,131][11517] Signal inference workers to stop experience collection... [2023-02-26 17:26:10,151][11531] InferenceWorker_p0-w0: stopping experience collection [2023-02-26 17:26:12,824][11517] Signal inference workers to resume experience collection... [2023-02-26 17:26:12,826][11531] InferenceWorker_p0-w0: resuming experience collection [2023-02-26 17:26:14,840][00949] Fps is (10 sec: 1228.8, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 12288. Throughput: 0: 180.1. Samples: 2702. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-02-26 17:26:14,849][00949] Avg episode reward: [(0, '3.019')] [2023-02-26 17:26:19,845][00949] Fps is (10 sec: 2865.6, 60 sec: 1433.2, 300 sec: 1433.2). Total num frames: 28672. Throughput: 0: 319.9. Samples: 6400. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-26 17:26:19,850][00949] Avg episode reward: [(0, '3.624')] [2023-02-26 17:26:23,852][11531] Updated weights for policy 0, policy_version 10 (0.0024) [2023-02-26 17:26:24,840][00949] Fps is (10 sec: 2867.2, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 40960. Throughput: 0: 451.0. Samples: 11274. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:26:24,847][00949] Avg episode reward: [(0, '4.195')] [2023-02-26 17:26:29,840][00949] Fps is (10 sec: 3688.5, 60 sec: 2184.5, 300 sec: 2184.5). Total num frames: 65536. Throughput: 0: 491.2. Samples: 14736. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) [2023-02-26 17:26:29,848][00949] Avg episode reward: [(0, '4.226')] [2023-02-26 17:26:32,805][11531] Updated weights for policy 0, policy_version 20 (0.0021) [2023-02-26 17:26:34,844][00949] Fps is (10 sec: 4503.5, 60 sec: 2457.3, 300 sec: 2457.3). Total num frames: 86016. Throughput: 0: 620.5. Samples: 21722. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:26:34,849][00949] Avg episode reward: [(0, '4.334')] [2023-02-26 17:26:39,854][00949] Fps is (10 sec: 3681.0, 60 sec: 2559.1, 300 sec: 2559.1). Total num frames: 102400. Throughput: 0: 654.1. Samples: 26172. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:26:39,864][00949] Avg episode reward: [(0, '4.355')] [2023-02-26 17:26:39,877][11517] Saving new best policy, reward=4.355! [2023-02-26 17:26:44,840][00949] Fps is (10 sec: 3278.3, 60 sec: 2639.6, 300 sec: 2639.6). Total num frames: 118784. Throughput: 0: 630.1. Samples: 28356. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:26:44,843][00949] Avg episode reward: [(0, '4.272')] [2023-02-26 17:26:44,907][11531] Updated weights for policy 0, policy_version 30 (0.0031) [2023-02-26 17:26:49,840][00949] Fps is (10 sec: 4102.0, 60 sec: 2867.2, 300 sec: 2867.2). Total num frames: 143360. Throughput: 0: 786.0. Samples: 35372. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:26:49,846][00949] Avg episode reward: [(0, '4.308')] [2023-02-26 17:26:54,090][11531] Updated weights for policy 0, policy_version 40 (0.0012) [2023-02-26 17:26:54,849][00949] Fps is (10 sec: 4501.5, 60 sec: 2978.4, 300 sec: 2978.4). Total num frames: 163840. Throughput: 0: 876.3. Samples: 41624. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:26:54,859][00949] Avg episode reward: [(0, '4.482')] [2023-02-26 17:26:54,861][11517] Saving new best policy, reward=4.482! [2023-02-26 17:26:59,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3003.7, 300 sec: 3003.7). Total num frames: 180224. Throughput: 0: 912.8. Samples: 43776. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-26 17:26:59,847][00949] Avg episode reward: [(0, '4.627')] [2023-02-26 17:26:59,856][11517] Saving new best policy, reward=4.627! [2023-02-26 17:27:04,840][00949] Fps is (10 sec: 3279.8, 60 sec: 3276.8, 300 sec: 3024.7). Total num frames: 196608. Throughput: 0: 942.4. Samples: 48802. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:27:04,843][00949] Avg episode reward: [(0, '4.411')] [2023-02-26 17:27:05,817][11531] Updated weights for policy 0, policy_version 50 (0.0012) [2023-02-26 17:27:09,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3159.8). Total num frames: 221184. Throughput: 0: 991.2. Samples: 55876. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 17:27:09,843][00949] Avg episode reward: [(0, '4.376')] [2023-02-26 17:27:14,844][00949] Fps is (10 sec: 4503.5, 60 sec: 3822.6, 300 sec: 3222.0). Total num frames: 241664. Throughput: 0: 987.1. Samples: 59158. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:27:14,847][00949] Avg episode reward: [(0, '4.375')] [2023-02-26 17:27:16,437][11531] Updated weights for policy 0, policy_version 60 (0.0012) [2023-02-26 17:27:19,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3755.0, 300 sec: 3174.4). Total num frames: 253952. Throughput: 0: 930.0. Samples: 63566. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 17:27:19,844][00949] Avg episode reward: [(0, '4.486')] [2023-02-26 17:27:24,840][00949] Fps is (10 sec: 3278.3, 60 sec: 3891.2, 300 sec: 3228.6). Total num frames: 274432. Throughput: 0: 957.2. Samples: 69230. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:27:24,847][00949] Avg episode reward: [(0, '4.588')] [2023-02-26 17:27:26,941][11531] Updated weights for policy 0, policy_version 70 (0.0018) [2023-02-26 17:27:29,842][00949] Fps is (10 sec: 4504.5, 60 sec: 3891.0, 300 sec: 3322.2). Total num frames: 299008. Throughput: 0: 987.4. Samples: 72792. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:27:29,848][00949] Avg episode reward: [(0, '4.335')] [2023-02-26 17:27:29,859][11517] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000073_299008.pth... [2023-02-26 17:27:34,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3823.2, 300 sec: 3319.9). Total num frames: 315392. Throughput: 0: 968.4. Samples: 78948. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:27:34,849][00949] Avg episode reward: [(0, '4.470')] [2023-02-26 17:27:38,025][11531] Updated weights for policy 0, policy_version 80 (0.0011) [2023-02-26 17:27:39,840][00949] Fps is (10 sec: 3277.6, 60 sec: 3823.9, 300 sec: 3317.8). Total num frames: 331776. Throughput: 0: 929.0. Samples: 83422. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:27:39,845][00949] Avg episode reward: [(0, '4.382')] [2023-02-26 17:27:44,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3354.8). Total num frames: 352256. Throughput: 0: 946.9. Samples: 86386. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 17:27:44,845][00949] Avg episode reward: [(0, '4.358')] [2023-02-26 17:27:47,860][11531] Updated weights for policy 0, policy_version 90 (0.0017) [2023-02-26 17:27:49,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3425.7). Total num frames: 376832. Throughput: 0: 992.7. Samples: 93472. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:27:49,842][00949] Avg episode reward: [(0, '4.306')] [2023-02-26 17:27:54,840][00949] Fps is (10 sec: 4095.8, 60 sec: 3823.5, 300 sec: 3419.3). Total num frames: 393216. Throughput: 0: 958.7. Samples: 99016. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:27:54,845][00949] Avg episode reward: [(0, '4.435')] [2023-02-26 17:27:59,618][11531] Updated weights for policy 0, policy_version 100 (0.0047) [2023-02-26 17:27:59,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3413.3). Total num frames: 409600. Throughput: 0: 933.9. Samples: 101178. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:27:59,845][00949] Avg episode reward: [(0, '4.556')] [2023-02-26 17:28:04,840][00949] Fps is (10 sec: 3686.6, 60 sec: 3891.2, 300 sec: 3440.6). Total num frames: 430080. Throughput: 0: 966.5. Samples: 107060. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:28:04,842][00949] Avg episode reward: [(0, '4.508')] [2023-02-26 17:28:08,512][11531] Updated weights for policy 0, policy_version 110 (0.0022) [2023-02-26 17:28:09,841][00949] Fps is (10 sec: 4505.1, 60 sec: 3891.1, 300 sec: 3497.3). Total num frames: 454656. Throughput: 0: 995.0. Samples: 114004. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 17:28:09,846][00949] Avg episode reward: [(0, '4.402')] [2023-02-26 17:28:14,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3823.2, 300 sec: 3489.2). Total num frames: 471040. Throughput: 0: 973.0. Samples: 116576. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:28:14,844][00949] Avg episode reward: [(0, '4.459')] [2023-02-26 17:28:19,840][00949] Fps is (10 sec: 2867.5, 60 sec: 3822.9, 300 sec: 3452.3). Total num frames: 483328. Throughput: 0: 934.1. Samples: 120984. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) [2023-02-26 17:28:19,842][00949] Avg episode reward: [(0, '4.448')] [2023-02-26 17:28:21,054][11531] Updated weights for policy 0, policy_version 120 (0.0023) [2023-02-26 17:28:24,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3502.8). Total num frames: 507904. Throughput: 0: 976.1. Samples: 127346. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:28:24,843][00949] Avg episode reward: [(0, '4.659')] [2023-02-26 17:28:24,854][11517] Saving new best policy, reward=4.659! [2023-02-26 17:28:29,842][00949] Fps is (10 sec: 4504.7, 60 sec: 3823.0, 300 sec: 3522.5). Total num frames: 528384. Throughput: 0: 985.1. Samples: 130718. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:28:29,846][00949] Avg episode reward: [(0, '4.524')] [2023-02-26 17:28:30,205][11531] Updated weights for policy 0, policy_version 130 (0.0014) [2023-02-26 17:28:34,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3514.6). Total num frames: 544768. Throughput: 0: 946.8. Samples: 136078. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:28:34,845][00949] Avg episode reward: [(0, '4.492')] [2023-02-26 17:28:39,840][00949] Fps is (10 sec: 3277.4, 60 sec: 3822.9, 300 sec: 3507.2). Total num frames: 561152. Throughput: 0: 927.6. Samples: 140758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:28:39,843][00949] Avg episode reward: [(0, '4.586')] [2023-02-26 17:28:42,193][11531] Updated weights for policy 0, policy_version 140 (0.0022) [2023-02-26 17:28:44,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3525.0). Total num frames: 581632. Throughput: 0: 956.2. Samples: 144208. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 17:28:44,842][00949] Avg episode reward: [(0, '4.459')] [2023-02-26 17:28:49,845][00949] Fps is (10 sec: 4503.1, 60 sec: 3822.6, 300 sec: 3565.8). Total num frames: 606208. Throughput: 0: 980.9. Samples: 151206. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:28:49,851][00949] Avg episode reward: [(0, '4.549')] [2023-02-26 17:28:51,949][11531] Updated weights for policy 0, policy_version 150 (0.0015) [2023-02-26 17:28:54,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3557.7). Total num frames: 622592. Throughput: 0: 933.7. Samples: 156020. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:28:54,844][00949] Avg episode reward: [(0, '4.459')] [2023-02-26 17:28:59,840][00949] Fps is (10 sec: 3278.6, 60 sec: 3822.9, 300 sec: 3549.9). Total num frames: 638976. Throughput: 0: 929.0. Samples: 158382. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:28:59,843][00949] Avg episode reward: [(0, '4.461')] [2023-02-26 17:29:02,724][11531] Updated weights for policy 0, policy_version 160 (0.0014) [2023-02-26 17:29:04,841][00949] Fps is (10 sec: 4095.3, 60 sec: 3891.1, 300 sec: 3586.7). Total num frames: 663552. Throughput: 0: 978.7. Samples: 165028. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:29:04,844][00949] Avg episode reward: [(0, '4.461')] [2023-02-26 17:29:09,840][00949] Fps is (10 sec: 4505.5, 60 sec: 3823.0, 300 sec: 3600.2). Total num frames: 684032. Throughput: 0: 982.8. Samples: 171574. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:29:09,843][00949] Avg episode reward: [(0, '4.617')] [2023-02-26 17:29:13,402][11531] Updated weights for policy 0, policy_version 170 (0.0012) [2023-02-26 17:29:14,842][00949] Fps is (10 sec: 3276.8, 60 sec: 3754.6, 300 sec: 3570.8). Total num frames: 696320. Throughput: 0: 957.4. Samples: 173800. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-26 17:29:14,844][00949] Avg episode reward: [(0, '4.814')] [2023-02-26 17:29:14,939][11517] Saving new best policy, reward=4.814! [2023-02-26 17:29:19,840][00949] Fps is (10 sec: 3276.9, 60 sec: 3891.2, 300 sec: 3584.0). Total num frames: 716800. Throughput: 0: 941.7. Samples: 178454. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:29:19,842][00949] Avg episode reward: [(0, '4.713')] [2023-02-26 17:29:23,916][11531] Updated weights for policy 0, policy_version 180 (0.0013) [2023-02-26 17:29:24,840][00949] Fps is (10 sec: 4096.7, 60 sec: 3822.9, 300 sec: 3596.5). Total num frames: 737280. Throughput: 0: 993.0. Samples: 185442. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:29:24,843][00949] Avg episode reward: [(0, '4.611')] [2023-02-26 17:29:29,845][00949] Fps is (10 sec: 4093.7, 60 sec: 3822.7, 300 sec: 3608.3). Total num frames: 757760. Throughput: 0: 992.9. Samples: 188896. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:29:29,848][00949] Avg episode reward: [(0, '4.724')] [2023-02-26 17:29:29,856][11517] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000185_757760.pth... [2023-02-26 17:29:34,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3600.7). Total num frames: 774144. Throughput: 0: 936.1. Samples: 193326. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:29:34,842][00949] Avg episode reward: [(0, '4.723')] [2023-02-26 17:29:35,947][11531] Updated weights for policy 0, policy_version 190 (0.0051) [2023-02-26 17:29:39,840][00949] Fps is (10 sec: 3688.5, 60 sec: 3891.2, 300 sec: 3611.9). Total num frames: 794624. Throughput: 0: 952.0. Samples: 198858. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:29:39,842][00949] Avg episode reward: [(0, '4.777')] [2023-02-26 17:29:44,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3622.7). Total num frames: 815104. Throughput: 0: 975.6. Samples: 202284. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:29:44,846][00949] Avg episode reward: [(0, '4.816')] [2023-02-26 17:29:44,849][11517] Saving new best policy, reward=4.816! [2023-02-26 17:29:45,075][11531] Updated weights for policy 0, policy_version 200 (0.0017) [2023-02-26 17:29:49,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3823.3, 300 sec: 3633.0). Total num frames: 835584. Throughput: 0: 965.6. Samples: 208478. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 17:29:49,856][00949] Avg episode reward: [(0, '4.646')] [2023-02-26 17:29:54,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3608.0). Total num frames: 847872. Throughput: 0: 919.4. Samples: 212948. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:29:54,843][00949] Avg episode reward: [(0, '4.628')] [2023-02-26 17:29:57,429][11531] Updated weights for policy 0, policy_version 210 (0.0020) [2023-02-26 17:29:59,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3618.1). Total num frames: 868352. Throughput: 0: 929.2. Samples: 215614. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:29:59,845][00949] Avg episode reward: [(0, '4.757')] [2023-02-26 17:30:04,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3823.0, 300 sec: 3644.6). Total num frames: 892928. Throughput: 0: 981.8. Samples: 222634. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 17:30:04,842][00949] Avg episode reward: [(0, '4.826')] [2023-02-26 17:30:04,846][11517] Saving new best policy, reward=4.826! [2023-02-26 17:30:06,116][11531] Updated weights for policy 0, policy_version 220 (0.0013) [2023-02-26 17:30:09,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3637.2). Total num frames: 909312. Throughput: 0: 952.1. Samples: 228288. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:30:09,846][00949] Avg episode reward: [(0, '4.950')] [2023-02-26 17:30:09,855][11517] Saving new best policy, reward=4.950! [2023-02-26 17:30:14,843][00949] Fps is (10 sec: 3275.6, 60 sec: 3822.8, 300 sec: 3630.1). Total num frames: 925696. Throughput: 0: 924.4. Samples: 230494. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:30:14,846][00949] Avg episode reward: [(0, '4.905')] [2023-02-26 17:30:18,237][11531] Updated weights for policy 0, policy_version 230 (0.0014) [2023-02-26 17:30:19,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3639.1). Total num frames: 946176. Throughput: 0: 953.0. Samples: 236210. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:30:19,845][00949] Avg episode reward: [(0, '5.068')] [2023-02-26 17:30:19,930][11517] Saving new best policy, reward=5.068! [2023-02-26 17:30:24,840][00949] Fps is (10 sec: 4507.3, 60 sec: 3891.2, 300 sec: 3663.2). Total num frames: 970752. Throughput: 0: 985.4. Samples: 243202. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:30:24,846][00949] Avg episode reward: [(0, '5.326')] [2023-02-26 17:30:24,850][11517] Saving new best policy, reward=5.326! [2023-02-26 17:30:27,827][11531] Updated weights for policy 0, policy_version 240 (0.0017) [2023-02-26 17:30:29,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3823.3, 300 sec: 3656.1). Total num frames: 987136. Throughput: 0: 971.3. Samples: 245994. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:30:29,844][00949] Avg episode reward: [(0, '5.178')] [2023-02-26 17:30:34,840][00949] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3649.2). Total num frames: 1003520. Throughput: 0: 931.6. Samples: 250398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:30:34,843][00949] Avg episode reward: [(0, '5.104')] [2023-02-26 17:30:39,246][11531] Updated weights for policy 0, policy_version 250 (0.0013) [2023-02-26 17:30:39,840][00949] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3657.1). Total num frames: 1024000. Throughput: 0: 974.7. Samples: 256810. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:30:39,846][00949] Avg episode reward: [(0, '5.051')] [2023-02-26 17:30:44,840][00949] Fps is (10 sec: 4505.7, 60 sec: 3891.2, 300 sec: 3679.2). Total num frames: 1048576. Throughput: 0: 994.5. Samples: 260366. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:30:44,847][00949] Avg episode reward: [(0, '5.169')] [2023-02-26 17:30:49,470][11531] Updated weights for policy 0, policy_version 260 (0.0027) [2023-02-26 17:30:49,840][00949] Fps is (10 sec: 4095.8, 60 sec: 3822.9, 300 sec: 3672.3). Total num frames: 1064960. Throughput: 0: 963.9. Samples: 266012. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 17:30:49,843][00949] Avg episode reward: [(0, '5.082')] [2023-02-26 17:30:54,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3665.6). Total num frames: 1081344. Throughput: 0: 942.8. Samples: 270714. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:30:54,843][00949] Avg episode reward: [(0, '5.377')] [2023-02-26 17:30:54,853][11517] Saving new best policy, reward=5.377! [2023-02-26 17:30:59,840][00949] Fps is (10 sec: 3686.6, 60 sec: 3891.2, 300 sec: 3735.0). Total num frames: 1101824. Throughput: 0: 970.6. Samples: 274166. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:30:59,842][00949] Avg episode reward: [(0, '5.870')] [2023-02-26 17:30:59,896][11517] Saving new best policy, reward=5.870! [2023-02-26 17:30:59,905][11531] Updated weights for policy 0, policy_version 270 (0.0012) [2023-02-26 17:31:04,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1126400. Throughput: 0: 996.2. Samples: 281040. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:31:04,842][00949] Avg episode reward: [(0, '5.996')] [2023-02-26 17:31:04,850][11517] Saving new best policy, reward=5.996! [2023-02-26 17:31:09,841][00949] Fps is (10 sec: 4095.7, 60 sec: 3891.1, 300 sec: 3832.2). Total num frames: 1142784. Throughput: 0: 949.5. Samples: 285930. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:31:09,847][00949] Avg episode reward: [(0, '6.725')] [2023-02-26 17:31:09,869][11517] Saving new best policy, reward=6.725! [2023-02-26 17:31:11,142][11531] Updated weights for policy 0, policy_version 280 (0.0012) [2023-02-26 17:31:14,840][00949] Fps is (10 sec: 2867.2, 60 sec: 3823.2, 300 sec: 3818.4). Total num frames: 1155072. Throughput: 0: 932.9. Samples: 287974. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 17:31:14,847][00949] Avg episode reward: [(0, '6.453')] [2023-02-26 17:31:19,840][00949] Fps is (10 sec: 2457.8, 60 sec: 3686.4, 300 sec: 3818.3). Total num frames: 1167360. Throughput: 0: 926.4. Samples: 292084. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 17:31:19,842][00949] Avg episode reward: [(0, '6.546')] [2023-02-26 17:31:24,296][11531] Updated weights for policy 0, policy_version 290 (0.0022) [2023-02-26 17:31:24,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3804.4). Total num frames: 1187840. Throughput: 0: 908.3. Samples: 297684. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:31:24,846][00949] Avg episode reward: [(0, '6.347')] [2023-02-26 17:31:29,840][00949] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3790.6). Total num frames: 1204224. Throughput: 0: 884.8. Samples: 300184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:31:29,848][00949] Avg episode reward: [(0, '6.591')] [2023-02-26 17:31:29,861][11517] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000294_1204224.pth... [2023-02-26 17:31:30,001][11517] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000073_299008.pth [2023-02-26 17:31:34,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3790.7). Total num frames: 1220608. Throughput: 0: 857.6. Samples: 304604. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:31:34,842][00949] Avg episode reward: [(0, '7.059')] [2023-02-26 17:31:34,848][11517] Saving new best policy, reward=7.059! [2023-02-26 17:31:36,557][11531] Updated weights for policy 0, policy_version 300 (0.0034) [2023-02-26 17:31:39,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3804.4). Total num frames: 1241088. Throughput: 0: 901.9. Samples: 311298. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 17:31:39,844][00949] Avg episode reward: [(0, '7.390')] [2023-02-26 17:31:39,878][11517] Saving new best policy, reward=7.390! [2023-02-26 17:31:44,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3804.4). Total num frames: 1265664. Throughput: 0: 901.6. Samples: 314740. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:31:44,842][00949] Avg episode reward: [(0, '7.702')] [2023-02-26 17:31:44,845][11517] Saving new best policy, reward=7.702! [2023-02-26 17:31:45,850][11531] Updated weights for policy 0, policy_version 310 (0.0024) [2023-02-26 17:31:49,841][00949] Fps is (10 sec: 3685.9, 60 sec: 3549.8, 300 sec: 3776.7). Total num frames: 1277952. Throughput: 0: 864.3. Samples: 319936. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 17:31:49,844][00949] Avg episode reward: [(0, '8.098')] [2023-02-26 17:31:49,916][11517] Saving new best policy, reward=8.098! [2023-02-26 17:31:54,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3790.5). Total num frames: 1298432. Throughput: 0: 864.1. Samples: 324816. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:31:54,844][00949] Avg episode reward: [(0, '7.590')] [2023-02-26 17:31:57,294][11531] Updated weights for policy 0, policy_version 320 (0.0016) [2023-02-26 17:31:59,840][00949] Fps is (10 sec: 4096.7, 60 sec: 3618.1, 300 sec: 3804.4). Total num frames: 1318912. Throughput: 0: 894.9. Samples: 328244. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:31:59,842][00949] Avg episode reward: [(0, '7.440')] [2023-02-26 17:32:04,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3804.4). Total num frames: 1343488. Throughput: 0: 961.4. Samples: 335346. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:32:04,843][00949] Avg episode reward: [(0, '7.782')] [2023-02-26 17:32:07,578][11531] Updated weights for policy 0, policy_version 330 (0.0014) [2023-02-26 17:32:09,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3776.7). Total num frames: 1355776. Throughput: 0: 937.3. Samples: 339862. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:32:09,845][00949] Avg episode reward: [(0, '8.119')] [2023-02-26 17:32:09,858][11517] Saving new best policy, reward=8.119! [2023-02-26 17:32:14,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3804.4). Total num frames: 1376256. Throughput: 0: 932.1. Samples: 342128. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:32:14,848][00949] Avg episode reward: [(0, '8.345')] [2023-02-26 17:32:14,853][11517] Saving new best policy, reward=8.345! [2023-02-26 17:32:18,112][11531] Updated weights for policy 0, policy_version 340 (0.0021) [2023-02-26 17:32:19,840][00949] Fps is (10 sec: 4095.9, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 1396736. Throughput: 0: 985.9. Samples: 348972. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:32:19,842][00949] Avg episode reward: [(0, '9.360')] [2023-02-26 17:32:19,861][11517] Saving new best policy, reward=9.360! [2023-02-26 17:32:24,840][00949] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3790.6). Total num frames: 1417216. Throughput: 0: 974.5. Samples: 355150. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:32:24,847][00949] Avg episode reward: [(0, '9.652')] [2023-02-26 17:32:24,853][11517] Saving new best policy, reward=9.652! [2023-02-26 17:32:29,565][11531] Updated weights for policy 0, policy_version 350 (0.0014) [2023-02-26 17:32:29,840][00949] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 1433600. Throughput: 0: 944.9. Samples: 357260. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:32:29,842][00949] Avg episode reward: [(0, '10.009')] [2023-02-26 17:32:29,867][11517] Saving new best policy, reward=10.009! [2023-02-26 17:32:34,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 1449984. Throughput: 0: 940.2. Samples: 362244. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:32:34,847][00949] Avg episode reward: [(0, '9.844')] [2023-02-26 17:32:39,413][11531] Updated weights for policy 0, policy_version 360 (0.0015) [2023-02-26 17:32:39,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 1474560. Throughput: 0: 986.1. Samples: 369190. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-26 17:32:39,848][00949] Avg episode reward: [(0, '10.459')] [2023-02-26 17:32:39,858][11517] Saving new best policy, reward=10.459! [2023-02-26 17:32:44,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 1495040. Throughput: 0: 982.6. Samples: 372462. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:32:44,846][00949] Avg episode reward: [(0, '11.138')] [2023-02-26 17:32:44,848][11517] Saving new best policy, reward=11.138! [2023-02-26 17:32:49,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3776.7). Total num frames: 1507328. Throughput: 0: 922.9. Samples: 376876. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:32:49,846][00949] Avg episode reward: [(0, '11.447')] [2023-02-26 17:32:49,863][11517] Saving new best policy, reward=11.447! [2023-02-26 17:32:51,764][11531] Updated weights for policy 0, policy_version 370 (0.0043) [2023-02-26 17:32:54,840][00949] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 1527808. Throughput: 0: 950.1. Samples: 382618. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:32:54,843][00949] Avg episode reward: [(0, '10.949')] [2023-02-26 17:32:59,840][00949] Fps is (10 sec: 4505.5, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 1552384. Throughput: 0: 978.8. Samples: 386174. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:32:59,847][00949] Avg episode reward: [(0, '10.504')] [2023-02-26 17:33:00,479][11531] Updated weights for policy 0, policy_version 380 (0.0016) [2023-02-26 17:33:04,844][00949] Fps is (10 sec: 4094.2, 60 sec: 3754.4, 300 sec: 3776.6). Total num frames: 1568768. Throughput: 0: 963.2. Samples: 392322. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:33:04,850][00949] Avg episode reward: [(0, '9.517')] [2023-02-26 17:33:09,840][00949] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 1585152. Throughput: 0: 926.4. Samples: 396840. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:33:09,847][00949] Avg episode reward: [(0, '9.296')] [2023-02-26 17:33:12,522][11531] Updated weights for policy 0, policy_version 390 (0.0016) [2023-02-26 17:33:14,840][00949] Fps is (10 sec: 3688.0, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 1605632. Throughput: 0: 946.2. Samples: 399840. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:33:14,842][00949] Avg episode reward: [(0, '9.862')] [2023-02-26 17:33:19,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 1630208. Throughput: 0: 992.2. Samples: 406892. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:33:19,845][00949] Avg episode reward: [(0, '11.117')] [2023-02-26 17:33:21,408][11531] Updated weights for policy 0, policy_version 400 (0.0012) [2023-02-26 17:33:24,842][00949] Fps is (10 sec: 4095.4, 60 sec: 3822.8, 300 sec: 3790.5). Total num frames: 1646592. Throughput: 0: 961.1. Samples: 412440. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:33:24,845][00949] Avg episode reward: [(0, '11.678')] [2023-02-26 17:33:24,854][11517] Saving new best policy, reward=11.678! [2023-02-26 17:33:29,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 1662976. Throughput: 0: 937.0. Samples: 414626. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:33:29,843][00949] Avg episode reward: [(0, '11.986')] [2023-02-26 17:33:29,860][11517] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000406_1662976.pth... [2023-02-26 17:33:29,986][11517] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000185_757760.pth [2023-02-26 17:33:30,006][11517] Saving new best policy, reward=11.986! [2023-02-26 17:33:33,354][11531] Updated weights for policy 0, policy_version 410 (0.0014) [2023-02-26 17:33:34,840][00949] Fps is (10 sec: 3687.0, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 1683456. Throughput: 0: 967.5. Samples: 420414. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:33:34,843][00949] Avg episode reward: [(0, '11.554')] [2023-02-26 17:33:39,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1708032. Throughput: 0: 996.5. Samples: 427460. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:33:39,842][00949] Avg episode reward: [(0, '11.617')] [2023-02-26 17:33:43,226][11531] Updated weights for policy 0, policy_version 420 (0.0016) [2023-02-26 17:33:44,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.6). Total num frames: 1724416. Throughput: 0: 974.9. Samples: 430042. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:33:44,845][00949] Avg episode reward: [(0, '12.304')] [2023-02-26 17:33:44,848][11517] Saving new best policy, reward=12.304! [2023-02-26 17:33:49,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 1740800. Throughput: 0: 936.6. Samples: 434466. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 17:33:49,847][00949] Avg episode reward: [(0, '12.738')] [2023-02-26 17:33:49,862][11517] Saving new best policy, reward=12.738! [2023-02-26 17:33:54,288][11531] Updated weights for policy 0, policy_version 430 (0.0012) [2023-02-26 17:33:54,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 1761280. Throughput: 0: 982.3. Samples: 441044. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-26 17:33:54,847][00949] Avg episode reward: [(0, '14.204')] [2023-02-26 17:33:54,851][11517] Saving new best policy, reward=14.204! [2023-02-26 17:33:59,845][00949] Fps is (10 sec: 4503.1, 60 sec: 3890.9, 300 sec: 3804.4). Total num frames: 1785856. Throughput: 0: 992.4. Samples: 444502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:33:59,857][00949] Avg episode reward: [(0, '14.503')] [2023-02-26 17:33:59,866][11517] Saving new best policy, reward=14.503! [2023-02-26 17:34:04,840][00949] Fps is (10 sec: 3686.3, 60 sec: 3823.2, 300 sec: 3776.7). Total num frames: 1798144. Throughput: 0: 953.0. Samples: 449776. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 17:34:04,846][00949] Avg episode reward: [(0, '14.731')] [2023-02-26 17:34:04,851][11517] Saving new best policy, reward=14.731! [2023-02-26 17:34:05,163][11531] Updated weights for policy 0, policy_version 440 (0.0037) [2023-02-26 17:34:09,840][00949] Fps is (10 sec: 2868.8, 60 sec: 3822.9, 300 sec: 3790.6). Total num frames: 1814528. Throughput: 0: 935.1. Samples: 454520. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:34:09,842][00949] Avg episode reward: [(0, '15.163')] [2023-02-26 17:34:09,857][11517] Saving new best policy, reward=15.163! [2023-02-26 17:34:14,840][00949] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 1839104. Throughput: 0: 963.1. Samples: 457964. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) [2023-02-26 17:34:14,843][00949] Avg episode reward: [(0, '13.990')] [2023-02-26 17:34:15,243][11531] Updated weights for policy 0, policy_version 450 (0.0012) [2023-02-26 17:34:19,841][00949] Fps is (10 sec: 4504.9, 60 sec: 3822.8, 300 sec: 3804.4). Total num frames: 1859584. Throughput: 0: 990.4. Samples: 464982. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 17:34:19,849][00949] Avg episode reward: [(0, '14.341')] [2023-02-26 17:34:24,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3790.6). Total num frames: 1875968. Throughput: 0: 936.6. Samples: 469608. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:34:24,842][00949] Avg episode reward: [(0, '14.149')] [2023-02-26 17:34:27,027][11531] Updated weights for policy 0, policy_version 460 (0.0011) [2023-02-26 17:34:29,840][00949] Fps is (10 sec: 3277.3, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 1892352. Throughput: 0: 930.9. Samples: 471932. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:34:29,842][00949] Avg episode reward: [(0, '14.242')] [2023-02-26 17:34:34,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 1916928. Throughput: 0: 982.8. Samples: 478694. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:34:34,848][00949] Avg episode reward: [(0, '15.572')] [2023-02-26 17:34:34,853][11517] Saving new best policy, reward=15.572! [2023-02-26 17:34:36,126][11531] Updated weights for policy 0, policy_version 470 (0.0013) [2023-02-26 17:34:39,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 1937408. Throughput: 0: 980.2. Samples: 485152. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:34:39,842][00949] Avg episode reward: [(0, '16.546')] [2023-02-26 17:34:39,858][11517] Saving new best policy, reward=16.546! [2023-02-26 17:34:44,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 1953792. Throughput: 0: 952.4. Samples: 487356. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:34:44,842][00949] Avg episode reward: [(0, '18.461')] [2023-02-26 17:34:44,844][11517] Saving new best policy, reward=18.461! [2023-02-26 17:34:48,189][11531] Updated weights for policy 0, policy_version 480 (0.0030) [2023-02-26 17:34:49,841][00949] Fps is (10 sec: 3276.3, 60 sec: 3822.8, 300 sec: 3804.4). Total num frames: 1970176. Throughput: 0: 943.4. Samples: 492232. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:34:49,850][00949] Avg episode reward: [(0, '19.249')] [2023-02-26 17:34:49,865][11517] Saving new best policy, reward=19.249! [2023-02-26 17:34:54,840][00949] Fps is (10 sec: 4095.9, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1994752. Throughput: 0: 992.4. Samples: 499178. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:34:54,843][00949] Avg episode reward: [(0, '19.134')] [2023-02-26 17:34:57,037][11531] Updated weights for policy 0, policy_version 490 (0.0026) [2023-02-26 17:34:59,840][00949] Fps is (10 sec: 4506.3, 60 sec: 3823.3, 300 sec: 3804.4). Total num frames: 2015232. Throughput: 0: 994.8. Samples: 502728. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:34:59,847][00949] Avg episode reward: [(0, '17.856')] [2023-02-26 17:35:04,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 2027520. Throughput: 0: 939.4. Samples: 507252. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:35:04,850][00949] Avg episode reward: [(0, '17.557')] [2023-02-26 17:35:09,006][11531] Updated weights for policy 0, policy_version 500 (0.0033) [2023-02-26 17:35:09,842][00949] Fps is (10 sec: 3276.1, 60 sec: 3891.1, 300 sec: 3804.4). Total num frames: 2048000. Throughput: 0: 964.0. Samples: 512992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:35:09,846][00949] Avg episode reward: [(0, '16.898')] [2023-02-26 17:35:14,840][00949] Fps is (10 sec: 4505.7, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 2072576. Throughput: 0: 991.7. Samples: 516558. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:35:14,846][00949] Avg episode reward: [(0, '16.624')] [2023-02-26 17:35:18,276][11531] Updated weights for policy 0, policy_version 510 (0.0014) [2023-02-26 17:35:19,841][00949] Fps is (10 sec: 4505.8, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 2093056. Throughput: 0: 980.8. Samples: 522832. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:35:19,844][00949] Avg episode reward: [(0, '18.381')] [2023-02-26 17:35:24,840][00949] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 2105344. Throughput: 0: 936.5. Samples: 527294. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:35:24,848][00949] Avg episode reward: [(0, '19.690')] [2023-02-26 17:35:24,854][11517] Saving new best policy, reward=19.690! [2023-02-26 17:35:29,840][00949] Fps is (10 sec: 3277.3, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 2125824. Throughput: 0: 948.8. Samples: 530052. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 17:35:29,843][00949] Avg episode reward: [(0, '19.132')] [2023-02-26 17:35:29,862][11517] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000519_2125824.pth... [2023-02-26 17:35:30,018][11517] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000294_1204224.pth [2023-02-26 17:35:30,127][11531] Updated weights for policy 0, policy_version 520 (0.0029) [2023-02-26 17:35:34,840][00949] Fps is (10 sec: 4505.8, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 2150400. Throughput: 0: 993.9. Samples: 536958. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:35:34,847][00949] Avg episode reward: [(0, '17.982')] [2023-02-26 17:35:39,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 2166784. Throughput: 0: 964.5. Samples: 542580. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 17:35:39,845][00949] Avg episode reward: [(0, '17.176')] [2023-02-26 17:35:40,318][11531] Updated weights for policy 0, policy_version 530 (0.0012) [2023-02-26 17:35:44,840][00949] Fps is (10 sec: 3276.6, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 2183168. Throughput: 0: 934.7. Samples: 544792. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 17:35:44,845][00949] Avg episode reward: [(0, '15.697')] [2023-02-26 17:35:49,843][00949] Fps is (10 sec: 3685.3, 60 sec: 3891.1, 300 sec: 3804.4). Total num frames: 2203648. Throughput: 0: 964.0. Samples: 550636. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:35:49,846][00949] Avg episode reward: [(0, '15.299')] [2023-02-26 17:35:50,940][11531] Updated weights for policy 0, policy_version 540 (0.0040) [2023-02-26 17:35:54,843][00949] Fps is (10 sec: 4504.2, 60 sec: 3891.0, 300 sec: 3818.3). Total num frames: 2228224. Throughput: 0: 991.6. Samples: 557614. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:35:54,847][00949] Avg episode reward: [(0, '15.449')] [2023-02-26 17:35:59,842][00949] Fps is (10 sec: 4096.2, 60 sec: 3822.8, 300 sec: 3790.5). Total num frames: 2244608. Throughput: 0: 973.4. Samples: 560362. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:35:59,849][00949] Avg episode reward: [(0, '14.714')] [2023-02-26 17:36:02,016][11531] Updated weights for policy 0, policy_version 550 (0.0014) [2023-02-26 17:36:04,841][00949] Fps is (10 sec: 3277.7, 60 sec: 3891.1, 300 sec: 3790.5). Total num frames: 2260992. Throughput: 0: 933.1. Samples: 564822. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:36:04,848][00949] Avg episode reward: [(0, '16.096')] [2023-02-26 17:36:09,840][00949] Fps is (10 sec: 3687.3, 60 sec: 3891.3, 300 sec: 3818.3). Total num frames: 2281472. Throughput: 0: 975.6. Samples: 571196. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:36:09,846][00949] Avg episode reward: [(0, '16.980')] [2023-02-26 17:36:11,796][11531] Updated weights for policy 0, policy_version 560 (0.0025) [2023-02-26 17:36:14,840][00949] Fps is (10 sec: 4506.0, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 2306048. Throughput: 0: 992.2. Samples: 574700. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:36:14,847][00949] Avg episode reward: [(0, '17.854')] [2023-02-26 17:36:19,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3754.8, 300 sec: 3832.2). Total num frames: 2318336. Throughput: 0: 959.1. Samples: 580116. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:36:19,848][00949] Avg episode reward: [(0, '18.669')] [2023-02-26 17:36:23,727][11531] Updated weights for policy 0, policy_version 570 (0.0012) [2023-02-26 17:36:24,840][00949] Fps is (10 sec: 2867.2, 60 sec: 3823.0, 300 sec: 3832.2). Total num frames: 2334720. Throughput: 0: 935.2. Samples: 584666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:36:24,847][00949] Avg episode reward: [(0, '19.994')] [2023-02-26 17:36:24,948][11517] Saving new best policy, reward=19.994! [2023-02-26 17:36:29,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 2359296. Throughput: 0: 961.4. Samples: 588056. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 17:36:29,843][00949] Avg episode reward: [(0, '20.603')] [2023-02-26 17:36:29,853][11517] Saving new best policy, reward=20.603! [2023-02-26 17:36:32,867][11531] Updated weights for policy 0, policy_version 580 (0.0030) [2023-02-26 17:36:34,840][00949] Fps is (10 sec: 4915.2, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 2383872. Throughput: 0: 987.2. Samples: 595058. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:36:34,846][00949] Avg episode reward: [(0, '21.056')] [2023-02-26 17:36:34,848][11517] Saving new best policy, reward=21.056! [2023-02-26 17:36:39,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 2396160. Throughput: 0: 938.4. Samples: 599840. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:36:39,846][00949] Avg episode reward: [(0, '22.265')] [2023-02-26 17:36:39,862][11517] Saving new best policy, reward=22.265! [2023-02-26 17:36:44,840][00949] Fps is (10 sec: 2867.2, 60 sec: 3823.0, 300 sec: 3846.1). Total num frames: 2412544. Throughput: 0: 926.8. Samples: 602066. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 17:36:44,843][00949] Avg episode reward: [(0, '21.983')] [2023-02-26 17:36:45,145][11531] Updated weights for policy 0, policy_version 590 (0.0030) [2023-02-26 17:36:49,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3891.4, 300 sec: 3860.0). Total num frames: 2437120. Throughput: 0: 970.5. Samples: 608494. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 17:36:49,848][00949] Avg episode reward: [(0, '19.805')] [2023-02-26 17:36:54,010][11531] Updated weights for policy 0, policy_version 600 (0.0022) [2023-02-26 17:36:54,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3823.2, 300 sec: 3860.0). Total num frames: 2457600. Throughput: 0: 979.6. Samples: 615278. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:36:54,844][00949] Avg episode reward: [(0, '20.691')] [2023-02-26 17:36:59,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3823.1, 300 sec: 3832.2). Total num frames: 2473984. Throughput: 0: 952.3. Samples: 617554. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:36:59,843][00949] Avg episode reward: [(0, '19.756')] [2023-02-26 17:37:04,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3846.1). Total num frames: 2490368. Throughput: 0: 937.7. Samples: 622312. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 17:37:04,847][00949] Avg episode reward: [(0, '19.519')] [2023-02-26 17:37:05,977][11531] Updated weights for policy 0, policy_version 610 (0.0023) [2023-02-26 17:37:09,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 2514944. Throughput: 0: 992.8. Samples: 629340. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:37:09,847][00949] Avg episode reward: [(0, '20.525')] [2023-02-26 17:37:14,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 2535424. Throughput: 0: 997.6. Samples: 632948. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 17:37:14,855][00949] Avg episode reward: [(0, '22.631')] [2023-02-26 17:37:14,862][11517] Saving new best policy, reward=22.631! [2023-02-26 17:37:15,373][11531] Updated weights for policy 0, policy_version 620 (0.0022) [2023-02-26 17:37:19,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 2551808. Throughput: 0: 946.2. Samples: 637638. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:37:19,846][00949] Avg episode reward: [(0, '22.063')] [2023-02-26 17:37:24,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 2568192. Throughput: 0: 957.8. Samples: 642942. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:37:24,844][00949] Avg episode reward: [(0, '22.971')] [2023-02-26 17:37:24,908][11517] Saving new best policy, reward=22.971! [2023-02-26 17:37:26,892][11531] Updated weights for policy 0, policy_version 630 (0.0023) [2023-02-26 17:37:29,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 2592768. Throughput: 0: 985.1. Samples: 646394. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:37:29,847][00949] Avg episode reward: [(0, '24.020')] [2023-02-26 17:37:29,859][11517] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000633_2592768.pth... [2023-02-26 17:37:29,975][11517] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000406_1662976.pth [2023-02-26 17:37:29,987][11517] Saving new best policy, reward=24.020! [2023-02-26 17:37:34,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 2609152. Throughput: 0: 972.2. Samples: 652242. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 17:37:34,852][00949] Avg episode reward: [(0, '24.720')] [2023-02-26 17:37:34,858][11517] Saving new best policy, reward=24.720! [2023-02-26 17:37:39,716][11531] Updated weights for policy 0, policy_version 640 (0.0012) [2023-02-26 17:37:39,843][00949] Fps is (10 sec: 2866.2, 60 sec: 3754.4, 300 sec: 3818.3). Total num frames: 2621440. Throughput: 0: 898.3. Samples: 655704. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:37:39,846][00949] Avg episode reward: [(0, '23.951')] [2023-02-26 17:37:44,841][00949] Fps is (10 sec: 2457.2, 60 sec: 3686.3, 300 sec: 3818.3). Total num frames: 2633728. Throughput: 0: 886.2. Samples: 657436. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 17:37:44,845][00949] Avg episode reward: [(0, '22.003')] [2023-02-26 17:37:49,840][00949] Fps is (10 sec: 3278.0, 60 sec: 3618.1, 300 sec: 3818.3). Total num frames: 2654208. Throughput: 0: 894.8. Samples: 662580. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:37:49,842][00949] Avg episode reward: [(0, '23.123')] [2023-02-26 17:37:51,339][11531] Updated weights for policy 0, policy_version 650 (0.0023) [2023-02-26 17:37:54,840][00949] Fps is (10 sec: 4096.6, 60 sec: 3618.1, 300 sec: 3804.4). Total num frames: 2674688. Throughput: 0: 894.9. Samples: 669610. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 17:37:54,842][00949] Avg episode reward: [(0, '22.491')] [2023-02-26 17:37:59,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3818.4). Total num frames: 2695168. Throughput: 0: 883.1. Samples: 672688. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:37:59,843][00949] Avg episode reward: [(0, '22.335')] [2023-02-26 17:38:02,130][11531] Updated weights for policy 0, policy_version 660 (0.0012) [2023-02-26 17:38:04,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3804.4). Total num frames: 2707456. Throughput: 0: 877.9. Samples: 677144. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 17:38:04,850][00949] Avg episode reward: [(0, '22.747')] [2023-02-26 17:38:09,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3818.3). Total num frames: 2732032. Throughput: 0: 896.1. Samples: 683266. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:38:09,842][00949] Avg episode reward: [(0, '23.219')] [2023-02-26 17:38:12,388][11531] Updated weights for policy 0, policy_version 670 (0.0013) [2023-02-26 17:38:14,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3804.4). Total num frames: 2752512. Throughput: 0: 896.5. Samples: 686736. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 17:38:14,841][00949] Avg episode reward: [(0, '22.983')] [2023-02-26 17:38:19,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3818.3). Total num frames: 2772992. Throughput: 0: 899.2. Samples: 692704. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:38:19,843][00949] Avg episode reward: [(0, '22.786')] [2023-02-26 17:38:23,829][11531] Updated weights for policy 0, policy_version 680 (0.0025) [2023-02-26 17:38:24,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3804.4). Total num frames: 2785280. Throughput: 0: 925.4. Samples: 697342. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:38:24,842][00949] Avg episode reward: [(0, '22.235')] [2023-02-26 17:38:29,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3818.3). Total num frames: 2809856. Throughput: 0: 959.3. Samples: 700602. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:38:29,842][00949] Avg episode reward: [(0, '22.004')] [2023-02-26 17:38:33,026][11531] Updated weights for policy 0, policy_version 690 (0.0013) [2023-02-26 17:38:34,840][00949] Fps is (10 sec: 4915.2, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 2834432. Throughput: 0: 1001.0. Samples: 707626. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:38:34,847][00949] Avg episode reward: [(0, '22.437')] [2023-02-26 17:38:39,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3754.9, 300 sec: 3804.4). Total num frames: 2846720. Throughput: 0: 960.6. Samples: 712836. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:38:39,843][00949] Avg episode reward: [(0, '23.490')] [2023-02-26 17:38:44,840][00949] Fps is (10 sec: 2867.2, 60 sec: 3823.0, 300 sec: 3804.4). Total num frames: 2863104. Throughput: 0: 941.7. Samples: 715064. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:38:44,842][00949] Avg episode reward: [(0, '24.260')] [2023-02-26 17:38:44,994][11531] Updated weights for policy 0, policy_version 700 (0.0019) [2023-02-26 17:38:49,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 2887680. Throughput: 0: 981.6. Samples: 721316. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:38:49,846][00949] Avg episode reward: [(0, '23.571')] [2023-02-26 17:38:53,631][11531] Updated weights for policy 0, policy_version 710 (0.0017) [2023-02-26 17:38:54,840][00949] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3818.4). Total num frames: 2912256. Throughput: 0: 1002.2. Samples: 728364. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 17:38:54,849][00949] Avg episode reward: [(0, '23.685')] [2023-02-26 17:38:59,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 2924544. Throughput: 0: 975.2. Samples: 730618. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:38:59,844][00949] Avg episode reward: [(0, '24.059')] [2023-02-26 17:39:04,840][00949] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 2940928. Throughput: 0: 944.9. Samples: 735226. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:39:04,844][00949] Avg episode reward: [(0, '22.932')] [2023-02-26 17:39:05,766][11531] Updated weights for policy 0, policy_version 720 (0.0031) [2023-02-26 17:39:09,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 2965504. Throughput: 0: 993.4. Samples: 742044. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:39:09,843][00949] Avg episode reward: [(0, '21.710')] [2023-02-26 17:39:14,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 2985984. Throughput: 0: 999.7. Samples: 745588. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:39:14,846][00949] Avg episode reward: [(0, '22.795')] [2023-02-26 17:39:15,286][11531] Updated weights for policy 0, policy_version 730 (0.0015) [2023-02-26 17:39:19,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3002368. Throughput: 0: 952.6. Samples: 750494. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:39:19,848][00949] Avg episode reward: [(0, '23.458')] [2023-02-26 17:39:24,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3018752. Throughput: 0: 951.4. Samples: 755648. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:39:24,842][00949] Avg episode reward: [(0, '23.406')] [2023-02-26 17:39:26,780][11531] Updated weights for policy 0, policy_version 740 (0.0016) [2023-02-26 17:39:29,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3043328. Throughput: 0: 978.6. Samples: 759100. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:39:29,843][00949] Avg episode reward: [(0, '24.652')] [2023-02-26 17:39:29,851][11517] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000743_3043328.pth... [2023-02-26 17:39:29,952][11517] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000519_2125824.pth [2023-02-26 17:39:34,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3063808. Throughput: 0: 988.4. Samples: 765792. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:39:34,842][00949] Avg episode reward: [(0, '26.034')] [2023-02-26 17:39:34,848][11517] Saving new best policy, reward=26.034! [2023-02-26 17:39:36,935][11531] Updated weights for policy 0, policy_version 750 (0.0026) [2023-02-26 17:39:39,840][00949] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3076096. Throughput: 0: 929.3. Samples: 770184. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 17:39:39,848][00949] Avg episode reward: [(0, '27.059')] [2023-02-26 17:39:39,864][11517] Saving new best policy, reward=27.059! [2023-02-26 17:39:44,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3096576. Throughput: 0: 932.9. Samples: 772600. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:39:44,842][00949] Avg episode reward: [(0, '27.230')] [2023-02-26 17:39:44,851][11517] Saving new best policy, reward=27.230! [2023-02-26 17:39:47,820][11531] Updated weights for policy 0, policy_version 760 (0.0022) [2023-02-26 17:39:49,840][00949] Fps is (10 sec: 4505.7, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3121152. Throughput: 0: 984.8. Samples: 779542. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:39:49,848][00949] Avg episode reward: [(0, '28.466')] [2023-02-26 17:39:49,859][11517] Saving new best policy, reward=28.466! [2023-02-26 17:39:54,847][00949] Fps is (10 sec: 4093.0, 60 sec: 3754.2, 300 sec: 3804.3). Total num frames: 3137536. Throughput: 0: 966.5. Samples: 785544. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:39:54,853][00949] Avg episode reward: [(0, '27.905')] [2023-02-26 17:39:59,155][11531] Updated weights for policy 0, policy_version 770 (0.0023) [2023-02-26 17:39:59,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3153920. Throughput: 0: 936.8. Samples: 787746. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:39:59,848][00949] Avg episode reward: [(0, '27.933')] [2023-02-26 17:40:04,840][00949] Fps is (10 sec: 3689.1, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3174400. Throughput: 0: 951.4. Samples: 793308. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:40:04,842][00949] Avg episode reward: [(0, '28.431')] [2023-02-26 17:40:08,505][11531] Updated weights for policy 0, policy_version 780 (0.0013) [2023-02-26 17:40:09,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3198976. Throughput: 0: 992.3. Samples: 800302. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:40:09,843][00949] Avg episode reward: [(0, '28.120')] [2023-02-26 17:40:14,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3215360. Throughput: 0: 981.6. Samples: 803272. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 17:40:14,843][00949] Avg episode reward: [(0, '27.349')] [2023-02-26 17:40:19,840][00949] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3231744. Throughput: 0: 933.0. Samples: 807778. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:40:19,843][00949] Avg episode reward: [(0, '25.393')] [2023-02-26 17:40:20,556][11531] Updated weights for policy 0, policy_version 790 (0.0022) [2023-02-26 17:40:24,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3252224. Throughput: 0: 976.3. Samples: 814118. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:40:24,842][00949] Avg episode reward: [(0, '24.497')] [2023-02-26 17:40:29,346][11531] Updated weights for policy 0, policy_version 800 (0.0012) [2023-02-26 17:40:29,840][00949] Fps is (10 sec: 4505.7, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3276800. Throughput: 0: 999.3. Samples: 817568. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:40:29,844][00949] Avg episode reward: [(0, '23.982')] [2023-02-26 17:40:34,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3293184. Throughput: 0: 970.8. Samples: 823230. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 17:40:34,842][00949] Avg episode reward: [(0, '23.283')] [2023-02-26 17:40:39,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3309568. Throughput: 0: 938.1. Samples: 827752. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:40:39,848][00949] Avg episode reward: [(0, '23.223')] [2023-02-26 17:40:41,508][11531] Updated weights for policy 0, policy_version 810 (0.0027) [2023-02-26 17:40:44,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3330048. Throughput: 0: 962.7. Samples: 831066. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:40:44,847][00949] Avg episode reward: [(0, '23.361')] [2023-02-26 17:40:49,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3818.4). Total num frames: 3354624. Throughput: 0: 994.0. Samples: 838036. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 17:40:49,843][00949] Avg episode reward: [(0, '23.938')] [2023-02-26 17:40:50,482][11531] Updated weights for policy 0, policy_version 820 (0.0019) [2023-02-26 17:40:54,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3891.7, 300 sec: 3818.3). Total num frames: 3371008. Throughput: 0: 951.5. Samples: 843118. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:40:54,843][00949] Avg episode reward: [(0, '23.646')] [2023-02-26 17:40:59,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3387392. Throughput: 0: 936.7. Samples: 845424. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 17:40:59,842][00949] Avg episode reward: [(0, '24.529')] [2023-02-26 17:41:02,273][11531] Updated weights for policy 0, policy_version 830 (0.0021) [2023-02-26 17:41:04,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 3411968. Throughput: 0: 978.9. Samples: 851826. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:41:04,847][00949] Avg episode reward: [(0, '23.187')] [2023-02-26 17:41:09,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3432448. Throughput: 0: 990.0. Samples: 858668. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:41:09,845][00949] Avg episode reward: [(0, '24.566')] [2023-02-26 17:41:12,003][11531] Updated weights for policy 0, policy_version 840 (0.0016) [2023-02-26 17:41:14,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3448832. Throughput: 0: 962.7. Samples: 860888. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:41:14,844][00949] Avg episode reward: [(0, '24.079')] [2023-02-26 17:41:19,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3465216. Throughput: 0: 941.3. Samples: 865588. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 17:41:19,843][00949] Avg episode reward: [(0, '24.212')] [2023-02-26 17:41:23,004][11531] Updated weights for policy 0, policy_version 850 (0.0018) [2023-02-26 17:41:24,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3485696. Throughput: 0: 997.6. Samples: 872646. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 17:41:24,847][00949] Avg episode reward: [(0, '24.302')] [2023-02-26 17:41:29,845][00949] Fps is (10 sec: 4503.1, 60 sec: 3890.9, 300 sec: 3818.2). Total num frames: 3510272. Throughput: 0: 1001.2. Samples: 876124. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:41:29,853][00949] Avg episode reward: [(0, '24.801')] [2023-02-26 17:41:29,866][11517] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000857_3510272.pth... [2023-02-26 17:41:30,004][11517] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000633_2592768.pth [2023-02-26 17:41:33,416][11531] Updated weights for policy 0, policy_version 860 (0.0017) [2023-02-26 17:41:34,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3522560. Throughput: 0: 953.4. Samples: 880940. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:41:34,844][00949] Avg episode reward: [(0, '26.715')] [2023-02-26 17:41:39,840][00949] Fps is (10 sec: 3278.6, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3543040. Throughput: 0: 958.2. Samples: 886238. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:41:39,847][00949] Avg episode reward: [(0, '26.697')] [2023-02-26 17:41:43,915][11531] Updated weights for policy 0, policy_version 870 (0.0014) [2023-02-26 17:41:44,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 3567616. Throughput: 0: 985.2. Samples: 889756. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:41:44,843][00949] Avg episode reward: [(0, '26.616')] [2023-02-26 17:41:49,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3584000. Throughput: 0: 989.2. Samples: 896340. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:41:49,846][00949] Avg episode reward: [(0, '27.098')] [2023-02-26 17:41:54,841][00949] Fps is (10 sec: 3276.4, 60 sec: 3822.8, 300 sec: 3818.3). Total num frames: 3600384. Throughput: 0: 934.9. Samples: 900740. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:41:54,844][00949] Avg episode reward: [(0, '26.956')] [2023-02-26 17:41:55,826][11531] Updated weights for policy 0, policy_version 880 (0.0024) [2023-02-26 17:41:59,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3620864. Throughput: 0: 939.9. Samples: 903184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:41:59,843][00949] Avg episode reward: [(0, '25.643')] [2023-02-26 17:42:04,840][00949] Fps is (10 sec: 4096.4, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3641344. Throughput: 0: 988.7. Samples: 910080. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:42:04,843][00949] Avg episode reward: [(0, '24.352')] [2023-02-26 17:42:05,147][11531] Updated weights for policy 0, policy_version 890 (0.0020) [2023-02-26 17:42:09,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3661824. Throughput: 0: 963.7. Samples: 916014. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:42:09,842][00949] Avg episode reward: [(0, '24.394')] [2023-02-26 17:42:14,840][00949] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3678208. Throughput: 0: 937.4. Samples: 918300. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:42:14,843][00949] Avg episode reward: [(0, '24.538')] [2023-02-26 17:42:17,132][11531] Updated weights for policy 0, policy_version 900 (0.0012) [2023-02-26 17:42:19,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3698688. Throughput: 0: 949.3. Samples: 923658. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:42:19,842][00949] Avg episode reward: [(0, '23.970')] [2023-02-26 17:42:24,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3719168. Throughput: 0: 987.6. Samples: 930680. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:42:24,842][00949] Avg episode reward: [(0, '21.445')] [2023-02-26 17:42:25,933][11531] Updated weights for policy 0, policy_version 910 (0.0018) [2023-02-26 17:42:29,841][00949] Fps is (10 sec: 4095.6, 60 sec: 3823.2, 300 sec: 3832.2). Total num frames: 3739648. Throughput: 0: 978.2. Samples: 933774. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 17:42:29,845][00949] Avg episode reward: [(0, '20.020')] [2023-02-26 17:42:34,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 3751936. Throughput: 0: 931.9. Samples: 938276. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:42:34,846][00949] Avg episode reward: [(0, '20.358')] [2023-02-26 17:42:38,057][11531] Updated weights for policy 0, policy_version 920 (0.0025) [2023-02-26 17:42:39,840][00949] Fps is (10 sec: 3686.8, 60 sec: 3891.2, 300 sec: 3873.9). Total num frames: 3776512. Throughput: 0: 969.0. Samples: 944344. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:42:39,843][00949] Avg episode reward: [(0, '19.347')] [2023-02-26 17:42:44,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 3796992. Throughput: 0: 991.1. Samples: 947782. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:42:44,842][00949] Avg episode reward: [(0, '19.655')] [2023-02-26 17:42:47,112][11531] Updated weights for policy 0, policy_version 930 (0.0013) [2023-02-26 17:42:49,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 3817472. Throughput: 0: 970.7. Samples: 953762. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 17:42:49,848][00949] Avg episode reward: [(0, '19.489')] [2023-02-26 17:42:54,843][00949] Fps is (10 sec: 3275.9, 60 sec: 3822.8, 300 sec: 3846.0). Total num frames: 3829760. Throughput: 0: 938.9. Samples: 958266. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:42:54,852][00949] Avg episode reward: [(0, '21.023')] [2023-02-26 17:42:58,772][11531] Updated weights for policy 0, policy_version 940 (0.0016) [2023-02-26 17:42:59,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 3854336. Throughput: 0: 963.3. Samples: 961648. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:42:59,843][00949] Avg episode reward: [(0, '21.669')] [2023-02-26 17:43:04,840][00949] Fps is (10 sec: 4916.6, 60 sec: 3959.5, 300 sec: 3887.7). Total num frames: 3878912. Throughput: 0: 1000.4. Samples: 968674. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:43:04,842][00949] Avg episode reward: [(0, '22.305')] [2023-02-26 17:43:08,682][11531] Updated weights for policy 0, policy_version 950 (0.0015) [2023-02-26 17:43:09,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 3891200. Throughput: 0: 957.6. Samples: 973774. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:43:09,842][00949] Avg episode reward: [(0, '23.114')] [2023-02-26 17:43:14,840][00949] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 3907584. Throughput: 0: 937.4. Samples: 975956. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 17:43:14,842][00949] Avg episode reward: [(0, '23.901')] [2023-02-26 17:43:19,556][11531] Updated weights for policy 0, policy_version 960 (0.0032) [2023-02-26 17:43:19,840][00949] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 3932160. Throughput: 0: 977.8. Samples: 982276. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 17:43:19,846][00949] Avg episode reward: [(0, '23.552')] [2023-02-26 17:43:24,840][00949] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 3952640. Throughput: 0: 995.9. Samples: 989160. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:43:24,844][00949] Avg episode reward: [(0, '22.094')] [2023-02-26 17:43:29,840][00949] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3846.1). Total num frames: 3969024. Throughput: 0: 968.6. Samples: 991368. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:43:29,845][00949] Avg episode reward: [(0, '22.546')] [2023-02-26 17:43:29,861][11517] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000969_3969024.pth... [2023-02-26 17:43:29,999][11517] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000743_3043328.pth [2023-02-26 17:43:30,924][11531] Updated weights for policy 0, policy_version 970 (0.0024) [2023-02-26 17:43:34,840][00949] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 3985408. Throughput: 0: 933.2. Samples: 995754. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 17:43:34,842][00949] Avg episode reward: [(0, '22.773')] [2023-02-26 17:43:38,749][11517] Stopping Batcher_0... [2023-02-26 17:43:38,749][00949] Component Batcher_0 stopped! [2023-02-26 17:43:38,751][00949] Component RolloutWorker_w0 process died already! Don't wait for it. [2023-02-26 17:43:38,751][11517] Loop batcher_evt_loop terminating... [2023-02-26 17:43:38,757][11517] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-26 17:43:38,792][00949] Component RolloutWorker_w4 stopped! [2023-02-26 17:43:38,795][11537] Stopping RolloutWorker_w4... [2023-02-26 17:43:38,802][11531] Weights refcount: 2 0 [2023-02-26 17:43:38,798][11537] Loop rollout_proc4_evt_loop terminating... [2023-02-26 17:43:38,816][00949] Component RolloutWorker_w2 stopped! [2023-02-26 17:43:38,820][00949] Component InferenceWorker_p0-w0 stopped! [2023-02-26 17:43:38,822][11531] Stopping InferenceWorker_p0-w0... [2023-02-26 17:43:38,823][11531] Loop inference_proc0-0_evt_loop terminating... [2023-02-26 17:43:38,827][11539] Stopping RolloutWorker_w7... [2023-02-26 17:43:38,827][11539] Loop rollout_proc7_evt_loop terminating... [2023-02-26 17:43:38,827][00949] Component RolloutWorker_w7 stopped! [2023-02-26 17:43:38,819][11534] Stopping RolloutWorker_w2... [2023-02-26 17:43:38,834][00949] Component RolloutWorker_w6 stopped! [2023-02-26 17:43:38,838][11538] Stopping RolloutWorker_w6... [2023-02-26 17:43:38,829][11534] Loop rollout_proc2_evt_loop terminating... [2023-02-26 17:43:38,839][11538] Loop rollout_proc6_evt_loop terminating... [2023-02-26 17:43:38,848][00949] Component RolloutWorker_w3 stopped! [2023-02-26 17:43:38,862][00949] Component RolloutWorker_w5 stopped! [2023-02-26 17:43:38,848][11535] Stopping RolloutWorker_w3... [2023-02-26 17:43:38,870][00949] Component RolloutWorker_w1 stopped! [2023-02-26 17:43:38,862][11536] Stopping RolloutWorker_w5... [2023-02-26 17:43:38,870][11532] Stopping RolloutWorker_w1... [2023-02-26 17:43:38,877][11535] Loop rollout_proc3_evt_loop terminating... [2023-02-26 17:43:38,876][11536] Loop rollout_proc5_evt_loop terminating... [2023-02-26 17:43:38,878][11532] Loop rollout_proc1_evt_loop terminating... [2023-02-26 17:43:38,940][11517] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000857_3510272.pth [2023-02-26 17:43:38,954][11517] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-26 17:43:39,083][00949] Component LearnerWorker_p0 stopped! [2023-02-26 17:43:39,090][00949] Waiting for process learner_proc0 to stop... [2023-02-26 17:43:39,096][11517] Stopping LearnerWorker_p0... [2023-02-26 17:43:39,097][11517] Loop learner_proc0_evt_loop terminating... [2023-02-26 17:43:40,788][00949] Waiting for process inference_proc0-0 to join... [2023-02-26 17:43:41,197][00949] Waiting for process rollout_proc0 to join... [2023-02-26 17:43:41,199][00949] Waiting for process rollout_proc1 to join... [2023-02-26 17:43:41,201][00949] Waiting for process rollout_proc2 to join... [2023-02-26 17:43:41,497][00949] Waiting for process rollout_proc3 to join... [2023-02-26 17:43:41,498][00949] Waiting for process rollout_proc4 to join... [2023-02-26 17:43:41,500][00949] Waiting for process rollout_proc5 to join... [2023-02-26 17:43:41,502][00949] Waiting for process rollout_proc6 to join... [2023-02-26 17:43:41,503][00949] Waiting for process rollout_proc7 to join... [2023-02-26 17:43:41,505][00949] Batcher 0 profile tree view: batching: 25.3529, releasing_batches: 0.0229 [2023-02-26 17:43:41,507][00949] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0041 wait_policy_total: 508.0234 update_model: 7.8480 weight_update: 0.0023 one_step: 0.0023 handle_policy_step: 504.1197 deserialize: 14.7764, stack: 3.0017, obs_to_device_normalize: 114.4543, forward: 241.6087, send_messages: 24.3686 prepare_outputs: 80.3506 to_cpu: 50.1959 [2023-02-26 17:43:41,509][00949] Learner 0 profile tree view: misc: 0.0054, prepare_batch: 15.4227 train: 73.8128 epoch_init: 0.0058, minibatch_init: 0.0130, losses_postprocess: 0.6215, kl_divergence: 0.7066, after_optimizer: 32.5889 calculate_losses: 25.9291 losses_init: 0.0080, forward_head: 1.5326, bptt_initial: 17.3992, tail: 0.9942, advantages_returns: 0.2637, losses: 3.2744 bptt: 2.1274 bptt_forward_core: 2.0556 update: 13.3544 clip: 1.3574 [2023-02-26 17:43:41,510][00949] RolloutWorker_w7 profile tree view: wait_for_trajectories: 0.3488, enqueue_policy_requests: 130.2636, env_step: 809.3419, overhead: 20.1892, complete_rollouts: 7.4676 save_policy_outputs: 19.8368 split_output_tensors: 9.7299 [2023-02-26 17:43:41,512][00949] Loop Runner_EvtLoop terminating... [2023-02-26 17:43:41,514][00949] Runner profile tree view: main_loop: 1086.2690 [2023-02-26 17:43:41,515][00949] Collected {0: 4005888}, FPS: 3687.7 [2023-02-26 17:43:41,626][00949] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-26 17:43:41,628][00949] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-26 17:43:41,631][00949] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-26 17:43:41,633][00949] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-26 17:43:41,635][00949] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-26 17:43:41,636][00949] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-26 17:43:41,639][00949] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-02-26 17:43:41,640][00949] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-26 17:43:41,641][00949] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2023-02-26 17:43:41,642][00949] Adding new argument 'hf_repository'=None that is not in the saved config file! [2023-02-26 17:43:41,644][00949] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-26 17:43:41,645][00949] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-26 17:43:41,646][00949] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-26 17:43:41,647][00949] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-26 17:43:41,648][00949] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-26 17:43:41,676][00949] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 17:43:41,678][00949] RunningMeanStd input shape: (3, 72, 128) [2023-02-26 17:43:41,680][00949] RunningMeanStd input shape: (1,) [2023-02-26 17:43:41,697][00949] ConvEncoder: input_channels=3 [2023-02-26 17:43:42,350][00949] Conv encoder output size: 512 [2023-02-26 17:43:42,352][00949] Policy head output size: 512 [2023-02-26 17:43:45,116][00949] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-26 17:43:47,004][00949] Num frames 100... [2023-02-26 17:43:47,208][00949] Num frames 200... [2023-02-26 17:43:47,397][00949] Num frames 300... [2023-02-26 17:43:47,576][00949] Num frames 400... [2023-02-26 17:43:47,807][00949] Num frames 500... [2023-02-26 17:43:48,019][00949] Num frames 600... [2023-02-26 17:43:48,205][00949] Num frames 700... [2023-02-26 17:43:48,411][00949] Num frames 800... [2023-02-26 17:43:48,581][00949] Num frames 900... [2023-02-26 17:43:48,761][00949] Num frames 1000... [2023-02-26 17:43:48,938][00949] Num frames 1100... [2023-02-26 17:43:49,095][00949] Num frames 1200... [2023-02-26 17:43:49,251][00949] Num frames 1300... [2023-02-26 17:43:49,415][00949] Num frames 1400... [2023-02-26 17:43:49,582][00949] Num frames 1500... [2023-02-26 17:43:49,756][00949] Num frames 1600... [2023-02-26 17:43:49,937][00949] Avg episode rewards: #0: 45.759, true rewards: #0: 16.760 [2023-02-26 17:43:49,939][00949] Avg episode reward: 45.759, avg true_objective: 16.760 [2023-02-26 17:43:49,980][00949] Num frames 1700... [2023-02-26 17:43:50,139][00949] Num frames 1800... [2023-02-26 17:43:50,307][00949] Num frames 1900... [2023-02-26 17:43:50,421][00949] Num frames 2000... [2023-02-26 17:43:50,537][00949] Num frames 2100... [2023-02-26 17:43:50,650][00949] Num frames 2200... [2023-02-26 17:43:50,776][00949] Num frames 2300... [2023-02-26 17:43:50,891][00949] Num frames 2400... [2023-02-26 17:43:51,005][00949] Num frames 2500... [2023-02-26 17:43:51,135][00949] Num frames 2600... [2023-02-26 17:43:51,255][00949] Num frames 2700... [2023-02-26 17:43:51,380][00949] Num frames 2800... [2023-02-26 17:43:51,497][00949] Num frames 2900... [2023-02-26 17:43:51,581][00949] Avg episode rewards: #0: 38.120, true rewards: #0: 14.620 [2023-02-26 17:43:51,583][00949] Avg episode reward: 38.120, avg true_objective: 14.620 [2023-02-26 17:43:51,672][00949] Num frames 3000... [2023-02-26 17:43:51,790][00949] Num frames 3100... [2023-02-26 17:43:51,912][00949] Num frames 3200... [2023-02-26 17:43:52,024][00949] Num frames 3300... [2023-02-26 17:43:52,141][00949] Num frames 3400... [2023-02-26 17:43:52,256][00949] Num frames 3500... [2023-02-26 17:43:52,381][00949] Num frames 3600... [2023-02-26 17:43:52,492][00949] Num frames 3700... [2023-02-26 17:43:52,580][00949] Avg episode rewards: #0: 31.080, true rewards: #0: 12.413 [2023-02-26 17:43:52,583][00949] Avg episode reward: 31.080, avg true_objective: 12.413 [2023-02-26 17:43:52,672][00949] Num frames 3800... [2023-02-26 17:43:52,791][00949] Num frames 3900... [2023-02-26 17:43:52,910][00949] Num frames 4000... [2023-02-26 17:43:53,021][00949] Num frames 4100... [2023-02-26 17:43:53,137][00949] Num frames 4200... [2023-02-26 17:43:53,249][00949] Num frames 4300... [2023-02-26 17:43:53,370][00949] Num frames 4400... [2023-02-26 17:43:53,482][00949] Num frames 4500... [2023-02-26 17:43:53,597][00949] Num frames 4600... [2023-02-26 17:43:53,710][00949] Num frames 4700... [2023-02-26 17:43:53,828][00949] Num frames 4800... [2023-02-26 17:43:53,945][00949] Num frames 4900... [2023-02-26 17:43:54,064][00949] Num frames 5000... [2023-02-26 17:43:54,176][00949] Num frames 5100... [2023-02-26 17:43:54,319][00949] Avg episode rewards: #0: 32.700, true rewards: #0: 12.950 [2023-02-26 17:43:54,321][00949] Avg episode reward: 32.700, avg true_objective: 12.950 [2023-02-26 17:43:54,347][00949] Num frames 5200... [2023-02-26 17:43:54,459][00949] Num frames 5300... [2023-02-26 17:43:54,575][00949] Num frames 5400... [2023-02-26 17:43:54,685][00949] Num frames 5500... [2023-02-26 17:43:54,801][00949] Num frames 5600... [2023-02-26 17:43:54,888][00949] Avg episode rewards: #0: 27.256, true rewards: #0: 11.256 [2023-02-26 17:43:54,890][00949] Avg episode reward: 27.256, avg true_objective: 11.256 [2023-02-26 17:43:54,980][00949] Num frames 5700... [2023-02-26 17:43:55,095][00949] Num frames 5800... [2023-02-26 17:43:55,209][00949] Num frames 5900... [2023-02-26 17:43:55,326][00949] Num frames 6000... [2023-02-26 17:43:55,447][00949] Num frames 6100... [2023-02-26 17:43:55,560][00949] Num frames 6200... [2023-02-26 17:43:55,670][00949] Num frames 6300... [2023-02-26 17:43:55,783][00949] Num frames 6400... [2023-02-26 17:43:55,852][00949] Avg episode rewards: #0: 26.018, true rewards: #0: 10.685 [2023-02-26 17:43:55,854][00949] Avg episode reward: 26.018, avg true_objective: 10.685 [2023-02-26 17:43:55,964][00949] Num frames 6500... [2023-02-26 17:43:56,075][00949] Num frames 6600... [2023-02-26 17:43:56,192][00949] Num frames 6700... [2023-02-26 17:43:56,307][00949] Num frames 6800... [2023-02-26 17:43:56,420][00949] Num frames 6900... [2023-02-26 17:43:56,538][00949] Num frames 7000... [2023-02-26 17:43:56,648][00949] Num frames 7100... [2023-02-26 17:43:56,763][00949] Num frames 7200... [2023-02-26 17:43:56,879][00949] Num frames 7300... [2023-02-26 17:43:56,999][00949] Num frames 7400... [2023-02-26 17:43:57,059][00949] Avg episode rewards: #0: 25.290, true rewards: #0: 10.576 [2023-02-26 17:43:57,061][00949] Avg episode reward: 25.290, avg true_objective: 10.576 [2023-02-26 17:43:57,170][00949] Num frames 7500... [2023-02-26 17:43:57,289][00949] Num frames 7600... [2023-02-26 17:43:57,403][00949] Num frames 7700... [2023-02-26 17:43:57,519][00949] Num frames 7800... [2023-02-26 17:43:57,633][00949] Num frames 7900... [2023-02-26 17:43:57,749][00949] Num frames 8000... [2023-02-26 17:43:57,860][00949] Num frames 8100... [2023-02-26 17:43:57,986][00949] Num frames 8200... [2023-02-26 17:43:58,100][00949] Num frames 8300... [2023-02-26 17:43:58,217][00949] Num frames 8400... [2023-02-26 17:43:58,333][00949] Num frames 8500... [2023-02-26 17:43:58,452][00949] Num frames 8600... [2023-02-26 17:43:58,620][00949] Num frames 8700... [2023-02-26 17:43:58,790][00949] Num frames 8800... [2023-02-26 17:43:58,954][00949] Num frames 8900... [2023-02-26 17:43:59,115][00949] Num frames 9000... [2023-02-26 17:43:59,277][00949] Num frames 9100... [2023-02-26 17:43:59,435][00949] Num frames 9200... [2023-02-26 17:43:59,599][00949] Num frames 9300... [2023-02-26 17:43:59,766][00949] Num frames 9400... [2023-02-26 17:43:59,924][00949] Num frames 9500... [2023-02-26 17:43:59,989][00949] Avg episode rewards: #0: 29.379, true rewards: #0: 11.879 [2023-02-26 17:43:59,991][00949] Avg episode reward: 29.379, avg true_objective: 11.879 [2023-02-26 17:44:00,152][00949] Num frames 9600... [2023-02-26 17:44:00,314][00949] Num frames 9700... [2023-02-26 17:44:00,474][00949] Num frames 9800... [2023-02-26 17:44:00,633][00949] Num frames 9900... [2023-02-26 17:44:00,797][00949] Num frames 10000... [2023-02-26 17:44:00,964][00949] Num frames 10100... [2023-02-26 17:44:01,139][00949] Num frames 10200... [2023-02-26 17:44:01,301][00949] Num frames 10300... [2023-02-26 17:44:01,465][00949] Num frames 10400... [2023-02-26 17:44:01,634][00949] Num frames 10500... [2023-02-26 17:44:01,797][00949] Num frames 10600... [2023-02-26 17:44:01,966][00949] Num frames 10700... [2023-02-26 17:44:02,103][00949] Num frames 10800... [2023-02-26 17:44:02,217][00949] Num frames 10900... [2023-02-26 17:44:02,330][00949] Num frames 11000... [2023-02-26 17:44:02,445][00949] Num frames 11100... [2023-02-26 17:44:02,579][00949] Avg episode rewards: #0: 30.741, true rewards: #0: 12.408 [2023-02-26 17:44:02,580][00949] Avg episode reward: 30.741, avg true_objective: 12.408 [2023-02-26 17:44:02,623][00949] Num frames 11200... [2023-02-26 17:44:02,739][00949] Num frames 11300... [2023-02-26 17:44:02,859][00949] Num frames 11400... [2023-02-26 17:44:02,969][00949] Num frames 11500... [2023-02-26 17:44:03,096][00949] Num frames 11600... [2023-02-26 17:44:03,212][00949] Num frames 11700... [2023-02-26 17:44:03,331][00949] Num frames 11800... [2023-02-26 17:44:03,441][00949] Num frames 11900... [2023-02-26 17:44:03,557][00949] Num frames 12000... [2023-02-26 17:44:03,666][00949] Num frames 12100... [2023-02-26 17:44:03,783][00949] Num frames 12200... [2023-02-26 17:44:03,894][00949] Num frames 12300... [2023-02-26 17:44:04,015][00949] Num frames 12400... [2023-02-26 17:44:04,103][00949] Avg episode rewards: #0: 30.828, true rewards: #0: 12.428 [2023-02-26 17:44:04,104][00949] Avg episode reward: 30.828, avg true_objective: 12.428 [2023-02-26 17:45:17,664][00949] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-26 17:50:30,280][00949] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-26 17:50:30,283][00949] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-26 17:50:30,287][00949] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-26 17:50:30,290][00949] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-26 17:50:30,292][00949] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-26 17:50:30,296][00949] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-26 17:50:30,299][00949] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-26 17:50:30,303][00949] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-26 17:50:30,305][00949] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-26 17:50:30,306][00949] Adding new argument 'hf_repository'='mktz/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-26 17:50:30,310][00949] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-26 17:50:30,312][00949] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-26 17:50:30,315][00949] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-26 17:50:30,317][00949] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-26 17:50:30,321][00949] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-26 17:50:30,357][00949] RunningMeanStd input shape: (3, 72, 128) [2023-02-26 17:50:30,360][00949] RunningMeanStd input shape: (1,) [2023-02-26 17:50:30,380][00949] ConvEncoder: input_channels=3 [2023-02-26 17:50:30,439][00949] Conv encoder output size: 512 [2023-02-26 17:50:30,441][00949] Policy head output size: 512 [2023-02-26 17:50:30,472][00949] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-26 17:50:30,968][00949] Num frames 100... [2023-02-26 17:50:31,100][00949] Num frames 200... [2023-02-26 17:50:31,217][00949] Num frames 300... [2023-02-26 17:50:31,332][00949] Num frames 400... [2023-02-26 17:50:31,454][00949] Num frames 500... [2023-02-26 17:50:31,575][00949] Num frames 600... [2023-02-26 17:50:31,697][00949] Num frames 700... [2023-02-26 17:50:31,818][00949] Num frames 800... [2023-02-26 17:50:31,930][00949] Num frames 900... [2023-02-26 17:50:32,046][00949] Num frames 1000... [2023-02-26 17:50:32,192][00949] Num frames 1100... [2023-02-26 17:50:32,353][00949] Num frames 1200... [2023-02-26 17:50:32,513][00949] Num frames 1300... [2023-02-26 17:50:32,679][00949] Num frames 1400... [2023-02-26 17:50:32,841][00949] Num frames 1500... [2023-02-26 17:50:33,002][00949] Num frames 1600... [2023-02-26 17:50:33,159][00949] Num frames 1700... [2023-02-26 17:50:33,325][00949] Num frames 1800... [2023-02-26 17:50:33,481][00949] Num frames 1900... [2023-02-26 17:50:33,639][00949] Num frames 2000... [2023-02-26 17:50:33,808][00949] Num frames 2100... [2023-02-26 17:50:33,863][00949] Avg episode rewards: #0: 53.999, true rewards: #0: 21.000 [2023-02-26 17:50:33,866][00949] Avg episode reward: 53.999, avg true_objective: 21.000 [2023-02-26 17:50:34,035][00949] Num frames 2200... [2023-02-26 17:50:34,197][00949] Num frames 2300... [2023-02-26 17:50:34,373][00949] Num frames 2400... [2023-02-26 17:50:34,541][00949] Num frames 2500... [2023-02-26 17:50:34,708][00949] Num frames 2600... [2023-02-26 17:50:34,872][00949] Num frames 2700... [2023-02-26 17:50:35,041][00949] Num frames 2800... [2023-02-26 17:50:35,209][00949] Num frames 2900... [2023-02-26 17:50:35,379][00949] Num frames 3000... [2023-02-26 17:50:35,539][00949] Num frames 3100... [2023-02-26 17:50:35,735][00949] Avg episode rewards: #0: 37.939, true rewards: #0: 15.940 [2023-02-26 17:50:35,738][00949] Avg episode reward: 37.939, avg true_objective: 15.940 [2023-02-26 17:50:35,755][00949] Num frames 3200... [2023-02-26 17:50:35,868][00949] Num frames 3300... [2023-02-26 17:50:35,983][00949] Num frames 3400... [2023-02-26 17:50:36,096][00949] Num frames 3500... [2023-02-26 17:50:36,208][00949] Num frames 3600... [2023-02-26 17:50:36,333][00949] Num frames 3700... [2023-02-26 17:50:36,495][00949] Avg episode rewards: #0: 28.320, true rewards: #0: 12.653 [2023-02-26 17:50:36,499][00949] Avg episode reward: 28.320, avg true_objective: 12.653 [2023-02-26 17:50:36,509][00949] Num frames 3800... [2023-02-26 17:50:36,621][00949] Num frames 3900... [2023-02-26 17:50:36,732][00949] Num frames 4000... [2023-02-26 17:50:36,845][00949] Num frames 4100... [2023-02-26 17:50:36,958][00949] Num frames 4200... [2023-02-26 17:50:37,028][00949] Avg episode rewards: #0: 22.780, true rewards: #0: 10.530 [2023-02-26 17:50:37,030][00949] Avg episode reward: 22.780, avg true_objective: 10.530 [2023-02-26 17:50:37,132][00949] Num frames 4300... [2023-02-26 17:50:37,246][00949] Num frames 4400... [2023-02-26 17:50:37,371][00949] Num frames 4500... [2023-02-26 17:50:37,484][00949] Num frames 4600... [2023-02-26 17:50:37,603][00949] Num frames 4700... [2023-02-26 17:50:37,718][00949] Num frames 4800... [2023-02-26 17:50:37,796][00949] Avg episode rewards: #0: 20.434, true rewards: #0: 9.634 [2023-02-26 17:50:37,798][00949] Avg episode reward: 20.434, avg true_objective: 9.634 [2023-02-26 17:50:37,898][00949] Num frames 4900... [2023-02-26 17:50:38,011][00949] Num frames 5000... [2023-02-26 17:50:38,135][00949] Num frames 5100... [2023-02-26 17:50:38,249][00949] Num frames 5200... [2023-02-26 17:50:38,375][00949] Num frames 5300... [2023-02-26 17:50:38,490][00949] Num frames 5400... [2023-02-26 17:50:38,603][00949] Num frames 5500... [2023-02-26 17:50:38,716][00949] Num frames 5600... [2023-02-26 17:50:38,842][00949] Num frames 5700... [2023-02-26 17:50:38,955][00949] Num frames 5800... [2023-02-26 17:50:39,024][00949] Avg episode rewards: #0: 20.350, true rewards: #0: 9.683 [2023-02-26 17:50:39,025][00949] Avg episode reward: 20.350, avg true_objective: 9.683 [2023-02-26 17:50:39,135][00949] Num frames 5900... [2023-02-26 17:50:39,249][00949] Num frames 6000... [2023-02-26 17:50:39,373][00949] Num frames 6100... [2023-02-26 17:50:39,483][00949] Num frames 6200... [2023-02-26 17:50:39,591][00949] Num frames 6300... [2023-02-26 17:50:39,707][00949] Num frames 6400... [2023-02-26 17:50:39,829][00949] Avg episode rewards: #0: 19.511, true rewards: #0: 9.226 [2023-02-26 17:50:39,832][00949] Avg episode reward: 19.511, avg true_objective: 9.226 [2023-02-26 17:50:39,887][00949] Num frames 6500... [2023-02-26 17:50:39,999][00949] Num frames 6600... [2023-02-26 17:50:40,119][00949] Num frames 6700... [2023-02-26 17:50:40,231][00949] Num frames 6800... [2023-02-26 17:50:40,347][00949] Num frames 6900... [2023-02-26 17:50:40,472][00949] Num frames 7000... [2023-02-26 17:50:40,587][00949] Num frames 7100... [2023-02-26 17:50:40,704][00949] Num frames 7200... [2023-02-26 17:50:40,819][00949] Num frames 7300... [2023-02-26 17:50:40,938][00949] Num frames 7400... [2023-02-26 17:50:41,002][00949] Avg episode rewards: #0: 19.382, true rewards: #0: 9.257 [2023-02-26 17:50:41,004][00949] Avg episode reward: 19.382, avg true_objective: 9.257 [2023-02-26 17:50:41,111][00949] Num frames 7500... [2023-02-26 17:50:41,233][00949] Num frames 7600... [2023-02-26 17:50:41,354][00949] Num frames 7700... [2023-02-26 17:50:41,483][00949] Num frames 7800... [2023-02-26 17:50:41,607][00949] Num frames 7900... [2023-02-26 17:50:41,721][00949] Num frames 8000... [2023-02-26 17:50:41,845][00949] Num frames 8100... [2023-02-26 17:50:41,961][00949] Num frames 8200... [2023-02-26 17:50:42,078][00949] Num frames 8300... [2023-02-26 17:50:42,192][00949] Num frames 8400... [2023-02-26 17:50:42,312][00949] Num frames 8500... [2023-02-26 17:50:42,429][00949] Num frames 8600... [2023-02-26 17:50:42,543][00949] Num frames 8700... [2023-02-26 17:50:42,656][00949] Num frames 8800... [2023-02-26 17:50:42,769][00949] Num frames 8900... [2023-02-26 17:50:42,892][00949] Num frames 9000... [2023-02-26 17:50:43,005][00949] Num frames 9100... [2023-02-26 17:50:43,131][00949] Num frames 9200... [2023-02-26 17:50:43,263][00949] Num frames 9300... [2023-02-26 17:50:43,422][00949] Avg episode rewards: #0: 23.316, true rewards: #0: 10.428 [2023-02-26 17:50:43,424][00949] Avg episode reward: 23.316, avg true_objective: 10.428 [2023-02-26 17:50:43,444][00949] Num frames 9400... [2023-02-26 17:50:43,559][00949] Num frames 9500... [2023-02-26 17:50:43,681][00949] Num frames 9600... [2023-02-26 17:50:43,795][00949] Num frames 9700... [2023-02-26 17:50:43,914][00949] Num frames 9800... [2023-02-26 17:50:44,033][00949] Num frames 9900... [2023-02-26 17:50:44,154][00949] Num frames 10000... [2023-02-26 17:50:44,269][00949] Num frames 10100... [2023-02-26 17:50:44,402][00949] Avg episode rewards: #0: 22.567, true rewards: #0: 10.167 [2023-02-26 17:50:44,404][00949] Avg episode reward: 22.567, avg true_objective: 10.167 [2023-02-26 17:51:47,855][00949] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-26 17:51:51,989][00949] The model has been pushed to https://huggingface.co/mktz/rl_course_vizdoom_health_gathering_supreme [2023-02-26 17:55:55,139][00949] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-26 17:55:55,140][00949] Overriding arg 'num_workers' with value 2 passed from command line [2023-02-26 17:55:55,143][00949] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-26 17:55:55,146][00949] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-26 17:55:55,148][00949] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-26 17:55:55,150][00949] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-26 17:55:55,152][00949] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-26 17:55:55,154][00949] Adding new argument 'max_num_episodes'=20 that is not in the saved config file! [2023-02-26 17:55:55,156][00949] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-26 17:55:55,157][00949] Adding new argument 'hf_repository'='mktz/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-26 17:55:55,158][00949] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-26 17:55:55,159][00949] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-26 17:55:55,160][00949] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-26 17:55:55,161][00949] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-26 17:55:55,162][00949] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-26 17:55:55,186][00949] RunningMeanStd input shape: (3, 72, 128) [2023-02-26 17:55:55,189][00949] RunningMeanStd input shape: (1,) [2023-02-26 17:55:55,208][00949] ConvEncoder: input_channels=3 [2023-02-26 17:55:55,253][00949] Conv encoder output size: 512 [2023-02-26 17:55:55,254][00949] Policy head output size: 512 [2023-02-26 17:55:55,278][00949] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-26 17:55:55,726][00949] Num frames 100... [2023-02-26 17:55:55,873][00949] Num frames 200... [2023-02-26 17:55:56,036][00949] Num frames 300... [2023-02-26 17:55:56,199][00949] Num frames 400... [2023-02-26 17:55:56,376][00949] Num frames 500... [2023-02-26 17:55:56,535][00949] Num frames 600... [2023-02-26 17:55:56,697][00949] Num frames 700... [2023-02-26 17:55:56,866][00949] Num frames 800... [2023-02-26 17:55:57,028][00949] Num frames 900... [2023-02-26 17:55:57,191][00949] Num frames 1000... [2023-02-26 17:55:57,357][00949] Num frames 1100... [2023-02-26 17:55:57,528][00949] Num frames 1200... [2023-02-26 17:55:57,694][00949] Num frames 1300... [2023-02-26 17:55:57,772][00949] Avg episode rewards: #0: 26.120, true rewards: #0: 13.120 [2023-02-26 17:55:57,776][00949] Avg episode reward: 26.120, avg true_objective: 13.120 [2023-02-26 17:55:57,917][00949] Num frames 1400... [2023-02-26 17:55:58,094][00949] Num frames 1500... [2023-02-26 17:55:58,258][00949] Num frames 1600... [2023-02-26 17:55:58,420][00949] Num frames 1700... [2023-02-26 17:55:58,581][00949] Num frames 1800... [2023-02-26 17:55:58,747][00949] Num frames 1900... [2023-02-26 17:55:58,923][00949] Num frames 2000... [2023-02-26 17:55:59,085][00949] Num frames 2100... [2023-02-26 17:55:59,239][00949] Num frames 2200... [2023-02-26 17:55:59,353][00949] Num frames 2300... [2023-02-26 17:55:59,475][00949] Num frames 2400... [2023-02-26 17:55:59,568][00949] Avg episode rewards: #0: 26.660, true rewards: #0: 12.160 [2023-02-26 17:55:59,570][00949] Avg episode reward: 26.660, avg true_objective: 12.160 [2023-02-26 17:55:59,649][00949] Num frames 2500... [2023-02-26 17:55:59,760][00949] Num frames 2600... [2023-02-26 17:55:59,881][00949] Num frames 2700... [2023-02-26 17:55:59,995][00949] Num frames 2800... [2023-02-26 17:56:00,111][00949] Num frames 2900... [2023-02-26 17:56:00,223][00949] Num frames 3000... [2023-02-26 17:56:00,336][00949] Num frames 3100... [2023-02-26 17:56:00,452][00949] Num frames 3200... [2023-02-26 17:56:00,572][00949] Num frames 3300... [2023-02-26 17:56:00,684][00949] Num frames 3400... [2023-02-26 17:56:00,793][00949] Num frames 3500... [2023-02-26 17:56:00,909][00949] Num frames 3600... [2023-02-26 17:56:01,023][00949] Num frames 3700... [2023-02-26 17:56:01,133][00949] Num frames 3800... [2023-02-26 17:56:01,253][00949] Num frames 3900... [2023-02-26 17:56:01,378][00949] Num frames 4000... [2023-02-26 17:56:01,496][00949] Num frames 4100... [2023-02-26 17:56:01,606][00949] Num frames 4200... [2023-02-26 17:56:01,699][00949] Avg episode rewards: #0: 31.747, true rewards: #0: 14.080 [2023-02-26 17:56:01,700][00949] Avg episode reward: 31.747, avg true_objective: 14.080 [2023-02-26 17:56:01,789][00949] Num frames 4300... [2023-02-26 17:56:01,906][00949] Num frames 4400... [2023-02-26 17:56:02,018][00949] Num frames 4500... [2023-02-26 17:56:02,131][00949] Num frames 4600... [2023-02-26 17:56:02,244][00949] Num frames 4700... [2023-02-26 17:56:02,358][00949] Num frames 4800... [2023-02-26 17:56:02,473][00949] Num frames 4900... [2023-02-26 17:56:02,570][00949] Avg episode rewards: #0: 27.825, true rewards: #0: 12.325 [2023-02-26 17:56:02,572][00949] Avg episode reward: 27.825, avg true_objective: 12.325 [2023-02-26 17:56:02,653][00949] Num frames 5000... [2023-02-26 17:56:02,767][00949] Num frames 5100... [2023-02-26 17:56:02,888][00949] Num frames 5200... [2023-02-26 17:56:03,000][00949] Num frames 5300... [2023-02-26 17:56:03,117][00949] Num frames 5400... [2023-02-26 17:56:03,234][00949] Num frames 5500... [2023-02-26 17:56:03,303][00949] Avg episode rewards: #0: 24.224, true rewards: #0: 11.024 [2023-02-26 17:56:03,305][00949] Avg episode reward: 24.224, avg true_objective: 11.024 [2023-02-26 17:56:03,405][00949] Num frames 5600... [2023-02-26 17:56:03,525][00949] Num frames 5700... [2023-02-26 17:56:03,638][00949] Num frames 5800... [2023-02-26 17:56:03,756][00949] Num frames 5900... [2023-02-26 17:56:03,875][00949] Num frames 6000... [2023-02-26 17:56:03,992][00949] Num frames 6100... [2023-02-26 17:56:04,110][00949] Num frames 6200... [2023-02-26 17:56:04,248][00949] Num frames 6300... [2023-02-26 17:56:04,408][00949] Num frames 6400... [2023-02-26 17:56:04,566][00949] Num frames 6500... [2023-02-26 17:56:04,725][00949] Num frames 6600... [2023-02-26 17:56:04,886][00949] Num frames 6700... [2023-02-26 17:56:05,046][00949] Num frames 6800... [2023-02-26 17:56:05,240][00949] Avg episode rewards: #0: 24.647, true rewards: #0: 11.480 [2023-02-26 17:56:05,242][00949] Avg episode reward: 24.647, avg true_objective: 11.480 [2023-02-26 17:56:05,265][00949] Num frames 6900... [2023-02-26 17:56:05,424][00949] Num frames 7000... [2023-02-26 17:56:05,593][00949] Num frames 7100... [2023-02-26 17:56:05,750][00949] Num frames 7200... [2023-02-26 17:56:05,908][00949] Num frames 7300... [2023-02-26 17:56:06,069][00949] Num frames 7400... [2023-02-26 17:56:06,232][00949] Num frames 7500... [2023-02-26 17:56:06,405][00949] Num frames 7600... [2023-02-26 17:56:06,566][00949] Num frames 7700... [2023-02-26 17:56:06,737][00949] Num frames 7800... [2023-02-26 17:56:06,906][00949] Num frames 7900... [2023-02-26 17:56:07,073][00949] Num frames 8000... [2023-02-26 17:56:07,242][00949] Num frames 8100... [2023-02-26 17:56:07,409][00949] Num frames 8200... [2023-02-26 17:56:07,573][00949] Num frames 8300... [2023-02-26 17:56:07,741][00949] Num frames 8400... [2023-02-26 17:56:07,856][00949] Num frames 8500... [2023-02-26 17:56:07,971][00949] Num frames 8600... [2023-02-26 17:56:08,089][00949] Num frames 8700... [2023-02-26 17:56:08,200][00949] Num frames 8800... [2023-02-26 17:56:08,321][00949] Num frames 8900... [2023-02-26 17:56:08,475][00949] Avg episode rewards: #0: 29.697, true rewards: #0: 12.840 [2023-02-26 17:56:08,477][00949] Avg episode reward: 29.697, avg true_objective: 12.840 [2023-02-26 17:56:08,494][00949] Num frames 9000... [2023-02-26 17:56:08,605][00949] Num frames 9100... [2023-02-26 17:56:08,734][00949] Num frames 9200... [2023-02-26 17:56:08,854][00949] Num frames 9300... [2023-02-26 17:56:08,978][00949] Num frames 9400... [2023-02-26 17:56:09,105][00949] Num frames 9500... [2023-02-26 17:56:09,236][00949] Num frames 9600... [2023-02-26 17:56:09,353][00949] Num frames 9700... [2023-02-26 17:56:09,473][00949] Num frames 9800... [2023-02-26 17:56:09,593][00949] Num frames 9900... [2023-02-26 17:56:09,668][00949] Avg episode rewards: #0: 28.020, true rewards: #0: 12.395 [2023-02-26 17:56:09,669][00949] Avg episode reward: 28.020, avg true_objective: 12.395 [2023-02-26 17:56:09,775][00949] Num frames 10000... [2023-02-26 17:56:09,894][00949] Num frames 10100... [2023-02-26 17:56:10,013][00949] Num frames 10200... [2023-02-26 17:56:10,129][00949] Num frames 10300... [2023-02-26 17:56:10,240][00949] Num frames 10400... [2023-02-26 17:56:10,360][00949] Num frames 10500... [2023-02-26 17:56:10,443][00949] Avg episode rewards: #0: 25.915, true rewards: #0: 11.693 [2023-02-26 17:56:10,445][00949] Avg episode reward: 25.915, avg true_objective: 11.693 [2023-02-26 17:56:10,530][00949] Num frames 10600... [2023-02-26 17:56:10,645][00949] Num frames 10700... [2023-02-26 17:56:10,761][00949] Num frames 10800... [2023-02-26 17:56:10,904][00949] Avg episode rewards: #0: 23.776, true rewards: #0: 10.876 [2023-02-26 17:56:10,906][00949] Avg episode reward: 23.776, avg true_objective: 10.876 [2023-02-26 17:56:10,937][00949] Num frames 10900... [2023-02-26 17:56:11,048][00949] Num frames 11000... [2023-02-26 17:56:11,165][00949] Num frames 11100... [2023-02-26 17:56:11,283][00949] Num frames 11200... [2023-02-26 17:56:11,394][00949] Num frames 11300... [2023-02-26 17:56:11,515][00949] Num frames 11400... [2023-02-26 17:56:11,628][00949] Num frames 11500... [2023-02-26 17:56:11,758][00949] Avg episode rewards: #0: 23.061, true rewards: #0: 10.515 [2023-02-26 17:56:11,760][00949] Avg episode reward: 23.061, avg true_objective: 10.515 [2023-02-26 17:56:11,799][00949] Num frames 11600... [2023-02-26 17:56:11,916][00949] Num frames 11700... [2023-02-26 17:56:12,030][00949] Num frames 11800... [2023-02-26 17:56:12,142][00949] Num frames 11900... [2023-02-26 17:56:12,293][00949] Avg episode rewards: #0: 21.652, true rewards: #0: 9.986 [2023-02-26 17:56:12,295][00949] Avg episode reward: 21.652, avg true_objective: 9.986 [2023-02-26 17:56:12,317][00949] Num frames 12000... [2023-02-26 17:56:12,434][00949] Num frames 12100... [2023-02-26 17:56:12,555][00949] Num frames 12200... [2023-02-26 17:56:12,680][00949] Num frames 12300... [2023-02-26 17:56:12,806][00949] Num frames 12400... [2023-02-26 17:56:12,927][00949] Num frames 12500... [2023-02-26 17:56:13,042][00949] Num frames 12600... [2023-02-26 17:56:13,159][00949] Num frames 12700... [2023-02-26 17:56:13,274][00949] Num frames 12800... [2023-02-26 17:56:13,394][00949] Num frames 12900... [2023-02-26 17:56:13,510][00949] Num frames 13000... [2023-02-26 17:56:13,620][00949] Num frames 13100... [2023-02-26 17:56:13,739][00949] Num frames 13200... [2023-02-26 17:56:13,907][00949] Avg episode rewards: #0: 22.611, true rewards: #0: 10.227 [2023-02-26 17:56:13,909][00949] Avg episode reward: 22.611, avg true_objective: 10.227 [2023-02-26 17:56:13,920][00949] Num frames 13300... [2023-02-26 17:56:14,032][00949] Num frames 13400... [2023-02-26 17:56:14,147][00949] Num frames 13500... [2023-02-26 17:56:14,259][00949] Num frames 13600... [2023-02-26 17:56:14,372][00949] Num frames 13700... [2023-02-26 17:56:14,494][00949] Num frames 13800... [2023-02-26 17:56:14,607][00949] Num frames 13900... [2023-02-26 17:56:14,720][00949] Num frames 14000... [2023-02-26 17:56:14,851][00949] Num frames 14100... [2023-02-26 17:56:14,969][00949] Num frames 14200... [2023-02-26 17:56:15,094][00949] Num frames 14300... [2023-02-26 17:56:15,211][00949] Num frames 14400... [2023-02-26 17:56:15,335][00949] Num frames 14500... [2023-02-26 17:56:15,450][00949] Num frames 14600... [2023-02-26 17:56:15,568][00949] Num frames 14700... [2023-02-26 17:56:15,682][00949] Num frames 14800... [2023-02-26 17:56:15,809][00949] Num frames 14900... [2023-02-26 17:56:15,978][00949] Avg episode rewards: #0: 23.851, true rewards: #0: 10.709 [2023-02-26 17:56:15,983][00949] Avg episode reward: 23.851, avg true_objective: 10.709 [2023-02-26 17:56:15,997][00949] Num frames 15000... [2023-02-26 17:56:16,122][00949] Num frames 15100... [2023-02-26 17:56:16,241][00949] Num frames 15200... [2023-02-26 17:56:16,357][00949] Num frames 15300... [2023-02-26 17:56:16,476][00949] Num frames 15400... [2023-02-26 17:56:16,595][00949] Num frames 15500... [2023-02-26 17:56:16,711][00949] Num frames 15600... [2023-02-26 17:56:16,827][00949] Num frames 15700... [2023-02-26 17:56:16,934][00949] Avg episode rewards: #0: 23.491, true rewards: #0: 10.491 [2023-02-26 17:56:16,935][00949] Avg episode reward: 23.491, avg true_objective: 10.491 [2023-02-26 17:56:17,016][00949] Num frames 15800... [2023-02-26 17:56:17,135][00949] Num frames 15900... [2023-02-26 17:56:17,262][00949] Num frames 16000... [2023-02-26 17:56:17,384][00949] Num frames 16100... [2023-02-26 17:56:17,498][00949] Num frames 16200... [2023-02-26 17:56:17,613][00949] Num frames 16300... [2023-02-26 17:56:17,738][00949] Num frames 16400... [2023-02-26 17:56:17,912][00949] Num frames 16500... [2023-02-26 17:56:18,065][00949] Num frames 16600... [2023-02-26 17:56:18,219][00949] Num frames 16700... [2023-02-26 17:56:18,416][00949] Avg episode rewards: #0: 23.870, true rewards: #0: 10.495 [2023-02-26 17:56:18,419][00949] Avg episode reward: 23.870, avg true_objective: 10.495 [2023-02-26 17:56:18,433][00949] Num frames 16800... [2023-02-26 17:56:18,597][00949] Num frames 16900... [2023-02-26 17:56:18,755][00949] Num frames 17000... [2023-02-26 17:56:18,924][00949] Num frames 17100... [2023-02-26 17:56:19,080][00949] Num frames 17200... [2023-02-26 17:56:19,232][00949] Num frames 17300... [2023-02-26 17:56:19,388][00949] Num frames 17400... [2023-02-26 17:56:19,551][00949] Num frames 17500... [2023-02-26 17:56:19,715][00949] Num frames 17600... [2023-02-26 17:56:19,881][00949] Num frames 17700... [2023-02-26 17:56:20,052][00949] Num frames 17800... [2023-02-26 17:56:20,215][00949] Num frames 17900... [2023-02-26 17:56:20,384][00949] Num frames 18000... [2023-02-26 17:56:20,567][00949] Num frames 18100... [2023-02-26 17:56:20,745][00949] Num frames 18200... [2023-02-26 17:56:20,916][00949] Num frames 18300... [2023-02-26 17:56:21,091][00949] Num frames 18400... [2023-02-26 17:56:21,199][00949] Avg episode rewards: #0: 25.014, true rewards: #0: 10.838 [2023-02-26 17:56:21,201][00949] Avg episode reward: 25.014, avg true_objective: 10.838 [2023-02-26 17:56:21,310][00949] Num frames 18500... [2023-02-26 17:56:21,429][00949] Num frames 18600... [2023-02-26 17:56:21,542][00949] Num frames 18700... [2023-02-26 17:56:21,659][00949] Num frames 18800... [2023-02-26 17:56:21,772][00949] Num frames 18900... [2023-02-26 17:56:21,883][00949] Num frames 19000... [2023-02-26 17:56:22,006][00949] Num frames 19100... [2023-02-26 17:56:22,128][00949] Num frames 19200... [2023-02-26 17:56:22,245][00949] Num frames 19300... [2023-02-26 17:56:22,361][00949] Num frames 19400... [2023-02-26 17:56:22,483][00949] Num frames 19500... [2023-02-26 17:56:22,597][00949] Num frames 19600... [2023-02-26 17:56:22,712][00949] Num frames 19700... [2023-02-26 17:56:22,824][00949] Num frames 19800... [2023-02-26 17:56:22,939][00949] Num frames 19900... [2023-02-26 17:56:23,067][00949] Num frames 20000... [2023-02-26 17:56:23,186][00949] Avg episode rewards: #0: 25.920, true rewards: #0: 11.142 [2023-02-26 17:56:23,188][00949] Avg episode reward: 25.920, avg true_objective: 11.142 [2023-02-26 17:56:23,247][00949] Num frames 20100... [2023-02-26 17:56:23,360][00949] Num frames 20200... [2023-02-26 17:56:23,475][00949] Num frames 20300... [2023-02-26 17:56:23,621][00949] Avg episode rewards: #0: 24.829, true rewards: #0: 10.724 [2023-02-26 17:56:23,623][00949] Avg episode reward: 24.829, avg true_objective: 10.724 [2023-02-26 17:56:23,655][00949] Num frames 20400... [2023-02-26 17:56:23,767][00949] Num frames 20500... [2023-02-26 17:56:23,886][00949] Num frames 20600... [2023-02-26 17:56:24,003][00949] Num frames 20700... [2023-02-26 17:56:24,127][00949] Num frames 20800... [2023-02-26 17:56:24,241][00949] Num frames 20900... [2023-02-26 17:56:24,363][00949] Num frames 21000... [2023-02-26 17:56:24,477][00949] Num frames 21100... [2023-02-26 17:56:24,593][00949] Num frames 21200... [2023-02-26 17:56:24,702][00949] Num frames 21300... [2023-02-26 17:56:24,812][00949] Num frames 21400... [2023-02-26 17:56:24,931][00949] Num frames 21500... [2023-02-26 17:56:25,051][00949] Num frames 21600... [2023-02-26 17:56:25,173][00949] Num frames 21700... [2023-02-26 17:56:25,290][00949] Num frames 21800... [2023-02-26 17:56:25,409][00949] Num frames 21900... [2023-02-26 17:56:25,527][00949] Num frames 22000... [2023-02-26 17:56:25,638][00949] Num frames 22100... [2023-02-26 17:56:25,755][00949] Num frames 22200... [2023-02-26 17:56:25,868][00949] Num frames 22300... [2023-02-26 17:56:25,987][00949] Num frames 22400... [2023-02-26 17:56:26,129][00949] Avg episode rewards: #0: 26.334, true rewards: #0: 11.234 [2023-02-26 17:56:26,132][00949] Avg episode reward: 26.334, avg true_objective: 11.234 [2023-02-26 17:58:43,553][00949] Replay video saved to /content/train_dir/default_experiment/replay.mp4!