[2023-02-27 09:51:24,524][00722] Saving configuration to /content/train_dir/default_experiment/config.json... [2023-02-27 09:51:24,527][00722] Rollout worker 0 uses device cpu [2023-02-27 09:51:24,528][00722] Rollout worker 1 uses device cpu [2023-02-27 09:51:24,530][00722] Rollout worker 2 uses device cpu [2023-02-27 09:51:24,531][00722] Rollout worker 3 uses device cpu [2023-02-27 09:51:24,533][00722] Rollout worker 4 uses device cpu [2023-02-27 09:51:24,534][00722] Rollout worker 5 uses device cpu [2023-02-27 09:51:24,536][00722] Rollout worker 6 uses device cpu [2023-02-27 09:51:24,537][00722] Rollout worker 7 uses device cpu [2023-02-27 09:51:24,738][00722] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-27 09:51:24,740][00722] InferenceWorker_p0-w0: min num requests: 2 [2023-02-27 09:51:24,773][00722] Starting all processes... [2023-02-27 09:51:24,774][00722] Starting process learner_proc0 [2023-02-27 09:51:24,828][00722] Starting all processes... [2023-02-27 09:51:24,838][00722] Starting process inference_proc0-0 [2023-02-27 09:51:24,838][00722] Starting process rollout_proc0 [2023-02-27 09:51:24,840][00722] Starting process rollout_proc1 [2023-02-27 09:51:24,841][00722] Starting process rollout_proc2 [2023-02-27 09:51:24,842][00722] Starting process rollout_proc3 [2023-02-27 09:51:24,842][00722] Starting process rollout_proc4 [2023-02-27 09:51:24,842][00722] Starting process rollout_proc5 [2023-02-27 09:51:24,842][00722] Starting process rollout_proc6 [2023-02-27 09:51:24,842][00722] Starting process rollout_proc7 [2023-02-27 09:51:36,508][10610] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-27 09:51:36,515][10610] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-02-27 09:51:36,946][10624] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-27 09:51:36,947][10624] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-02-27 09:51:37,103][10625] Worker 0 uses CPU cores [0] [2023-02-27 09:51:37,342][10627] Worker 2 uses CPU cores [0] [2023-02-27 09:51:37,380][10626] Worker 1 uses CPU cores [1] [2023-02-27 09:51:37,448][10628] Worker 3 uses CPU cores [1] [2023-02-27 09:51:37,463][10630] Worker 5 uses CPU cores [1] [2023-02-27 09:51:37,711][10632] Worker 7 uses CPU cores [1] [2023-02-27 09:51:37,773][10629] Worker 4 uses CPU cores [0] [2023-02-27 09:51:37,774][10631] Worker 6 uses CPU cores [0] [2023-02-27 09:51:37,876][10610] Num visible devices: 1 [2023-02-27 09:51:37,880][10624] Num visible devices: 1 [2023-02-27 09:51:37,907][10610] Starting seed is not provided [2023-02-27 09:51:37,907][10610] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-27 09:51:37,911][10610] Initializing actor-critic model on device cuda:0 [2023-02-27 09:51:37,912][10610] RunningMeanStd input shape: (3, 72, 128) [2023-02-27 09:51:37,914][10610] RunningMeanStd input shape: (1,) [2023-02-27 09:51:37,933][10610] ConvEncoder: input_channels=3 [2023-02-27 09:51:38,302][10610] Conv encoder output size: 512 [2023-02-27 09:51:38,303][10610] Policy head output size: 512 [2023-02-27 09:51:38,366][10610] Created Actor Critic model with architecture: [2023-02-27 09:51:38,367][10610] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) [2023-02-27 09:51:44,730][00722] Heartbeat connected on Batcher_0 [2023-02-27 09:51:44,739][00722] Heartbeat connected on InferenceWorker_p0-w0 [2023-02-27 09:51:44,748][00722] Heartbeat connected on RolloutWorker_w0 [2023-02-27 09:51:44,752][00722] Heartbeat connected on RolloutWorker_w1 [2023-02-27 09:51:44,756][00722] Heartbeat connected on RolloutWorker_w2 [2023-02-27 09:51:44,759][00722] Heartbeat connected on RolloutWorker_w3 [2023-02-27 09:51:44,762][00722] Heartbeat connected on RolloutWorker_w4 [2023-02-27 09:51:44,765][00722] Heartbeat connected on RolloutWorker_w5 [2023-02-27 09:51:44,769][00722] Heartbeat connected on RolloutWorker_w6 [2023-02-27 09:51:44,772][00722] Heartbeat connected on RolloutWorker_w7 [2023-02-27 09:51:46,412][10610] Using optimizer [2023-02-27 09:51:46,413][10610] No checkpoints found [2023-02-27 09:51:46,414][10610] Did not load from checkpoint, starting from scratch! [2023-02-27 09:51:46,414][10610] Initialized policy 0 weights for model version 0 [2023-02-27 09:51:46,418][10610] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-27 09:51:46,427][10610] LearnerWorker_p0 finished initialization! [2023-02-27 09:51:46,428][00722] Heartbeat connected on LearnerWorker_p0 [2023-02-27 09:51:46,623][10624] RunningMeanStd input shape: (3, 72, 128) [2023-02-27 09:51:46,625][10624] RunningMeanStd input shape: (1,) [2023-02-27 09:51:46,638][10624] ConvEncoder: input_channels=3 [2023-02-27 09:51:46,747][10624] Conv encoder output size: 512 [2023-02-27 09:51:46,748][10624] Policy head output size: 512 [2023-02-27 09:51:49,692][00722] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-27 09:51:49,708][00722] Inference worker 0-0 is ready! [2023-02-27 09:51:49,710][00722] All inference workers are ready! Signal rollout workers to start! [2023-02-27 09:51:49,824][10632] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-27 09:51:49,826][10630] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-27 09:51:49,883][10626] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-27 09:51:49,904][10631] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-27 09:51:49,900][10625] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-27 09:51:49,911][10629] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-27 09:51:49,913][10627] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-27 09:51:49,940][10628] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-27 09:51:50,942][10632] Decorrelating experience for 0 frames... [2023-02-27 09:51:50,944][10630] Decorrelating experience for 0 frames... [2023-02-27 09:51:51,268][10629] Decorrelating experience for 0 frames... [2023-02-27 09:51:51,275][10625] Decorrelating experience for 0 frames... [2023-02-27 09:51:52,278][10630] Decorrelating experience for 32 frames... [2023-02-27 09:51:52,280][10632] Decorrelating experience for 32 frames... [2023-02-27 09:51:52,593][10631] Decorrelating experience for 0 frames... [2023-02-27 09:51:52,848][10626] Decorrelating experience for 0 frames... [2023-02-27 09:51:52,876][10629] Decorrelating experience for 32 frames... [2023-02-27 09:51:52,875][10628] Decorrelating experience for 0 frames... [2023-02-27 09:51:52,902][10625] Decorrelating experience for 32 frames... [2023-02-27 09:51:53,689][10631] Decorrelating experience for 32 frames... [2023-02-27 09:51:53,737][10630] Decorrelating experience for 64 frames... [2023-02-27 09:51:53,740][10632] Decorrelating experience for 64 frames... [2023-02-27 09:51:53,783][10629] Decorrelating experience for 64 frames... [2023-02-27 09:51:53,964][10626] Decorrelating experience for 32 frames... [2023-02-27 09:51:54,461][10631] Decorrelating experience for 64 frames... [2023-02-27 09:51:54,692][00722] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-27 09:51:54,771][10625] Decorrelating experience for 64 frames... [2023-02-27 09:51:54,950][10630] Decorrelating experience for 96 frames... [2023-02-27 09:51:54,962][10632] Decorrelating experience for 96 frames... [2023-02-27 09:51:55,335][10626] Decorrelating experience for 64 frames... [2023-02-27 09:51:55,379][10629] Decorrelating experience for 96 frames... [2023-02-27 09:51:55,453][10631] Decorrelating experience for 96 frames... [2023-02-27 09:51:55,865][10628] Decorrelating experience for 32 frames... [2023-02-27 09:51:56,240][10626] Decorrelating experience for 96 frames... [2023-02-27 09:51:56,457][10625] Decorrelating experience for 96 frames... [2023-02-27 09:51:56,566][10628] Decorrelating experience for 64 frames... [2023-02-27 09:51:56,877][10627] Decorrelating experience for 0 frames... [2023-02-27 09:51:56,988][10628] Decorrelating experience for 96 frames... [2023-02-27 09:51:57,279][10627] Decorrelating experience for 32 frames... [2023-02-27 09:51:57,598][10627] Decorrelating experience for 64 frames... [2023-02-27 09:51:57,899][10627] Decorrelating experience for 96 frames... [2023-02-27 09:51:59,692][00722] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 4.2. Samples: 42. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-27 09:52:01,391][10610] Signal inference workers to stop experience collection... [2023-02-27 09:52:01,398][10624] InferenceWorker_p0-w0: stopping experience collection [2023-02-27 09:52:04,006][10610] Signal inference workers to resume experience collection... [2023-02-27 09:52:04,007][10624] InferenceWorker_p0-w0: resuming experience collection [2023-02-27 09:52:04,692][00722] Fps is (10 sec: 409.6, 60 sec: 273.1, 300 sec: 273.1). Total num frames: 4096. Throughput: 0: 156.0. Samples: 2340. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-02-27 09:52:04,695][00722] Avg episode reward: [(0, '2.227')] [2023-02-27 09:52:09,697][00722] Fps is (10 sec: 2047.1, 60 sec: 1023.8, 300 sec: 1023.8). Total num frames: 20480. Throughput: 0: 213.2. Samples: 4264. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) [2023-02-27 09:52:09,705][00722] Avg episode reward: [(0, '3.578')] [2023-02-27 09:52:14,406][10624] Updated weights for policy 0, policy_version 10 (0.0011) [2023-02-27 09:52:14,694][00722] Fps is (10 sec: 3685.5, 60 sec: 1638.3, 300 sec: 1638.3). Total num frames: 40960. Throughput: 0: 390.8. Samples: 9770. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) [2023-02-27 09:52:14,701][00722] Avg episode reward: [(0, '4.130')] [2023-02-27 09:52:19,692][00722] Fps is (10 sec: 4507.6, 60 sec: 2184.5, 300 sec: 2184.5). Total num frames: 65536. Throughput: 0: 442.9. Samples: 13288. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 09:52:19,694][00722] Avg episode reward: [(0, '4.521')] [2023-02-27 09:52:24,210][10624] Updated weights for policy 0, policy_version 20 (0.0016) [2023-02-27 09:52:24,693][00722] Fps is (10 sec: 4096.7, 60 sec: 2340.5, 300 sec: 2340.5). Total num frames: 81920. Throughput: 0: 570.3. Samples: 19962. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:52:24,699][00722] Avg episode reward: [(0, '4.564')] [2023-02-27 09:52:29,694][00722] Fps is (10 sec: 2866.8, 60 sec: 2355.1, 300 sec: 2355.1). Total num frames: 94208. Throughput: 0: 609.7. Samples: 24388. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:52:29,700][00722] Avg episode reward: [(0, '4.522')] [2023-02-27 09:52:34,692][00722] Fps is (10 sec: 3277.0, 60 sec: 2548.6, 300 sec: 2548.6). Total num frames: 114688. Throughput: 0: 591.9. Samples: 26636. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:52:34,700][00722] Avg episode reward: [(0, '4.407')] [2023-02-27 09:52:34,705][10610] Saving new best policy, reward=4.407! [2023-02-27 09:52:35,826][10624] Updated weights for policy 0, policy_version 30 (0.0021) [2023-02-27 09:52:39,692][00722] Fps is (10 sec: 4506.3, 60 sec: 2785.3, 300 sec: 2785.3). Total num frames: 139264. Throughput: 0: 744.4. Samples: 33498. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:52:39,697][00722] Avg episode reward: [(0, '4.401')] [2023-02-27 09:52:44,692][00722] Fps is (10 sec: 4096.0, 60 sec: 2830.0, 300 sec: 2830.0). Total num frames: 155648. Throughput: 0: 881.5. Samples: 39710. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:52:44,699][00722] Avg episode reward: [(0, '4.483')] [2023-02-27 09:52:44,704][10610] Saving new best policy, reward=4.483! [2023-02-27 09:52:46,293][10624] Updated weights for policy 0, policy_version 40 (0.0029) [2023-02-27 09:52:49,692][00722] Fps is (10 sec: 3276.7, 60 sec: 2867.2, 300 sec: 2867.2). Total num frames: 172032. Throughput: 0: 878.5. Samples: 41874. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:52:49,701][00722] Avg episode reward: [(0, '4.623')] [2023-02-27 09:52:49,712][10610] Saving new best policy, reward=4.623! [2023-02-27 09:52:54,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 2961.7). Total num frames: 192512. Throughput: 0: 945.0. Samples: 46784. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:52:54,694][00722] Avg episode reward: [(0, '4.768')] [2023-02-27 09:52:54,701][10610] Saving new best policy, reward=4.768! [2023-02-27 09:52:56,779][10624] Updated weights for policy 0, policy_version 50 (0.0026) [2023-02-27 09:52:59,692][00722] Fps is (10 sec: 4505.7, 60 sec: 3618.1, 300 sec: 3101.3). Total num frames: 217088. Throughput: 0: 979.7. Samples: 53856. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-27 09:52:59,695][00722] Avg episode reward: [(0, '4.583')] [2023-02-27 09:53:04,693][00722] Fps is (10 sec: 4505.4, 60 sec: 3891.2, 300 sec: 3167.6). Total num frames: 237568. Throughput: 0: 982.5. Samples: 57502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:53:04,698][00722] Avg episode reward: [(0, '4.552')] [2023-02-27 09:53:07,203][10624] Updated weights for policy 0, policy_version 60 (0.0020) [2023-02-27 09:53:09,696][00722] Fps is (10 sec: 3275.7, 60 sec: 3823.0, 300 sec: 3123.1). Total num frames: 249856. Throughput: 0: 937.2. Samples: 62140. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) [2023-02-27 09:53:09,700][00722] Avg episode reward: [(0, '4.770')] [2023-02-27 09:53:09,715][10610] Saving new best policy, reward=4.770! [2023-02-27 09:53:14,692][00722] Fps is (10 sec: 3276.9, 60 sec: 3823.1, 300 sec: 3180.4). Total num frames: 270336. Throughput: 0: 963.7. Samples: 67752. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 09:53:14,697][00722] Avg episode reward: [(0, '4.663')] [2023-02-27 09:53:17,390][10624] Updated weights for policy 0, policy_version 70 (0.0035) [2023-02-27 09:53:19,692][00722] Fps is (10 sec: 4507.1, 60 sec: 3822.9, 300 sec: 3276.8). Total num frames: 294912. Throughput: 0: 993.8. Samples: 71356. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:53:19,700][00722] Avg episode reward: [(0, '4.510')] [2023-02-27 09:53:19,709][10610] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000072_294912.pth... [2023-02-27 09:53:24,695][00722] Fps is (10 sec: 4504.3, 60 sec: 3891.1, 300 sec: 3319.8). Total num frames: 315392. Throughput: 0: 988.3. Samples: 77976. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 09:53:24,702][00722] Avg episode reward: [(0, '4.405')] [2023-02-27 09:53:29,241][10624] Updated weights for policy 0, policy_version 80 (0.0029) [2023-02-27 09:53:29,694][00722] Fps is (10 sec: 3276.3, 60 sec: 3891.2, 300 sec: 3276.8). Total num frames: 327680. Throughput: 0: 945.9. Samples: 82276. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-27 09:53:29,696][00722] Avg episode reward: [(0, '4.358')] [2023-02-27 09:53:34,692][00722] Fps is (10 sec: 3277.7, 60 sec: 3891.2, 300 sec: 3315.8). Total num frames: 348160. Throughput: 0: 954.8. Samples: 84838. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-27 09:53:34,698][00722] Avg episode reward: [(0, '4.500')] [2023-02-27 09:53:38,529][10624] Updated weights for policy 0, policy_version 90 (0.0025) [2023-02-27 09:53:39,693][00722] Fps is (10 sec: 4505.6, 60 sec: 3891.1, 300 sec: 3388.5). Total num frames: 372736. Throughput: 0: 996.7. Samples: 91636. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-27 09:53:39,700][00722] Avg episode reward: [(0, '4.524')] [2023-02-27 09:53:44,692][00722] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3383.7). Total num frames: 389120. Throughput: 0: 973.5. Samples: 97662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:53:44,698][00722] Avg episode reward: [(0, '4.666')] [2023-02-27 09:53:49,692][00722] Fps is (10 sec: 3277.2, 60 sec: 3891.2, 300 sec: 3379.2). Total num frames: 405504. Throughput: 0: 943.7. Samples: 99970. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:53:49,699][00722] Avg episode reward: [(0, '4.659')] [2023-02-27 09:53:50,929][10624] Updated weights for policy 0, policy_version 100 (0.0014) [2023-02-27 09:53:54,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3407.9). Total num frames: 425984. Throughput: 0: 953.3. Samples: 105034. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:53:54,701][00722] Avg episode reward: [(0, '4.733')] [2023-02-27 09:53:59,693][00722] Fps is (10 sec: 4095.6, 60 sec: 3822.9, 300 sec: 3434.3). Total num frames: 446464. Throughput: 0: 983.2. Samples: 111998. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-27 09:53:59,700][00722] Avg episode reward: [(0, '4.623')] [2023-02-27 09:54:00,082][10624] Updated weights for policy 0, policy_version 110 (0.0015) [2023-02-27 09:54:04,692][00722] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3398.2). Total num frames: 458752. Throughput: 0: 956.4. Samples: 114392. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:54:04,695][00722] Avg episode reward: [(0, '4.584')] [2023-02-27 09:54:09,695][00722] Fps is (10 sec: 2457.2, 60 sec: 3686.5, 300 sec: 3364.5). Total num frames: 471040. Throughput: 0: 887.6. Samples: 117916. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:54:09,697][00722] Avg episode reward: [(0, '4.479')] [2023-02-27 09:54:14,692][00722] Fps is (10 sec: 2457.5, 60 sec: 3549.9, 300 sec: 3333.3). Total num frames: 483328. Throughput: 0: 874.5. Samples: 121626. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:54:14,695][00722] Avg episode reward: [(0, '4.507')] [2023-02-27 09:54:15,685][10624] Updated weights for policy 0, policy_version 120 (0.0024) [2023-02-27 09:54:19,694][00722] Fps is (10 sec: 3686.5, 60 sec: 3549.7, 300 sec: 3386.0). Total num frames: 507904. Throughput: 0: 887.5. Samples: 124778. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:54:19,702][00722] Avg episode reward: [(0, '4.685')] [2023-02-27 09:54:24,391][10624] Updated weights for policy 0, policy_version 130 (0.0013) [2023-02-27 09:54:24,694][00722] Fps is (10 sec: 4914.5, 60 sec: 3618.2, 300 sec: 3435.3). Total num frames: 532480. Throughput: 0: 893.2. Samples: 131828. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:54:24,696][00722] Avg episode reward: [(0, '4.756')] [2023-02-27 09:54:29,692][00722] Fps is (10 sec: 4096.9, 60 sec: 3686.5, 300 sec: 3430.4). Total num frames: 548864. Throughput: 0: 877.9. Samples: 137168. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:54:29,695][00722] Avg episode reward: [(0, '4.785')] [2023-02-27 09:54:29,707][10610] Saving new best policy, reward=4.785! [2023-02-27 09:54:34,695][00722] Fps is (10 sec: 2866.9, 60 sec: 3549.7, 300 sec: 3400.9). Total num frames: 561152. Throughput: 0: 874.4. Samples: 139320. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:54:34,699][00722] Avg episode reward: [(0, '4.713')] [2023-02-27 09:54:36,751][10624] Updated weights for policy 0, policy_version 140 (0.0021) [2023-02-27 09:54:39,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3445.5). Total num frames: 585728. Throughput: 0: 890.6. Samples: 145112. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:54:39,694][00722] Avg episode reward: [(0, '4.915')] [2023-02-27 09:54:39,706][10610] Saving new best policy, reward=4.915! [2023-02-27 09:54:44,692][00722] Fps is (10 sec: 4916.5, 60 sec: 3686.4, 300 sec: 3487.4). Total num frames: 610304. Throughput: 0: 889.5. Samples: 152024. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-27 09:54:44,695][00722] Avg episode reward: [(0, '4.938')] [2023-02-27 09:54:44,698][10610] Saving new best policy, reward=4.938! [2023-02-27 09:54:45,964][10624] Updated weights for policy 0, policy_version 150 (0.0012) [2023-02-27 09:54:49,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3458.8). Total num frames: 622592. Throughput: 0: 892.9. Samples: 154574. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 09:54:49,694][00722] Avg episode reward: [(0, '4.995')] [2023-02-27 09:54:49,705][10610] Saving new best policy, reward=4.995! [2023-02-27 09:54:54,692][00722] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3453.9). Total num frames: 638976. Throughput: 0: 911.6. Samples: 158934. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-27 09:54:54,697][00722] Avg episode reward: [(0, '5.039')] [2023-02-27 09:54:54,702][10610] Saving new best policy, reward=5.039! [2023-02-27 09:54:58,044][10624] Updated weights for policy 0, policy_version 160 (0.0014) [2023-02-27 09:54:59,692][00722] Fps is (10 sec: 4095.9, 60 sec: 3618.2, 300 sec: 3492.4). Total num frames: 663552. Throughput: 0: 969.4. Samples: 165248. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 09:54:59,694][00722] Avg episode reward: [(0, '4.843')] [2023-02-27 09:55:04,692][00722] Fps is (10 sec: 4915.2, 60 sec: 3822.9, 300 sec: 3528.9). Total num frames: 688128. Throughput: 0: 979.8. Samples: 168866. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 09:55:04,699][00722] Avg episode reward: [(0, '5.240')] [2023-02-27 09:55:04,704][10610] Saving new best policy, reward=5.240! [2023-02-27 09:55:07,622][10624] Updated weights for policy 0, policy_version 170 (0.0012) [2023-02-27 09:55:09,692][00722] Fps is (10 sec: 3686.5, 60 sec: 3823.1, 300 sec: 3502.1). Total num frames: 700416. Throughput: 0: 953.4. Samples: 174730. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 09:55:09,698][00722] Avg episode reward: [(0, '5.479')] [2023-02-27 09:55:09,714][10610] Saving new best policy, reward=5.479! [2023-02-27 09:55:14,692][00722] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3496.6). Total num frames: 716800. Throughput: 0: 931.9. Samples: 179104. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:55:14,697][00722] Avg episode reward: [(0, '5.244')] [2023-02-27 09:55:19,031][10624] Updated weights for policy 0, policy_version 180 (0.0020) [2023-02-27 09:55:19,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3823.1, 300 sec: 3510.9). Total num frames: 737280. Throughput: 0: 953.7. Samples: 182236. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:55:19,699][00722] Avg episode reward: [(0, '5.476')] [2023-02-27 09:55:19,712][10610] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000180_737280.pth... [2023-02-27 09:55:24,696][00722] Fps is (10 sec: 4503.8, 60 sec: 3822.8, 300 sec: 3543.5). Total num frames: 761856. Throughput: 0: 978.9. Samples: 189166. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:55:24,705][00722] Avg episode reward: [(0, '5.703')] [2023-02-27 09:55:24,712][10610] Saving new best policy, reward=5.703! [2023-02-27 09:55:29,695][00722] Fps is (10 sec: 3685.3, 60 sec: 3754.5, 300 sec: 3518.8). Total num frames: 774144. Throughput: 0: 930.2. Samples: 193886. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 09:55:29,697][00722] Avg episode reward: [(0, '5.884')] [2023-02-27 09:55:29,713][10610] Saving new best policy, reward=5.884! [2023-02-27 09:55:30,385][10624] Updated weights for policy 0, policy_version 190 (0.0041) [2023-02-27 09:55:34,692][00722] Fps is (10 sec: 2458.6, 60 sec: 3754.8, 300 sec: 3495.3). Total num frames: 786432. Throughput: 0: 921.1. Samples: 196022. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:55:34,700][00722] Avg episode reward: [(0, '6.194')] [2023-02-27 09:55:34,762][10610] Saving new best policy, reward=6.194! [2023-02-27 09:55:39,692][00722] Fps is (10 sec: 3687.5, 60 sec: 3754.7, 300 sec: 3526.1). Total num frames: 811008. Throughput: 0: 948.3. Samples: 201608. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:55:39,700][00722] Avg episode reward: [(0, '6.587')] [2023-02-27 09:55:39,711][10610] Saving new best policy, reward=6.587! [2023-02-27 09:55:41,064][10624] Updated weights for policy 0, policy_version 200 (0.0022) [2023-02-27 09:55:44,692][00722] Fps is (10 sec: 4915.1, 60 sec: 3754.7, 300 sec: 3555.7). Total num frames: 835584. Throughput: 0: 960.6. Samples: 208474. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:55:44,706][00722] Avg episode reward: [(0, '6.289')] [2023-02-27 09:55:49,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3532.8). Total num frames: 847872. Throughput: 0: 937.4. Samples: 211050. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:55:49,697][00722] Avg episode reward: [(0, '6.369')] [2023-02-27 09:55:53,049][10624] Updated weights for policy 0, policy_version 210 (0.0024) [2023-02-27 09:55:54,693][00722] Fps is (10 sec: 2867.0, 60 sec: 3754.6, 300 sec: 3527.6). Total num frames: 864256. Throughput: 0: 902.9. Samples: 215360. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:55:54,695][00722] Avg episode reward: [(0, '6.508')] [2023-02-27 09:55:59,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3538.9). Total num frames: 884736. Throughput: 0: 941.6. Samples: 221474. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:55:59,699][00722] Avg episode reward: [(0, '6.606')] [2023-02-27 09:55:59,779][10610] Saving new best policy, reward=6.606! [2023-02-27 09:56:02,486][10624] Updated weights for policy 0, policy_version 220 (0.0018) [2023-02-27 09:56:04,692][00722] Fps is (10 sec: 4506.0, 60 sec: 3686.4, 300 sec: 3565.9). Total num frames: 909312. Throughput: 0: 948.6. Samples: 224922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 09:56:04,698][00722] Avg episode reward: [(0, '7.473')] [2023-02-27 09:56:04,704][10610] Saving new best policy, reward=7.473! [2023-02-27 09:56:09,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3560.4). Total num frames: 925696. Throughput: 0: 922.0. Samples: 230654. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 09:56:09,698][00722] Avg episode reward: [(0, '7.657')] [2023-02-27 09:56:09,710][10610] Saving new best policy, reward=7.657! [2023-02-27 09:56:14,692][00722] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3539.6). Total num frames: 937984. Throughput: 0: 913.8. Samples: 235004. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:56:14,695][00722] Avg episode reward: [(0, '7.855')] [2023-02-27 09:56:14,697][10610] Saving new best policy, reward=7.855! [2023-02-27 09:56:15,226][10624] Updated weights for policy 0, policy_version 230 (0.0032) [2023-02-27 09:56:19,692][00722] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3549.9). Total num frames: 958464. Throughput: 0: 930.0. Samples: 237872. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:56:19,695][00722] Avg episode reward: [(0, '7.065')] [2023-02-27 09:56:24,189][10624] Updated weights for policy 0, policy_version 240 (0.0023) [2023-02-27 09:56:24,692][00722] Fps is (10 sec: 4505.6, 60 sec: 3686.6, 300 sec: 3574.7). Total num frames: 983040. Throughput: 0: 956.9. Samples: 244668. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:56:24,695][00722] Avg episode reward: [(0, '7.211')] [2023-02-27 09:56:29,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3754.9, 300 sec: 3569.4). Total num frames: 999424. Throughput: 0: 924.2. Samples: 250064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-27 09:56:29,695][00722] Avg episode reward: [(0, '6.952')] [2023-02-27 09:56:34,692][00722] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3549.9). Total num frames: 1011712. Throughput: 0: 915.6. Samples: 252252. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-27 09:56:34,695][00722] Avg episode reward: [(0, '6.542')] [2023-02-27 09:56:36,645][10624] Updated weights for policy 0, policy_version 250 (0.0021) [2023-02-27 09:56:39,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3573.4). Total num frames: 1036288. Throughput: 0: 945.2. Samples: 257894. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:56:39,699][00722] Avg episode reward: [(0, '6.640')] [2023-02-27 09:56:44,692][00722] Fps is (10 sec: 4915.2, 60 sec: 3754.7, 300 sec: 3596.1). Total num frames: 1060864. Throughput: 0: 963.7. Samples: 264842. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:56:44,698][00722] Avg episode reward: [(0, '6.919')] [2023-02-27 09:56:45,635][10624] Updated weights for policy 0, policy_version 260 (0.0014) [2023-02-27 09:56:49,694][00722] Fps is (10 sec: 3685.7, 60 sec: 3754.6, 300 sec: 3637.8). Total num frames: 1073152. Throughput: 0: 947.3. Samples: 267554. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:56:49,702][00722] Avg episode reward: [(0, '7.099')] [2023-02-27 09:56:54,692][00722] Fps is (10 sec: 2867.1, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 1089536. Throughput: 0: 916.1. Samples: 271880. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:56:54,701][00722] Avg episode reward: [(0, '6.509')] [2023-02-27 09:56:58,118][10624] Updated weights for policy 0, policy_version 270 (0.0020) [2023-02-27 09:56:59,692][00722] Fps is (10 sec: 3687.1, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 1110016. Throughput: 0: 948.7. Samples: 277696. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 09:56:59,701][00722] Avg episode reward: [(0, '6.863')] [2023-02-27 09:57:04,692][00722] Fps is (10 sec: 4505.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 1134592. Throughput: 0: 961.0. Samples: 281118. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-27 09:57:04,699][00722] Avg episode reward: [(0, '7.424')] [2023-02-27 09:57:07,974][10624] Updated weights for policy 0, policy_version 280 (0.0014) [2023-02-27 09:57:09,693][00722] Fps is (10 sec: 4095.5, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 1150976. Throughput: 0: 942.0. Samples: 287060. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 09:57:09,702][00722] Avg episode reward: [(0, '8.202')] [2023-02-27 09:57:09,714][10610] Saving new best policy, reward=8.202! [2023-02-27 09:57:14,692][00722] Fps is (10 sec: 2867.1, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 1163264. Throughput: 0: 918.4. Samples: 291394. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 09:57:14,698][00722] Avg episode reward: [(0, '8.384')] [2023-02-27 09:57:14,702][10610] Saving new best policy, reward=8.384! [2023-02-27 09:57:19,526][10624] Updated weights for policy 0, policy_version 290 (0.0029) [2023-02-27 09:57:19,692][00722] Fps is (10 sec: 3686.8, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 1187840. Throughput: 0: 934.4. Samples: 294300. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:57:19,700][00722] Avg episode reward: [(0, '8.197')] [2023-02-27 09:57:19,713][10610] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000290_1187840.pth... [2023-02-27 09:57:19,826][10610] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000072_294912.pth [2023-02-27 09:57:24,692][00722] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 1208320. Throughput: 0: 960.9. Samples: 301134. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-27 09:57:24,699][00722] Avg episode reward: [(0, '7.447')] [2023-02-27 09:57:29,694][00722] Fps is (10 sec: 3685.9, 60 sec: 3754.6, 300 sec: 3762.7). Total num frames: 1224704. Throughput: 0: 923.6. Samples: 306406. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:57:29,698][00722] Avg episode reward: [(0, '7.491')] [2023-02-27 09:57:30,624][10624] Updated weights for policy 0, policy_version 300 (0.0014) [2023-02-27 09:57:34,692][00722] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 1236992. Throughput: 0: 910.7. Samples: 308532. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:57:34,698][00722] Avg episode reward: [(0, '7.628')] [2023-02-27 09:57:39,692][00722] Fps is (10 sec: 3686.9, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 1261568. Throughput: 0: 941.7. Samples: 314254. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:57:39,695][00722] Avg episode reward: [(0, '7.987')] [2023-02-27 09:57:41,071][10624] Updated weights for policy 0, policy_version 310 (0.0027) [2023-02-27 09:57:44,695][00722] Fps is (10 sec: 4914.0, 60 sec: 3754.5, 300 sec: 3776.6). Total num frames: 1286144. Throughput: 0: 964.9. Samples: 321120. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:57:44,697][00722] Avg episode reward: [(0, '8.116')] [2023-02-27 09:57:49,693][00722] Fps is (10 sec: 3686.0, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 1298432. Throughput: 0: 946.4. Samples: 323708. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-27 09:57:49,696][00722] Avg episode reward: [(0, '7.695')] [2023-02-27 09:57:52,986][10624] Updated weights for policy 0, policy_version 320 (0.0013) [2023-02-27 09:57:54,692][00722] Fps is (10 sec: 2867.8, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 1314816. Throughput: 0: 911.4. Samples: 328072. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:57:54,699][00722] Avg episode reward: [(0, '8.333')] [2023-02-27 09:57:59,692][00722] Fps is (10 sec: 4096.4, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 1339392. Throughput: 0: 952.4. Samples: 334250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 09:57:59,701][00722] Avg episode reward: [(0, '8.349')] [2023-02-27 09:58:03,192][10624] Updated weights for policy 0, policy_version 330 (0.0014) [2023-02-27 09:58:04,692][00722] Fps is (10 sec: 4096.1, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 1355776. Throughput: 0: 955.5. Samples: 337296. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 09:58:04,700][00722] Avg episode reward: [(0, '8.384')] [2023-02-27 09:58:09,692][00722] Fps is (10 sec: 2457.6, 60 sec: 3549.9, 300 sec: 3707.2). Total num frames: 1363968. Throughput: 0: 893.4. Samples: 341338. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 09:58:09,698][00722] Avg episode reward: [(0, '8.930')] [2023-02-27 09:58:09,822][10610] Saving new best policy, reward=8.930! [2023-02-27 09:58:14,692][00722] Fps is (10 sec: 2457.6, 60 sec: 3618.1, 300 sec: 3679.5). Total num frames: 1380352. Throughput: 0: 854.7. Samples: 344868. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 09:58:14,698][00722] Avg episode reward: [(0, '8.695')] [2023-02-27 09:58:18,213][10624] Updated weights for policy 0, policy_version 340 (0.0046) [2023-02-27 09:58:19,692][00722] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3665.6). Total num frames: 1396736. Throughput: 0: 858.2. Samples: 347152. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 09:58:19,700][00722] Avg episode reward: [(0, '9.946')] [2023-02-27 09:58:19,712][10610] Saving new best policy, reward=9.946! [2023-02-27 09:58:24,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3707.2). Total num frames: 1421312. Throughput: 0: 880.2. Samples: 353862. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:58:24,700][00722] Avg episode reward: [(0, '11.371')] [2023-02-27 09:58:24,706][10610] Saving new best policy, reward=11.371! [2023-02-27 09:58:27,052][10624] Updated weights for policy 0, policy_version 350 (0.0020) [2023-02-27 09:58:29,692][00722] Fps is (10 sec: 4505.6, 60 sec: 3618.2, 300 sec: 3707.2). Total num frames: 1441792. Throughput: 0: 870.6. Samples: 360296. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 09:58:29,695][00722] Avg episode reward: [(0, '11.967')] [2023-02-27 09:58:29,703][10610] Saving new best policy, reward=11.967! [2023-02-27 09:58:34,693][00722] Fps is (10 sec: 3276.4, 60 sec: 3618.1, 300 sec: 3665.6). Total num frames: 1454080. Throughput: 0: 861.2. Samples: 362462. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:58:34,697][00722] Avg episode reward: [(0, '12.491')] [2023-02-27 09:58:34,703][10610] Saving new best policy, reward=12.491! [2023-02-27 09:58:39,389][10624] Updated weights for policy 0, policy_version 360 (0.0020) [2023-02-27 09:58:39,692][00722] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3679.5). Total num frames: 1474560. Throughput: 0: 866.8. Samples: 367078. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 09:58:39,695][00722] Avg episode reward: [(0, '13.296')] [2023-02-27 09:58:39,707][10610] Saving new best policy, reward=13.296! [2023-02-27 09:58:44,692][00722] Fps is (10 sec: 4096.5, 60 sec: 3481.7, 300 sec: 3693.3). Total num frames: 1495040. Throughput: 0: 884.1. Samples: 374034. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:58:44,695][00722] Avg episode reward: [(0, '14.075')] [2023-02-27 09:58:44,784][10610] Saving new best policy, reward=14.075! [2023-02-27 09:58:48,850][10624] Updated weights for policy 0, policy_version 370 (0.0014) [2023-02-27 09:58:49,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3618.2, 300 sec: 3693.3). Total num frames: 1515520. Throughput: 0: 892.2. Samples: 377444. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 09:58:49,701][00722] Avg episode reward: [(0, '14.417')] [2023-02-27 09:58:49,717][10610] Saving new best policy, reward=14.417! [2023-02-27 09:58:54,698][00722] Fps is (10 sec: 3684.4, 60 sec: 3617.8, 300 sec: 3679.4). Total num frames: 1531904. Throughput: 0: 905.2. Samples: 382076. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-27 09:58:54,701][00722] Avg episode reward: [(0, '14.536')] [2023-02-27 09:58:54,708][10610] Saving new best policy, reward=14.536! [2023-02-27 09:58:59,692][00722] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3693.3). Total num frames: 1548288. Throughput: 0: 937.2. Samples: 387042. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 09:58:59,694][00722] Avg episode reward: [(0, '13.451')] [2023-02-27 09:59:00,982][10624] Updated weights for policy 0, policy_version 380 (0.0012) [2023-02-27 09:59:04,692][00722] Fps is (10 sec: 4098.2, 60 sec: 3618.1, 300 sec: 3735.0). Total num frames: 1572864. Throughput: 0: 964.6. Samples: 390560. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 09:59:04,700][00722] Avg episode reward: [(0, '12.893')] [2023-02-27 09:59:09,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 1589248. Throughput: 0: 969.4. Samples: 397484. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-27 09:59:09,700][00722] Avg episode reward: [(0, '11.649')] [2023-02-27 09:59:10,948][10624] Updated weights for policy 0, policy_version 390 (0.0013) [2023-02-27 09:59:14,692][00722] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 1605632. Throughput: 0: 924.4. Samples: 401892. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 09:59:14,702][00722] Avg episode reward: [(0, '11.586')] [2023-02-27 09:59:19,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3707.2). Total num frames: 1626112. Throughput: 0: 926.9. Samples: 404172. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:59:19,694][00722] Avg episode reward: [(0, '11.957')] [2023-02-27 09:59:19,704][10610] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000397_1626112.pth... [2023-02-27 09:59:19,820][10610] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000180_737280.pth [2023-02-27 09:59:22,053][10624] Updated weights for policy 0, policy_version 400 (0.0021) [2023-02-27 09:59:24,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 1646592. Throughput: 0: 974.9. Samples: 410950. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-27 09:59:24,697][00722] Avg episode reward: [(0, '12.892')] [2023-02-27 09:59:29,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 1667072. Throughput: 0: 962.9. Samples: 417366. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:59:29,697][00722] Avg episode reward: [(0, '13.521')] [2023-02-27 09:59:32,807][10624] Updated weights for policy 0, policy_version 410 (0.0021) [2023-02-27 09:59:34,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3721.1). Total num frames: 1683456. Throughput: 0: 936.0. Samples: 419566. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:59:34,696][00722] Avg episode reward: [(0, '13.720')] [2023-02-27 09:59:39,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3707.2). Total num frames: 1703936. Throughput: 0: 939.6. Samples: 424352. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:59:39,694][00722] Avg episode reward: [(0, '13.830')] [2023-02-27 09:59:43,098][10624] Updated weights for policy 0, policy_version 420 (0.0012) [2023-02-27 09:59:44,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 1724416. Throughput: 0: 986.4. Samples: 431430. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:59:44,694][00722] Avg episode reward: [(0, '14.398')] [2023-02-27 09:59:49,692][00722] Fps is (10 sec: 4095.9, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 1744896. Throughput: 0: 984.9. Samples: 434880. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:59:49,695][00722] Avg episode reward: [(0, '15.687')] [2023-02-27 09:59:49,712][10610] Saving new best policy, reward=15.687! [2023-02-27 09:59:54,694][00722] Fps is (10 sec: 3276.3, 60 sec: 3754.9, 300 sec: 3707.2). Total num frames: 1757184. Throughput: 0: 926.4. Samples: 439172. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 09:59:54,699][00722] Avg episode reward: [(0, '14.934')] [2023-02-27 09:59:55,263][10624] Updated weights for policy 0, policy_version 430 (0.0020) [2023-02-27 09:59:59,692][00722] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3693.3). Total num frames: 1777664. Throughput: 0: 942.1. Samples: 444288. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 09:59:59,699][00722] Avg episode reward: [(0, '15.153')] [2023-02-27 10:00:04,499][10624] Updated weights for policy 0, policy_version 440 (0.0012) [2023-02-27 10:00:04,692][00722] Fps is (10 sec: 4506.3, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 1802240. Throughput: 0: 969.8. Samples: 447812. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:00:04,698][00722] Avg episode reward: [(0, '16.087')] [2023-02-27 10:00:04,702][10610] Saving new best policy, reward=16.087! [2023-02-27 10:00:09,697][00722] Fps is (10 sec: 4094.2, 60 sec: 3822.7, 300 sec: 3734.9). Total num frames: 1818624. Throughput: 0: 967.8. Samples: 454506. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:00:09,704][00722] Avg episode reward: [(0, '15.108')] [2023-02-27 10:00:14,692][00722] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 1835008. Throughput: 0: 925.4. Samples: 459010. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:00:14,697][00722] Avg episode reward: [(0, '14.672')] [2023-02-27 10:00:16,942][10624] Updated weights for policy 0, policy_version 450 (0.0013) [2023-02-27 10:00:19,692][00722] Fps is (10 sec: 3688.0, 60 sec: 3822.9, 300 sec: 3707.3). Total num frames: 1855488. Throughput: 0: 925.1. Samples: 461194. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:00:19,695][00722] Avg episode reward: [(0, '17.150')] [2023-02-27 10:00:19,713][10610] Saving new best policy, reward=17.150! [2023-02-27 10:00:24,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 1875968. Throughput: 0: 973.6. Samples: 468162. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:00:24,696][00722] Avg episode reward: [(0, '17.743')] [2023-02-27 10:00:24,699][10610] Saving new best policy, reward=17.743! [2023-02-27 10:00:25,852][10624] Updated weights for policy 0, policy_version 460 (0.0014) [2023-02-27 10:00:29,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 1896448. Throughput: 0: 950.1. Samples: 474184. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-27 10:00:29,697][00722] Avg episode reward: [(0, '17.610')] [2023-02-27 10:00:34,692][00722] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 1908736. Throughput: 0: 923.6. Samples: 476442. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:00:34,698][00722] Avg episode reward: [(0, '17.176')] [2023-02-27 10:00:38,128][10624] Updated weights for policy 0, policy_version 470 (0.0027) [2023-02-27 10:00:39,693][00722] Fps is (10 sec: 3276.7, 60 sec: 3754.6, 300 sec: 3707.2). Total num frames: 1929216. Throughput: 0: 936.6. Samples: 481320. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:00:39,695][00722] Avg episode reward: [(0, '17.152')] [2023-02-27 10:00:44,692][00722] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 1953792. Throughput: 0: 979.6. Samples: 488372. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-27 10:00:44,699][00722] Avg episode reward: [(0, '16.793')] [2023-02-27 10:00:47,050][10624] Updated weights for policy 0, policy_version 480 (0.0017) [2023-02-27 10:00:49,692][00722] Fps is (10 sec: 4096.1, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 1970176. Throughput: 0: 980.2. Samples: 491922. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-27 10:00:49,696][00722] Avg episode reward: [(0, '16.694')] [2023-02-27 10:00:54,692][00722] Fps is (10 sec: 3276.9, 60 sec: 3823.0, 300 sec: 3735.0). Total num frames: 1986560. Throughput: 0: 932.5. Samples: 496466. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:00:54,697][00722] Avg episode reward: [(0, '16.585')] [2023-02-27 10:00:59,263][10624] Updated weights for policy 0, policy_version 490 (0.0025) [2023-02-27 10:00:59,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 2007040. Throughput: 0: 950.0. Samples: 501760. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 10:00:59,699][00722] Avg episode reward: [(0, '16.928')] [2023-02-27 10:01:04,692][00722] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 2031616. Throughput: 0: 980.4. Samples: 505310. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-27 10:01:04,694][00722] Avg episode reward: [(0, '15.765')] [2023-02-27 10:01:08,393][10624] Updated weights for policy 0, policy_version 500 (0.0017) [2023-02-27 10:01:09,692][00722] Fps is (10 sec: 4505.6, 60 sec: 3891.5, 300 sec: 3776.7). Total num frames: 2052096. Throughput: 0: 976.5. Samples: 512106. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:01:09,698][00722] Avg episode reward: [(0, '16.430')] [2023-02-27 10:01:14,692][00722] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 2064384. Throughput: 0: 941.2. Samples: 516536. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:01:14,700][00722] Avg episode reward: [(0, '17.009')] [2023-02-27 10:01:19,692][00722] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 2084864. Throughput: 0: 944.0. Samples: 518920. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:01:19,699][00722] Avg episode reward: [(0, '19.345')] [2023-02-27 10:01:19,710][10610] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000509_2084864.pth... [2023-02-27 10:01:19,842][10610] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000290_1187840.pth [2023-02-27 10:01:19,857][10610] Saving new best policy, reward=19.345! [2023-02-27 10:01:20,137][10624] Updated weights for policy 0, policy_version 510 (0.0023) [2023-02-27 10:01:24,692][00722] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3762.8). Total num frames: 2109440. Throughput: 0: 992.7. Samples: 525992. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-27 10:01:24,702][00722] Avg episode reward: [(0, '18.895')] [2023-02-27 10:01:29,693][00722] Fps is (10 sec: 4095.8, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 2125824. Throughput: 0: 971.0. Samples: 532066. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:01:29,699][00722] Avg episode reward: [(0, '19.060')] [2023-02-27 10:01:30,001][10624] Updated weights for policy 0, policy_version 520 (0.0017) [2023-02-27 10:01:34,692][00722] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3748.9). Total num frames: 2142208. Throughput: 0: 940.1. Samples: 534226. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-27 10:01:34,695][00722] Avg episode reward: [(0, '18.120')] [2023-02-27 10:01:39,692][00722] Fps is (10 sec: 3686.6, 60 sec: 3891.2, 300 sec: 3735.0). Total num frames: 2162688. Throughput: 0: 952.9. Samples: 539346. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-27 10:01:39,698][00722] Avg episode reward: [(0, '17.378')] [2023-02-27 10:01:41,168][10624] Updated weights for policy 0, policy_version 530 (0.0020) [2023-02-27 10:01:44,692][00722] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 2187264. Throughput: 0: 995.3. Samples: 546548. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-27 10:01:44,695][00722] Avg episode reward: [(0, '17.278')] [2023-02-27 10:01:49,698][00722] Fps is (10 sec: 4093.8, 60 sec: 3890.9, 300 sec: 3776.6). Total num frames: 2203648. Throughput: 0: 989.5. Samples: 549844. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:01:49,701][00722] Avg episode reward: [(0, '16.996')] [2023-02-27 10:01:51,666][10624] Updated weights for policy 0, policy_version 540 (0.0013) [2023-02-27 10:01:54,693][00722] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3762.8). Total num frames: 2220032. Throughput: 0: 937.5. Samples: 554294. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:01:54,697][00722] Avg episode reward: [(0, '17.608')] [2023-02-27 10:01:59,692][00722] Fps is (10 sec: 3688.4, 60 sec: 3891.2, 300 sec: 3748.9). Total num frames: 2240512. Throughput: 0: 961.4. Samples: 559798. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-27 10:01:59,702][00722] Avg episode reward: [(0, '17.765')] [2023-02-27 10:02:02,154][10624] Updated weights for policy 0, policy_version 550 (0.0034) [2023-02-27 10:02:04,693][00722] Fps is (10 sec: 3686.1, 60 sec: 3754.6, 300 sec: 3748.9). Total num frames: 2256896. Throughput: 0: 985.4. Samples: 563264. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:02:04,695][00722] Avg episode reward: [(0, '18.004')] [2023-02-27 10:02:09,695][00722] Fps is (10 sec: 2866.5, 60 sec: 3618.0, 300 sec: 3748.9). Total num frames: 2269184. Throughput: 0: 922.9. Samples: 567526. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 10:02:09,697][00722] Avg episode reward: [(0, '18.162')] [2023-02-27 10:02:14,696][00722] Fps is (10 sec: 2457.0, 60 sec: 3617.9, 300 sec: 3707.2). Total num frames: 2281472. Throughput: 0: 866.7. Samples: 571070. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-27 10:02:14,703][00722] Avg episode reward: [(0, '18.717')] [2023-02-27 10:02:18,063][10624] Updated weights for policy 0, policy_version 560 (0.0028) [2023-02-27 10:02:19,692][00722] Fps is (10 sec: 2867.9, 60 sec: 3549.9, 300 sec: 3693.3). Total num frames: 2297856. Throughput: 0: 866.7. Samples: 573226. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-27 10:02:19,700][00722] Avg episode reward: [(0, '19.522')] [2023-02-27 10:02:19,714][10610] Saving new best policy, reward=19.522! [2023-02-27 10:02:24,692][00722] Fps is (10 sec: 4097.3, 60 sec: 3549.9, 300 sec: 3721.1). Total num frames: 2322432. Throughput: 0: 888.3. Samples: 579318. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 10:02:24,695][00722] Avg episode reward: [(0, '20.759')] [2023-02-27 10:02:24,701][10610] Saving new best policy, reward=20.759! [2023-02-27 10:02:26,987][10624] Updated weights for policy 0, policy_version 570 (0.0022) [2023-02-27 10:02:29,692][00722] Fps is (10 sec: 4505.6, 60 sec: 3618.2, 300 sec: 3748.9). Total num frames: 2342912. Throughput: 0: 882.0. Samples: 586240. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:02:29,699][00722] Avg episode reward: [(0, '20.650')] [2023-02-27 10:02:34,696][00722] Fps is (10 sec: 3685.2, 60 sec: 3617.9, 300 sec: 3721.1). Total num frames: 2359296. Throughput: 0: 855.7. Samples: 588348. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:02:34,698][00722] Avg episode reward: [(0, '20.368')] [2023-02-27 10:02:39,591][10624] Updated weights for policy 0, policy_version 580 (0.0022) [2023-02-27 10:02:39,692][00722] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3693.4). Total num frames: 2375680. Throughput: 0: 855.7. Samples: 592802. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:02:39,699][00722] Avg episode reward: [(0, '19.378')] [2023-02-27 10:02:44,692][00722] Fps is (10 sec: 3687.6, 60 sec: 3481.6, 300 sec: 3721.1). Total num frames: 2396160. Throughput: 0: 883.7. Samples: 599566. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-27 10:02:44,695][00722] Avg episode reward: [(0, '17.418')] [2023-02-27 10:02:48,185][10624] Updated weights for policy 0, policy_version 590 (0.0016) [2023-02-27 10:02:49,692][00722] Fps is (10 sec: 4505.6, 60 sec: 3618.5, 300 sec: 3748.9). Total num frames: 2420736. Throughput: 0: 884.2. Samples: 603054. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-27 10:02:49,697][00722] Avg episode reward: [(0, '18.703')] [2023-02-27 10:02:54,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3707.2). Total num frames: 2433024. Throughput: 0: 908.2. Samples: 608394. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:02:54,694][00722] Avg episode reward: [(0, '19.378')] [2023-02-27 10:02:59,695][00722] Fps is (10 sec: 3276.0, 60 sec: 3549.7, 300 sec: 3721.1). Total num frames: 2453504. Throughput: 0: 932.3. Samples: 613022. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 10:02:59,699][00722] Avg episode reward: [(0, '18.903')] [2023-02-27 10:03:00,401][10624] Updated weights for policy 0, policy_version 600 (0.0020) [2023-02-27 10:03:04,692][00722] Fps is (10 sec: 4505.6, 60 sec: 3686.5, 300 sec: 3776.7). Total num frames: 2478080. Throughput: 0: 962.9. Samples: 616556. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:03:04,695][00722] Avg episode reward: [(0, '18.362')] [2023-02-27 10:03:09,217][10624] Updated weights for policy 0, policy_version 610 (0.0022) [2023-02-27 10:03:09,692][00722] Fps is (10 sec: 4506.7, 60 sec: 3823.1, 300 sec: 3790.5). Total num frames: 2498560. Throughput: 0: 988.6. Samples: 623806. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:03:09,698][00722] Avg episode reward: [(0, '18.871')] [2023-02-27 10:03:14,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3891.4, 300 sec: 3790.5). Total num frames: 2514944. Throughput: 0: 941.2. Samples: 628594. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 10:03:14,696][00722] Avg episode reward: [(0, '18.707')] [2023-02-27 10:03:19,692][00722] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3762.8). Total num frames: 2531328. Throughput: 0: 943.8. Samples: 630818. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 10:03:19,694][00722] Avg episode reward: [(0, '18.782')] [2023-02-27 10:03:19,705][10610] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000618_2531328.pth... [2023-02-27 10:03:19,820][10610] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000397_1626112.pth [2023-02-27 10:03:21,215][10624] Updated weights for policy 0, policy_version 620 (0.0012) [2023-02-27 10:03:24,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3762.8). Total num frames: 2551808. Throughput: 0: 983.0. Samples: 637038. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:03:24,695][00722] Avg episode reward: [(0, '19.052')] [2023-02-27 10:03:29,692][00722] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 2576384. Throughput: 0: 985.4. Samples: 643908. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 10:03:29,694][00722] Avg episode reward: [(0, '20.250')] [2023-02-27 10:03:31,129][10624] Updated weights for policy 0, policy_version 630 (0.0013) [2023-02-27 10:03:34,692][00722] Fps is (10 sec: 3686.3, 60 sec: 3823.1, 300 sec: 3776.6). Total num frames: 2588672. Throughput: 0: 957.2. Samples: 646130. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:03:34,695][00722] Avg episode reward: [(0, '19.852')] [2023-02-27 10:03:39,693][00722] Fps is (10 sec: 2867.1, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 2605056. Throughput: 0: 939.8. Samples: 650684. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 10:03:39,695][00722] Avg episode reward: [(0, '20.894')] [2023-02-27 10:03:39,714][10610] Saving new best policy, reward=20.894! [2023-02-27 10:03:42,375][10624] Updated weights for policy 0, policy_version 640 (0.0017) [2023-02-27 10:03:44,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3776.6). Total num frames: 2629632. Throughput: 0: 988.2. Samples: 657488. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:03:44,695][00722] Avg episode reward: [(0, '20.303')] [2023-02-27 10:03:49,693][00722] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3790.6). Total num frames: 2650112. Throughput: 0: 986.3. Samples: 660942. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-27 10:03:49,696][00722] Avg episode reward: [(0, '22.249')] [2023-02-27 10:03:49,716][10610] Saving new best policy, reward=22.249! [2023-02-27 10:03:52,841][10624] Updated weights for policy 0, policy_version 650 (0.0014) [2023-02-27 10:03:54,693][00722] Fps is (10 sec: 3686.1, 60 sec: 3891.1, 300 sec: 3790.5). Total num frames: 2666496. Throughput: 0: 940.3. Samples: 666120. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:03:54,699][00722] Avg episode reward: [(0, '22.392')] [2023-02-27 10:03:54,701][10610] Saving new best policy, reward=22.392! [2023-02-27 10:03:59,692][00722] Fps is (10 sec: 3276.9, 60 sec: 3823.1, 300 sec: 3762.8). Total num frames: 2682880. Throughput: 0: 934.7. Samples: 670656. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:03:59,694][00722] Avg episode reward: [(0, '22.198')] [2023-02-27 10:04:03,661][10624] Updated weights for policy 0, policy_version 660 (0.0017) [2023-02-27 10:04:04,692][00722] Fps is (10 sec: 4096.5, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 2707456. Throughput: 0: 963.3. Samples: 674166. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:04:04,695][00722] Avg episode reward: [(0, '21.066')] [2023-02-27 10:04:09,692][00722] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 2727936. Throughput: 0: 981.1. Samples: 681188. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:04:09,695][00722] Avg episode reward: [(0, '19.282')] [2023-02-27 10:04:14,592][10624] Updated weights for policy 0, policy_version 670 (0.0013) [2023-02-27 10:04:14,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 2744320. Throughput: 0: 933.4. Samples: 685910. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-27 10:04:14,705][00722] Avg episode reward: [(0, '18.793')] [2023-02-27 10:04:19,692][00722] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 2760704. Throughput: 0: 932.1. Samples: 688076. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-27 10:04:19,695][00722] Avg episode reward: [(0, '19.084')] [2023-02-27 10:04:24,596][10624] Updated weights for policy 0, policy_version 680 (0.0020) [2023-02-27 10:04:24,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 2785280. Throughput: 0: 977.6. Samples: 694676. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-27 10:04:24,699][00722] Avg episode reward: [(0, '18.464')] [2023-02-27 10:04:29,692][00722] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 2805760. Throughput: 0: 975.8. Samples: 701398. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:04:29,697][00722] Avg episode reward: [(0, '19.946')] [2023-02-27 10:04:34,693][00722] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 2818048. Throughput: 0: 947.7. Samples: 703588. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-27 10:04:34,696][00722] Avg episode reward: [(0, '19.827')] [2023-02-27 10:04:36,611][10624] Updated weights for policy 0, policy_version 690 (0.0014) [2023-02-27 10:04:39,692][00722] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 2838528. Throughput: 0: 932.2. Samples: 708066. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:04:39,695][00722] Avg episode reward: [(0, '19.592')] [2023-02-27 10:04:44,692][00722] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 2859008. Throughput: 0: 984.9. Samples: 714976. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:04:44,695][00722] Avg episode reward: [(0, '19.857')] [2023-02-27 10:04:45,908][10624] Updated weights for policy 0, policy_version 700 (0.0017) [2023-02-27 10:04:49,692][00722] Fps is (10 sec: 4095.9, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 2879488. Throughput: 0: 983.5. Samples: 718424. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:04:49,702][00722] Avg episode reward: [(0, '22.445')] [2023-02-27 10:04:49,722][10610] Saving new best policy, reward=22.445! [2023-02-27 10:04:54,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3790.5). Total num frames: 2895872. Throughput: 0: 937.2. Samples: 723360. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:04:54,696][00722] Avg episode reward: [(0, '23.526')] [2023-02-27 10:04:54,706][10610] Saving new best policy, reward=23.526! [2023-02-27 10:04:58,205][10624] Updated weights for policy 0, policy_version 710 (0.0027) [2023-02-27 10:04:59,692][00722] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 2912256. Throughput: 0: 937.0. Samples: 728076. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:04:59,694][00722] Avg episode reward: [(0, '22.251')] [2023-02-27 10:05:04,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.6). Total num frames: 2936832. Throughput: 0: 966.1. Samples: 731550. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:05:04,695][00722] Avg episode reward: [(0, '23.794')] [2023-02-27 10:05:04,698][10610] Saving new best policy, reward=23.794! [2023-02-27 10:05:07,103][10624] Updated weights for policy 0, policy_version 720 (0.0011) [2023-02-27 10:05:09,692][00722] Fps is (10 sec: 4505.5, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 2957312. Throughput: 0: 976.6. Samples: 738624. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:05:09,696][00722] Avg episode reward: [(0, '24.336')] [2023-02-27 10:05:09,711][10610] Saving new best policy, reward=24.336! [2023-02-27 10:05:14,697][00722] Fps is (10 sec: 3275.4, 60 sec: 3754.4, 300 sec: 3776.6). Total num frames: 2969600. Throughput: 0: 927.2. Samples: 743128. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:05:14,703][00722] Avg episode reward: [(0, '23.435')] [2023-02-27 10:05:19,430][10624] Updated weights for policy 0, policy_version 730 (0.0049) [2023-02-27 10:05:19,692][00722] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 2990080. Throughput: 0: 929.3. Samples: 745408. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-27 10:05:19,700][00722] Avg episode reward: [(0, '20.883')] [2023-02-27 10:05:19,713][10610] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000730_2990080.pth... [2023-02-27 10:05:19,862][10610] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000509_2084864.pth [2023-02-27 10:05:24,693][00722] Fps is (10 sec: 4507.4, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 3014656. Throughput: 0: 976.9. Samples: 752028. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:05:24,700][00722] Avg episode reward: [(0, '21.872')] [2023-02-27 10:05:28,082][10624] Updated weights for policy 0, policy_version 740 (0.0021) [2023-02-27 10:05:29,692][00722] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3035136. Throughput: 0: 972.6. Samples: 758744. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:05:29,698][00722] Avg episode reward: [(0, '22.012')] [2023-02-27 10:05:34,692][00722] Fps is (10 sec: 3276.9, 60 sec: 3823.0, 300 sec: 3790.5). Total num frames: 3047424. Throughput: 0: 945.1. Samples: 760954. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:05:34,698][00722] Avg episode reward: [(0, '23.669')] [2023-02-27 10:05:39,692][00722] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 3067904. Throughput: 0: 933.9. Samples: 765386. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:05:39,694][00722] Avg episode reward: [(0, '23.939')] [2023-02-27 10:05:40,548][10624] Updated weights for policy 0, policy_version 750 (0.0022) [2023-02-27 10:05:44,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 3088384. Throughput: 0: 984.7. Samples: 772386. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:05:44,698][00722] Avg episode reward: [(0, '23.545')] [2023-02-27 10:05:49,692][00722] Fps is (10 sec: 4095.9, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3108864. Throughput: 0: 987.0. Samples: 775964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:05:49,695][00722] Avg episode reward: [(0, '22.774')] [2023-02-27 10:05:49,851][10624] Updated weights for policy 0, policy_version 760 (0.0018) [2023-02-27 10:05:54,697][00722] Fps is (10 sec: 3684.8, 60 sec: 3822.6, 300 sec: 3790.5). Total num frames: 3125248. Throughput: 0: 938.9. Samples: 780880. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:05:54,699][00722] Avg episode reward: [(0, '21.078')] [2023-02-27 10:05:59,692][00722] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 3141632. Throughput: 0: 948.0. Samples: 785786. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 10:05:59,694][00722] Avg episode reward: [(0, '20.514')] [2023-02-27 10:06:01,492][10624] Updated weights for policy 0, policy_version 770 (0.0019) [2023-02-27 10:06:04,692][00722] Fps is (10 sec: 4097.8, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 3166208. Throughput: 0: 973.1. Samples: 789198. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-27 10:06:04,701][00722] Avg episode reward: [(0, '19.730')] [2023-02-27 10:06:09,695][00722] Fps is (10 sec: 3685.5, 60 sec: 3686.3, 300 sec: 3776.6). Total num frames: 3178496. Throughput: 0: 950.0. Samples: 794778. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:06:09,697][00722] Avg episode reward: [(0, '20.404')] [2023-02-27 10:06:14,692][00722] Fps is (10 sec: 2457.6, 60 sec: 3686.7, 300 sec: 3748.9). Total num frames: 3190784. Throughput: 0: 877.9. Samples: 798248. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:06:14,698][00722] Avg episode reward: [(0, '21.587')] [2023-02-27 10:06:15,211][10624] Updated weights for policy 0, policy_version 780 (0.0030) [2023-02-27 10:06:19,692][00722] Fps is (10 sec: 2458.2, 60 sec: 3549.9, 300 sec: 3707.2). Total num frames: 3203072. Throughput: 0: 869.7. Samples: 800090. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:06:19,696][00722] Avg episode reward: [(0, '22.512')] [2023-02-27 10:06:24,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3735.0). Total num frames: 3227648. Throughput: 0: 889.3. Samples: 805404. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:06:24,700][00722] Avg episode reward: [(0, '21.883')] [2023-02-27 10:06:26,218][10624] Updated weights for policy 0, policy_version 790 (0.0028) [2023-02-27 10:06:29,692][00722] Fps is (10 sec: 4915.2, 60 sec: 3618.1, 300 sec: 3762.8). Total num frames: 3252224. Throughput: 0: 892.0. Samples: 812524. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:06:29,698][00722] Avg episode reward: [(0, '21.163')] [2023-02-27 10:06:34,692][00722] Fps is (10 sec: 4095.8, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 3268608. Throughput: 0: 881.1. Samples: 815614. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:06:34,696][00722] Avg episode reward: [(0, '21.828')] [2023-02-27 10:06:37,062][10624] Updated weights for policy 0, policy_version 800 (0.0018) [2023-02-27 10:06:39,692][00722] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3707.2). Total num frames: 3280896. Throughput: 0: 871.8. Samples: 820108. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:06:39,700][00722] Avg episode reward: [(0, '22.402')] [2023-02-27 10:06:44,692][00722] Fps is (10 sec: 3686.5, 60 sec: 3618.1, 300 sec: 3735.1). Total num frames: 3305472. Throughput: 0: 893.5. Samples: 825992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 10:06:44,700][00722] Avg episode reward: [(0, '22.594')] [2023-02-27 10:06:47,058][10624] Updated weights for policy 0, policy_version 810 (0.0046) [2023-02-27 10:06:49,692][00722] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3748.9). Total num frames: 3325952. Throughput: 0: 895.5. Samples: 829494. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:06:49,700][00722] Avg episode reward: [(0, '23.550')] [2023-02-27 10:06:54,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3686.7, 300 sec: 3748.9). Total num frames: 3346432. Throughput: 0: 910.7. Samples: 835756. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 10:06:54,699][00722] Avg episode reward: [(0, '24.028')] [2023-02-27 10:06:58,674][10624] Updated weights for policy 0, policy_version 820 (0.0011) [2023-02-27 10:06:59,693][00722] Fps is (10 sec: 3276.6, 60 sec: 3618.1, 300 sec: 3735.0). Total num frames: 3358720. Throughput: 0: 934.6. Samples: 840306. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-27 10:06:59,696][00722] Avg episode reward: [(0, '24.570')] [2023-02-27 10:06:59,707][10610] Saving new best policy, reward=24.570! [2023-02-27 10:07:04,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3776.7). Total num frames: 3383296. Throughput: 0: 952.9. Samples: 842972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:07:04,698][00722] Avg episode reward: [(0, '22.447')] [2023-02-27 10:07:08,000][10624] Updated weights for policy 0, policy_version 830 (0.0025) [2023-02-27 10:07:09,692][00722] Fps is (10 sec: 4505.9, 60 sec: 3754.8, 300 sec: 3804.5). Total num frames: 3403776. Throughput: 0: 995.7. Samples: 850212. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:07:09,695][00722] Avg episode reward: [(0, '21.608')] [2023-02-27 10:07:14,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3424256. Throughput: 0: 965.2. Samples: 855956. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-27 10:07:14,695][00722] Avg episode reward: [(0, '20.235')] [2023-02-27 10:07:19,692][00722] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 3436544. Throughput: 0: 948.4. Samples: 858294. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:07:19,701][00722] Avg episode reward: [(0, '20.827')] [2023-02-27 10:07:19,714][10610] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000839_3436544.pth... [2023-02-27 10:07:19,882][10610] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000618_2531328.pth [2023-02-27 10:07:20,025][10624] Updated weights for policy 0, policy_version 840 (0.0026) [2023-02-27 10:07:24,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 3461120. Throughput: 0: 970.8. Samples: 863794. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 10:07:24,701][00722] Avg episode reward: [(0, '20.581')] [2023-02-27 10:07:28,840][10624] Updated weights for policy 0, policy_version 850 (0.0017) [2023-02-27 10:07:29,692][00722] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3804.5). Total num frames: 3481600. Throughput: 0: 1000.0. Samples: 870992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:07:29,694][00722] Avg episode reward: [(0, '22.396')] [2023-02-27 10:07:34,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3502080. Throughput: 0: 986.1. Samples: 873868. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:07:34,698][00722] Avg episode reward: [(0, '22.570')] [2023-02-27 10:07:39,694][00722] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 3514368. Throughput: 0: 947.5. Samples: 878392. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:07:39,697][00722] Avg episode reward: [(0, '22.718')] [2023-02-27 10:07:40,795][10624] Updated weights for policy 0, policy_version 860 (0.0021) [2023-02-27 10:07:44,692][00722] Fps is (10 sec: 3686.3, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 3538944. Throughput: 0: 980.0. Samples: 884404. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-27 10:07:44,698][00722] Avg episode reward: [(0, '23.216')] [2023-02-27 10:07:49,545][10624] Updated weights for policy 0, policy_version 870 (0.0018) [2023-02-27 10:07:49,692][00722] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 3563520. Throughput: 0: 1000.0. Samples: 887972. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:07:49,697][00722] Avg episode reward: [(0, '24.131')] [2023-02-27 10:07:54,693][00722] Fps is (10 sec: 3686.2, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3575808. Throughput: 0: 971.5. Samples: 893930. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-27 10:07:54,697][00722] Avg episode reward: [(0, '23.466')] [2023-02-27 10:07:59,693][00722] Fps is (10 sec: 2867.1, 60 sec: 3891.2, 300 sec: 3776.6). Total num frames: 3592192. Throughput: 0: 944.3. Samples: 898450. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-27 10:07:59,697][00722] Avg episode reward: [(0, '24.317')] [2023-02-27 10:08:01,906][10624] Updated weights for policy 0, policy_version 880 (0.0035) [2023-02-27 10:08:04,692][00722] Fps is (10 sec: 4096.4, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 3616768. Throughput: 0: 957.9. Samples: 901398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 10:08:04,699][00722] Avg episode reward: [(0, '23.449')] [2023-02-27 10:08:09,692][00722] Fps is (10 sec: 4505.8, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3637248. Throughput: 0: 994.0. Samples: 908522. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:08:09,699][00722] Avg episode reward: [(0, '22.946')] [2023-02-27 10:08:10,919][10624] Updated weights for policy 0, policy_version 890 (0.0012) [2023-02-27 10:08:14,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3653632. Throughput: 0: 959.1. Samples: 914150. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:08:14,694][00722] Avg episode reward: [(0, '21.790')] [2023-02-27 10:08:19,693][00722] Fps is (10 sec: 3276.5, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 3670016. Throughput: 0: 945.4. Samples: 916410. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:08:19,697][00722] Avg episode reward: [(0, '22.489')] [2023-02-27 10:08:22,577][10624] Updated weights for policy 0, policy_version 900 (0.0013) [2023-02-27 10:08:24,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 3694592. Throughput: 0: 973.9. Samples: 922218. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:08:24,695][00722] Avg episode reward: [(0, '23.155')] [2023-02-27 10:08:29,692][00722] Fps is (10 sec: 4915.7, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 3719168. Throughput: 0: 1001.7. Samples: 929480. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-27 10:08:29,695][00722] Avg episode reward: [(0, '23.773')] [2023-02-27 10:08:31,693][10624] Updated weights for policy 0, policy_version 910 (0.0018) [2023-02-27 10:08:34,693][00722] Fps is (10 sec: 4095.7, 60 sec: 3891.1, 300 sec: 3832.2). Total num frames: 3735552. Throughput: 0: 983.0. Samples: 932210. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-27 10:08:34,697][00722] Avg episode reward: [(0, '23.885')] [2023-02-27 10:08:39,692][00722] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 3747840. Throughput: 0: 950.9. Samples: 936720. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-27 10:08:39,698][00722] Avg episode reward: [(0, '23.953')] [2023-02-27 10:08:43,473][10624] Updated weights for policy 0, policy_version 920 (0.0016) [2023-02-27 10:08:44,692][00722] Fps is (10 sec: 3686.7, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3772416. Throughput: 0: 989.3. Samples: 942968. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:08:44,700][00722] Avg episode reward: [(0, '23.252')] [2023-02-27 10:08:49,692][00722] Fps is (10 sec: 4915.2, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3796992. Throughput: 0: 1002.0. Samples: 946490. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 10:08:49,698][00722] Avg episode reward: [(0, '21.861')] [2023-02-27 10:08:53,448][10624] Updated weights for policy 0, policy_version 930 (0.0026) [2023-02-27 10:08:54,695][00722] Fps is (10 sec: 3685.4, 60 sec: 3891.1, 300 sec: 3818.3). Total num frames: 3809280. Throughput: 0: 972.0. Samples: 952264. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:08:54,699][00722] Avg episode reward: [(0, '21.056')] [2023-02-27 10:08:59,692][00722] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 3825664. Throughput: 0: 948.1. Samples: 956814. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:08:59,695][00722] Avg episode reward: [(0, '22.366')] [2023-02-27 10:09:04,144][10624] Updated weights for policy 0, policy_version 940 (0.0012) [2023-02-27 10:09:04,692][00722] Fps is (10 sec: 4097.1, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3850240. Throughput: 0: 970.2. Samples: 960068. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 10:09:04,695][00722] Avg episode reward: [(0, '23.928')] [2023-02-27 10:09:09,697][00722] Fps is (10 sec: 4912.8, 60 sec: 3959.1, 300 sec: 3832.1). Total num frames: 3874816. Throughput: 0: 1000.3. Samples: 967236. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:09:09,700][00722] Avg episode reward: [(0, '25.177')] [2023-02-27 10:09:09,716][10610] Saving new best policy, reward=25.177! [2023-02-27 10:09:14,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3887104. Throughput: 0: 955.3. Samples: 972470. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:09:14,694][00722] Avg episode reward: [(0, '25.169')] [2023-02-27 10:09:14,763][10624] Updated weights for policy 0, policy_version 950 (0.0013) [2023-02-27 10:09:19,692][00722] Fps is (10 sec: 2868.6, 60 sec: 3891.3, 300 sec: 3790.5). Total num frames: 3903488. Throughput: 0: 944.9. Samples: 974728. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-27 10:09:19,702][00722] Avg episode reward: [(0, '26.189')] [2023-02-27 10:09:19,720][10610] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000953_3903488.pth... [2023-02-27 10:09:19,871][10610] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000730_2990080.pth [2023-02-27 10:09:19,882][10610] Saving new best policy, reward=26.189! [2023-02-27 10:09:24,692][00722] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3928064. Throughput: 0: 975.1. Samples: 980598. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:09:24,700][00722] Avg episode reward: [(0, '26.704')] [2023-02-27 10:09:24,703][10610] Saving new best policy, reward=26.704! [2023-02-27 10:09:25,208][10624] Updated weights for policy 0, policy_version 960 (0.0018) [2023-02-27 10:09:29,692][00722] Fps is (10 sec: 4915.2, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 3952640. Throughput: 0: 994.8. Samples: 987734. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-27 10:09:29,695][00722] Avg episode reward: [(0, '25.390')] [2023-02-27 10:09:34,692][00722] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3818.3). Total num frames: 3964928. Throughput: 0: 972.9. Samples: 990272. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:09:34,696][00722] Avg episode reward: [(0, '24.563')] [2023-02-27 10:09:36,759][10624] Updated weights for policy 0, policy_version 970 (0.0027) [2023-02-27 10:09:39,692][00722] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3981312. Throughput: 0: 943.4. Samples: 994716. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-27 10:09:39,695][00722] Avg episode reward: [(0, '23.622')] [2023-02-27 10:09:44,311][10610] Stopping Batcher_0... [2023-02-27 10:09:44,312][10610] Loop batcher_evt_loop terminating... [2023-02-27 10:09:44,313][00722] Component Batcher_0 stopped! [2023-02-27 10:09:44,323][10610] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-27 10:09:44,339][10628] Stopping RolloutWorker_w3... [2023-02-27 10:09:44,340][00722] Component RolloutWorker_w3 stopped! [2023-02-27 10:09:44,349][10628] Loop rollout_proc3_evt_loop terminating... [2023-02-27 10:09:44,361][00722] Component RolloutWorker_w5 stopped! [2023-02-27 10:09:44,367][10630] Stopping RolloutWorker_w5... [2023-02-27 10:09:44,372][00722] Component RolloutWorker_w7 stopped! [2023-02-27 10:09:44,377][10632] Stopping RolloutWorker_w7... [2023-02-27 10:09:44,371][10630] Loop rollout_proc5_evt_loop terminating... [2023-02-27 10:09:44,379][10632] Loop rollout_proc7_evt_loop terminating... [2023-02-27 10:09:44,383][00722] Component RolloutWorker_w1 stopped! [2023-02-27 10:09:44,387][10626] Stopping RolloutWorker_w1... [2023-02-27 10:09:44,389][00722] Component RolloutWorker_w2 stopped! [2023-02-27 10:09:44,389][10624] Weights refcount: 2 0 [2023-02-27 10:09:44,388][10627] Stopping RolloutWorker_w2... [2023-02-27 10:09:44,394][10624] Stopping InferenceWorker_p0-w0... [2023-02-27 10:09:44,394][00722] Component InferenceWorker_p0-w0 stopped! [2023-02-27 10:09:44,388][10626] Loop rollout_proc1_evt_loop terminating... [2023-02-27 10:09:44,396][10624] Loop inference_proc0-0_evt_loop terminating... [2023-02-27 10:09:44,404][10629] Stopping RolloutWorker_w4... [2023-02-27 10:09:44,404][00722] Component RolloutWorker_w4 stopped! [2023-02-27 10:09:44,395][10627] Loop rollout_proc2_evt_loop terminating... [2023-02-27 10:09:44,412][10629] Loop rollout_proc4_evt_loop terminating... [2023-02-27 10:09:44,438][10631] Stopping RolloutWorker_w6... [2023-02-27 10:09:44,439][10625] Stopping RolloutWorker_w0... [2023-02-27 10:09:44,438][00722] Component RolloutWorker_w6 stopped! [2023-02-27 10:09:44,443][00722] Component RolloutWorker_w0 stopped! [2023-02-27 10:09:44,442][10631] Loop rollout_proc6_evt_loop terminating... [2023-02-27 10:09:44,446][10625] Loop rollout_proc0_evt_loop terminating... [2023-02-27 10:09:44,504][10610] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000839_3436544.pth [2023-02-27 10:09:44,516][10610] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-27 10:09:44,684][00722] Component LearnerWorker_p0 stopped! [2023-02-27 10:09:44,690][00722] Waiting for process learner_proc0 to stop... [2023-02-27 10:09:44,695][10610] Stopping LearnerWorker_p0... [2023-02-27 10:09:44,696][10610] Loop learner_proc0_evt_loop terminating... [2023-02-27 10:09:46,520][00722] Waiting for process inference_proc0-0 to join... [2023-02-27 10:09:46,823][00722] Waiting for process rollout_proc0 to join... [2023-02-27 10:09:46,825][00722] Waiting for process rollout_proc1 to join... [2023-02-27 10:09:47,236][00722] Waiting for process rollout_proc2 to join... [2023-02-27 10:09:47,237][00722] Waiting for process rollout_proc3 to join... [2023-02-27 10:09:47,238][00722] Waiting for process rollout_proc4 to join... [2023-02-27 10:09:47,239][00722] Waiting for process rollout_proc5 to join... [2023-02-27 10:09:47,255][00722] Waiting for process rollout_proc6 to join... [2023-02-27 10:09:47,256][00722] Waiting for process rollout_proc7 to join... [2023-02-27 10:09:47,257][00722] Batcher 0 profile tree view: batching: 25.7283, releasing_batches: 0.0243 [2023-02-27 10:09:47,258][00722] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0011 wait_policy_total: 521.7631 update_model: 7.7234 weight_update: 0.0038 one_step: 0.0034 handle_policy_step: 503.4002 deserialize: 14.3770, stack: 2.8739, obs_to_device_normalize: 112.0697, forward: 240.9010, send_messages: 26.3236 prepare_outputs: 81.2488 to_cpu: 50.3257 [2023-02-27 10:09:47,260][00722] Learner 0 profile tree view: misc: 0.0054, prepare_batch: 16.2398 train: 75.4581 epoch_init: 0.0080, minibatch_init: 0.0060, losses_postprocess: 0.6149, kl_divergence: 0.5710, after_optimizer: 33.0008 calculate_losses: 26.7244 losses_init: 0.0062, forward_head: 1.8145, bptt_initial: 17.5397, tail: 1.0086, advantages_returns: 0.3281, losses: 3.6993 bptt: 2.0449 bptt_forward_core: 1.9538 update: 13.9170 clip: 1.3814 [2023-02-27 10:09:47,266][00722] RolloutWorker_w0 profile tree view: wait_for_trajectories: 0.3092, enqueue_policy_requests: 138.8888, env_step: 810.4204, overhead: 20.0656, complete_rollouts: 6.8862 save_policy_outputs: 19.4879 split_output_tensors: 9.8395 [2023-02-27 10:09:47,268][00722] RolloutWorker_w7 profile tree view: wait_for_trajectories: 0.2851, enqueue_policy_requests: 140.8819, env_step: 809.8829, overhead: 20.2445, complete_rollouts: 6.8378 save_policy_outputs: 19.2089 split_output_tensors: 9.4714 [2023-02-27 10:09:47,270][00722] Loop Runner_EvtLoop terminating... [2023-02-27 10:09:47,272][00722] Runner profile tree view: main_loop: 1102.4991 [2023-02-27 10:09:47,274][00722] Collected {0: 4005888}, FPS: 3633.5 [2023-02-27 10:19:30,485][00722] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-27 10:19:30,489][00722] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-27 10:19:30,491][00722] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-27 10:19:30,494][00722] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-27 10:19:30,496][00722] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-27 10:19:30,499][00722] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-27 10:19:30,501][00722] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-02-27 10:19:30,502][00722] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-27 10:19:30,506][00722] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2023-02-27 10:19:30,507][00722] Adding new argument 'hf_repository'=None that is not in the saved config file! [2023-02-27 10:19:30,509][00722] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-27 10:19:30,512][00722] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-27 10:19:30,513][00722] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-27 10:19:30,515][00722] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-27 10:19:30,518][00722] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-27 10:19:30,562][00722] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-27 10:19:30,569][00722] RunningMeanStd input shape: (3, 72, 128) [2023-02-27 10:19:30,574][00722] RunningMeanStd input shape: (1,) [2023-02-27 10:19:30,598][00722] ConvEncoder: input_channels=3 [2023-02-27 10:19:31,370][00722] Conv encoder output size: 512 [2023-02-27 10:19:31,372][00722] Policy head output size: 512 [2023-02-27 10:19:34,260][00722] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-27 10:19:35,526][00722] Num frames 100... [2023-02-27 10:19:35,638][00722] Num frames 200... [2023-02-27 10:19:35,752][00722] Num frames 300... [2023-02-27 10:19:35,883][00722] Num frames 400... [2023-02-27 10:19:36,027][00722] Avg episode rewards: #0: 7.800, true rewards: #0: 4.800 [2023-02-27 10:19:36,029][00722] Avg episode reward: 7.800, avg true_objective: 4.800 [2023-02-27 10:19:36,060][00722] Num frames 500... [2023-02-27 10:19:36,170][00722] Num frames 600... [2023-02-27 10:19:36,287][00722] Num frames 700... [2023-02-27 10:19:36,398][00722] Num frames 800... [2023-02-27 10:19:36,508][00722] Num frames 900... [2023-02-27 10:19:36,618][00722] Num frames 1000... [2023-02-27 10:19:36,731][00722] Num frames 1100... [2023-02-27 10:19:36,895][00722] Avg episode rewards: #0: 10.430, true rewards: #0: 5.930 [2023-02-27 10:19:36,897][00722] Avg episode reward: 10.430, avg true_objective: 5.930 [2023-02-27 10:19:36,918][00722] Num frames 1200... [2023-02-27 10:19:37,033][00722] Num frames 1300... [2023-02-27 10:19:37,145][00722] Num frames 1400... [2023-02-27 10:19:37,270][00722] Num frames 1500... [2023-02-27 10:19:37,391][00722] Num frames 1600... [2023-02-27 10:19:37,503][00722] Num frames 1700... [2023-02-27 10:19:37,620][00722] Num frames 1800... [2023-02-27 10:19:37,733][00722] Num frames 1900... [2023-02-27 10:19:37,853][00722] Num frames 2000... [2023-02-27 10:19:37,974][00722] Num frames 2100... [2023-02-27 10:19:38,089][00722] Num frames 2200... [2023-02-27 10:19:38,202][00722] Num frames 2300... [2023-02-27 10:19:38,322][00722] Num frames 2400... [2023-02-27 10:19:38,434][00722] Num frames 2500... [2023-02-27 10:19:38,546][00722] Num frames 2600... [2023-02-27 10:19:38,659][00722] Num frames 2700... [2023-02-27 10:19:38,771][00722] Num frames 2800... [2023-02-27 10:19:38,898][00722] Num frames 2900... [2023-02-27 10:19:39,014][00722] Num frames 3000... [2023-02-27 10:19:39,128][00722] Num frames 3100... [2023-02-27 10:19:39,241][00722] Avg episode rewards: #0: 26.173, true rewards: #0: 10.507 [2023-02-27 10:19:39,243][00722] Avg episode reward: 26.173, avg true_objective: 10.507 [2023-02-27 10:19:39,301][00722] Num frames 3200... [2023-02-27 10:19:39,416][00722] Num frames 3300... [2023-02-27 10:19:39,527][00722] Num frames 3400... [2023-02-27 10:19:39,637][00722] Num frames 3500... [2023-02-27 10:19:39,755][00722] Num frames 3600... [2023-02-27 10:19:39,877][00722] Num frames 3700... [2023-02-27 10:19:39,991][00722] Num frames 3800... [2023-02-27 10:19:40,078][00722] Avg episode rewards: #0: 23.060, true rewards: #0: 9.560 [2023-02-27 10:19:40,080][00722] Avg episode reward: 23.060, avg true_objective: 9.560 [2023-02-27 10:19:40,170][00722] Num frames 3900... [2023-02-27 10:19:40,281][00722] Num frames 4000... [2023-02-27 10:19:40,400][00722] Num frames 4100... [2023-02-27 10:19:40,511][00722] Num frames 4200... [2023-02-27 10:19:40,623][00722] Num frames 4300... [2023-02-27 10:19:40,739][00722] Num frames 4400... [2023-02-27 10:19:40,854][00722] Num frames 4500... [2023-02-27 10:19:40,971][00722] Num frames 4600... [2023-02-27 10:19:41,082][00722] Num frames 4700... [2023-02-27 10:19:41,199][00722] Num frames 4800... [2023-02-27 10:19:41,312][00722] Num frames 4900... [2023-02-27 10:19:41,425][00722] Num frames 5000... [2023-02-27 10:19:41,522][00722] Avg episode rewards: #0: 24.672, true rewards: #0: 10.072 [2023-02-27 10:19:41,524][00722] Avg episode reward: 24.672, avg true_objective: 10.072 [2023-02-27 10:19:41,599][00722] Num frames 5100... [2023-02-27 10:19:41,712][00722] Num frames 5200... [2023-02-27 10:19:41,826][00722] Num frames 5300... [2023-02-27 10:19:41,943][00722] Num frames 5400... [2023-02-27 10:19:42,057][00722] Num frames 5500... [2023-02-27 10:19:42,170][00722] Num frames 5600... [2023-02-27 10:19:42,285][00722] Num frames 5700... [2023-02-27 10:19:42,398][00722] Num frames 5800... [2023-02-27 10:19:42,509][00722] Num frames 5900... [2023-02-27 10:19:42,619][00722] Num frames 6000... [2023-02-27 10:19:42,734][00722] Num frames 6100... [2023-02-27 10:19:42,846][00722] Num frames 6200... [2023-02-27 10:19:42,967][00722] Num frames 6300... [2023-02-27 10:19:43,090][00722] Num frames 6400... [2023-02-27 10:19:43,185][00722] Avg episode rewards: #0: 25.888, true rewards: #0: 10.722 [2023-02-27 10:19:43,187][00722] Avg episode reward: 25.888, avg true_objective: 10.722 [2023-02-27 10:19:43,267][00722] Num frames 6500... [2023-02-27 10:19:43,432][00722] Num frames 6600... [2023-02-27 10:19:43,612][00722] Num frames 6700... [2023-02-27 10:19:43,783][00722] Num frames 6800... [2023-02-27 10:19:43,962][00722] Num frames 6900... [2023-02-27 10:19:44,122][00722] Num frames 7000... [2023-02-27 10:19:44,247][00722] Avg episode rewards: #0: 23.916, true rewards: #0: 10.059 [2023-02-27 10:19:44,249][00722] Avg episode reward: 23.916, avg true_objective: 10.059 [2023-02-27 10:19:44,343][00722] Num frames 7100... [2023-02-27 10:19:44,502][00722] Num frames 7200... [2023-02-27 10:19:44,659][00722] Num frames 7300... [2023-02-27 10:19:44,815][00722] Num frames 7400... [2023-02-27 10:19:44,988][00722] Num frames 7500... [2023-02-27 10:19:45,144][00722] Num frames 7600... [2023-02-27 10:19:45,313][00722] Num frames 7700... [2023-02-27 10:19:45,473][00722] Num frames 7800... [2023-02-27 10:19:45,634][00722] Num frames 7900... [2023-02-27 10:19:45,799][00722] Num frames 8000... [2023-02-27 10:19:46,014][00722] Avg episode rewards: #0: 23.496, true rewards: #0: 10.121 [2023-02-27 10:19:46,018][00722] Avg episode reward: 23.496, avg true_objective: 10.121 [2023-02-27 10:19:46,027][00722] Num frames 8100... [2023-02-27 10:19:46,197][00722] Num frames 8200... [2023-02-27 10:19:46,356][00722] Num frames 8300... [2023-02-27 10:19:46,518][00722] Num frames 8400... [2023-02-27 10:19:46,679][00722] Num frames 8500... [2023-02-27 10:19:46,839][00722] Num frames 8600... [2023-02-27 10:19:46,959][00722] Num frames 8700... [2023-02-27 10:19:47,039][00722] Avg episode rewards: #0: 22.244, true rewards: #0: 9.689 [2023-02-27 10:19:47,041][00722] Avg episode reward: 22.244, avg true_objective: 9.689 [2023-02-27 10:19:47,137][00722] Num frames 8800... [2023-02-27 10:19:47,258][00722] Num frames 8900... [2023-02-27 10:19:47,370][00722] Num frames 9000... [2023-02-27 10:19:47,485][00722] Num frames 9100... [2023-02-27 10:19:47,603][00722] Num frames 9200... [2023-02-27 10:19:47,716][00722] Num frames 9300... [2023-02-27 10:19:47,829][00722] Num frames 9400... [2023-02-27 10:19:47,945][00722] Num frames 9500... [2023-02-27 10:19:48,061][00722] Num frames 9600... [2023-02-27 10:19:48,173][00722] Num frames 9700... [2023-02-27 10:19:48,287][00722] Num frames 9800... [2023-02-27 10:19:48,400][00722] Num frames 9900... [2023-02-27 10:19:48,511][00722] Num frames 10000... [2023-02-27 10:19:48,624][00722] Num frames 10100... [2023-02-27 10:19:48,741][00722] Num frames 10200... [2023-02-27 10:19:48,856][00722] Num frames 10300... [2023-02-27 10:19:48,981][00722] Avg episode rewards: #0: 24.460, true rewards: #0: 10.360 [2023-02-27 10:19:48,982][00722] Avg episode reward: 24.460, avg true_objective: 10.360 [2023-02-27 10:20:50,978][00722] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-27 10:22:28,608][00722] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-27 10:22:28,610][00722] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-27 10:22:28,613][00722] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-27 10:22:28,615][00722] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-27 10:22:28,617][00722] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-27 10:22:28,619][00722] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-27 10:22:28,620][00722] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-27 10:22:28,621][00722] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-27 10:22:28,622][00722] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-27 10:22:28,624][00722] Adding new argument 'hf_repository'='sryu1/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-27 10:22:28,625][00722] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-27 10:22:28,626][00722] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-27 10:22:28,627][00722] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-27 10:22:28,628][00722] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-27 10:22:28,630][00722] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-27 10:22:28,658][00722] RunningMeanStd input shape: (3, 72, 128) [2023-02-27 10:22:28,661][00722] RunningMeanStd input shape: (1,) [2023-02-27 10:22:28,673][00722] ConvEncoder: input_channels=3 [2023-02-27 10:22:28,711][00722] Conv encoder output size: 512 [2023-02-27 10:22:28,712][00722] Policy head output size: 512 [2023-02-27 10:22:28,733][00722] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-27 10:22:29,179][00722] Num frames 100... [2023-02-27 10:22:29,298][00722] Num frames 200... [2023-02-27 10:22:29,410][00722] Num frames 300... [2023-02-27 10:22:29,532][00722] Num frames 400... [2023-02-27 10:22:29,642][00722] Num frames 500... [2023-02-27 10:22:29,754][00722] Num frames 600... [2023-02-27 10:22:29,865][00722] Num frames 700... [2023-02-27 10:22:29,976][00722] Num frames 800... [2023-02-27 10:22:30,095][00722] Num frames 900... [2023-02-27 10:22:30,209][00722] Num frames 1000... [2023-02-27 10:22:30,315][00722] Avg episode rewards: #0: 21.380, true rewards: #0: 10.380 [2023-02-27 10:22:30,318][00722] Avg episode reward: 21.380, avg true_objective: 10.380 [2023-02-27 10:22:30,388][00722] Num frames 1100... [2023-02-27 10:22:30,520][00722] Num frames 1200... [2023-02-27 10:22:30,634][00722] Num frames 1300... [2023-02-27 10:22:30,750][00722] Num frames 1400... [2023-02-27 10:22:30,864][00722] Num frames 1500... [2023-02-27 10:22:30,979][00722] Num frames 1600... [2023-02-27 10:22:31,093][00722] Num frames 1700... [2023-02-27 10:22:31,207][00722] Num frames 1800... [2023-02-27 10:22:31,329][00722] Num frames 1900... [2023-02-27 10:22:31,447][00722] Num frames 2000... [2023-02-27 10:22:31,545][00722] Avg episode rewards: #0: 22.680, true rewards: #0: 10.180 [2023-02-27 10:22:31,548][00722] Avg episode reward: 22.680, avg true_objective: 10.180 [2023-02-27 10:22:31,621][00722] Num frames 2100... [2023-02-27 10:22:31,739][00722] Num frames 2200... [2023-02-27 10:22:31,852][00722] Num frames 2300... [2023-02-27 10:22:31,968][00722] Num frames 2400... [2023-02-27 10:22:32,090][00722] Num frames 2500... [2023-02-27 10:22:32,209][00722] Num frames 2600... [2023-02-27 10:22:32,329][00722] Num frames 2700... [2023-02-27 10:22:32,442][00722] Num frames 2800... [2023-02-27 10:22:32,555][00722] Num frames 2900... [2023-02-27 10:22:32,666][00722] Num frames 3000... [2023-02-27 10:22:32,782][00722] Num frames 3100... [2023-02-27 10:22:32,943][00722] Num frames 3200... [2023-02-27 10:22:33,109][00722] Num frames 3300... [2023-02-27 10:22:33,181][00722] Avg episode rewards: #0: 25.357, true rewards: #0: 11.023 [2023-02-27 10:22:33,183][00722] Avg episode reward: 25.357, avg true_objective: 11.023 [2023-02-27 10:22:33,337][00722] Num frames 3400... [2023-02-27 10:22:33,492][00722] Num frames 3500... [2023-02-27 10:22:33,643][00722] Num frames 3600... [2023-02-27 10:22:33,797][00722] Num frames 3700... [2023-02-27 10:22:33,969][00722] Num frames 3800... [2023-02-27 10:22:34,123][00722] Num frames 3900... [2023-02-27 10:22:34,282][00722] Num frames 4000... [2023-02-27 10:22:34,440][00722] Num frames 4100... [2023-02-27 10:22:34,594][00722] Num frames 4200... [2023-02-27 10:22:34,751][00722] Num frames 4300... [2023-02-27 10:22:34,913][00722] Num frames 4400... [2023-02-27 10:22:35,074][00722] Num frames 4500... [2023-02-27 10:22:35,238][00722] Num frames 4600... [2023-02-27 10:22:35,403][00722] Num frames 4700... [2023-02-27 10:22:35,566][00722] Num frames 4800... [2023-02-27 10:22:35,725][00722] Num frames 4900... [2023-02-27 10:22:35,884][00722] Num frames 5000... [2023-02-27 10:22:36,045][00722] Num frames 5100... [2023-02-27 10:22:36,208][00722] Num frames 5200... [2023-02-27 10:22:36,344][00722] Avg episode rewards: #0: 30.870, true rewards: #0: 13.120 [2023-02-27 10:22:36,347][00722] Avg episode reward: 30.870, avg true_objective: 13.120 [2023-02-27 10:22:36,412][00722] Num frames 5300... [2023-02-27 10:22:36,527][00722] Num frames 5400... [2023-02-27 10:22:36,642][00722] Num frames 5500... [2023-02-27 10:22:36,753][00722] Num frames 5600... [2023-02-27 10:22:36,872][00722] Num frames 5700... [2023-02-27 10:22:36,987][00722] Num frames 5800... [2023-02-27 10:22:37,099][00722] Num frames 5900... [2023-02-27 10:22:37,217][00722] Num frames 6000... [2023-02-27 10:22:37,331][00722] Num frames 6100... [2023-02-27 10:22:37,452][00722] Num frames 6200... [2023-02-27 10:22:37,575][00722] Num frames 6300... [2023-02-27 10:22:37,696][00722] Num frames 6400... [2023-02-27 10:22:37,808][00722] Num frames 6500... [2023-02-27 10:22:37,923][00722] Num frames 6600... [2023-02-27 10:22:38,007][00722] Avg episode rewards: #0: 31.248, true rewards: #0: 13.248 [2023-02-27 10:22:38,009][00722] Avg episode reward: 31.248, avg true_objective: 13.248 [2023-02-27 10:22:38,096][00722] Num frames 6700... [2023-02-27 10:22:38,207][00722] Num frames 6800... [2023-02-27 10:22:38,320][00722] Num frames 6900... [2023-02-27 10:22:38,438][00722] Num frames 7000... [2023-02-27 10:22:38,549][00722] Num frames 7100... [2023-02-27 10:22:38,666][00722] Num frames 7200... [2023-02-27 10:22:38,790][00722] Avg episode rewards: #0: 27.940, true rewards: #0: 12.107 [2023-02-27 10:22:38,792][00722] Avg episode reward: 27.940, avg true_objective: 12.107 [2023-02-27 10:22:38,836][00722] Num frames 7300... [2023-02-27 10:22:38,951][00722] Num frames 7400... [2023-02-27 10:22:39,065][00722] Num frames 7500... [2023-02-27 10:22:39,178][00722] Num frames 7600... [2023-02-27 10:22:39,300][00722] Num frames 7700... [2023-02-27 10:22:39,415][00722] Num frames 7800... [2023-02-27 10:22:39,534][00722] Num frames 7900... [2023-02-27 10:22:39,652][00722] Num frames 8000... [2023-02-27 10:22:39,772][00722] Num frames 8100... [2023-02-27 10:22:39,891][00722] Num frames 8200... [2023-02-27 10:22:40,053][00722] Avg episode rewards: #0: 27.269, true rewards: #0: 11.840 [2023-02-27 10:22:40,055][00722] Avg episode reward: 27.269, avg true_objective: 11.840 [2023-02-27 10:22:40,074][00722] Num frames 8300... [2023-02-27 10:22:40,192][00722] Num frames 8400... [2023-02-27 10:22:40,307][00722] Num frames 8500... [2023-02-27 10:22:40,421][00722] Num frames 8600... [2023-02-27 10:22:40,547][00722] Num frames 8700... [2023-02-27 10:22:40,658][00722] Num frames 8800... [2023-02-27 10:22:40,770][00722] Num frames 8900... [2023-02-27 10:22:40,884][00722] Num frames 9000... [2023-02-27 10:22:41,005][00722] Num frames 9100... [2023-02-27 10:22:41,118][00722] Num frames 9200... [2023-02-27 10:22:41,231][00722] Num frames 9300... [2023-02-27 10:22:41,352][00722] Num frames 9400... [2023-02-27 10:22:41,472][00722] Num frames 9500... [2023-02-27 10:22:41,587][00722] Num frames 9600... [2023-02-27 10:22:41,706][00722] Num frames 9700... [2023-02-27 10:22:41,819][00722] Num frames 9800... [2023-02-27 10:22:41,934][00722] Num frames 9900... [2023-02-27 10:22:42,051][00722] Num frames 10000... [2023-02-27 10:22:42,168][00722] Num frames 10100... [2023-02-27 10:22:42,284][00722] Num frames 10200... [2023-02-27 10:22:42,404][00722] Num frames 10300... [2023-02-27 10:22:42,563][00722] Avg episode rewards: #0: 31.485, true rewards: #0: 12.985 [2023-02-27 10:22:42,568][00722] Avg episode reward: 31.485, avg true_objective: 12.985 [2023-02-27 10:22:42,583][00722] Num frames 10400... [2023-02-27 10:22:42,694][00722] Num frames 10500... [2023-02-27 10:22:42,807][00722] Num frames 10600... [2023-02-27 10:22:42,929][00722] Num frames 10700... [2023-02-27 10:22:43,044][00722] Num frames 10800... [2023-02-27 10:22:43,157][00722] Num frames 10900... [2023-02-27 10:22:43,273][00722] Num frames 11000... [2023-02-27 10:22:43,390][00722] Num frames 11100... [2023-02-27 10:22:43,510][00722] Num frames 11200... [2023-02-27 10:22:43,630][00722] Num frames 11300... [2023-02-27 10:22:43,748][00722] Avg episode rewards: #0: 30.387, true rewards: #0: 12.609 [2023-02-27 10:22:43,750][00722] Avg episode reward: 30.387, avg true_objective: 12.609 [2023-02-27 10:22:43,816][00722] Num frames 11400... [2023-02-27 10:22:43,931][00722] Num frames 11500... [2023-02-27 10:22:44,053][00722] Num frames 11600... [2023-02-27 10:22:44,163][00722] Num frames 11700... [2023-02-27 10:22:44,276][00722] Num frames 11800... [2023-02-27 10:22:44,390][00722] Num frames 11900... [2023-02-27 10:22:44,505][00722] Num frames 12000... [2023-02-27 10:22:44,622][00722] Num frames 12100... [2023-02-27 10:22:44,697][00722] Avg episode rewards: #0: 28.816, true rewards: #0: 12.116 [2023-02-27 10:22:44,699][00722] Avg episode reward: 28.816, avg true_objective: 12.116 [2023-02-27 10:23:55,590][00722] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-27 10:25:20,824][00722] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-27 10:25:20,825][00722] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-27 10:25:20,829][00722] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-27 10:25:20,832][00722] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-27 10:25:20,834][00722] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-27 10:25:20,835][00722] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-27 10:25:20,839][00722] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-27 10:25:20,842][00722] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-27 10:25:20,845][00722] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-27 10:25:20,847][00722] Adding new argument 'hf_repository'='sryu1/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-27 10:25:20,850][00722] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-27 10:25:20,851][00722] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-27 10:25:20,856][00722] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-27 10:25:20,857][00722] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-27 10:25:20,859][00722] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-27 10:25:20,885][00722] RunningMeanStd input shape: (3, 72, 128) [2023-02-27 10:25:20,888][00722] RunningMeanStd input shape: (1,) [2023-02-27 10:25:20,903][00722] ConvEncoder: input_channels=3 [2023-02-27 10:25:20,940][00722] Conv encoder output size: 512 [2023-02-27 10:25:20,943][00722] Policy head output size: 512 [2023-02-27 10:25:20,962][00722] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-27 10:25:21,408][00722] Num frames 100... [2023-02-27 10:25:21,531][00722] Num frames 200... [2023-02-27 10:25:21,648][00722] Num frames 300... [2023-02-27 10:25:21,760][00722] Num frames 400... [2023-02-27 10:25:21,871][00722] Num frames 500... [2023-02-27 10:25:21,942][00722] Avg episode rewards: #0: 8.120, true rewards: #0: 5.120 [2023-02-27 10:25:21,944][00722] Avg episode reward: 8.120, avg true_objective: 5.120 [2023-02-27 10:25:22,047][00722] Num frames 600... [2023-02-27 10:25:22,162][00722] Num frames 700... [2023-02-27 10:25:22,278][00722] Num frames 800... [2023-02-27 10:25:22,390][00722] Num frames 900... [2023-02-27 10:25:22,502][00722] Num frames 1000... [2023-02-27 10:25:22,623][00722] Num frames 1100... [2023-02-27 10:25:22,768][00722] Avg episode rewards: #0: 10.880, true rewards: #0: 5.880 [2023-02-27 10:25:22,771][00722] Avg episode reward: 10.880, avg true_objective: 5.880 [2023-02-27 10:25:22,803][00722] Num frames 1200... [2023-02-27 10:25:22,918][00722] Num frames 1300... [2023-02-27 10:25:23,031][00722] Num frames 1400... [2023-02-27 10:25:23,144][00722] Num frames 1500... [2023-02-27 10:25:23,260][00722] Num frames 1600... [2023-02-27 10:25:23,416][00722] Avg episode rewards: #0: 9.960, true rewards: #0: 5.627 [2023-02-27 10:25:23,418][00722] Avg episode reward: 9.960, avg true_objective: 5.627 [2023-02-27 10:25:23,435][00722] Num frames 1700... [2023-02-27 10:25:23,554][00722] Num frames 1800... [2023-02-27 10:25:23,669][00722] Num frames 1900... [2023-02-27 10:25:23,803][00722] Num frames 2000... [2023-02-27 10:25:23,970][00722] Num frames 2100... [2023-02-27 10:25:24,144][00722] Num frames 2200... [2023-02-27 10:25:24,303][00722] Num frames 2300... [2023-02-27 10:25:24,468][00722] Num frames 2400... [2023-02-27 10:25:24,627][00722] Num frames 2500... [2023-02-27 10:25:24,791][00722] Num frames 2600... [2023-02-27 10:25:24,966][00722] Num frames 2700... [2023-02-27 10:25:25,128][00722] Avg episode rewards: #0: 13.918, true rewards: #0: 6.917 [2023-02-27 10:25:25,131][00722] Avg episode reward: 13.918, avg true_objective: 6.917 [2023-02-27 10:25:25,190][00722] Num frames 2800... [2023-02-27 10:25:25,346][00722] Num frames 2900... [2023-02-27 10:25:25,502][00722] Num frames 3000... [2023-02-27 10:25:25,673][00722] Num frames 3100... [2023-02-27 10:25:25,833][00722] Num frames 3200... [2023-02-27 10:25:25,997][00722] Num frames 3300... [2023-02-27 10:25:26,162][00722] Num frames 3400... [2023-02-27 10:25:26,330][00722] Num frames 3500... [2023-02-27 10:25:26,495][00722] Num frames 3600... [2023-02-27 10:25:26,602][00722] Avg episode rewards: #0: 15.262, true rewards: #0: 7.262 [2023-02-27 10:25:26,605][00722] Avg episode reward: 15.262, avg true_objective: 7.262 [2023-02-27 10:25:26,726][00722] Num frames 3700... [2023-02-27 10:25:26,888][00722] Num frames 3800... [2023-02-27 10:25:27,052][00722] Num frames 3900... [2023-02-27 10:25:27,219][00722] Num frames 4000... [2023-02-27 10:25:27,358][00722] Num frames 4100... [2023-02-27 10:25:27,476][00722] Num frames 4200... [2023-02-27 10:25:27,587][00722] Num frames 4300... [2023-02-27 10:25:27,709][00722] Num frames 4400... [2023-02-27 10:25:27,761][00722] Avg episode rewards: #0: 15.333, true rewards: #0: 7.333 [2023-02-27 10:25:27,763][00722] Avg episode reward: 15.333, avg true_objective: 7.333 [2023-02-27 10:25:27,882][00722] Num frames 4500... [2023-02-27 10:25:27,995][00722] Num frames 4600... [2023-02-27 10:25:28,108][00722] Num frames 4700... [2023-02-27 10:25:28,225][00722] Num frames 4800... [2023-02-27 10:25:28,343][00722] Num frames 4900... [2023-02-27 10:25:28,454][00722] Num frames 5000... [2023-02-27 10:25:28,568][00722] Num frames 5100... [2023-02-27 10:25:28,680][00722] Num frames 5200... [2023-02-27 10:25:28,798][00722] Num frames 5300... [2023-02-27 10:25:28,916][00722] Num frames 5400... [2023-02-27 10:25:29,036][00722] Num frames 5500... [2023-02-27 10:25:29,153][00722] Num frames 5600... [2023-02-27 10:25:29,269][00722] Num frames 5700... [2023-02-27 10:25:29,396][00722] Num frames 5800... [2023-02-27 10:25:29,509][00722] Num frames 5900... [2023-02-27 10:25:29,583][00722] Avg episode rewards: #0: 17.880, true rewards: #0: 8.451 [2023-02-27 10:25:29,585][00722] Avg episode reward: 17.880, avg true_objective: 8.451 [2023-02-27 10:25:29,691][00722] Num frames 6000... [2023-02-27 10:25:29,818][00722] Num frames 6100... [2023-02-27 10:25:29,934][00722] Num frames 6200... [2023-02-27 10:25:30,044][00722] Num frames 6300... [2023-02-27 10:25:30,160][00722] Num frames 6400... [2023-02-27 10:25:30,276][00722] Num frames 6500... [2023-02-27 10:25:30,391][00722] Num frames 6600... [2023-02-27 10:25:30,508][00722] Num frames 6700... [2023-02-27 10:25:30,649][00722] Avg episode rewards: #0: 18.225, true rewards: #0: 8.475 [2023-02-27 10:25:30,651][00722] Avg episode reward: 18.225, avg true_objective: 8.475 [2023-02-27 10:25:30,677][00722] Num frames 6800... [2023-02-27 10:25:30,795][00722] Num frames 6900... [2023-02-27 10:25:30,907][00722] Num frames 7000... [2023-02-27 10:25:31,022][00722] Num frames 7100... [2023-02-27 10:25:31,134][00722] Num frames 7200... [2023-02-27 10:25:31,249][00722] Num frames 7300... [2023-02-27 10:25:31,368][00722] Num frames 7400... [2023-02-27 10:25:31,480][00722] Num frames 7500... [2023-02-27 10:25:31,595][00722] Num frames 7600... [2023-02-27 10:25:31,711][00722] Num frames 7700... [2023-02-27 10:25:31,833][00722] Num frames 7800... [2023-02-27 10:25:31,946][00722] Num frames 7900... [2023-02-27 10:25:32,057][00722] Num frames 8000... [2023-02-27 10:25:32,168][00722] Num frames 8100... [2023-02-27 10:25:32,296][00722] Num frames 8200... [2023-02-27 10:25:32,447][00722] Avg episode rewards: #0: 19.871, true rewards: #0: 9.204 [2023-02-27 10:25:32,449][00722] Avg episode reward: 19.871, avg true_objective: 9.204 [2023-02-27 10:25:32,471][00722] Num frames 8300... [2023-02-27 10:25:32,586][00722] Num frames 8400... [2023-02-27 10:25:32,700][00722] Num frames 8500... [2023-02-27 10:25:32,818][00722] Num frames 8600... [2023-02-27 10:25:32,939][00722] Num frames 8700... [2023-02-27 10:25:33,051][00722] Num frames 8800... [2023-02-27 10:25:33,206][00722] Avg episode rewards: #0: 19.092, true rewards: #0: 8.892 [2023-02-27 10:25:33,209][00722] Avg episode reward: 19.092, avg true_objective: 8.892 [2023-02-27 10:26:27,784][00722] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-27 10:33:25,447][00722] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-27 10:33:25,452][00722] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-27 10:33:25,456][00722] Adding new argument 'no_render'=False that is not in the saved config file! [2023-02-27 10:33:25,458][00722] Adding new argument 'save_video'=False that is not in the saved config file! [2023-02-27 10:33:25,462][00722] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-27 10:33:25,463][00722] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-27 10:33:25,467][00722] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-02-27 10:33:25,468][00722] Adding new argument 'max_num_episodes'=1000000000.0 that is not in the saved config file! [2023-02-27 10:33:25,469][00722] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-27 10:33:25,474][00722] Adding new argument 'hf_repository'='sryu1/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-27 10:33:25,475][00722] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-27 10:33:25,476][00722] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-27 10:33:25,478][00722] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-27 10:33:25,479][00722] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-27 10:33:25,481][00722] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-27 10:33:25,521][00722] RunningMeanStd input shape: (3, 72, 128) [2023-02-27 10:33:25,524][00722] RunningMeanStd input shape: (1,) [2023-02-27 10:33:25,543][00722] ConvEncoder: input_channels=3 [2023-02-27 10:33:25,608][00722] Conv encoder output size: 512 [2023-02-27 10:33:25,613][00722] Policy head output size: 512 [2023-02-27 10:33:25,644][00722] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-27 10:33:56,409][00722] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-27 10:33:56,410][00722] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-27 10:33:56,418][00722] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-27 10:33:56,420][00722] Adding new argument 'save_video'=False that is not in the saved config file! [2023-02-27 10:33:56,422][00722] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-27 10:33:56,424][00722] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-27 10:33:56,425][00722] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-27 10:33:56,428][00722] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-27 10:33:56,429][00722] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-27 10:33:56,432][00722] Adding new argument 'hf_repository'='ThomasSimonini/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-27 10:33:56,433][00722] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-27 10:33:56,435][00722] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-27 10:33:56,437][00722] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-27 10:33:56,439][00722] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-27 10:33:56,440][00722] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-27 10:33:56,465][00722] RunningMeanStd input shape: (3, 72, 128) [2023-02-27 10:33:56,467][00722] RunningMeanStd input shape: (1,) [2023-02-27 10:33:56,482][00722] ConvEncoder: input_channels=3 [2023-02-27 10:33:56,518][00722] Conv encoder output size: 512 [2023-02-27 10:33:56,521][00722] Policy head output size: 512 [2023-02-27 10:33:56,539][00722] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-27 10:33:56,973][00722] Num frames 100... [2023-02-27 10:33:57,073][00722] Num frames 200... [2023-02-27 10:33:57,168][00722] Num frames 300... [2023-02-27 10:33:57,274][00722] Num frames 400... [2023-02-27 10:33:57,372][00722] Num frames 500... [2023-02-27 10:33:57,469][00722] Num frames 600... [2023-02-27 10:33:57,581][00722] Num frames 700... [2023-02-27 10:33:57,679][00722] Num frames 800... [2023-02-27 10:33:57,779][00722] Num frames 900... [2023-02-27 10:33:57,877][00722] Num frames 1000... [2023-02-27 10:33:57,976][00722] Num frames 1100... [2023-02-27 10:33:58,082][00722] Num frames 1200... [2023-02-27 10:33:58,179][00722] Num frames 1300... [2023-02-27 10:33:58,282][00722] Num frames 1400... [2023-02-27 10:33:58,383][00722] Num frames 1500... [2023-02-27 10:33:58,488][00722] Avg episode rewards: #0: 36.530, true rewards: #0: 15.530 [2023-02-27 10:33:58,492][00722] Avg episode reward: 36.530, avg true_objective: 15.530 [2023-02-27 10:33:58,538][00722] Num frames 1600... [2023-02-27 10:33:58,637][00722] Num frames 1700... [2023-02-27 10:33:58,732][00722] Num frames 1800... [2023-02-27 10:33:58,826][00722] Num frames 1900... [2023-02-27 10:33:58,925][00722] Num frames 2000... [2023-02-27 10:33:59,012][00722] Avg episode rewards: #0: 21.665, true rewards: #0: 10.165 [2023-02-27 10:33:59,014][00722] Avg episode reward: 21.665, avg true_objective: 10.165 [2023-02-27 10:33:59,086][00722] Num frames 2100... [2023-02-27 10:33:59,182][00722] Num frames 2200... [2023-02-27 10:33:59,288][00722] Num frames 2300... [2023-02-27 10:33:59,382][00722] Num frames 2400... [2023-02-27 10:33:59,476][00722] Num frames 2500... [2023-02-27 10:33:59,574][00722] Num frames 2600... [2023-02-27 10:33:59,670][00722] Num frames 2700... [2023-02-27 10:33:59,770][00722] Num frames 2800... [2023-02-27 10:33:59,870][00722] Num frames 2900... [2023-02-27 10:33:59,931][00722] Avg episode rewards: #0: 20.347, true rewards: #0: 9.680 [2023-02-27 10:33:59,932][00722] Avg episode reward: 20.347, avg true_objective: 9.680 [2023-02-27 10:34:00,026][00722] Num frames 3000... [2023-02-27 10:34:00,122][00722] Num frames 3100... [2023-02-27 10:34:00,223][00722] Num frames 3200... [2023-02-27 10:34:00,323][00722] Num frames 3300... [2023-02-27 10:34:00,421][00722] Num frames 3400... [2023-02-27 10:34:00,524][00722] Num frames 3500... [2023-02-27 10:34:00,621][00722] Num frames 3600... [2023-02-27 10:34:00,725][00722] Num frames 3700... [2023-02-27 10:34:00,829][00722] Num frames 3800... [2023-02-27 10:34:00,926][00722] Num frames 3900... [2023-02-27 10:34:01,022][00722] Num frames 4000... [2023-02-27 10:34:01,128][00722] Num frames 4100... [2023-02-27 10:34:01,193][00722] Avg episode rewards: #0: 22.770, true rewards: #0: 10.270 [2023-02-27 10:34:01,195][00722] Avg episode reward: 22.770, avg true_objective: 10.270 [2023-02-27 10:34:01,293][00722] Num frames 4200... [2023-02-27 10:34:01,390][00722] Num frames 4300... [2023-02-27 10:34:01,486][00722] Num frames 4400... [2023-02-27 10:34:01,584][00722] Num frames 4500... [2023-02-27 10:34:01,681][00722] Num frames 4600... [2023-02-27 10:34:01,777][00722] Num frames 4700... [2023-02-27 10:34:01,911][00722] Avg episode rewards: #0: 21.560, true rewards: #0: 9.560 [2023-02-27 10:34:01,913][00722] Avg episode reward: 21.560, avg true_objective: 9.560 [2023-02-27 10:34:01,936][00722] Num frames 4800... [2023-02-27 10:34:02,038][00722] Num frames 4900... [2023-02-27 10:34:02,144][00722] Num frames 5000... [2023-02-27 10:34:02,248][00722] Num frames 5100... [2023-02-27 10:34:02,391][00722] Num frames 5200... [2023-02-27 10:34:02,527][00722] Num frames 5300... [2023-02-27 10:34:02,662][00722] Num frames 5400... [2023-02-27 10:34:02,796][00722] Num frames 5500... [2023-02-27 10:34:02,917][00722] Avg episode rewards: #0: 20.580, true rewards: #0: 9.247 [2023-02-27 10:34:02,921][00722] Avg episode reward: 20.580, avg true_objective: 9.247 [2023-02-27 10:34:02,992][00722] Num frames 5600... [2023-02-27 10:34:03,121][00722] Num frames 5700... [2023-02-27 10:34:03,255][00722] Num frames 5800... [2023-02-27 10:34:03,387][00722] Num frames 5900... [2023-02-27 10:34:03,519][00722] Num frames 6000... [2023-02-27 10:34:03,648][00722] Num frames 6100... [2023-02-27 10:34:03,776][00722] Num frames 6200... [2023-02-27 10:34:03,904][00722] Num frames 6300... [2023-02-27 10:34:04,032][00722] Num frames 6400... [2023-02-27 10:34:04,144][00722] Avg episode rewards: #0: 20.206, true rewards: #0: 9.206 [2023-02-27 10:34:04,146][00722] Avg episode reward: 20.206, avg true_objective: 9.206 [2023-02-27 10:34:04,221][00722] Num frames 6500... [2023-02-27 10:34:04,363][00722] Num frames 6600... [2023-02-27 10:34:04,514][00722] Num frames 6700... [2023-02-27 10:34:04,648][00722] Num frames 6800... [2023-02-27 10:34:04,784][00722] Num frames 6900... [2023-02-27 10:34:04,924][00722] Num frames 7000... [2023-02-27 10:34:05,059][00722] Num frames 7100... [2023-02-27 10:34:05,193][00722] Num frames 7200... [2023-02-27 10:34:05,331][00722] Num frames 7300... [2023-02-27 10:34:05,468][00722] Num frames 7400... [2023-02-27 10:34:05,606][00722] Num frames 7500... [2023-02-27 10:34:05,747][00722] Avg episode rewards: #0: 20.455, true rewards: #0: 9.455 [2023-02-27 10:34:05,749][00722] Avg episode reward: 20.455, avg true_objective: 9.455 [2023-02-27 10:34:05,802][00722] Num frames 7600... [2023-02-27 10:34:05,929][00722] Num frames 7700... [2023-02-27 10:34:06,026][00722] Num frames 7800... [2023-02-27 10:34:06,130][00722] Num frames 7900... [2023-02-27 10:34:06,223][00722] Num frames 8000... [2023-02-27 10:34:06,320][00722] Num frames 8100... [2023-02-27 10:34:06,415][00722] Num frames 8200... [2023-02-27 10:34:06,514][00722] Num frames 8300... [2023-02-27 10:34:06,609][00722] Num frames 8400... [2023-02-27 10:34:06,705][00722] Num frames 8500... [2023-02-27 10:34:06,812][00722] Num frames 8600... [2023-02-27 10:34:06,908][00722] Num frames 8700... [2023-02-27 10:34:07,007][00722] Num frames 8800... [2023-02-27 10:34:07,102][00722] Num frames 8900... [2023-02-27 10:34:07,209][00722] Num frames 9000... [2023-02-27 10:34:07,339][00722] Avg episode rewards: #0: 21.973, true rewards: #0: 10.084 [2023-02-27 10:34:07,342][00722] Avg episode reward: 21.973, avg true_objective: 10.084 [2023-02-27 10:34:07,368][00722] Num frames 9100... [2023-02-27 10:34:07,467][00722] Num frames 9200... [2023-02-27 10:34:07,571][00722] Num frames 9300... [2023-02-27 10:34:07,665][00722] Num frames 9400... [2023-02-27 10:34:07,766][00722] Num frames 9500... [2023-02-27 10:34:07,867][00722] Num frames 9600... [2023-02-27 10:34:08,016][00722] Avg episode rewards: #0: 20.996, true rewards: #0: 9.696 [2023-02-27 10:34:08,018][00722] Avg episode reward: 20.996, avg true_objective: 9.696 [2023-02-27 10:35:25,388][00722] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-27 10:35:25,390][00722] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-27 10:35:25,392][00722] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-27 10:35:25,396][00722] Adding new argument 'save_video'=False that is not in the saved config file! [2023-02-27 10:35:25,398][00722] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-27 10:35:25,399][00722] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-27 10:35:25,401][00722] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-27 10:35:25,403][00722] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-27 10:35:25,405][00722] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-27 10:35:25,406][00722] Adding new argument 'hf_repository'='sryu1/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-27 10:35:25,408][00722] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-27 10:35:25,410][00722] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-27 10:35:25,412][00722] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-27 10:35:25,413][00722] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-27 10:35:25,415][00722] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-27 10:35:25,441][00722] RunningMeanStd input shape: (3, 72, 128) [2023-02-27 10:35:25,443][00722] RunningMeanStd input shape: (1,) [2023-02-27 10:35:25,457][00722] ConvEncoder: input_channels=3 [2023-02-27 10:35:25,500][00722] Conv encoder output size: 512 [2023-02-27 10:35:25,502][00722] Policy head output size: 512 [2023-02-27 10:35:25,522][00722] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-27 10:35:25,950][00722] Num frames 100... [2023-02-27 10:35:26,050][00722] Num frames 200... [2023-02-27 10:35:26,150][00722] Num frames 300... [2023-02-27 10:35:26,250][00722] Num frames 400... [2023-02-27 10:35:26,343][00722] Num frames 500... [2023-02-27 10:35:26,436][00722] Num frames 600... [2023-02-27 10:35:26,500][00722] Avg episode rewards: #0: 9.080, true rewards: #0: 6.080 [2023-02-27 10:35:26,503][00722] Avg episode reward: 9.080, avg true_objective: 6.080 [2023-02-27 10:35:26,591][00722] Num frames 700... [2023-02-27 10:35:26,692][00722] Num frames 800... [2023-02-27 10:35:26,793][00722] Num frames 900... [2023-02-27 10:35:26,887][00722] Num frames 1000... [2023-02-27 10:35:26,994][00722] Num frames 1100... [2023-02-27 10:35:27,088][00722] Num frames 1200... [2023-02-27 10:35:27,191][00722] Num frames 1300... [2023-02-27 10:35:27,296][00722] Num frames 1400... [2023-02-27 10:35:27,396][00722] Num frames 1500... [2023-02-27 10:35:27,491][00722] Num frames 1600... [2023-02-27 10:35:27,581][00722] Avg episode rewards: #0: 16.160, true rewards: #0: 8.160 [2023-02-27 10:35:27,584][00722] Avg episode reward: 16.160, avg true_objective: 8.160 [2023-02-27 10:35:27,652][00722] Num frames 1700... [2023-02-27 10:35:27,746][00722] Num frames 1800... [2023-02-27 10:35:27,841][00722] Num frames 1900... [2023-02-27 10:35:27,935][00722] Num frames 2000... [2023-02-27 10:35:28,029][00722] Num frames 2100... [2023-02-27 10:35:28,125][00722] Num frames 2200... [2023-02-27 10:35:28,219][00722] Num frames 2300... [2023-02-27 10:35:28,330][00722] Num frames 2400... [2023-02-27 10:35:28,424][00722] Num frames 2500... [2023-02-27 10:35:28,520][00722] Num frames 2600... [2023-02-27 10:35:28,629][00722] Num frames 2700... [2023-02-27 10:35:28,724][00722] Num frames 2800... [2023-02-27 10:35:28,826][00722] Num frames 2900... [2023-02-27 10:35:28,930][00722] Num frames 3000... [2023-02-27 10:35:29,033][00722] Num frames 3100... [2023-02-27 10:35:29,097][00722] Avg episode rewards: #0: 23.350, true rewards: #0: 10.350 [2023-02-27 10:35:29,099][00722] Avg episode reward: 23.350, avg true_objective: 10.350 [2023-02-27 10:35:29,193][00722] Num frames 3200... [2023-02-27 10:35:29,296][00722] Num frames 3300... [2023-02-27 10:35:29,391][00722] Num frames 3400... [2023-02-27 10:35:29,485][00722] Num frames 3500... [2023-02-27 10:35:29,592][00722] Num frames 3600... [2023-02-27 10:35:29,691][00722] Num frames 3700... [2023-02-27 10:35:29,788][00722] Num frames 3800... [2023-02-27 10:35:29,885][00722] Num frames 3900... [2023-02-27 10:35:29,989][00722] Num frames 4000... [2023-02-27 10:35:30,047][00722] Avg episode rewards: #0: 22.253, true rewards: #0: 10.002 [2023-02-27 10:35:30,050][00722] Avg episode reward: 22.253, avg true_objective: 10.002 [2023-02-27 10:35:30,154][00722] Num frames 4100... [2023-02-27 10:35:30,251][00722] Num frames 4200... [2023-02-27 10:35:30,346][00722] Num frames 4300... [2023-02-27 10:35:30,444][00722] Num frames 4400... [2023-02-27 10:35:30,540][00722] Num frames 4500... [2023-02-27 10:35:30,664][00722] Num frames 4600... [2023-02-27 10:35:30,821][00722] Avg episode rewards: #0: 20.564, true rewards: #0: 9.364 [2023-02-27 10:35:30,823][00722] Avg episode reward: 20.564, avg true_objective: 9.364 [2023-02-27 10:35:30,848][00722] Num frames 4700... [2023-02-27 10:35:30,976][00722] Num frames 4800... [2023-02-27 10:35:31,129][00722] Num frames 4900... [2023-02-27 10:35:31,260][00722] Num frames 5000... [2023-02-27 10:35:31,388][00722] Num frames 5100... [2023-02-27 10:35:31,517][00722] Num frames 5200... [2023-02-27 10:35:31,660][00722] Num frames 5300... [2023-02-27 10:35:31,805][00722] Num frames 5400... [2023-02-27 10:35:31,943][00722] Num frames 5500... [2023-02-27 10:35:32,080][00722] Num frames 5600... [2023-02-27 10:35:32,208][00722] Num frames 5700... [2023-02-27 10:35:32,341][00722] Num frames 5800... [2023-02-27 10:35:32,477][00722] Num frames 5900... [2023-02-27 10:35:32,610][00722] Num frames 6000... [2023-02-27 10:35:32,747][00722] Num frames 6100... [2023-02-27 10:35:32,853][00722] Avg episode rewards: #0: 23.223, true rewards: #0: 10.223 [2023-02-27 10:35:32,855][00722] Avg episode reward: 23.223, avg true_objective: 10.223 [2023-02-27 10:35:32,948][00722] Num frames 6200... [2023-02-27 10:35:33,087][00722] Num frames 6300... [2023-02-27 10:35:33,225][00722] Num frames 6400... [2023-02-27 10:35:33,366][00722] Num frames 6500... [2023-02-27 10:35:33,499][00722] Num frames 6600... [2023-02-27 10:35:33,635][00722] Num frames 6700... [2023-02-27 10:35:33,769][00722] Num frames 6800... [2023-02-27 10:35:33,911][00722] Num frames 6900... [2023-02-27 10:35:34,048][00722] Num frames 7000... [2023-02-27 10:35:34,168][00722] Num frames 7100... [2023-02-27 10:35:34,263][00722] Num frames 7200... [2023-02-27 10:35:34,358][00722] Num frames 7300... [2023-02-27 10:35:34,451][00722] Num frames 7400... [2023-02-27 10:35:34,548][00722] Num frames 7500... [2023-02-27 10:35:34,644][00722] Num frames 7600... [2023-02-27 10:35:34,743][00722] Num frames 7700... [2023-02-27 10:35:34,842][00722] Num frames 7800... [2023-02-27 10:35:34,938][00722] Num frames 7900... [2023-02-27 10:35:35,033][00722] Num frames 8000... [2023-02-27 10:35:35,129][00722] Num frames 8100... [2023-02-27 10:35:35,225][00722] Num frames 8200... [2023-02-27 10:35:35,308][00722] Avg episode rewards: #0: 28.037, true rewards: #0: 11.751 [2023-02-27 10:35:35,309][00722] Avg episode reward: 28.037, avg true_objective: 11.751 [2023-02-27 10:35:35,384][00722] Num frames 8300... [2023-02-27 10:35:35,477][00722] Num frames 8400... [2023-02-27 10:35:35,571][00722] Num frames 8500... [2023-02-27 10:35:35,670][00722] Num frames 8600... [2023-02-27 10:35:35,773][00722] Num frames 8700... [2023-02-27 10:35:35,868][00722] Num frames 8800... [2023-02-27 10:35:35,965][00722] Num frames 8900... [2023-02-27 10:35:36,115][00722] Avg episode rewards: #0: 26.117, true rewards: #0: 11.242 [2023-02-27 10:35:36,117][00722] Avg episode reward: 26.117, avg true_objective: 11.242 [2023-02-27 10:35:36,129][00722] Num frames 9000... [2023-02-27 10:35:36,225][00722] Num frames 9100... [2023-02-27 10:35:36,328][00722] Num frames 9200... [2023-02-27 10:35:36,423][00722] Num frames 9300... [2023-02-27 10:35:36,519][00722] Num frames 9400... [2023-02-27 10:35:36,615][00722] Num frames 9500... [2023-02-27 10:35:36,715][00722] Num frames 9600... [2023-02-27 10:35:36,812][00722] Num frames 9700... [2023-02-27 10:35:36,910][00722] Num frames 9800... [2023-02-27 10:35:37,009][00722] Num frames 9900... [2023-02-27 10:35:37,106][00722] Num frames 10000... [2023-02-27 10:35:37,204][00722] Num frames 10100... [2023-02-27 10:35:37,305][00722] Num frames 10200... [2023-02-27 10:35:37,401][00722] Num frames 10300... [2023-02-27 10:35:37,496][00722] Num frames 10400... [2023-02-27 10:35:37,595][00722] Num frames 10500... [2023-02-27 10:35:37,704][00722] Num frames 10600... [2023-02-27 10:35:37,811][00722] Num frames 10700... [2023-02-27 10:35:37,910][00722] Num frames 10800... [2023-02-27 10:35:38,016][00722] Num frames 10900... [2023-02-27 10:35:38,133][00722] Num frames 11000... [2023-02-27 10:35:38,281][00722] Avg episode rewards: #0: 29.549, true rewards: #0: 12.327 [2023-02-27 10:35:38,282][00722] Avg episode reward: 29.549, avg true_objective: 12.327 [2023-02-27 10:35:38,292][00722] Num frames 11100... [2023-02-27 10:35:38,387][00722] Num frames 11200... [2023-02-27 10:35:38,487][00722] Num frames 11300... [2023-02-27 10:35:38,579][00722] Num frames 11400... [2023-02-27 10:35:38,676][00722] Num frames 11500... [2023-02-27 10:35:38,777][00722] Num frames 11600... [2023-02-27 10:35:38,871][00722] Num frames 11700... [2023-02-27 10:35:38,966][00722] Num frames 11800... [2023-02-27 10:35:39,065][00722] Num frames 11900... [2023-02-27 10:35:39,147][00722] Avg episode rewards: #0: 28.226, true rewards: #0: 11.926 [2023-02-27 10:35:39,149][00722] Avg episode reward: 28.226, avg true_objective: 11.926