diff --git "a/sf_log.txt" "b/sf_log.txt" --- "a/sf_log.txt" +++ "b/sf_log.txt" @@ -1,32 +1,39 @@ -[2023-09-25 20:15:19,439][95689] Saving configuration to ./train_atari/atari_beamrider/config.json... -[2023-09-25 20:15:19,756][95689] Rollout worker 0 uses device cpu -[2023-09-25 20:15:19,757][95689] Rollout worker 1 uses device cpu -[2023-09-25 20:15:19,757][95689] Rollout worker 2 uses device cpu -[2023-09-25 20:15:19,758][95689] Rollout worker 3 uses device cpu -[2023-09-25 20:15:19,758][95689] Rollout worker 4 uses device cpu -[2023-09-25 20:15:19,759][95689] Rollout worker 5 uses device cpu -[2023-09-25 20:15:19,759][95689] Rollout worker 6 uses device cpu -[2023-09-25 20:15:19,760][95689] Rollout worker 7 uses device cpu -[2023-09-25 20:15:19,760][95689] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1 -[2023-09-25 20:15:19,806][95689] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-09-25 20:15:19,806][95689] InferenceWorker_p0-w0: min num requests: 1 -[2023-09-25 20:15:19,810][95689] Using GPUs [1] for process 1 (actually maps to GPUs [1]) -[2023-09-25 20:15:19,810][95689] InferenceWorker_p1-w0: min num requests: 1 -[2023-09-25 20:15:19,834][95689] Starting all processes... -[2023-09-25 20:15:19,834][95689] Starting process learner_proc0 -[2023-09-25 20:15:21,428][95689] Starting process learner_proc1 -[2023-09-25 20:15:21,431][96647] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-09-25 20:15:21,431][96647] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 -[2023-09-25 20:15:21,449][96647] Num visible devices: 1 -[2023-09-25 20:15:21,465][96647] Starting seed is not provided -[2023-09-25 20:15:21,465][96647] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-09-25 20:15:21,465][96647] Initializing actor-critic model on device cuda:0 -[2023-09-25 20:15:21,466][96647] RunningMeanStd input shape: (4, 84, 84) -[2023-09-25 20:15:21,466][96647] RunningMeanStd input shape: (1,) -[2023-09-25 20:15:21,477][96647] ConvEncoder: input_channels=4 -[2023-09-25 20:15:21,635][96647] Conv encoder output size: 512 -[2023-09-25 20:15:21,637][96647] Created Actor Critic model with architecture: -[2023-09-25 20:15:21,637][96647] ActorCriticSharedWeights( +[2023-10-09 04:07:34,846][59242] Saving configuration to ./train_atari/atari_beamrider_APPO/config.json... +[2023-10-09 04:07:35,162][59242] Rollout worker 0 uses device cpu +[2023-10-09 04:07:35,163][59242] Rollout worker 1 uses device cpu +[2023-10-09 04:07:35,164][59242] Rollout worker 2 uses device cpu +[2023-10-09 04:07:35,164][59242] Rollout worker 3 uses device cpu +[2023-10-09 04:07:35,165][59242] Rollout worker 4 uses device cpu +[2023-10-09 04:07:35,165][59242] Rollout worker 5 uses device cpu +[2023-10-09 04:07:35,166][59242] Rollout worker 6 uses device cpu +[2023-10-09 04:07:35,166][59242] Rollout worker 7 uses device cpu +[2023-10-09 04:07:35,166][59242] Rollout worker 8 uses device cpu +[2023-10-09 04:07:35,167][59242] Rollout worker 9 uses device cpu +[2023-10-09 04:07:35,167][59242] Rollout worker 10 uses device cpu +[2023-10-09 04:07:35,167][59242] Rollout worker 11 uses device cpu +[2023-10-09 04:07:35,168][59242] Rollout worker 12 uses device cpu +[2023-10-09 04:07:35,168][59242] Rollout worker 13 uses device cpu +[2023-10-09 04:07:35,169][59242] Rollout worker 14 uses device cpu +[2023-10-09 04:07:35,169][59242] Rollout worker 15 uses device cpu +[2023-10-09 04:07:35,468][59242] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-10-09 04:07:35,469][59242] InferenceWorker_p0-w0: min num requests: 2 +[2023-10-09 04:07:35,472][59242] Using GPUs [1] for process 1 (actually maps to GPUs [1]) +[2023-10-09 04:07:35,473][59242] InferenceWorker_p1-w0: min num requests: 2 +[2023-10-09 04:07:35,521][59242] Starting all processes... +[2023-10-09 04:07:35,521][59242] Starting process learner_proc0 +[2023-10-09 04:07:37,206][59242] Starting process learner_proc1 +[2023-10-09 04:07:37,209][59934] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-10-09 04:07:37,209][59934] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 +[2023-10-09 04:07:37,227][59934] Num visible devices: 1 +[2023-10-09 04:07:37,244][59934] Setting fixed seed 1234 +[2023-10-09 04:07:37,245][59934] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-10-09 04:07:37,245][59934] Initializing actor-critic model on device cuda:0 +[2023-10-09 04:07:37,245][59934] RunningMeanStd input shape: (4, 84, 84) +[2023-10-09 04:07:37,246][59934] RunningMeanStd input shape: (1,) +[2023-10-09 04:07:37,257][59934] ConvEncoder: input_channels=4 +[2023-10-09 04:07:37,409][59934] Conv encoder output size: 512 +[2023-10-09 04:07:37,411][59934] Created Actor Critic model with architecture: +[2023-10-09 04:07:37,411][59934] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( @@ -67,35 +74,41 @@ (distribution_linear): Linear(in_features=512, out_features=9, bias=True) ) ) -[2023-09-25 20:15:22,224][96647] Using optimizer -[2023-09-25 20:15:22,225][96647] No checkpoints found -[2023-09-25 20:15:22,225][96647] Did not load from checkpoint, starting from scratch! -[2023-09-25 20:15:22,225][96647] Initialized policy 0 weights for model version 0 -[2023-09-25 20:15:22,226][96647] LearnerWorker_p0 finished initialization! -[2023-09-25 20:15:22,227][96647] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-09-25 20:15:23,053][95689] Starting all processes... -[2023-09-25 20:15:23,057][96710] Using GPUs [1] for process 1 (actually maps to GPUs [1]) -[2023-09-25 20:15:23,057][96710] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for learning process 1 -[2023-09-25 20:15:23,061][95689] Starting process inference_proc0-0 -[2023-09-25 20:15:23,061][95689] Starting process inference_proc1-0 -[2023-09-25 20:15:23,061][95689] Starting process rollout_proc0 -[2023-09-25 20:15:23,075][96710] Num visible devices: 1 -[2023-09-25 20:15:23,061][95689] Starting process rollout_proc1 -[2023-09-25 20:15:23,062][95689] Starting process rollout_proc2 -[2023-09-25 20:15:23,092][96710] Starting seed is not provided -[2023-09-25 20:15:23,092][96710] Using GPUs [0] for process 1 (actually maps to GPUs [1]) -[2023-09-25 20:15:23,092][96710] Initializing actor-critic model on device cuda:0 -[2023-09-25 20:15:23,093][96710] RunningMeanStd input shape: (4, 84, 84) -[2023-09-25 20:15:23,062][95689] Starting process rollout_proc3 -[2023-09-25 20:15:23,093][96710] RunningMeanStd input shape: (1,) -[2023-09-25 20:15:23,063][95689] Starting process rollout_proc4 -[2023-09-25 20:15:23,066][95689] Starting process rollout_proc5 -[2023-09-25 20:15:23,067][95689] Starting process rollout_proc6 -[2023-09-25 20:15:23,067][95689] Starting process rollout_proc7 -[2023-09-25 20:15:23,105][96710] ConvEncoder: input_channels=4 -[2023-09-25 20:15:23,358][96710] Conv encoder output size: 512 -[2023-09-25 20:15:23,360][96710] Created Actor Critic model with architecture: -[2023-09-25 20:15:23,361][96710] ActorCriticSharedWeights( +[2023-10-09 04:07:37,966][59934] Using optimizer +[2023-10-09 04:07:37,966][59934] No checkpoints found +[2023-10-09 04:07:37,967][59934] Did not load from checkpoint, starting from scratch! +[2023-10-09 04:07:37,967][59934] Initialized policy 0 weights for model version 0 +[2023-10-09 04:07:37,968][59934] LearnerWorker_p0 finished initialization! +[2023-10-09 04:07:37,969][59934] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-10-09 04:07:38,932][59242] Starting all processes... +[2023-10-09 04:07:38,935][60003] Using GPUs [1] for process 1 (actually maps to GPUs [1]) +[2023-10-09 04:07:38,935][60003] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for learning process 1 +[2023-10-09 04:07:38,941][59242] Starting process inference_proc0-0 +[2023-10-09 04:07:38,941][59242] Starting process inference_proc1-0 +[2023-10-09 04:07:38,941][59242] Starting process rollout_proc0 +[2023-10-09 04:07:38,954][60003] Num visible devices: 1 +[2023-10-09 04:07:38,942][59242] Starting process rollout_proc1 +[2023-10-09 04:07:38,973][60003] Setting fixed seed 1234 +[2023-10-09 04:07:38,942][59242] Starting process rollout_proc2 +[2023-10-09 04:07:38,975][60003] Using GPUs [0] for process 1 (actually maps to GPUs [1]) +[2023-10-09 04:07:38,975][60003] Initializing actor-critic model on device cuda:0 +[2023-10-09 04:07:38,942][59242] Starting process rollout_proc3 +[2023-10-09 04:07:38,975][60003] RunningMeanStd input shape: (4, 84, 84) +[2023-10-09 04:07:38,976][60003] RunningMeanStd input shape: (1,) +[2023-10-09 04:07:38,943][59242] Starting process rollout_proc4 +[2023-10-09 04:07:38,945][59242] Starting process rollout_proc5 +[2023-10-09 04:07:38,947][59242] Starting process rollout_proc6 +[2023-10-09 04:07:38,948][59242] Starting process rollout_proc7 +[2023-10-09 04:07:38,988][60003] ConvEncoder: input_channels=4 +[2023-10-09 04:07:38,953][59242] Starting process rollout_proc8 +[2023-10-09 04:07:38,955][59242] Starting process rollout_proc9 +[2023-10-09 04:07:38,957][59242] Starting process rollout_proc10 +[2023-10-09 04:07:38,958][59242] Starting process rollout_proc11 +[2023-10-09 04:07:38,958][59242] Starting process rollout_proc12 +[2023-10-09 04:07:38,958][59242] Starting process rollout_proc13 +[2023-10-09 04:07:39,462][60003] Conv encoder output size: 512 +[2023-10-09 04:07:39,465][60003] Created Actor Critic model with architecture: +[2023-10-09 04:07:39,465][60003] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( @@ -136,139 +149,26461 @@ (distribution_linear): Linear(in_features=512, out_features=9, bias=True) ) ) -[2023-09-25 20:15:23,963][96710] Using optimizer -[2023-09-25 20:15:23,964][96710] No checkpoints found -[2023-09-25 20:15:23,964][96710] Did not load from checkpoint, starting from scratch! -[2023-09-25 20:15:23,964][96710] Initialized policy 1 weights for model version 0 -[2023-09-25 20:15:23,966][96710] LearnerWorker_p1 finished initialization! -[2023-09-25 20:15:23,966][96710] Using GPUs [0] for process 1 (actually maps to GPUs [1]) -[2023-09-25 20:15:24,995][96848] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-09-25 20:15:24,995][96848] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 -[2023-09-25 20:15:25,013][96848] Num visible devices: 1 -[2023-09-25 20:15:25,039][96885] Worker 3 uses CPU cores [12, 13, 14, 15] -[2023-09-25 20:15:25,063][96887] Worker 5 uses CPU cores [20, 21, 22, 23] -[2023-09-25 20:15:25,085][96882] Worker 1 uses CPU cores [4, 5, 6, 7] -[2023-09-25 20:15:25,089][96849] Using GPUs [1] for process 1 (actually maps to GPUs [1]) -[2023-09-25 20:15:25,089][96849] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for inference process 1 -[2023-09-25 20:15:25,109][96849] Num visible devices: 1 -[2023-09-25 20:15:25,110][96886] Worker 4 uses CPU cores [16, 17, 18, 19] -[2023-09-25 20:15:25,130][96888] Worker 6 uses CPU cores [24, 25, 26, 27] -[2023-09-25 20:15:25,160][96889] Worker 7 uses CPU cores [28, 29, 30, 31] -[2023-09-25 20:15:25,210][96868] Worker 0 uses CPU cores [0, 1, 2, 3] -[2023-09-25 20:15:25,262][96884] Worker 2 uses CPU cores [8, 9, 10, 11] -[2023-09-25 20:15:25,632][96848] RunningMeanStd input shape: (4, 84, 84) -[2023-09-25 20:15:25,632][96848] RunningMeanStd input shape: (1,) -[2023-09-25 20:15:25,642][96848] ConvEncoder: input_channels=4 -[2023-09-25 20:15:25,710][95689] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan, 1: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-09-25 20:15:25,727][96849] RunningMeanStd input shape: (4, 84, 84) -[2023-09-25 20:15:25,728][96849] RunningMeanStd input shape: (1,) -[2023-09-25 20:15:25,739][96849] ConvEncoder: input_channels=4 -[2023-09-25 20:15:25,742][96848] Conv encoder output size: 512 -[2023-09-25 20:15:25,747][95689] Inference worker 0-0 is ready! -[2023-09-25 20:15:25,838][96849] Conv encoder output size: 512 -[2023-09-25 20:15:25,843][95689] Inference worker 1-0 is ready! -[2023-09-25 20:15:25,844][95689] All inference workers are ready! Signal rollout workers to start! -[2023-09-25 20:15:26,302][96886] Decorrelating experience for 0 frames... -[2023-09-25 20:15:26,306][96884] Decorrelating experience for 0 frames... -[2023-09-25 20:15:26,306][96889] Decorrelating experience for 0 frames... -[2023-09-25 20:15:26,307][96882] Decorrelating experience for 0 frames... -[2023-09-25 20:15:26,307][96888] Decorrelating experience for 0 frames... -[2023-09-25 20:15:26,309][96868] Decorrelating experience for 0 frames... -[2023-09-25 20:15:26,310][96885] Decorrelating experience for 0 frames... -[2023-09-25 20:15:26,312][96887] Decorrelating experience for 0 frames... -[2023-09-25 20:15:30,710][95689] Fps is (10 sec: 1638.4, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 8192. Throughput: 0: 204.8, 1: 204.8. Samples: 2048. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-25 20:15:35,710][95689] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 32768. Throughput: 0: 398.0, 1: 406.3. Samples: 8043. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-25 20:15:35,711][95689] Avg episode reward: [(0, '2.667'), (1, '7.000')] -[2023-09-25 20:15:39,794][95689] Heartbeat connected on Batcher_0 -[2023-09-25 20:15:39,797][95689] Heartbeat connected on LearnerWorker_p0 -[2023-09-25 20:15:39,800][95689] Heartbeat connected on Batcher_1 -[2023-09-25 20:15:39,802][95689] Heartbeat connected on LearnerWorker_p1 -[2023-09-25 20:15:39,809][95689] Heartbeat connected on InferenceWorker_p0-w0 -[2023-09-25 20:15:39,812][95689] Heartbeat connected on InferenceWorker_p1-w0 -[2023-09-25 20:15:39,813][95689] Heartbeat connected on RolloutWorker_w0 -[2023-09-25 20:15:39,818][95689] Heartbeat connected on RolloutWorker_w1 -[2023-09-25 20:15:39,819][95689] Heartbeat connected on RolloutWorker_w2 -[2023-09-25 20:15:39,822][95689] Heartbeat connected on RolloutWorker_w3 -[2023-09-25 20:15:39,826][95689] Heartbeat connected on RolloutWorker_w4 -[2023-09-25 20:15:39,827][95689] Heartbeat connected on RolloutWorker_w5 -[2023-09-25 20:15:39,830][95689] Heartbeat connected on RolloutWorker_w6 -[2023-09-25 20:15:39,835][95689] Heartbeat connected on RolloutWorker_w7 -[2023-09-25 20:15:40,710][95689] Fps is (10 sec: 5734.3, 60 sec: 4369.1, 300 sec: 4369.1). Total num frames: 65536. Throughput: 0: 414.1, 1: 419.9. Samples: 12511. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-25 20:15:40,711][95689] Avg episode reward: [(0, '2.611'), (1, '4.600')] -[2023-09-25 20:15:42,747][96848] Updated weights for policy 0, policy_version 160 (0.0018) -[2023-09-25 20:15:42,748][96849] Updated weights for policy 1, policy_version 160 (0.0017) -[2023-09-25 20:15:45,710][95689] Fps is (10 sec: 6553.6, 60 sec: 4915.2, 300 sec: 4915.2). Total num frames: 98304. Throughput: 0: 563.0, 1: 563.2. Samples: 22524. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-25 20:15:45,711][95689] Avg episode reward: [(0, '2.586'), (1, '4.250')] -[2023-09-25 20:15:50,710][95689] Fps is (10 sec: 6553.6, 60 sec: 5242.9, 300 sec: 5242.9). Total num frames: 131072. Throughput: 0: 638.3, 1: 641.1. Samples: 31985. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-25 20:15:50,711][95689] Avg episode reward: [(0, '2.686'), (1, '4.310')] -[2023-09-25 20:15:55,667][96849] Updated weights for policy 1, policy_version 320 (0.0017) -[2023-09-25 20:15:55,667][96848] Updated weights for policy 0, policy_version 320 (0.0017) -[2023-09-25 20:15:55,710][95689] Fps is (10 sec: 6553.6, 60 sec: 5461.3, 300 sec: 5461.3). Total num frames: 163840. Throughput: 0: 613.2, 1: 614.4. Samples: 36827. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-25 20:15:55,711][95689] Avg episode reward: [(0, '2.628'), (1, '3.974')] -[2023-09-25 20:16:00,710][95689] Fps is (10 sec: 5734.4, 60 sec: 5383.3, 300 sec: 5383.3). Total num frames: 188416. Throughput: 0: 656.8, 1: 659.3. Samples: 46063. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-25 20:16:00,711][95689] Avg episode reward: [(0, '2.648'), (1, '3.857')] -[2023-09-25 20:16:05,710][95689] Fps is (10 sec: 5734.4, 60 sec: 5529.6, 300 sec: 5529.6). Total num frames: 221184. Throughput: 0: 693.5, 1: 695.9. Samples: 55579. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-25 20:16:05,711][95689] Avg episode reward: [(0, '2.633'), (1, '3.614')] -[2023-09-25 20:16:05,869][96647] Saving new best policy, reward=2.633! -[2023-09-25 20:16:05,899][96710] Saving new best policy, reward=3.614! -[2023-09-25 20:16:08,456][96849] Updated weights for policy 1, policy_version 480 (0.0016) -[2023-09-25 20:16:08,457][96848] Updated weights for policy 0, policy_version 480 (0.0018) -[2023-09-25 20:16:10,710][95689] Fps is (10 sec: 6553.7, 60 sec: 5643.4, 300 sec: 5643.4). Total num frames: 253952. Throughput: 0: 671.6, 1: 674.6. Samples: 60581. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-25 20:16:10,711][95689] Avg episode reward: [(0, '2.697'), (1, '3.672')] -[2023-09-25 20:16:10,711][96647] Saving new best policy, reward=2.697! -[2023-09-25 20:16:10,711][96710] Saving new best policy, reward=3.672! -[2023-09-25 20:16:15,710][95689] Fps is (10 sec: 6553.6, 60 sec: 5734.4, 300 sec: 5734.4). Total num frames: 286720. Throughput: 0: 751.0, 1: 752.4. Samples: 69703. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-25 20:16:15,711][95689] Avg episode reward: [(0, '2.747'), (1, '3.533')] -[2023-09-25 20:16:15,712][96647] Saving new best policy, reward=2.747! -[2023-09-25 20:16:20,710][95689] Fps is (10 sec: 6553.4, 60 sec: 5808.9, 300 sec: 5808.9). Total num frames: 319488. Throughput: 0: 790.7, 1: 790.0. Samples: 79176. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-25 20:16:20,711][95689] Avg episode reward: [(0, '2.695'), (1, '3.565')] -[2023-09-25 20:16:21,656][96849] Updated weights for policy 1, policy_version 640 (0.0019) -[2023-09-25 20:16:21,656][96848] Updated weights for policy 0, policy_version 640 (0.0017) -[2023-09-25 20:16:25,710][95689] Fps is (10 sec: 6553.6, 60 sec: 5870.9, 300 sec: 5870.9). Total num frames: 352256. Throughput: 0: 794.9, 1: 793.0. Samples: 83968. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-25 20:16:25,711][95689] Avg episode reward: [(0, '2.814'), (1, '3.484')] -[2023-09-25 20:16:25,712][96647] Saving new best policy, reward=2.814! -[2023-09-25 20:16:30,710][95689] Fps is (10 sec: 6144.1, 60 sec: 6212.3, 300 sec: 5860.4). Total num frames: 380928. Throughput: 0: 786.3, 1: 789.4. Samples: 93429. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-25 20:16:30,711][95689] Avg episode reward: [(0, '2.966'), (1, '3.390')] -[2023-09-25 20:16:30,731][96647] Saving new best policy, reward=2.966! -[2023-09-25 20:16:34,554][96849] Updated weights for policy 1, policy_version 800 (0.0014) -[2023-09-25 20:16:34,555][96848] Updated weights for policy 0, policy_version 800 (0.0018) -[2023-09-25 20:16:35,710][95689] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 5851.4). Total num frames: 409600. Throughput: 0: 787.2, 1: 788.1. Samples: 102873. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-25 20:16:35,711][95689] Avg episode reward: [(0, '2.927'), (1, '3.360')] -[2023-09-25 20:16:40,710][95689] Fps is (10 sec: 6144.0, 60 sec: 6280.5, 300 sec: 5898.2). Total num frames: 442368. Throughput: 0: 790.2, 1: 790.4. Samples: 107958. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-25 20:16:40,711][95689] Avg episode reward: [(0, '2.860'), (1, '3.280')] -[2023-09-25 20:16:41,784][95689] Keyboard interrupt detected in the event loop EvtLoop [Runner_EvtLoop, process=main process 95689], exiting... -[2023-09-25 20:16:41,784][96885] Stopping RolloutWorker_w3... -[2023-09-25 20:16:41,784][96884] Stopping RolloutWorker_w2... -[2023-09-25 20:16:41,784][96886] Stopping RolloutWorker_w4... -[2023-09-25 20:16:41,784][96882] Stopping RolloutWorker_w1... -[2023-09-25 20:16:41,785][96868] Stopping RolloutWorker_w0... -[2023-09-25 20:16:41,785][96885] Loop rollout_proc3_evt_loop terminating... -[2023-09-25 20:16:41,784][95689] Runner profile tree view: -main_loop: 81.9507 -[2023-09-25 20:16:41,785][96884] Loop rollout_proc2_evt_loop terminating... -[2023-09-25 20:16:41,785][96868] Loop rollout_proc0_evt_loop terminating... -[2023-09-25 20:16:41,785][96886] Loop rollout_proc4_evt_loop terminating... -[2023-09-25 20:16:41,785][96882] Loop rollout_proc1_evt_loop terminating... -[2023-09-25 20:16:41,785][95689] Collected {0: 225280, 1: 225280}, FPS: 5497.9 -[2023-09-25 20:16:41,785][96888] Stopping RolloutWorker_w6... -[2023-09-25 20:16:41,785][96889] Stopping RolloutWorker_w7... -[2023-09-25 20:16:41,785][96888] Loop rollout_proc6_evt_loop terminating... -[2023-09-25 20:16:41,785][96710] Stopping Batcher_1... -[2023-09-25 20:16:41,785][96887] Stopping RolloutWorker_w5... -[2023-09-25 20:16:41,785][96889] Loop rollout_proc7_evt_loop terminating... -[2023-09-25 20:16:41,786][96887] Loop rollout_proc5_evt_loop terminating... -[2023-09-25 20:16:41,785][96647] Stopping Batcher_0... -[2023-09-25 20:16:41,786][96710] Loop batcher_evt_loop terminating... -[2023-09-25 20:16:41,786][96647] Loop batcher_evt_loop terminating... -[2023-09-25 20:16:41,787][96710] Saving ./train_atari/atari_beamrider/checkpoint_p1/checkpoint_000000880_225280.pth... -[2023-09-25 20:16:41,787][96647] Saving ./train_atari/atari_beamrider/checkpoint_p0/checkpoint_000000880_225280.pth... -[2023-09-25 20:16:41,800][96849] Weights refcount: 2 0 -[2023-09-25 20:16:41,800][96848] Weights refcount: 2 0 -[2023-09-25 20:16:41,801][96849] Stopping InferenceWorker_p1-w0... -[2023-09-25 20:16:41,801][96848] Stopping InferenceWorker_p0-w0... -[2023-09-25 20:16:41,801][96849] Loop inference_proc1-0_evt_loop terminating... -[2023-09-25 20:16:41,801][96848] Loop inference_proc0-0_evt_loop terminating... -[2023-09-25 20:16:41,824][96710] Stopping LearnerWorker_p1... -[2023-09-25 20:16:41,824][96710] Loop learner_proc1_evt_loop terminating... -[2023-09-25 20:16:41,825][96647] Stopping LearnerWorker_p0... -[2023-09-25 20:16:41,825][96647] Loop learner_proc0_evt_loop terminating... +[2023-10-09 04:07:40,296][60003] Using optimizer +[2023-10-09 04:07:40,297][60003] No checkpoints found +[2023-10-09 04:07:40,297][60003] Did not load from checkpoint, starting from scratch! +[2023-10-09 04:07:40,297][60003] Initialized policy 1 weights for model version 0 +[2023-10-09 04:07:40,299][60003] LearnerWorker_p1 finished initialization! +[2023-10-09 04:07:40,299][60003] Using GPUs [0] for process 1 (actually maps to GPUs [1]) +[2023-10-09 04:07:41,192][59242] Starting process rollout_proc14 +[2023-10-09 04:07:41,193][59242] Starting process rollout_proc15 +[2023-10-09 04:07:41,197][60188] Worker 10 uses CPU cores [20, 21] +[2023-10-09 04:07:41,200][60176] Worker 0 uses CPU cores [0, 1] +[2023-10-09 04:07:41,200][60186] Worker 8 uses CPU cores [16, 17] +[2023-10-09 04:07:41,240][60185] Worker 7 uses CPU cores [14, 15] +[2023-10-09 04:07:41,462][60182] Worker 6 uses CPU cores [12, 13] +[2023-10-09 04:07:41,509][60191] Worker 13 uses CPU cores [26, 27] +[2023-10-09 04:07:41,509][60187] Worker 9 uses CPU cores [18, 19] +[2023-10-09 04:07:41,555][60184] Worker 4 uses CPU cores [8, 9] +[2023-10-09 04:07:41,656][60143] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-10-09 04:07:41,656][60143] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 +[2023-10-09 04:07:41,664][60189] Worker 11 uses CPU cores [22, 23] +[2023-10-09 04:07:41,674][60143] Num visible devices: 1 +[2023-10-09 04:07:41,732][60179] Worker 2 uses CPU cores [4, 5] +[2023-10-09 04:07:41,777][60144] Using GPUs [1] for process 1 (actually maps to GPUs [1]) +[2023-10-09 04:07:41,777][60144] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for inference process 1 +[2023-10-09 04:07:41,796][60144] Num visible devices: 1 +[2023-10-09 04:07:41,824][60190] Worker 12 uses CPU cores [24, 25] +[2023-10-09 04:07:41,900][60178] Worker 1 uses CPU cores [2, 3] +[2023-10-09 04:07:41,928][60181] Worker 5 uses CPU cores [10, 11] +[2023-10-09 04:07:41,979][60180] Worker 3 uses CPU cores [6, 7] +[2023-10-09 04:07:42,311][60143] RunningMeanStd input shape: (4, 84, 84) +[2023-10-09 04:07:42,312][60143] RunningMeanStd input shape: (1,) +[2023-10-09 04:07:42,323][60143] ConvEncoder: input_channels=4 +[2023-10-09 04:07:42,393][60144] RunningMeanStd input shape: (4, 84, 84) +[2023-10-09 04:07:42,394][60144] RunningMeanStd input shape: (1,) +[2023-10-09 04:07:42,404][60144] ConvEncoder: input_channels=4 +[2023-10-09 04:07:42,425][60143] Conv encoder output size: 512 +[2023-10-09 04:07:42,503][60144] Conv encoder output size: 512 +[2023-10-09 04:07:43,141][60886] Worker 14 uses CPU cores [28, 29] +[2023-10-09 04:07:43,157][59242] Inference worker 0-0 is ready! +[2023-10-09 04:07:43,158][60919] Worker 15 uses CPU cores [30, 31] +[2023-10-09 04:07:43,158][59242] Inference worker 1-0 is ready! +[2023-10-09 04:07:43,159][59242] All inference workers are ready! Signal rollout workers to start! +[2023-10-09 04:07:43,160][60179] EnvRunner 2-0 uses policy 0 +[2023-10-09 04:07:43,160][60182] EnvRunner 6-0 uses policy 0 +[2023-10-09 04:07:43,160][60190] EnvRunner 12-0 uses policy 0 +[2023-10-09 04:07:43,160][60186] EnvRunner 8-0 uses policy 0 +[2023-10-09 04:07:43,160][60189] EnvRunner 11-0 uses policy 1 +[2023-10-09 04:07:43,160][60188] EnvRunner 10-0 uses policy 0 +[2023-10-09 04:07:43,160][60191] EnvRunner 13-0 uses policy 1 +[2023-10-09 04:07:43,160][60184] EnvRunner 4-0 uses policy 0 +[2023-10-09 04:07:43,160][60180] EnvRunner 3-0 uses policy 1 +[2023-10-09 04:07:43,160][60176] EnvRunner 0-0 uses policy 0 +[2023-10-09 04:07:43,160][59242] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan, 1: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-10-09 04:07:43,160][60178] EnvRunner 1-0 uses policy 1 +[2023-10-09 04:07:43,160][60187] EnvRunner 9-0 uses policy 1 +[2023-10-09 04:07:43,160][60185] EnvRunner 7-0 uses policy 1 +[2023-10-09 04:07:43,160][60181] EnvRunner 5-0 uses policy 1 +[2023-10-09 04:07:43,352][60886] EnvRunner 14-0 uses policy 0 +[2023-10-09 04:07:43,361][60919] EnvRunner 15-0 uses policy 1 +[2023-10-09 04:07:45,455][59242] Heartbeat connected on Batcher_0 +[2023-10-09 04:07:45,458][59242] Heartbeat connected on LearnerWorker_p0 +[2023-10-09 04:07:45,461][59242] Heartbeat connected on Batcher_1 +[2023-10-09 04:07:45,464][59242] Heartbeat connected on LearnerWorker_p1 +[2023-10-09 04:07:45,473][59242] Heartbeat connected on InferenceWorker_p0-w0 +[2023-10-09 04:07:45,476][59242] Heartbeat connected on InferenceWorker_p1-w0 +[2023-10-09 04:07:45,480][59242] Heartbeat connected on RolloutWorker_w0 +[2023-10-09 04:07:45,481][59242] Heartbeat connected on RolloutWorker_w1 +[2023-10-09 04:07:45,482][59242] Heartbeat connected on RolloutWorker_w2 +[2023-10-09 04:07:45,485][59242] Heartbeat connected on RolloutWorker_w3 +[2023-10-09 04:07:45,490][59242] Heartbeat connected on RolloutWorker_w4 +[2023-10-09 04:07:45,491][59242] Heartbeat connected on RolloutWorker_w5 +[2023-10-09 04:07:45,498][59242] Heartbeat connected on RolloutWorker_w6 +[2023-10-09 04:07:45,499][59242] Heartbeat connected on RolloutWorker_w8 +[2023-10-09 04:07:45,502][59242] Heartbeat connected on RolloutWorker_w7 +[2023-10-09 04:07:45,502][59242] Heartbeat connected on RolloutWorker_w9 +[2023-10-09 04:07:45,505][59242] Heartbeat connected on RolloutWorker_w10 +[2023-10-09 04:07:45,510][59242] Heartbeat connected on RolloutWorker_w11 +[2023-10-09 04:07:45,512][59242] Heartbeat connected on RolloutWorker_w12 +[2023-10-09 04:07:45,513][59242] Heartbeat connected on RolloutWorker_w13 +[2023-10-09 04:07:45,519][59242] Heartbeat connected on RolloutWorker_w14 +[2023-10-09 04:07:45,519][59242] Heartbeat connected on RolloutWorker_w15 +[2023-10-09 04:07:46,052][59242] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 556.1, 1: 531.9. Samples: 3146. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-10-09 04:07:51,052][59242] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 953.9, 1: 944.5. Samples: 14982. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-10-09 04:07:51,053][59242] Avg episode reward: [(0, '2.769'), (1, '2.947')] +[2023-10-09 04:07:52,993][60143] Updated weights for policy 0, policy_version 10 (0.0010) +[2023-10-09 04:07:53,365][60143] Updated weights for policy 0, policy_version 20 (0.0010) +[2023-10-09 04:07:53,488][60144] Updated weights for policy 1, policy_version 10 (0.0007) +[2023-10-09 04:07:53,736][60143] Updated weights for policy 0, policy_version 30 (0.0009) +[2023-10-09 04:07:53,853][60144] Updated weights for policy 1, policy_version 20 (0.0009) +[2023-10-09 04:07:54,215][60144] Updated weights for policy 1, policy_version 30 (0.0008) +[2023-10-09 04:07:56,052][59242] Fps is (10 sec: 6553.5, 60 sec: 5083.5, 300 sec: 5083.5). Total num frames: 65536. Throughput: 0: 1240.5, 1: 1225.6. Samples: 31792. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 04:07:56,053][59242] Avg episode reward: [(0, '3.161'), (1, '2.417')] +[2023-10-09 04:07:56,421][60143] Updated weights for policy 0, policy_version 40 (0.0008) +[2023-10-09 04:07:56,747][60144] Updated weights for policy 1, policy_version 40 (0.0009) +[2023-10-09 04:07:56,788][60143] Updated weights for policy 0, policy_version 50 (0.0009) +[2023-10-09 04:07:57,118][60144] Updated weights for policy 1, policy_version 50 (0.0007) +[2023-10-09 04:07:57,149][60143] Updated weights for policy 0, policy_version 60 (0.0008) +[2023-10-09 04:07:57,470][60144] Updated weights for policy 1, policy_version 60 (0.0008) +[2023-10-09 04:08:00,614][60143] Updated weights for policy 0, policy_version 70 (0.0007) +[2023-10-09 04:08:00,870][60144] Updated weights for policy 1, policy_version 70 (0.0010) +[2023-10-09 04:08:00,978][60143] Updated weights for policy 0, policy_version 80 (0.0009) +[2023-10-09 04:08:01,052][59242] Fps is (10 sec: 13107.4, 60 sec: 7325.8, 300 sec: 7325.8). Total num frames: 131072. Throughput: 0: 1473.7, 1: 1462.7. Samples: 52538. Policy #0 lag: (min: 33.0, avg: 33.0, max: 33.0) +[2023-10-09 04:08:01,053][59242] Avg episode reward: [(0, '3.041'), (1, '2.608')] +[2023-10-09 04:08:01,226][60144] Updated weights for policy 1, policy_version 80 (0.0009) +[2023-10-09 04:08:01,350][60143] Updated weights for policy 0, policy_version 90 (0.0010) +[2023-10-09 04:08:01,586][60144] Updated weights for policy 1, policy_version 90 (0.0009) +[2023-10-09 04:08:05,109][60143] Updated weights for policy 0, policy_version 100 (0.0009) +[2023-10-09 04:08:05,193][60144] Updated weights for policy 1, policy_version 100 (0.0008) +[2023-10-09 04:08:05,473][60143] Updated weights for policy 0, policy_version 110 (0.0008) +[2023-10-09 04:08:05,550][60144] Updated weights for policy 1, policy_version 110 (0.0009) +[2023-10-09 04:08:05,844][60143] Updated weights for policy 0, policy_version 120 (0.0008) +[2023-10-09 04:08:05,921][60144] Updated weights for policy 1, policy_version 120 (0.0007) +[2023-10-09 04:08:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 8588.6, 300 sec: 8588.6). Total num frames: 196608. Throughput: 0: 1349.9, 1: 1348.2. Samples: 61764. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 04:08:06,053][59242] Avg episode reward: [(0, '2.970'), (1, '2.811')] +[2023-10-09 04:08:09,942][60144] Updated weights for policy 1, policy_version 130 (0.0007) +[2023-10-09 04:08:09,980][60143] Updated weights for policy 0, policy_version 130 (0.0008) +[2023-10-09 04:08:10,315][60144] Updated weights for policy 1, policy_version 140 (0.0008) +[2023-10-09 04:08:10,346][60143] Updated weights for policy 0, policy_version 140 (0.0011) +[2023-10-09 04:08:10,673][60144] Updated weights for policy 1, policy_version 150 (0.0009) +[2023-10-09 04:08:10,711][60143] Updated weights for policy 0, policy_version 150 (0.0008) +[2023-10-09 04:08:11,040][60144] Updated weights for policy 1, policy_version 160 (0.0009) +[2023-10-09 04:08:11,052][59242] Fps is (10 sec: 16384.1, 60 sec: 10573.5, 300 sec: 10573.5). Total num frames: 294912. Throughput: 0: 1485.3, 1: 1484.5. Samples: 82832. Policy #0 lag: (min: 22.0, avg: 28.6, max: 54.0) +[2023-10-09 04:08:11,053][59242] Avg episode reward: [(0, '2.933'), (1, '2.800')] +[2023-10-09 04:08:11,053][60003] Saving new best policy, reward=2.800! +[2023-10-09 04:08:11,075][59934] Saving new best policy, reward=2.933! +[2023-10-09 04:08:11,077][60143] Updated weights for policy 0, policy_version 160 (0.0009) +[2023-10-09 04:08:15,032][60144] Updated weights for policy 1, policy_version 170 (0.0007) +[2023-10-09 04:08:15,113][60143] Updated weights for policy 0, policy_version 170 (0.0008) +[2023-10-09 04:08:15,399][60144] Updated weights for policy 1, policy_version 180 (0.0008) +[2023-10-09 04:08:15,488][60143] Updated weights for policy 0, policy_version 180 (0.0008) +[2023-10-09 04:08:15,759][60144] Updated weights for policy 1, policy_version 190 (0.0007) +[2023-10-09 04:08:15,856][60143] Updated weights for policy 0, policy_version 190 (0.0007) +[2023-10-09 04:08:16,052][59242] Fps is (10 sec: 19660.4, 60 sec: 11954.8, 300 sec: 11954.8). Total num frames: 393216. Throughput: 0: 1549.2, 1: 1556.9. Samples: 102166. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:08:16,053][59242] Avg episode reward: [(0, '2.950'), (1, '2.460')] +[2023-10-09 04:08:16,061][59934] Saving new best policy, reward=2.950! +[2023-10-09 04:08:19,569][60144] Updated weights for policy 1, policy_version 200 (0.0008) +[2023-10-09 04:08:19,783][60143] Updated weights for policy 0, policy_version 200 (0.0008) +[2023-10-09 04:08:19,926][60144] Updated weights for policy 1, policy_version 210 (0.0008) +[2023-10-09 04:08:20,143][60143] Updated weights for policy 0, policy_version 210 (0.0009) +[2023-10-09 04:08:20,286][60144] Updated weights for policy 1, policy_version 220 (0.0009) +[2023-10-09 04:08:20,514][60143] Updated weights for policy 0, policy_version 220 (0.0009) +[2023-10-09 04:08:21,052][59242] Fps is (10 sec: 16384.0, 60 sec: 12106.9, 300 sec: 12106.9). Total num frames: 458752. Throughput: 0: 1487.0, 1: 1502.7. Samples: 113284. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-09 04:08:21,052][59242] Avg episode reward: [(0, '2.880'), (1, '2.600')] +[2023-10-09 04:08:24,168][60144] Updated weights for policy 1, policy_version 230 (0.0008) +[2023-10-09 04:08:24,530][60144] Updated weights for policy 1, policy_version 240 (0.0008) +[2023-10-09 04:08:24,576][60143] Updated weights for policy 0, policy_version 230 (0.0009) +[2023-10-09 04:08:24,902][60144] Updated weights for policy 1, policy_version 250 (0.0009) +[2023-10-09 04:08:24,942][60143] Updated weights for policy 0, policy_version 240 (0.0009) +[2023-10-09 04:08:25,318][60143] Updated weights for policy 0, policy_version 250 (0.0007) +[2023-10-09 04:08:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 12223.5, 300 sec: 12223.5). Total num frames: 524288. Throughput: 0: 1553.5, 1: 1559.3. Samples: 133512. Policy #0 lag: (min: 4.0, avg: 13.9, max: 36.0) +[2023-10-09 04:08:26,053][59242] Avg episode reward: [(0, '2.870'), (1, '2.450')] +[2023-10-09 04:08:28,915][60144] Updated weights for policy 1, policy_version 260 (0.0008) +[2023-10-09 04:08:29,283][60144] Updated weights for policy 1, policy_version 270 (0.0007) +[2023-10-09 04:08:29,304][60143] Updated weights for policy 0, policy_version 260 (0.0008) +[2023-10-09 04:08:29,652][60144] Updated weights for policy 1, policy_version 280 (0.0008) +[2023-10-09 04:08:29,678][60143] Updated weights for policy 0, policy_version 270 (0.0007) +[2023-10-09 04:08:30,050][60143] Updated weights for policy 0, policy_version 280 (0.0007) +[2023-10-09 04:08:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 12315.7, 300 sec: 12315.7). Total num frames: 589824. Throughput: 0: 1653.3, 1: 1672.0. Samples: 152782. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) +[2023-10-09 04:08:31,053][59242] Avg episode reward: [(0, '2.870'), (1, '2.570')] +[2023-10-09 04:08:33,556][60144] Updated weights for policy 1, policy_version 290 (0.0007) +[2023-10-09 04:08:33,940][60144] Updated weights for policy 1, policy_version 300 (0.0008) +[2023-10-09 04:08:34,092][60143] Updated weights for policy 0, policy_version 290 (0.0010) +[2023-10-09 04:08:34,301][60144] Updated weights for policy 1, policy_version 310 (0.0009) +[2023-10-09 04:08:34,478][60143] Updated weights for policy 0, policy_version 300 (0.0008) +[2023-10-09 04:08:34,671][60144] Updated weights for policy 1, policy_version 320 (0.0009) +[2023-10-09 04:08:34,854][60143] Updated weights for policy 0, policy_version 310 (0.0008) +[2023-10-09 04:08:35,225][60143] Updated weights for policy 0, policy_version 320 (0.0009) +[2023-10-09 04:08:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 12390.6, 300 sec: 12390.6). Total num frames: 655360. Throughput: 0: 1658.0, 1: 1669.5. Samples: 164718. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) +[2023-10-09 04:08:36,053][59242] Avg episode reward: [(0, '2.930'), (1, '2.480')] +[2023-10-09 04:08:38,623][60144] Updated weights for policy 1, policy_version 330 (0.0008) +[2023-10-09 04:08:38,986][60144] Updated weights for policy 1, policy_version 340 (0.0008) +[2023-10-09 04:08:39,111][60143] Updated weights for policy 0, policy_version 330 (0.0008) +[2023-10-09 04:08:39,350][60144] Updated weights for policy 1, policy_version 350 (0.0008) +[2023-10-09 04:08:39,480][60143] Updated weights for policy 0, policy_version 340 (0.0007) +[2023-10-09 04:08:39,854][60143] Updated weights for policy 0, policy_version 350 (0.0009) +[2023-10-09 04:08:41,052][59242] Fps is (10 sec: 13106.9, 60 sec: 12452.4, 300 sec: 12452.4). Total num frames: 720896. Throughput: 0: 1685.8, 1: 1692.4. Samples: 183814. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-09 04:08:41,054][59242] Avg episode reward: [(0, '3.050'), (1, '2.600')] +[2023-10-09 04:08:41,055][59934] Saving new best policy, reward=3.050! +[2023-10-09 04:08:43,395][60144] Updated weights for policy 1, policy_version 360 (0.0007) +[2023-10-09 04:08:43,771][60144] Updated weights for policy 1, policy_version 370 (0.0009) +[2023-10-09 04:08:43,957][60143] Updated weights for policy 0, policy_version 360 (0.0007) +[2023-10-09 04:08:44,132][60144] Updated weights for policy 1, policy_version 380 (0.0007) +[2023-10-09 04:08:44,326][60143] Updated weights for policy 0, policy_version 370 (0.0007) +[2023-10-09 04:08:44,688][60143] Updated weights for policy 0, policy_version 380 (0.0008) +[2023-10-09 04:08:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 12504.5). Total num frames: 786432. Throughput: 0: 1676.7, 1: 1700.7. Samples: 204526. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 04:08:46,053][59242] Avg episode reward: [(0, '2.870'), (1, '2.910')] +[2023-10-09 04:08:46,061][60003] Saving new best policy, reward=2.910! +[2023-10-09 04:08:48,277][60144] Updated weights for policy 1, policy_version 390 (0.0009) +[2023-10-09 04:08:48,369][60143] Updated weights for policy 0, policy_version 390 (0.0008) +[2023-10-09 04:08:48,638][60144] Updated weights for policy 1, policy_version 400 (0.0008) +[2023-10-09 04:08:48,745][60143] Updated weights for policy 0, policy_version 400 (0.0008) +[2023-10-09 04:08:49,002][60144] Updated weights for policy 1, policy_version 410 (0.0008) +[2023-10-09 04:08:49,112][60143] Updated weights for policy 0, policy_version 410 (0.0009) +[2023-10-09 04:08:51,052][59242] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 12548.9). Total num frames: 851968. Throughput: 0: 1701.6, 1: 1713.0. Samples: 215418. Policy #0 lag: (min: 22.0, avg: 30.3, max: 54.0) +[2023-10-09 04:08:51,053][59242] Avg episode reward: [(0, '2.930'), (1, '2.820')] +[2023-10-09 04:08:53,012][60144] Updated weights for policy 1, policy_version 420 (0.0009) +[2023-10-09 04:08:53,167][60143] Updated weights for policy 0, policy_version 420 (0.0009) +[2023-10-09 04:08:53,378][60144] Updated weights for policy 1, policy_version 430 (0.0007) +[2023-10-09 04:08:53,537][60143] Updated weights for policy 0, policy_version 430 (0.0008) +[2023-10-09 04:08:53,745][60144] Updated weights for policy 1, policy_version 440 (0.0007) +[2023-10-09 04:08:53,905][60143] Updated weights for policy 0, policy_version 440 (0.0008) +[2023-10-09 04:08:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 12587.2). Total num frames: 917504. Throughput: 0: 1681.7, 1: 1700.0. Samples: 235010. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:08:56,053][59242] Avg episode reward: [(0, '2.730'), (1, '2.930')] +[2023-10-09 04:08:56,055][60003] Saving new best policy, reward=2.930! +[2023-10-09 04:08:57,588][60144] Updated weights for policy 1, policy_version 450 (0.0009) +[2023-10-09 04:08:57,935][60143] Updated weights for policy 0, policy_version 450 (0.0009) +[2023-10-09 04:08:57,962][60144] Updated weights for policy 1, policy_version 460 (0.0008) +[2023-10-09 04:08:58,299][60143] Updated weights for policy 0, policy_version 460 (0.0008) +[2023-10-09 04:08:58,327][60144] Updated weights for policy 1, policy_version 470 (0.0007) +[2023-10-09 04:08:58,666][60143] Updated weights for policy 0, policy_version 470 (0.0009) +[2023-10-09 04:08:58,692][60144] Updated weights for policy 1, policy_version 480 (0.0007) +[2023-10-09 04:08:59,031][60143] Updated weights for policy 0, policy_version 480 (0.0010) +[2023-10-09 04:09:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 12620.6). Total num frames: 983040. Throughput: 0: 1704.9, 1: 1718.0. Samples: 256198. Policy #0 lag: (min: 1.0, avg: 8.9, max: 33.0) +[2023-10-09 04:09:01,053][59242] Avg episode reward: [(0, '2.710'), (1, '2.960')] +[2023-10-09 04:09:01,064][60003] Saving new best policy, reward=2.960! +[2023-10-09 04:09:02,821][60144] Updated weights for policy 1, policy_version 490 (0.0008) +[2023-10-09 04:09:03,184][60144] Updated weights for policy 1, policy_version 500 (0.0007) +[2023-10-09 04:09:03,356][60143] Updated weights for policy 0, policy_version 490 (0.0008) +[2023-10-09 04:09:03,553][60144] Updated weights for policy 1, policy_version 510 (0.0008) +[2023-10-09 04:09:03,716][60143] Updated weights for policy 0, policy_version 500 (0.0009) +[2023-10-09 04:09:04,084][60143] Updated weights for policy 0, policy_version 510 (0.0011) +[2023-10-09 04:09:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 12649.9). Total num frames: 1048576. Throughput: 0: 1703.1, 1: 1696.8. Samples: 266280. Policy #0 lag: (min: 26.0, avg: 26.1, max: 32.0) +[2023-10-09 04:09:06,053][59242] Avg episode reward: [(0, '2.880'), (1, '3.140')] +[2023-10-09 04:09:06,055][60003] Saving new best policy, reward=3.140! +[2023-10-09 04:09:07,597][60144] Updated weights for policy 1, policy_version 520 (0.0007) +[2023-10-09 04:09:07,960][60144] Updated weights for policy 1, policy_version 530 (0.0010) +[2023-10-09 04:09:08,079][60143] Updated weights for policy 0, policy_version 520 (0.0009) +[2023-10-09 04:09:08,325][60144] Updated weights for policy 1, policy_version 540 (0.0008) +[2023-10-09 04:09:08,450][60143] Updated weights for policy 0, policy_version 530 (0.0007) +[2023-10-09 04:09:08,829][60143] Updated weights for policy 0, policy_version 540 (0.0008) +[2023-10-09 04:09:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 12675.9). Total num frames: 1114112. Throughput: 0: 1693.6, 1: 1708.6. Samples: 286612. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:09:11,053][59242] Avg episode reward: [(0, '3.120'), (1, '3.030')] +[2023-10-09 04:09:11,054][59934] Saving new best policy, reward=3.120! +[2023-10-09 04:09:12,130][60144] Updated weights for policy 1, policy_version 550 (0.0009) +[2023-10-09 04:09:12,502][60144] Updated weights for policy 1, policy_version 560 (0.0007) +[2023-10-09 04:09:12,724][60143] Updated weights for policy 0, policy_version 550 (0.0009) +[2023-10-09 04:09:12,859][60144] Updated weights for policy 1, policy_version 570 (0.0008) +[2023-10-09 04:09:13,093][60143] Updated weights for policy 0, policy_version 560 (0.0008) +[2023-10-09 04:09:13,461][60143] Updated weights for policy 0, policy_version 570 (0.0009) +[2023-10-09 04:09:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12699.1). Total num frames: 1179648. Throughput: 0: 1721.7, 1: 1727.1. Samples: 307980. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:09:16,053][59242] Avg episode reward: [(0, '3.100'), (1, '3.150')] +[2023-10-09 04:09:16,065][60003] Saving new best policy, reward=3.150! +[2023-10-09 04:09:16,633][60144] Updated weights for policy 1, policy_version 580 (0.0009) +[2023-10-09 04:09:16,988][60144] Updated weights for policy 1, policy_version 590 (0.0008) +[2023-10-09 04:09:17,290][60143] Updated weights for policy 0, policy_version 580 (0.0007) +[2023-10-09 04:09:17,351][60144] Updated weights for policy 1, policy_version 600 (0.0007) +[2023-10-09 04:09:17,653][60143] Updated weights for policy 0, policy_version 590 (0.0008) +[2023-10-09 04:09:18,012][60143] Updated weights for policy 0, policy_version 600 (0.0008) +[2023-10-09 04:09:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 12720.0). Total num frames: 1245184. Throughput: 0: 1686.9, 1: 1703.6. Samples: 317294. Policy #0 lag: (min: 17.0, avg: 35.9, max: 49.0) +[2023-10-09 04:09:21,053][59242] Avg episode reward: [(0, '3.170'), (1, '3.000')] +[2023-10-09 04:09:21,054][59934] Saving new best policy, reward=3.170! +[2023-10-09 04:09:21,445][60144] Updated weights for policy 1, policy_version 610 (0.0008) +[2023-10-09 04:09:21,848][60144] Updated weights for policy 1, policy_version 620 (0.0009) +[2023-10-09 04:09:21,950][60143] Updated weights for policy 0, policy_version 610 (0.0009) +[2023-10-09 04:09:22,207][60144] Updated weights for policy 1, policy_version 630 (0.0008) +[2023-10-09 04:09:22,334][60143] Updated weights for policy 0, policy_version 620 (0.0010) +[2023-10-09 04:09:22,587][60144] Updated weights for policy 1, policy_version 640 (0.0007) +[2023-10-09 04:09:22,701][60143] Updated weights for policy 0, policy_version 630 (0.0009) +[2023-10-09 04:09:23,079][60143] Updated weights for policy 0, policy_version 640 (0.0007) +[2023-10-09 04:09:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12738.8). Total num frames: 1310720. Throughput: 0: 1707.1, 1: 1723.0. Samples: 338168. Policy #0 lag: (min: 28.0, avg: 31.1, max: 60.0) +[2023-10-09 04:09:26,053][59242] Avg episode reward: [(0, '3.040'), (1, '3.080')] +[2023-10-09 04:09:26,682][60144] Updated weights for policy 1, policy_version 650 (0.0008) +[2023-10-09 04:09:27,047][60143] Updated weights for policy 0, policy_version 650 (0.0008) +[2023-10-09 04:09:27,052][60144] Updated weights for policy 1, policy_version 660 (0.0008) +[2023-10-09 04:09:27,407][60143] Updated weights for policy 0, policy_version 660 (0.0008) +[2023-10-09 04:09:27,416][60144] Updated weights for policy 1, policy_version 670 (0.0008) +[2023-10-09 04:09:27,790][60143] Updated weights for policy 0, policy_version 670 (0.0011) +[2023-10-09 04:09:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12755.9). Total num frames: 1376256. Throughput: 0: 1719.7, 1: 1721.3. Samples: 359372. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) +[2023-10-09 04:09:31,053][59242] Avg episode reward: [(0, '3.050'), (1, '3.270')] +[2023-10-09 04:09:31,061][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000000672_688128.pth... +[2023-10-09 04:09:31,223][60144] Updated weights for policy 1, policy_version 680 (0.0008) +[2023-10-09 04:09:31,587][60144] Updated weights for policy 1, policy_version 690 (0.0008) +[2023-10-09 04:09:31,847][60143] Updated weights for policy 0, policy_version 680 (0.0009) +[2023-10-09 04:09:31,956][60144] Updated weights for policy 1, policy_version 700 (0.0007) +[2023-10-09 04:09:32,096][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000000704_720896.pth... +[2023-10-09 04:09:32,125][60003] Saving new best policy, reward=3.270! +[2023-10-09 04:09:32,225][60143] Updated weights for policy 0, policy_version 690 (0.0009) +[2023-10-09 04:09:32,583][60143] Updated weights for policy 0, policy_version 700 (0.0008) +[2023-10-09 04:09:36,009][60144] Updated weights for policy 1, policy_version 710 (0.0008) +[2023-10-09 04:09:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 12771.4). Total num frames: 1441792. Throughput: 0: 1694.4, 1: 1709.7. Samples: 368600. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-09 04:09:36,053][59242] Avg episode reward: [(0, '3.150'), (1, '3.290')] +[2023-10-09 04:09:36,371][60144] Updated weights for policy 1, policy_version 720 (0.0007) +[2023-10-09 04:09:36,581][60143] Updated weights for policy 0, policy_version 710 (0.0008) +[2023-10-09 04:09:36,738][60144] Updated weights for policy 1, policy_version 730 (0.0008) +[2023-10-09 04:09:36,949][60143] Updated weights for policy 0, policy_version 720 (0.0009) +[2023-10-09 04:09:36,955][60003] Saving new best policy, reward=3.290! +[2023-10-09 04:09:37,323][60143] Updated weights for policy 0, policy_version 730 (0.0007) +[2023-10-09 04:09:40,777][60144] Updated weights for policy 1, policy_version 740 (0.0008) +[2023-10-09 04:09:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 12785.7). Total num frames: 1507328. Throughput: 0: 1716.9, 1: 1721.8. Samples: 389750. Policy #0 lag: (min: 4.0, avg: 6.3, max: 36.0) +[2023-10-09 04:09:41,052][59242] Avg episode reward: [(0, '3.170'), (1, '3.310')] +[2023-10-09 04:09:41,143][60144] Updated weights for policy 1, policy_version 750 (0.0010) +[2023-10-09 04:09:41,300][60143] Updated weights for policy 0, policy_version 740 (0.0008) +[2023-10-09 04:09:41,514][60144] Updated weights for policy 1, policy_version 760 (0.0009) +[2023-10-09 04:09:41,673][60143] Updated weights for policy 0, policy_version 750 (0.0007) +[2023-10-09 04:09:41,808][60003] Saving new best policy, reward=3.310! +[2023-10-09 04:09:42,044][60143] Updated weights for policy 0, policy_version 760 (0.0008) +[2023-10-09 04:09:45,495][60144] Updated weights for policy 1, policy_version 770 (0.0008) +[2023-10-09 04:09:45,856][60144] Updated weights for policy 1, policy_version 780 (0.0007) +[2023-10-09 04:09:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12798.8). Total num frames: 1572864. Throughput: 0: 1718.2, 1: 1720.7. Samples: 410946. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:09:46,053][59242] Avg episode reward: [(0, '3.360'), (1, '3.450')] +[2023-10-09 04:09:46,143][60143] Updated weights for policy 0, policy_version 770 (0.0009) +[2023-10-09 04:09:46,226][60144] Updated weights for policy 1, policy_version 790 (0.0007) +[2023-10-09 04:09:46,517][60143] Updated weights for policy 0, policy_version 780 (0.0010) +[2023-10-09 04:09:46,591][60144] Updated weights for policy 1, policy_version 800 (0.0007) +[2023-10-09 04:09:46,591][60003] Saving new best policy, reward=3.450! +[2023-10-09 04:09:46,884][60143] Updated weights for policy 0, policy_version 790 (0.0008) +[2023-10-09 04:09:47,250][59934] Saving new best policy, reward=3.360! +[2023-10-09 04:09:47,253][60143] Updated weights for policy 0, policy_version 800 (0.0008) +[2023-10-09 04:09:50,465][60144] Updated weights for policy 1, policy_version 810 (0.0007) +[2023-10-09 04:09:50,829][60144] Updated weights for policy 1, policy_version 820 (0.0008) +[2023-10-09 04:09:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 12810.8). Total num frames: 1638400. Throughput: 0: 1700.0, 1: 1721.1. Samples: 420230. Policy #0 lag: (min: 15.0, avg: 28.0, max: 47.0) +[2023-10-09 04:09:51,053][59242] Avg episode reward: [(0, '3.350'), (1, '3.410')] +[2023-10-09 04:09:51,194][60144] Updated weights for policy 1, policy_version 830 (0.0007) +[2023-10-09 04:09:51,326][60143] Updated weights for policy 0, policy_version 810 (0.0010) +[2023-10-09 04:09:51,690][60143] Updated weights for policy 0, policy_version 820 (0.0010) +[2023-10-09 04:09:52,065][60143] Updated weights for policy 0, policy_version 830 (0.0008) +[2023-10-09 04:09:55,264][60144] Updated weights for policy 1, policy_version 840 (0.0008) +[2023-10-09 04:09:55,624][60144] Updated weights for policy 1, policy_version 850 (0.0008) +[2023-10-09 04:09:55,988][60144] Updated weights for policy 1, policy_version 860 (0.0008) +[2023-10-09 04:09:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12822.0). Total num frames: 1703936. Throughput: 0: 1714.1, 1: 1727.9. Samples: 441500. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-09 04:09:56,053][59242] Avg episode reward: [(0, '3.220'), (1, '3.260')] +[2023-10-09 04:09:56,090][60143] Updated weights for policy 0, policy_version 840 (0.0008) +[2023-10-09 04:09:56,464][60143] Updated weights for policy 0, policy_version 850 (0.0008) +[2023-10-09 04:09:56,833][60143] Updated weights for policy 0, policy_version 860 (0.0010) +[2023-10-09 04:09:59,890][60144] Updated weights for policy 1, policy_version 870 (0.0009) +[2023-10-09 04:10:00,246][60144] Updated weights for policy 1, policy_version 880 (0.0009) +[2023-10-09 04:10:00,611][60144] Updated weights for policy 1, policy_version 890 (0.0007) +[2023-10-09 04:10:00,765][60143] Updated weights for policy 0, policy_version 870 (0.0011) +[2023-10-09 04:10:01,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13069.9). Total num frames: 1802240. Throughput: 0: 1712.9, 1: 1703.5. Samples: 461718. Policy #0 lag: (min: 31.0, avg: 39.7, max: 40.0) +[2023-10-09 04:10:01,053][59242] Avg episode reward: [(0, '3.170'), (1, '3.290')] +[2023-10-09 04:10:01,125][60143] Updated weights for policy 0, policy_version 880 (0.0009) +[2023-10-09 04:10:01,503][60143] Updated weights for policy 0, policy_version 890 (0.0010) +[2023-10-09 04:10:04,727][60144] Updated weights for policy 1, policy_version 900 (0.0009) +[2023-10-09 04:10:05,091][60144] Updated weights for policy 1, policy_version 910 (0.0008) +[2023-10-09 04:10:05,305][60143] Updated weights for policy 0, policy_version 900 (0.0008) +[2023-10-09 04:10:05,458][60144] Updated weights for policy 1, policy_version 920 (0.0008) +[2023-10-09 04:10:05,661][60143] Updated weights for policy 0, policy_version 910 (0.0008) +[2023-10-09 04:10:06,028][60143] Updated weights for policy 0, policy_version 920 (0.0007) +[2023-10-09 04:10:06,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13071.3). Total num frames: 1867776. Throughput: 0: 1715.6, 1: 1720.4. Samples: 471912. Policy #0 lag: (min: 16.0, avg: 33.1, max: 48.0) +[2023-10-09 04:10:06,053][59242] Avg episode reward: [(0, '2.980'), (1, '3.050')] +[2023-10-09 04:10:09,456][60144] Updated weights for policy 1, policy_version 930 (0.0008) +[2023-10-09 04:10:09,878][60144] Updated weights for policy 1, policy_version 940 (0.0008) +[2023-10-09 04:10:10,056][60143] Updated weights for policy 0, policy_version 930 (0.0009) +[2023-10-09 04:10:10,248][60144] Updated weights for policy 1, policy_version 950 (0.0007) +[2023-10-09 04:10:10,470][60143] Updated weights for policy 0, policy_version 940 (0.0009) +[2023-10-09 04:10:10,619][60144] Updated weights for policy 1, policy_version 960 (0.0007) +[2023-10-09 04:10:10,846][60143] Updated weights for policy 0, policy_version 950 (0.0009) +[2023-10-09 04:10:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13072.5). Total num frames: 1933312. Throughput: 0: 1712.5, 1: 1725.6. Samples: 492886. Policy #0 lag: (min: 31.0, avg: 37.1, max: 63.0) +[2023-10-09 04:10:11,053][59242] Avg episode reward: [(0, '2.990'), (1, '3.140')] +[2023-10-09 04:10:11,207][60143] Updated weights for policy 0, policy_version 960 (0.0009) +[2023-10-09 04:10:14,475][60144] Updated weights for policy 1, policy_version 970 (0.0007) +[2023-10-09 04:10:14,839][60144] Updated weights for policy 1, policy_version 980 (0.0008) +[2023-10-09 04:10:15,209][60144] Updated weights for policy 1, policy_version 990 (0.0009) +[2023-10-09 04:10:15,248][60143] Updated weights for policy 0, policy_version 970 (0.0008) +[2023-10-09 04:10:15,620][60143] Updated weights for policy 0, policy_version 980 (0.0009) +[2023-10-09 04:10:15,989][60143] Updated weights for policy 0, policy_version 990 (0.0008) +[2023-10-09 04:10:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13073.6). Total num frames: 1998848. Throughput: 0: 1691.4, 1: 1698.8. Samples: 511928. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) +[2023-10-09 04:10:16,053][59242] Avg episode reward: [(0, '3.170'), (1, '3.130')] +[2023-10-09 04:10:19,196][60144] Updated weights for policy 1, policy_version 1000 (0.0007) +[2023-10-09 04:10:19,573][60144] Updated weights for policy 1, policy_version 1010 (0.0008) +[2023-10-09 04:10:19,912][60143] Updated weights for policy 0, policy_version 1000 (0.0009) +[2023-10-09 04:10:19,937][60144] Updated weights for policy 1, policy_version 1020 (0.0009) +[2023-10-09 04:10:20,284][60143] Updated weights for policy 0, policy_version 1010 (0.0009) +[2023-10-09 04:10:20,653][60143] Updated weights for policy 0, policy_version 1020 (0.0008) +[2023-10-09 04:10:21,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 13282.2). Total num frames: 2097152. Throughput: 0: 1713.7, 1: 1726.1. Samples: 523392. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 04:10:21,053][59242] Avg episode reward: [(0, '3.410'), (1, '3.320')] +[2023-10-09 04:10:21,055][59934] Saving new best policy, reward=3.410! +[2023-10-09 04:10:23,796][60144] Updated weights for policy 1, policy_version 1030 (0.0008) +[2023-10-09 04:10:24,153][60144] Updated weights for policy 1, policy_version 1040 (0.0008) +[2023-10-09 04:10:24,521][60144] Updated weights for policy 1, policy_version 1050 (0.0007) +[2023-10-09 04:10:24,757][60143] Updated weights for policy 0, policy_version 1030 (0.0008) +[2023-10-09 04:10:25,140][60143] Updated weights for policy 0, policy_version 1040 (0.0009) +[2023-10-09 04:10:25,513][60143] Updated weights for policy 0, policy_version 1050 (0.0009) +[2023-10-09 04:10:26,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13276.8). Total num frames: 2162688. Throughput: 0: 1715.9, 1: 1707.1. Samples: 543784. Policy #0 lag: (min: 3.0, avg: 11.0, max: 35.0) +[2023-10-09 04:10:26,053][59242] Avg episode reward: [(0, '3.330'), (1, '3.180')] +[2023-10-09 04:10:28,618][60144] Updated weights for policy 1, policy_version 1060 (0.0008) +[2023-10-09 04:10:28,983][60144] Updated weights for policy 1, policy_version 1070 (0.0010) +[2023-10-09 04:10:29,343][60144] Updated weights for policy 1, policy_version 1080 (0.0007) +[2023-10-09 04:10:29,407][60143] Updated weights for policy 0, policy_version 1060 (0.0008) +[2023-10-09 04:10:29,777][60143] Updated weights for policy 0, policy_version 1070 (0.0008) +[2023-10-09 04:10:30,158][60143] Updated weights for policy 0, policy_version 1080 (0.0010) +[2023-10-09 04:10:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13271.8). Total num frames: 2228224. Throughput: 0: 1685.2, 1: 1703.1. Samples: 563420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:10:31,053][59242] Avg episode reward: [(0, '3.420'), (1, '3.340')] +[2023-10-09 04:10:31,063][59934] Saving new best policy, reward=3.420! +[2023-10-09 04:10:33,283][60144] Updated weights for policy 1, policy_version 1090 (0.0007) +[2023-10-09 04:10:33,651][60144] Updated weights for policy 1, policy_version 1100 (0.0009) +[2023-10-09 04:10:33,948][60143] Updated weights for policy 0, policy_version 1090 (0.0009) +[2023-10-09 04:10:34,024][60144] Updated weights for policy 1, policy_version 1110 (0.0011) +[2023-10-09 04:10:34,331][60143] Updated weights for policy 0, policy_version 1100 (0.0008) +[2023-10-09 04:10:34,393][60144] Updated weights for policy 1, policy_version 1120 (0.0007) +[2023-10-09 04:10:34,697][60143] Updated weights for policy 0, policy_version 1110 (0.0007) +[2023-10-09 04:10:35,071][60143] Updated weights for policy 0, policy_version 1120 (0.0007) +[2023-10-09 04:10:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13267.0). Total num frames: 2293760. Throughput: 0: 1716.8, 1: 1721.0. Samples: 574932. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 04:10:36,053][59242] Avg episode reward: [(0, '3.460'), (1, '3.360')] +[2023-10-09 04:10:36,055][59934] Saving new best policy, reward=3.460! +[2023-10-09 04:10:38,474][60144] Updated weights for policy 1, policy_version 1130 (0.0011) +[2023-10-09 04:10:38,848][60144] Updated weights for policy 1, policy_version 1140 (0.0009) +[2023-10-09 04:10:39,064][60143] Updated weights for policy 0, policy_version 1130 (0.0007) +[2023-10-09 04:10:39,214][60144] Updated weights for policy 1, policy_version 1150 (0.0008) +[2023-10-09 04:10:39,441][60143] Updated weights for policy 0, policy_version 1140 (0.0010) +[2023-10-09 04:10:39,807][60143] Updated weights for policy 0, policy_version 1150 (0.0009) +[2023-10-09 04:10:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13262.5). Total num frames: 2359296. Throughput: 0: 1699.3, 1: 1694.6. Samples: 594226. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:10:41,052][59242] Avg episode reward: [(0, '3.550'), (1, '3.330')] +[2023-10-09 04:10:41,053][59934] Saving new best policy, reward=3.550! +[2023-10-09 04:10:43,089][60144] Updated weights for policy 1, policy_version 1160 (0.0008) +[2023-10-09 04:10:43,456][60144] Updated weights for policy 1, policy_version 1170 (0.0009) +[2023-10-09 04:10:43,806][60143] Updated weights for policy 0, policy_version 1160 (0.0009) +[2023-10-09 04:10:43,824][60144] Updated weights for policy 1, policy_version 1180 (0.0008) +[2023-10-09 04:10:44,182][60143] Updated weights for policy 0, policy_version 1170 (0.0008) +[2023-10-09 04:10:44,550][60143] Updated weights for policy 0, policy_version 1180 (0.0007) +[2023-10-09 04:10:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13258.3). Total num frames: 2424832. Throughput: 0: 1688.6, 1: 1722.2. Samples: 615202. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) +[2023-10-09 04:10:46,053][59242] Avg episode reward: [(0, '3.270'), (1, '3.210')] +[2023-10-09 04:10:47,628][60144] Updated weights for policy 1, policy_version 1190 (0.0008) +[2023-10-09 04:10:47,999][60144] Updated weights for policy 1, policy_version 1200 (0.0009) +[2023-10-09 04:10:48,365][60144] Updated weights for policy 1, policy_version 1210 (0.0008) +[2023-10-09 04:10:48,613][60143] Updated weights for policy 0, policy_version 1190 (0.0007) +[2023-10-09 04:10:48,972][60143] Updated weights for policy 0, policy_version 1200 (0.0009) +[2023-10-09 04:10:49,350][60143] Updated weights for policy 0, policy_version 1210 (0.0008) +[2023-10-09 04:10:51,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 13254.3). Total num frames: 2490368. Throughput: 0: 1713.8, 1: 1706.4. Samples: 625824. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-09 04:10:51,053][59242] Avg episode reward: [(0, '3.610'), (1, '3.340')] +[2023-10-09 04:10:51,054][59934] Saving new best policy, reward=3.610! +[2023-10-09 04:10:52,252][60144] Updated weights for policy 1, policy_version 1220 (0.0009) +[2023-10-09 04:10:52,614][60144] Updated weights for policy 1, policy_version 1230 (0.0010) +[2023-10-09 04:10:52,987][60144] Updated weights for policy 1, policy_version 1240 (0.0008) +[2023-10-09 04:10:53,307][60143] Updated weights for policy 0, policy_version 1220 (0.0009) +[2023-10-09 04:10:53,676][60143] Updated weights for policy 0, policy_version 1230 (0.0008) +[2023-10-09 04:10:54,046][60143] Updated weights for policy 0, policy_version 1240 (0.0009) +[2023-10-09 04:10:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13250.5). Total num frames: 2555904. Throughput: 0: 1691.0, 1: 1708.1. Samples: 645844. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:10:56,052][59242] Avg episode reward: [(0, '3.520'), (1, '3.400')] +[2023-10-09 04:10:57,057][60144] Updated weights for policy 1, policy_version 1250 (0.0009) +[2023-10-09 04:10:57,444][60144] Updated weights for policy 1, policy_version 1260 (0.0008) +[2023-10-09 04:10:57,814][60144] Updated weights for policy 1, policy_version 1270 (0.0007) +[2023-10-09 04:10:57,845][60143] Updated weights for policy 0, policy_version 1250 (0.0009) +[2023-10-09 04:10:58,175][60144] Updated weights for policy 1, policy_version 1280 (0.0008) +[2023-10-09 04:10:58,251][60143] Updated weights for policy 0, policy_version 1260 (0.0008) +[2023-10-09 04:10:58,620][60143] Updated weights for policy 0, policy_version 1270 (0.0008) +[2023-10-09 04:10:58,993][60143] Updated weights for policy 0, policy_version 1280 (0.0007) +[2023-10-09 04:11:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13246.8). Total num frames: 2621440. Throughput: 0: 1714.6, 1: 1736.8. Samples: 667240. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:11:01,053][59242] Avg episode reward: [(0, '3.370'), (1, '3.390')] +[2023-10-09 04:11:02,062][60144] Updated weights for policy 1, policy_version 1290 (0.0009) +[2023-10-09 04:11:02,419][60144] Updated weights for policy 1, policy_version 1300 (0.0008) +[2023-10-09 04:11:02,798][60144] Updated weights for policy 1, policy_version 1310 (0.0007) +[2023-10-09 04:11:02,872][60143] Updated weights for policy 0, policy_version 1290 (0.0008) +[2023-10-09 04:11:03,247][60143] Updated weights for policy 0, policy_version 1300 (0.0009) +[2023-10-09 04:11:03,615][60143] Updated weights for policy 0, policy_version 1310 (0.0007) +[2023-10-09 04:11:06,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13243.4). Total num frames: 2686976. Throughput: 0: 1704.8, 1: 1708.1. Samples: 676974. Policy #0 lag: (min: 26.0, avg: 32.9, max: 58.0) +[2023-10-09 04:11:06,053][59242] Avg episode reward: [(0, '3.460'), (1, '3.530')] +[2023-10-09 04:11:06,055][60003] Saving new best policy, reward=3.530! +[2023-10-09 04:11:06,840][60144] Updated weights for policy 1, policy_version 1320 (0.0007) +[2023-10-09 04:11:07,202][60144] Updated weights for policy 1, policy_version 1330 (0.0007) +[2023-10-09 04:11:07,578][60144] Updated weights for policy 1, policy_version 1340 (0.0007) +[2023-10-09 04:11:07,608][60143] Updated weights for policy 0, policy_version 1320 (0.0007) +[2023-10-09 04:11:07,986][60143] Updated weights for policy 0, policy_version 1330 (0.0008) +[2023-10-09 04:11:08,360][60143] Updated weights for policy 0, policy_version 1340 (0.0007) +[2023-10-09 04:11:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13240.1). Total num frames: 2752512. Throughput: 0: 1692.6, 1: 1729.9. Samples: 697794. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 04:11:11,053][59242] Avg episode reward: [(0, '3.430'), (1, '3.750')] +[2023-10-09 04:11:11,053][60003] Saving new best policy, reward=3.750! +[2023-10-09 04:11:11,526][60144] Updated weights for policy 1, policy_version 1350 (0.0009) +[2023-10-09 04:11:11,892][60144] Updated weights for policy 1, policy_version 1360 (0.0009) +[2023-10-09 04:11:12,263][60144] Updated weights for policy 1, policy_version 1370 (0.0009) +[2023-10-09 04:11:12,402][60143] Updated weights for policy 0, policy_version 1350 (0.0008) +[2023-10-09 04:11:12,772][60143] Updated weights for policy 0, policy_version 1360 (0.0009) +[2023-10-09 04:11:13,153][60143] Updated weights for policy 0, policy_version 1370 (0.0009) +[2023-10-09 04:11:16,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13237.0). Total num frames: 2818048. Throughput: 0: 1724.5, 1: 1733.9. Samples: 719050. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-09 04:11:16,052][59242] Avg episode reward: [(0, '3.350'), (1, '3.830')] +[2023-10-09 04:11:16,281][60144] Updated weights for policy 1, policy_version 1380 (0.0008) +[2023-10-09 04:11:16,649][60144] Updated weights for policy 1, policy_version 1390 (0.0007) +[2023-10-09 04:11:16,968][60143] Updated weights for policy 0, policy_version 1380 (0.0007) +[2023-10-09 04:11:17,015][60144] Updated weights for policy 1, policy_version 1400 (0.0008) +[2023-10-09 04:11:17,299][60003] Saving new best policy, reward=3.830! +[2023-10-09 04:11:17,337][60143] Updated weights for policy 0, policy_version 1390 (0.0007) +[2023-10-09 04:11:17,700][60143] Updated weights for policy 0, policy_version 1400 (0.0009) +[2023-10-09 04:11:20,901][60144] Updated weights for policy 1, policy_version 1410 (0.0008) +[2023-10-09 04:11:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13234.0). Total num frames: 2883584. Throughput: 0: 1699.1, 1: 1711.7. Samples: 728420. Policy #0 lag: (min: 10.0, avg: 17.1, max: 42.0) +[2023-10-09 04:11:21,053][59242] Avg episode reward: [(0, '3.590'), (1, '3.850')] +[2023-10-09 04:11:21,272][60144] Updated weights for policy 1, policy_version 1420 (0.0008) +[2023-10-09 04:11:21,639][60144] Updated weights for policy 1, policy_version 1430 (0.0009) +[2023-10-09 04:11:21,828][60143] Updated weights for policy 0, policy_version 1410 (0.0008) +[2023-10-09 04:11:21,997][60003] Saving new best policy, reward=3.850! +[2023-10-09 04:11:22,001][60144] Updated weights for policy 1, policy_version 1440 (0.0008) +[2023-10-09 04:11:22,195][60143] Updated weights for policy 0, policy_version 1420 (0.0007) +[2023-10-09 04:11:22,570][60143] Updated weights for policy 0, policy_version 1430 (0.0009) +[2023-10-09 04:11:22,937][60143] Updated weights for policy 0, policy_version 1440 (0.0008) +[2023-10-09 04:11:25,861][60144] Updated weights for policy 1, policy_version 1450 (0.0009) +[2023-10-09 04:11:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13231.2). Total num frames: 2949120. Throughput: 0: 1718.6, 1: 1740.4. Samples: 749884. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-09 04:11:26,053][59242] Avg episode reward: [(0, '3.830'), (1, '3.760')] +[2023-10-09 04:11:26,054][59934] Saving new best policy, reward=3.830! +[2023-10-09 04:11:26,234][60144] Updated weights for policy 1, policy_version 1460 (0.0010) +[2023-10-09 04:11:26,605][60144] Updated weights for policy 1, policy_version 1470 (0.0010) +[2023-10-09 04:11:26,915][60143] Updated weights for policy 0, policy_version 1450 (0.0008) +[2023-10-09 04:11:27,288][60143] Updated weights for policy 0, policy_version 1460 (0.0009) +[2023-10-09 04:11:27,661][60143] Updated weights for policy 0, policy_version 1470 (0.0009) +[2023-10-09 04:11:30,547][60144] Updated weights for policy 1, policy_version 1480 (0.0009) +[2023-10-09 04:11:30,919][60144] Updated weights for policy 1, policy_version 1490 (0.0007) +[2023-10-09 04:11:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13228.5). Total num frames: 3014656. Throughput: 0: 1731.9, 1: 1727.4. Samples: 770868. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-09 04:11:31,052][59242] Avg episode reward: [(0, '3.640'), (1, '3.930')] +[2023-10-09 04:11:31,059][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000001472_1507328.pth... +[2023-10-09 04:11:31,283][60144] Updated weights for policy 1, policy_version 1500 (0.0008) +[2023-10-09 04:11:31,432][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000001504_1540096.pth... +[2023-10-09 04:11:31,464][60003] Saving new best policy, reward=3.930! +[2023-10-09 04:11:31,570][60143] Updated weights for policy 0, policy_version 1480 (0.0009) +[2023-10-09 04:11:31,942][60143] Updated weights for policy 0, policy_version 1490 (0.0010) +[2023-10-09 04:11:32,319][60143] Updated weights for policy 0, policy_version 1500 (0.0009) +[2023-10-09 04:11:35,250][60144] Updated weights for policy 1, policy_version 1510 (0.0009) +[2023-10-09 04:11:35,607][60144] Updated weights for policy 1, policy_version 1520 (0.0008) +[2023-10-09 04:11:35,978][60144] Updated weights for policy 1, policy_version 1530 (0.0007) +[2023-10-09 04:11:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13225.9). Total num frames: 3080192. Throughput: 0: 1708.0, 1: 1736.2. Samples: 780810. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:11:36,053][59242] Avg episode reward: [(0, '3.670'), (1, '3.920')] +[2023-10-09 04:11:36,422][60143] Updated weights for policy 0, policy_version 1510 (0.0008) +[2023-10-09 04:11:36,787][60143] Updated weights for policy 0, policy_version 1520 (0.0009) +[2023-10-09 04:11:37,165][60143] Updated weights for policy 0, policy_version 1530 (0.0010) +[2023-10-09 04:11:39,977][60144] Updated weights for policy 1, policy_version 1540 (0.0008) +[2023-10-09 04:11:40,344][60144] Updated weights for policy 1, policy_version 1550 (0.0008) +[2023-10-09 04:11:40,720][60144] Updated weights for policy 1, policy_version 1560 (0.0009) +[2023-10-09 04:11:41,040][60143] Updated weights for policy 0, policy_version 1540 (0.0008) +[2023-10-09 04:11:41,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13361.1). Total num frames: 3178496. Throughput: 0: 1730.3, 1: 1738.0. Samples: 801916. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 04:11:41,053][59242] Avg episode reward: [(0, '3.560'), (1, '3.960')] +[2023-10-09 04:11:41,054][60003] Saving new best policy, reward=3.960! +[2023-10-09 04:11:41,410][60143] Updated weights for policy 0, policy_version 1550 (0.0009) +[2023-10-09 04:11:41,779][60143] Updated weights for policy 0, policy_version 1560 (0.0009) +[2023-10-09 04:11:44,606][60144] Updated weights for policy 1, policy_version 1570 (0.0010) +[2023-10-09 04:11:45,011][60144] Updated weights for policy 1, policy_version 1580 (0.0008) +[2023-10-09 04:11:45,370][60144] Updated weights for policy 1, policy_version 1590 (0.0007) +[2023-10-09 04:11:45,739][60144] Updated weights for policy 1, policy_version 1600 (0.0008) +[2023-10-09 04:11:45,833][60143] Updated weights for policy 0, policy_version 1570 (0.0007) +[2023-10-09 04:11:46,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13355.9). Total num frames: 3244032. Throughput: 0: 1728.4, 1: 1709.2. Samples: 821930. Policy #0 lag: (min: 17.0, avg: 27.6, max: 49.0) +[2023-10-09 04:11:46,052][59242] Avg episode reward: [(0, '3.430'), (1, '3.880')] +[2023-10-09 04:11:46,248][60143] Updated weights for policy 0, policy_version 1580 (0.0007) +[2023-10-09 04:11:46,625][60143] Updated weights for policy 0, policy_version 1590 (0.0007) +[2023-10-09 04:11:46,993][60143] Updated weights for policy 0, policy_version 1600 (0.0007) +[2023-10-09 04:11:49,496][60144] Updated weights for policy 1, policy_version 1610 (0.0008) +[2023-10-09 04:11:49,857][60144] Updated weights for policy 1, policy_version 1620 (0.0007) +[2023-10-09 04:11:50,226][60144] Updated weights for policy 1, policy_version 1630 (0.0007) +[2023-10-09 04:11:50,845][60143] Updated weights for policy 0, policy_version 1610 (0.0008) +[2023-10-09 04:11:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13350.8). Total num frames: 3309568. Throughput: 0: 1711.7, 1: 1739.1. Samples: 832262. Policy #0 lag: (min: 30.0, avg: 31.6, max: 58.0) +[2023-10-09 04:11:51,053][59242] Avg episode reward: [(0, '3.770'), (1, '4.160')] +[2023-10-09 04:11:51,054][60003] Saving new best policy, reward=4.160! +[2023-10-09 04:11:51,215][60143] Updated weights for policy 0, policy_version 1620 (0.0007) +[2023-10-09 04:11:51,582][60143] Updated weights for policy 0, policy_version 1630 (0.0007) +[2023-10-09 04:11:54,159][60144] Updated weights for policy 1, policy_version 1640 (0.0010) +[2023-10-09 04:11:54,533][60144] Updated weights for policy 1, policy_version 1650 (0.0010) +[2023-10-09 04:11:54,901][60144] Updated weights for policy 1, policy_version 1660 (0.0009) +[2023-10-09 04:11:55,606][60143] Updated weights for policy 0, policy_version 1640 (0.0010) +[2023-10-09 04:11:55,977][60143] Updated weights for policy 0, policy_version 1650 (0.0009) +[2023-10-09 04:11:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13346.0). Total num frames: 3375104. Throughput: 0: 1721.7, 1: 1723.0. Samples: 852804. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 04:11:56,052][59242] Avg episode reward: [(0, '3.690'), (1, '4.240')] +[2023-10-09 04:11:56,053][60003] Saving new best policy, reward=4.240! +[2023-10-09 04:11:56,350][60143] Updated weights for policy 0, policy_version 1660 (0.0009) +[2023-10-09 04:11:58,995][60144] Updated weights for policy 1, policy_version 1670 (0.0007) +[2023-10-09 04:11:59,362][60144] Updated weights for policy 1, policy_version 1680 (0.0008) +[2023-10-09 04:11:59,732][60144] Updated weights for policy 1, policy_version 1690 (0.0007) +[2023-10-09 04:12:00,226][60143] Updated weights for policy 0, policy_version 1670 (0.0008) +[2023-10-09 04:12:00,600][60143] Updated weights for policy 0, policy_version 1680 (0.0011) +[2023-10-09 04:12:00,969][60143] Updated weights for policy 0, policy_version 1690 (0.0011) +[2023-10-09 04:12:01,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13341.4). Total num frames: 3440640. Throughput: 0: 1709.7, 1: 1713.6. Samples: 873100. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-09 04:12:01,052][59242] Avg episode reward: [(0, '4.100'), (1, '3.980')] +[2023-10-09 04:12:01,192][59934] Saving new best policy, reward=4.100! +[2023-10-09 04:12:03,725][60144] Updated weights for policy 1, policy_version 1700 (0.0008) +[2023-10-09 04:12:04,100][60144] Updated weights for policy 1, policy_version 1710 (0.0008) +[2023-10-09 04:12:04,468][60144] Updated weights for policy 1, policy_version 1720 (0.0010) +[2023-10-09 04:12:05,055][60143] Updated weights for policy 0, policy_version 1700 (0.0010) +[2023-10-09 04:12:05,427][60143] Updated weights for policy 0, policy_version 1710 (0.0009) +[2023-10-09 04:12:05,800][60143] Updated weights for policy 0, policy_version 1720 (0.0010) +[2023-10-09 04:12:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13337.0). Total num frames: 3506176. Throughput: 0: 1717.0, 1: 1742.3. Samples: 884088. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:12:06,053][59242] Avg episode reward: [(0, '3.980'), (1, '3.910')] +[2023-10-09 04:12:08,360][60144] Updated weights for policy 1, policy_version 1730 (0.0012) +[2023-10-09 04:12:08,735][60144] Updated weights for policy 1, policy_version 1740 (0.0010) +[2023-10-09 04:12:09,097][60144] Updated weights for policy 1, policy_version 1750 (0.0011) +[2023-10-09 04:12:09,468][60144] Updated weights for policy 1, policy_version 1760 (0.0009) +[2023-10-09 04:12:09,833][60143] Updated weights for policy 0, policy_version 1730 (0.0009) +[2023-10-09 04:12:10,209][60143] Updated weights for policy 0, policy_version 1740 (0.0010) +[2023-10-09 04:12:10,582][60143] Updated weights for policy 0, policy_version 1750 (0.0007) +[2023-10-09 04:12:10,959][60143] Updated weights for policy 0, policy_version 1760 (0.0008) +[2023-10-09 04:12:11,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13455.0). Total num frames: 3604480. Throughput: 0: 1716.2, 1: 1713.7. Samples: 904230. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:12:11,053][59242] Avg episode reward: [(0, '4.140'), (1, '3.970')] +[2023-10-09 04:12:11,053][59934] Saving new best policy, reward=4.140! +[2023-10-09 04:12:13,418][60144] Updated weights for policy 1, policy_version 1770 (0.0009) +[2023-10-09 04:12:13,788][60144] Updated weights for policy 1, policy_version 1780 (0.0008) +[2023-10-09 04:12:14,165][60144] Updated weights for policy 1, policy_version 1790 (0.0007) +[2023-10-09 04:12:14,856][60143] Updated weights for policy 0, policy_version 1770 (0.0007) +[2023-10-09 04:12:15,233][60143] Updated weights for policy 0, policy_version 1780 (0.0010) +[2023-10-09 04:12:15,603][60143] Updated weights for policy 0, policy_version 1790 (0.0008) +[2023-10-09 04:12:16,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13448.6). Total num frames: 3670016. Throughput: 0: 1692.2, 1: 1725.1. Samples: 924644. Policy #0 lag: (min: 8.0, avg: 29.6, max: 40.0) +[2023-10-09 04:12:16,053][59242] Avg episode reward: [(0, '4.000'), (1, '3.860')] +[2023-10-09 04:12:17,995][60144] Updated weights for policy 1, policy_version 1800 (0.0010) +[2023-10-09 04:12:18,362][60144] Updated weights for policy 1, policy_version 1810 (0.0010) +[2023-10-09 04:12:18,737][60144] Updated weights for policy 1, policy_version 1820 (0.0010) +[2023-10-09 04:12:19,579][60143] Updated weights for policy 0, policy_version 1800 (0.0009) +[2023-10-09 04:12:19,946][60143] Updated weights for policy 0, policy_version 1810 (0.0008) +[2023-10-09 04:12:20,322][60143] Updated weights for policy 0, policy_version 1820 (0.0007) +[2023-10-09 04:12:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13442.5). Total num frames: 3735552. Throughput: 0: 1712.0, 1: 1724.3. Samples: 935442. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-09 04:12:21,053][59242] Avg episode reward: [(0, '3.980'), (1, '4.180')] +[2023-10-09 04:12:22,643][60144] Updated weights for policy 1, policy_version 1830 (0.0009) +[2023-10-09 04:12:23,008][60144] Updated weights for policy 1, policy_version 1840 (0.0008) +[2023-10-09 04:12:23,376][60144] Updated weights for policy 1, policy_version 1850 (0.0008) +[2023-10-09 04:12:24,370][60143] Updated weights for policy 0, policy_version 1830 (0.0008) +[2023-10-09 04:12:24,753][60143] Updated weights for policy 0, policy_version 1840 (0.0009) +[2023-10-09 04:12:25,128][60143] Updated weights for policy 0, policy_version 1850 (0.0010) +[2023-10-09 04:12:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13436.5). Total num frames: 3801088. Throughput: 0: 1711.3, 1: 1719.2. Samples: 956290. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:12:26,053][59242] Avg episode reward: [(0, '3.940'), (1, '4.440')] +[2023-10-09 04:12:26,054][60003] Saving new best policy, reward=4.440! +[2023-10-09 04:12:27,348][60144] Updated weights for policy 1, policy_version 1860 (0.0009) +[2023-10-09 04:12:27,718][60144] Updated weights for policy 1, policy_version 1870 (0.0008) +[2023-10-09 04:12:28,091][60144] Updated weights for policy 1, policy_version 1880 (0.0009) +[2023-10-09 04:12:29,210][60143] Updated weights for policy 0, policy_version 1860 (0.0010) +[2023-10-09 04:12:29,582][60143] Updated weights for policy 0, policy_version 1870 (0.0010) +[2023-10-09 04:12:29,964][60143] Updated weights for policy 0, policy_version 1880 (0.0010) +[2023-10-09 04:12:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13430.8). Total num frames: 3866624. Throughput: 0: 1690.9, 1: 1745.1. Samples: 976552. Policy #0 lag: (min: 26.0, avg: 27.0, max: 48.0) +[2023-10-09 04:12:31,053][59242] Avg episode reward: [(0, '4.060'), (1, '4.770')] +[2023-10-09 04:12:31,062][60003] Saving new best policy, reward=4.770! +[2023-10-09 04:12:31,959][60144] Updated weights for policy 1, policy_version 1890 (0.0010) +[2023-10-09 04:12:32,367][60144] Updated weights for policy 1, policy_version 1900 (0.0010) +[2023-10-09 04:12:32,744][60144] Updated weights for policy 1, policy_version 1910 (0.0009) +[2023-10-09 04:12:33,109][60144] Updated weights for policy 1, policy_version 1920 (0.0008) +[2023-10-09 04:12:33,980][60143] Updated weights for policy 0, policy_version 1890 (0.0011) +[2023-10-09 04:12:34,384][60143] Updated weights for policy 0, policy_version 1900 (0.0010) +[2023-10-09 04:12:34,759][60143] Updated weights for policy 0, policy_version 1910 (0.0008) +[2023-10-09 04:12:35,138][60143] Updated weights for policy 0, policy_version 1920 (0.0007) +[2023-10-09 04:12:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13425.3). Total num frames: 3932160. Throughput: 0: 1730.5, 1: 1714.8. Samples: 987304. Policy #0 lag: (min: 31.0, avg: 41.7, max: 63.0) +[2023-10-09 04:12:36,053][59242] Avg episode reward: [(0, '4.020'), (1, '4.780')] +[2023-10-09 04:12:36,054][60003] Saving new best policy, reward=4.780! +[2023-10-09 04:12:37,021][60144] Updated weights for policy 1, policy_version 1930 (0.0007) +[2023-10-09 04:12:37,386][60144] Updated weights for policy 1, policy_version 1940 (0.0009) +[2023-10-09 04:12:37,757][60144] Updated weights for policy 1, policy_version 1950 (0.0007) +[2023-10-09 04:12:39,018][60143] Updated weights for policy 0, policy_version 1930 (0.0010) +[2023-10-09 04:12:39,396][60143] Updated weights for policy 0, policy_version 1940 (0.0009) +[2023-10-09 04:12:39,758][60143] Updated weights for policy 0, policy_version 1950 (0.0009) +[2023-10-09 04:12:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 3997696. Throughput: 0: 1706.5, 1: 1734.6. Samples: 1007654. Policy #0 lag: (min: 31.0, avg: 40.7, max: 63.0) +[2023-10-09 04:12:41,053][59242] Avg episode reward: [(0, '4.180'), (1, '4.810')] +[2023-10-09 04:12:41,053][60003] Saving new best policy, reward=4.810! +[2023-10-09 04:12:41,053][59934] Saving new best policy, reward=4.180! +[2023-10-09 04:12:41,616][60144] Updated weights for policy 1, policy_version 1960 (0.0009) +[2023-10-09 04:12:41,992][60144] Updated weights for policy 1, policy_version 1970 (0.0007) +[2023-10-09 04:12:42,362][60144] Updated weights for policy 1, policy_version 1980 (0.0008) +[2023-10-09 04:12:43,442][60143] Updated weights for policy 0, policy_version 1960 (0.0009) +[2023-10-09 04:12:43,815][60143] Updated weights for policy 0, policy_version 1970 (0.0008) +[2023-10-09 04:12:44,180][60143] Updated weights for policy 0, policy_version 1980 (0.0008) +[2023-10-09 04:12:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 4063232. Throughput: 0: 1714.4, 1: 1754.6. Samples: 1029208. Policy #0 lag: (min: 31.0, avg: 33.3, max: 57.0) +[2023-10-09 04:12:46,052][59242] Avg episode reward: [(0, '4.070'), (1, '4.940')] +[2023-10-09 04:12:46,190][60144] Updated weights for policy 1, policy_version 1990 (0.0007) +[2023-10-09 04:12:46,554][60144] Updated weights for policy 1, policy_version 2000 (0.0007) +[2023-10-09 04:12:46,919][60144] Updated weights for policy 1, policy_version 2010 (0.0009) +[2023-10-09 04:12:47,139][60003] Saving new best policy, reward=4.940! +[2023-10-09 04:12:48,017][60143] Updated weights for policy 0, policy_version 1990 (0.0009) +[2023-10-09 04:12:48,394][60143] Updated weights for policy 0, policy_version 2000 (0.0008) +[2023-10-09 04:12:48,767][60143] Updated weights for policy 0, policy_version 2010 (0.0010) +[2023-10-09 04:12:50,682][60144] Updated weights for policy 1, policy_version 2020 (0.0009) +[2023-10-09 04:12:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 4128768. Throughput: 0: 1718.3, 1: 1728.3. Samples: 1039188. Policy #0 lag: (min: 1.0, avg: 7.0, max: 32.0) +[2023-10-09 04:12:51,053][60144] Updated weights for policy 1, policy_version 2030 (0.0011) +[2023-10-09 04:12:51,053][59242] Avg episode reward: [(0, '3.950'), (1, '4.610')] +[2023-10-09 04:12:51,412][60144] Updated weights for policy 1, policy_version 2040 (0.0008) +[2023-10-09 04:12:52,589][60143] Updated weights for policy 0, policy_version 2020 (0.0011) +[2023-10-09 04:12:52,961][60143] Updated weights for policy 0, policy_version 2030 (0.0008) +[2023-10-09 04:12:53,339][60143] Updated weights for policy 0, policy_version 2040 (0.0007) +[2023-10-09 04:12:55,523][60144] Updated weights for policy 1, policy_version 2050 (0.0007) +[2023-10-09 04:12:55,900][60144] Updated weights for policy 1, policy_version 2060 (0.0010) +[2023-10-09 04:12:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 4194304. Throughput: 0: 1708.0, 1: 1751.8. Samples: 1059918. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) +[2023-10-09 04:12:56,052][59242] Avg episode reward: [(0, '4.110'), (1, '4.710')] +[2023-10-09 04:12:56,266][60144] Updated weights for policy 1, policy_version 2070 (0.0010) +[2023-10-09 04:12:56,637][60144] Updated weights for policy 1, policy_version 2080 (0.0009) +[2023-10-09 04:12:57,407][60143] Updated weights for policy 0, policy_version 2050 (0.0008) +[2023-10-09 04:12:57,782][60143] Updated weights for policy 0, policy_version 2060 (0.0011) +[2023-10-09 04:12:58,152][60143] Updated weights for policy 0, policy_version 2070 (0.0010) +[2023-10-09 04:12:58,523][60143] Updated weights for policy 0, policy_version 2080 (0.0007) +[2023-10-09 04:13:00,443][60144] Updated weights for policy 1, policy_version 2090 (0.0008) +[2023-10-09 04:13:00,807][60144] Updated weights for policy 1, policy_version 2100 (0.0010) +[2023-10-09 04:13:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 4259840. Throughput: 0: 1727.7, 1: 1742.1. Samples: 1080784. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:13:01,053][59242] Avg episode reward: [(0, '4.170'), (1, '4.800')] +[2023-10-09 04:13:01,177][60144] Updated weights for policy 1, policy_version 2110 (0.0010) +[2023-10-09 04:13:02,465][60143] Updated weights for policy 0, policy_version 2090 (0.0008) +[2023-10-09 04:13:02,831][60143] Updated weights for policy 0, policy_version 2100 (0.0008) +[2023-10-09 04:13:03,209][60143] Updated weights for policy 0, policy_version 2110 (0.0009) +[2023-10-09 04:13:05,179][60144] Updated weights for policy 1, policy_version 2120 (0.0008) +[2023-10-09 04:13:05,541][60144] Updated weights for policy 1, policy_version 2130 (0.0008) +[2023-10-09 04:13:05,912][60144] Updated weights for policy 1, policy_version 2140 (0.0010) +[2023-10-09 04:13:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 4325376. Throughput: 0: 1705.9, 1: 1744.0. Samples: 1090686. Policy #0 lag: (min: 26.0, avg: 26.1, max: 33.0) +[2023-10-09 04:13:06,052][59242] Avg episode reward: [(0, '3.970'), (1, '4.610')] +[2023-10-09 04:13:07,242][60143] Updated weights for policy 0, policy_version 2120 (0.0007) +[2023-10-09 04:13:07,616][60143] Updated weights for policy 0, policy_version 2130 (0.0008) +[2023-10-09 04:13:07,982][60143] Updated weights for policy 0, policy_version 2140 (0.0008) +[2023-10-09 04:13:09,824][60144] Updated weights for policy 1, policy_version 2150 (0.0010) +[2023-10-09 04:13:10,197][60144] Updated weights for policy 1, policy_version 2160 (0.0010) +[2023-10-09 04:13:10,561][60144] Updated weights for policy 1, policy_version 2170 (0.0009) +[2023-10-09 04:13:11,052][59242] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 4423680. Throughput: 0: 1714.2, 1: 1751.1. Samples: 1112226. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) +[2023-10-09 04:13:11,052][59242] Avg episode reward: [(0, '4.290'), (1, '4.200')] +[2023-10-09 04:13:11,053][59934] Saving new best policy, reward=4.290! +[2023-10-09 04:13:11,821][60143] Updated weights for policy 0, policy_version 2150 (0.0008) +[2023-10-09 04:13:12,189][60143] Updated weights for policy 0, policy_version 2160 (0.0007) +[2023-10-09 04:13:12,570][60143] Updated weights for policy 0, policy_version 2170 (0.0008) +[2023-10-09 04:13:14,325][60144] Updated weights for policy 1, policy_version 2180 (0.0008) +[2023-10-09 04:13:14,687][60144] Updated weights for policy 1, policy_version 2190 (0.0008) +[2023-10-09 04:13:15,068][60144] Updated weights for policy 1, policy_version 2200 (0.0007) +[2023-10-09 04:13:16,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 4489216. Throughput: 0: 1736.8, 1: 1721.7. Samples: 1132184. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:13:16,052][59242] Avg episode reward: [(0, '4.190'), (1, '4.450')] +[2023-10-09 04:13:16,436][60143] Updated weights for policy 0, policy_version 2180 (0.0010) +[2023-10-09 04:13:16,811][60143] Updated weights for policy 0, policy_version 2190 (0.0008) +[2023-10-09 04:13:17,178][60143] Updated weights for policy 0, policy_version 2200 (0.0009) +[2023-10-09 04:13:19,026][60144] Updated weights for policy 1, policy_version 2210 (0.0007) +[2023-10-09 04:13:19,426][60144] Updated weights for policy 1, policy_version 2220 (0.0008) +[2023-10-09 04:13:19,800][60144] Updated weights for policy 1, policy_version 2230 (0.0007) +[2023-10-09 04:13:20,169][60144] Updated weights for policy 1, policy_version 2240 (0.0009) +[2023-10-09 04:13:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 4554752. Throughput: 0: 1703.1, 1: 1756.4. Samples: 1142984. Policy #0 lag: (min: 31.0, avg: 31.5, max: 45.0) +[2023-10-09 04:13:21,052][59242] Avg episode reward: [(0, '4.390'), (1, '4.620')] +[2023-10-09 04:13:21,176][60143] Updated weights for policy 0, policy_version 2210 (0.0008) +[2023-10-09 04:13:21,587][60143] Updated weights for policy 0, policy_version 2220 (0.0007) +[2023-10-09 04:13:21,950][60143] Updated weights for policy 0, policy_version 2230 (0.0007) +[2023-10-09 04:13:22,322][59934] Saving new best policy, reward=4.390! +[2023-10-09 04:13:22,322][60143] Updated weights for policy 0, policy_version 2240 (0.0008) +[2023-10-09 04:13:24,075][60144] Updated weights for policy 1, policy_version 2250 (0.0009) +[2023-10-09 04:13:24,449][60144] Updated weights for policy 1, policy_version 2260 (0.0008) +[2023-10-09 04:13:24,823][60144] Updated weights for policy 1, policy_version 2270 (0.0008) +[2023-10-09 04:13:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 4620288. Throughput: 0: 1727.2, 1: 1729.8. Samples: 1163222. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) +[2023-10-09 04:13:26,053][59242] Avg episode reward: [(0, '4.600'), (1, '4.310')] +[2023-10-09 04:13:26,246][60143] Updated weights for policy 0, policy_version 2250 (0.0008) +[2023-10-09 04:13:26,616][60143] Updated weights for policy 0, policy_version 2260 (0.0007) +[2023-10-09 04:13:26,984][60143] Updated weights for policy 0, policy_version 2270 (0.0007) +[2023-10-09 04:13:27,061][59934] Saving new best policy, reward=4.600! +[2023-10-09 04:13:28,703][60144] Updated weights for policy 1, policy_version 2280 (0.0008) +[2023-10-09 04:13:29,072][60144] Updated weights for policy 1, policy_version 2290 (0.0007) +[2023-10-09 04:13:29,442][60144] Updated weights for policy 1, policy_version 2300 (0.0008) +[2023-10-09 04:13:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 4685824. Throughput: 0: 1726.0, 1: 1713.4. Samples: 1183984. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-09 04:13:31,052][59242] Avg episode reward: [(0, '4.550'), (1, '4.290')] +[2023-10-09 04:13:31,060][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000002304_2359296.pth... +[2023-10-09 04:13:31,064][60143] Updated weights for policy 0, policy_version 2280 (0.0007) +[2023-10-09 04:13:31,097][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000000704_720896.pth +[2023-10-09 04:13:31,442][60143] Updated weights for policy 0, policy_version 2290 (0.0008) +[2023-10-09 04:13:31,807][60143] Updated weights for policy 0, policy_version 2300 (0.0008) +[2023-10-09 04:13:31,947][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000002304_2359296.pth... +[2023-10-09 04:13:31,976][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000000672_688128.pth +[2023-10-09 04:13:33,432][60144] Updated weights for policy 1, policy_version 2310 (0.0008) +[2023-10-09 04:13:33,796][60144] Updated weights for policy 1, policy_version 2320 (0.0009) +[2023-10-09 04:13:34,167][60144] Updated weights for policy 1, policy_version 2330 (0.0011) +[2023-10-09 04:13:35,566][60143] Updated weights for policy 0, policy_version 2310 (0.0008) +[2023-10-09 04:13:35,935][60143] Updated weights for policy 0, policy_version 2320 (0.0008) +[2023-10-09 04:13:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 4751360. Throughput: 0: 1711.9, 1: 1732.8. Samples: 1194200. Policy #0 lag: (min: 9.0, avg: 17.0, max: 41.0) +[2023-10-09 04:13:36,052][59242] Avg episode reward: [(0, '4.600'), (1, '4.540')] +[2023-10-09 04:13:36,306][60143] Updated weights for policy 0, policy_version 2330 (0.0008) +[2023-10-09 04:13:38,105][60144] Updated weights for policy 1, policy_version 2340 (0.0007) +[2023-10-09 04:13:38,470][60144] Updated weights for policy 1, policy_version 2350 (0.0007) +[2023-10-09 04:13:38,843][60144] Updated weights for policy 1, policy_version 2360 (0.0009) +[2023-10-09 04:13:40,409][60143] Updated weights for policy 0, policy_version 2340 (0.0007) +[2023-10-09 04:13:40,784][60143] Updated weights for policy 0, policy_version 2350 (0.0009) +[2023-10-09 04:13:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 4816896. Throughput: 0: 1726.9, 1: 1711.4. Samples: 1214640. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 04:13:41,053][59242] Avg episode reward: [(0, '4.830'), (1, '4.750')] +[2023-10-09 04:13:41,149][60143] Updated weights for policy 0, policy_version 2360 (0.0010) +[2023-10-09 04:13:41,449][59934] Saving new best policy, reward=4.830! +[2023-10-09 04:13:42,807][60144] Updated weights for policy 1, policy_version 2370 (0.0008) +[2023-10-09 04:13:43,185][60144] Updated weights for policy 1, policy_version 2380 (0.0008) +[2023-10-09 04:13:43,564][60144] Updated weights for policy 1, policy_version 2390 (0.0007) +[2023-10-09 04:13:43,934][60144] Updated weights for policy 1, policy_version 2400 (0.0007) +[2023-10-09 04:13:45,214][60143] Updated weights for policy 0, policy_version 2370 (0.0008) +[2023-10-09 04:13:45,586][60143] Updated weights for policy 0, policy_version 2380 (0.0008) +[2023-10-09 04:13:45,960][60143] Updated weights for policy 0, policy_version 2390 (0.0009) +[2023-10-09 04:13:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 4882432. Throughput: 0: 1720.2, 1: 1721.9. Samples: 1235678. Policy #0 lag: (min: 14.0, avg: 19.0, max: 46.0) +[2023-10-09 04:13:46,053][59242] Avg episode reward: [(0, '5.170'), (1, '4.910')] +[2023-10-09 04:13:46,337][59934] Saving new best policy, reward=5.170! +[2023-10-09 04:13:46,338][60143] Updated weights for policy 0, policy_version 2400 (0.0009) +[2023-10-09 04:13:47,863][60144] Updated weights for policy 1, policy_version 2410 (0.0009) +[2023-10-09 04:13:48,226][60144] Updated weights for policy 1, policy_version 2420 (0.0009) +[2023-10-09 04:13:48,595][60144] Updated weights for policy 1, policy_version 2430 (0.0009) +[2023-10-09 04:13:50,228][60143] Updated weights for policy 0, policy_version 2410 (0.0011) +[2023-10-09 04:13:50,587][60143] Updated weights for policy 0, policy_version 2420 (0.0008) +[2023-10-09 04:13:50,962][60143] Updated weights for policy 0, policy_version 2430 (0.0007) +[2023-10-09 04:13:51,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 4980736. Throughput: 0: 1729.3, 1: 1714.4. Samples: 1245654. Policy #0 lag: (min: 4.0, avg: 7.5, max: 36.0) +[2023-10-09 04:13:51,053][59242] Avg episode reward: [(0, '4.920'), (1, '5.050')] +[2023-10-09 04:13:51,054][60003] Saving new best policy, reward=5.050! +[2023-10-09 04:13:52,456][60144] Updated weights for policy 1, policy_version 2440 (0.0007) +[2023-10-09 04:13:52,827][60144] Updated weights for policy 1, policy_version 2450 (0.0007) +[2023-10-09 04:13:53,205][60144] Updated weights for policy 1, policy_version 2460 (0.0007) +[2023-10-09 04:13:54,790][60143] Updated weights for policy 0, policy_version 2440 (0.0007) +[2023-10-09 04:13:55,160][60143] Updated weights for policy 0, policy_version 2450 (0.0009) +[2023-10-09 04:13:55,531][60143] Updated weights for policy 0, policy_version 2460 (0.0009) +[2023-10-09 04:13:56,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 5046272. Throughput: 0: 1728.8, 1: 1709.9. Samples: 1266966. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:13:56,053][59242] Avg episode reward: [(0, '5.020'), (1, '5.300')] +[2023-10-09 04:13:56,054][60003] Saving new best policy, reward=5.300! +[2023-10-09 04:13:57,044][60144] Updated weights for policy 1, policy_version 2470 (0.0009) +[2023-10-09 04:13:57,421][60144] Updated weights for policy 1, policy_version 2480 (0.0008) +[2023-10-09 04:13:57,784][60144] Updated weights for policy 1, policy_version 2490 (0.0009) +[2023-10-09 04:13:59,595][60143] Updated weights for policy 0, policy_version 2470 (0.0007) +[2023-10-09 04:13:59,969][60143] Updated weights for policy 0, policy_version 2480 (0.0008) +[2023-10-09 04:14:00,342][60143] Updated weights for policy 0, policy_version 2490 (0.0009) +[2023-10-09 04:14:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 5111808. Throughput: 0: 1702.9, 1: 1742.5. Samples: 1287226. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 04:14:01,053][59242] Avg episode reward: [(0, '5.010'), (1, '5.280')] +[2023-10-09 04:14:01,670][60144] Updated weights for policy 1, policy_version 2500 (0.0008) +[2023-10-09 04:14:02,044][60144] Updated weights for policy 1, policy_version 2510 (0.0009) +[2023-10-09 04:14:02,405][60144] Updated weights for policy 1, policy_version 2520 (0.0007) +[2023-10-09 04:14:04,077][60143] Updated weights for policy 0, policy_version 2500 (0.0008) +[2023-10-09 04:14:04,442][60143] Updated weights for policy 0, policy_version 2510 (0.0007) +[2023-10-09 04:14:04,812][60143] Updated weights for policy 0, policy_version 2520 (0.0007) +[2023-10-09 04:14:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 5177344. Throughput: 0: 1731.4, 1: 1708.6. Samples: 1297782. Policy #0 lag: (min: 18.0, avg: 18.5, max: 31.0) +[2023-10-09 04:14:06,052][59242] Avg episode reward: [(0, '4.940'), (1, '5.390')] +[2023-10-09 04:14:06,053][60003] Saving new best policy, reward=5.390! +[2023-10-09 04:14:06,349][60144] Updated weights for policy 1, policy_version 2530 (0.0008) +[2023-10-09 04:14:06,738][60144] Updated weights for policy 1, policy_version 2540 (0.0009) +[2023-10-09 04:14:07,102][60144] Updated weights for policy 1, policy_version 2550 (0.0009) +[2023-10-09 04:14:07,477][60144] Updated weights for policy 1, policy_version 2560 (0.0010) +[2023-10-09 04:14:09,067][60143] Updated weights for policy 0, policy_version 2530 (0.0009) +[2023-10-09 04:14:09,463][60143] Updated weights for policy 0, policy_version 2540 (0.0010) +[2023-10-09 04:14:09,838][60143] Updated weights for policy 0, policy_version 2550 (0.0010) +[2023-10-09 04:14:10,212][60143] Updated weights for policy 0, policy_version 2560 (0.0010) +[2023-10-09 04:14:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 5242880. Throughput: 0: 1712.9, 1: 1738.6. Samples: 1318542. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) +[2023-10-09 04:14:11,053][59242] Avg episode reward: [(0, '4.890'), (1, '5.300')] +[2023-10-09 04:14:11,364][60144] Updated weights for policy 1, policy_version 2572 (0.0008) +[2023-10-09 04:14:11,740][60144] Updated weights for policy 1, policy_version 2582 (0.0009) +[2023-10-09 04:14:12,110][60144] Updated weights for policy 1, policy_version 2592 (0.0009) +[2023-10-09 04:14:13,987][60143] Updated weights for policy 0, policy_version 2570 (0.0007) +[2023-10-09 04:14:14,347][60143] Updated weights for policy 0, policy_version 2580 (0.0008) +[2023-10-09 04:14:14,721][60143] Updated weights for policy 0, policy_version 2590 (0.0008) +[2023-10-09 04:14:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 5308416. Throughput: 0: 1703.2, 1: 1744.8. Samples: 1339146. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) +[2023-10-09 04:14:16,053][59242] Avg episode reward: [(0, '5.020'), (1, '5.190')] +[2023-10-09 04:14:16,156][60144] Updated weights for policy 1, policy_version 2602 (0.0009) +[2023-10-09 04:14:16,522][60144] Updated weights for policy 1, policy_version 2612 (0.0007) +[2023-10-09 04:14:16,890][60144] Updated weights for policy 1, policy_version 2622 (0.0010) +[2023-10-09 04:14:18,699][60143] Updated weights for policy 0, policy_version 2600 (0.0008) +[2023-10-09 04:14:19,074][60143] Updated weights for policy 0, policy_version 2610 (0.0008) +[2023-10-09 04:14:19,447][60143] Updated weights for policy 0, policy_version 2620 (0.0008) +[2023-10-09 04:14:20,891][60144] Updated weights for policy 1, policy_version 2632 (0.0009) +[2023-10-09 04:14:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 5373952. Throughput: 0: 1734.5, 1: 1725.2. Samples: 1349890. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:14:21,053][59242] Avg episode reward: [(0, '4.920'), (1, '5.500')] +[2023-10-09 04:14:21,270][60144] Updated weights for policy 1, policy_version 2642 (0.0008) +[2023-10-09 04:14:21,642][60144] Updated weights for policy 1, policy_version 2652 (0.0008) +[2023-10-09 04:14:21,785][60003] Saving new best policy, reward=5.500! +[2023-10-09 04:14:23,536][60143] Updated weights for policy 0, policy_version 2630 (0.0009) +[2023-10-09 04:14:23,898][60143] Updated weights for policy 0, policy_version 2640 (0.0008) +[2023-10-09 04:14:24,270][60143] Updated weights for policy 0, policy_version 2650 (0.0007) +[2023-10-09 04:14:25,610][60144] Updated weights for policy 1, policy_version 2662 (0.0009) +[2023-10-09 04:14:25,978][60144] Updated weights for policy 1, policy_version 2672 (0.0009) +[2023-10-09 04:14:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 5439488. Throughput: 0: 1704.0, 1: 1748.2. Samples: 1369990. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-09 04:14:26,053][59242] Avg episode reward: [(0, '4.830'), (1, '5.740')] +[2023-10-09 04:14:26,346][60144] Updated weights for policy 1, policy_version 2682 (0.0007) +[2023-10-09 04:14:26,561][60003] Saving new best policy, reward=5.740! +[2023-10-09 04:14:28,080][60143] Updated weights for policy 0, policy_version 2660 (0.0010) +[2023-10-09 04:14:28,459][60143] Updated weights for policy 0, policy_version 2670 (0.0009) +[2023-10-09 04:14:28,825][60143] Updated weights for policy 0, policy_version 2680 (0.0008) +[2023-10-09 04:14:30,315][60144] Updated weights for policy 1, policy_version 2692 (0.0007) +[2023-10-09 04:14:30,686][60144] Updated weights for policy 1, policy_version 2702 (0.0009) +[2023-10-09 04:14:31,050][60144] Updated weights for policy 1, policy_version 2712 (0.0009) +[2023-10-09 04:14:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 5505024. Throughput: 0: 1712.1, 1: 1732.5. Samples: 1390686. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-09 04:14:31,053][59242] Avg episode reward: [(0, '4.810'), (1, '5.580')] +[2023-10-09 04:14:33,043][60143] Updated weights for policy 0, policy_version 2690 (0.0009) +[2023-10-09 04:14:33,411][60143] Updated weights for policy 0, policy_version 2700 (0.0009) +[2023-10-09 04:14:33,797][60143] Updated weights for policy 0, policy_version 2710 (0.0010) +[2023-10-09 04:14:34,167][60143] Updated weights for policy 0, policy_version 2720 (0.0009) +[2023-10-09 04:14:34,873][60144] Updated weights for policy 1, policy_version 2722 (0.0010) +[2023-10-09 04:14:35,239][60144] Updated weights for policy 1, policy_version 2732 (0.0008) +[2023-10-09 04:14:35,617][60144] Updated weights for policy 1, policy_version 2742 (0.0008) +[2023-10-09 04:14:35,987][60144] Updated weights for policy 1, policy_version 2752 (0.0010) +[2023-10-09 04:14:36,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 5603328. Throughput: 0: 1718.2, 1: 1738.9. Samples: 1401224. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-09 04:14:36,053][59242] Avg episode reward: [(0, '5.160'), (1, '5.730')] +[2023-10-09 04:14:38,194][60143] Updated weights for policy 0, policy_version 2730 (0.0007) +[2023-10-09 04:14:38,570][60143] Updated weights for policy 0, policy_version 2740 (0.0007) +[2023-10-09 04:14:38,935][60143] Updated weights for policy 0, policy_version 2750 (0.0008) +[2023-10-09 04:14:39,910][60144] Updated weights for policy 1, policy_version 2762 (0.0008) +[2023-10-09 04:14:40,284][60144] Updated weights for policy 1, policy_version 2772 (0.0008) +[2023-10-09 04:14:40,653][60144] Updated weights for policy 1, policy_version 2782 (0.0008) +[2023-10-09 04:14:41,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 5668864. Throughput: 0: 1690.5, 1: 1743.1. Samples: 1421482. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-09 04:14:41,053][59242] Avg episode reward: [(0, '5.140'), (1, '5.400')] +[2023-10-09 04:14:42,934][60143] Updated weights for policy 0, policy_version 2760 (0.0010) +[2023-10-09 04:14:43,293][60143] Updated weights for policy 0, policy_version 2770 (0.0010) +[2023-10-09 04:14:43,664][60143] Updated weights for policy 0, policy_version 2780 (0.0007) +[2023-10-09 04:14:44,436][60144] Updated weights for policy 1, policy_version 2792 (0.0011) +[2023-10-09 04:14:44,806][60144] Updated weights for policy 1, policy_version 2802 (0.0010) +[2023-10-09 04:14:45,175][60144] Updated weights for policy 1, policy_version 2812 (0.0010) +[2023-10-09 04:14:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 5734400. Throughput: 0: 1720.1, 1: 1712.0. Samples: 1441674. Policy #0 lag: (min: 25.0, avg: 37.0, max: 57.0) +[2023-10-09 04:14:46,053][59242] Avg episode reward: [(0, '5.190'), (1, '5.530')] +[2023-10-09 04:14:46,061][59934] Saving new best policy, reward=5.190! +[2023-10-09 04:14:47,537][60143] Updated weights for policy 0, policy_version 2790 (0.0010) +[2023-10-09 04:14:47,913][60143] Updated weights for policy 0, policy_version 2800 (0.0009) +[2023-10-09 04:14:48,287][60143] Updated weights for policy 0, policy_version 2810 (0.0011) +[2023-10-09 04:14:49,315][60144] Updated weights for policy 1, policy_version 2822 (0.0009) +[2023-10-09 04:14:49,674][60144] Updated weights for policy 1, policy_version 2832 (0.0009) +[2023-10-09 04:14:50,044][60144] Updated weights for policy 1, policy_version 2842 (0.0010) +[2023-10-09 04:14:51,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13884.7). Total num frames: 5799936. Throughput: 0: 1695.3, 1: 1742.2. Samples: 1452470. Policy #0 lag: (min: 19.0, avg: 25.9, max: 51.0) +[2023-10-09 04:14:51,052][59242] Avg episode reward: [(0, '5.590'), (1, '5.270')] +[2023-10-09 04:14:51,053][59934] Saving new best policy, reward=5.590! +[2023-10-09 04:14:52,278][60143] Updated weights for policy 0, policy_version 2820 (0.0009) +[2023-10-09 04:14:52,651][60143] Updated weights for policy 0, policy_version 2830 (0.0008) +[2023-10-09 04:14:53,024][60143] Updated weights for policy 0, policy_version 2840 (0.0009) +[2023-10-09 04:14:53,881][60144] Updated weights for policy 1, policy_version 2852 (0.0008) +[2023-10-09 04:14:54,264][60144] Updated weights for policy 1, policy_version 2862 (0.0011) +[2023-10-09 04:14:54,623][60144] Updated weights for policy 1, policy_version 2872 (0.0007) +[2023-10-09 04:14:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 5865472. Throughput: 0: 1711.6, 1: 1718.9. Samples: 1472916. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-09 04:14:56,053][59242] Avg episode reward: [(0, '5.280'), (1, '5.430')] +[2023-10-09 04:14:57,036][60143] Updated weights for policy 0, policy_version 2850 (0.0010) +[2023-10-09 04:14:57,449][60143] Updated weights for policy 0, policy_version 2860 (0.0010) +[2023-10-09 04:14:57,819][60143] Updated weights for policy 0, policy_version 2870 (0.0007) +[2023-10-09 04:14:58,192][60143] Updated weights for policy 0, policy_version 2880 (0.0009) +[2023-10-09 04:14:58,519][60144] Updated weights for policy 1, policy_version 2882 (0.0008) +[2023-10-09 04:14:58,883][60144] Updated weights for policy 1, policy_version 2892 (0.0007) +[2023-10-09 04:14:59,248][60144] Updated weights for policy 1, policy_version 2902 (0.0007) +[2023-10-09 04:14:59,609][60144] Updated weights for policy 1, policy_version 2912 (0.0009) +[2023-10-09 04:15:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 5931008. Throughput: 0: 1720.9, 1: 1713.4. Samples: 1493690. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-09 04:15:01,052][59242] Avg episode reward: [(0, '5.240'), (1, '5.370')] +[2023-10-09 04:15:02,156][60143] Updated weights for policy 0, policy_version 2890 (0.0009) +[2023-10-09 04:15:02,524][60143] Updated weights for policy 0, policy_version 2900 (0.0009) +[2023-10-09 04:15:02,888][60143] Updated weights for policy 0, policy_version 2910 (0.0008) +[2023-10-09 04:15:03,497][60144] Updated weights for policy 1, policy_version 2922 (0.0009) +[2023-10-09 04:15:03,863][60144] Updated weights for policy 1, policy_version 2932 (0.0010) +[2023-10-09 04:15:04,233][60144] Updated weights for policy 1, policy_version 2942 (0.0011) +[2023-10-09 04:15:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 5996544. Throughput: 0: 1686.4, 1: 1733.8. Samples: 1503800. Policy #0 lag: (min: 15.0, avg: 26.6, max: 47.0) +[2023-10-09 04:15:06,053][59242] Avg episode reward: [(0, '5.210'), (1, '5.230')] +[2023-10-09 04:15:06,827][60143] Updated weights for policy 0, policy_version 2920 (0.0007) +[2023-10-09 04:15:07,206][60143] Updated weights for policy 0, policy_version 2930 (0.0007) +[2023-10-09 04:15:07,576][60143] Updated weights for policy 0, policy_version 2940 (0.0011) +[2023-10-09 04:15:08,333][60144] Updated weights for policy 1, policy_version 2952 (0.0009) +[2023-10-09 04:15:08,688][60144] Updated weights for policy 1, policy_version 2962 (0.0008) +[2023-10-09 04:15:09,058][60144] Updated weights for policy 1, policy_version 2972 (0.0007) +[2023-10-09 04:15:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 6062080. Throughput: 0: 1718.1, 1: 1713.5. Samples: 1524414. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-09 04:15:11,053][59242] Avg episode reward: [(0, '5.050'), (1, '5.430')] +[2023-10-09 04:15:11,489][60143] Updated weights for policy 0, policy_version 2950 (0.0009) +[2023-10-09 04:15:11,851][60143] Updated weights for policy 0, policy_version 2960 (0.0007) +[2023-10-09 04:15:12,224][60143] Updated weights for policy 0, policy_version 2970 (0.0010) +[2023-10-09 04:15:12,907][60144] Updated weights for policy 1, policy_version 2982 (0.0009) +[2023-10-09 04:15:13,282][60144] Updated weights for policy 1, policy_version 2992 (0.0010) +[2023-10-09 04:15:13,652][60144] Updated weights for policy 1, policy_version 3002 (0.0010) +[2023-10-09 04:15:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 6127616. Throughput: 0: 1713.6, 1: 1730.5. Samples: 1545668. Policy #0 lag: (min: 31.0, avg: 46.1, max: 63.0) +[2023-10-09 04:15:16,053][59242] Avg episode reward: [(0, '5.010'), (1, '5.420')] +[2023-10-09 04:15:16,174][60143] Updated weights for policy 0, policy_version 2980 (0.0008) +[2023-10-09 04:15:16,544][60143] Updated weights for policy 0, policy_version 2990 (0.0008) +[2023-10-09 04:15:16,916][60143] Updated weights for policy 0, policy_version 3000 (0.0008) +[2023-10-09 04:15:17,544][60144] Updated weights for policy 1, policy_version 3012 (0.0009) +[2023-10-09 04:15:17,914][60144] Updated weights for policy 1, policy_version 3022 (0.0008) +[2023-10-09 04:15:18,273][60144] Updated weights for policy 1, policy_version 3032 (0.0008) +[2023-10-09 04:15:20,776][60143] Updated weights for policy 0, policy_version 3010 (0.0009) +[2023-10-09 04:15:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 6193152. Throughput: 0: 1696.9, 1: 1722.9. Samples: 1555116. Policy #0 lag: (min: 31.0, avg: 46.1, max: 63.0) +[2023-10-09 04:15:21,053][59242] Avg episode reward: [(0, '5.000'), (1, '5.550')] +[2023-10-09 04:15:21,149][60143] Updated weights for policy 0, policy_version 3020 (0.0008) +[2023-10-09 04:15:21,517][60143] Updated weights for policy 0, policy_version 3030 (0.0007) +[2023-10-09 04:15:21,891][60143] Updated weights for policy 0, policy_version 3040 (0.0009) +[2023-10-09 04:15:22,054][60144] Updated weights for policy 1, policy_version 3042 (0.0008) +[2023-10-09 04:15:22,429][60144] Updated weights for policy 1, policy_version 3052 (0.0008) +[2023-10-09 04:15:22,792][60144] Updated weights for policy 1, policy_version 3062 (0.0007) +[2023-10-09 04:15:23,158][60144] Updated weights for policy 1, policy_version 3072 (0.0009) +[2023-10-09 04:15:25,881][60143] Updated weights for policy 0, policy_version 3050 (0.0007) +[2023-10-09 04:15:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 6258688. Throughput: 0: 1716.9, 1: 1722.2. Samples: 1576242. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-09 04:15:26,053][59242] Avg episode reward: [(0, '4.840'), (1, '6.000')] +[2023-10-09 04:15:26,053][60003] Saving new best policy, reward=6.000! +[2023-10-09 04:15:26,250][60143] Updated weights for policy 0, policy_version 3060 (0.0007) +[2023-10-09 04:15:26,630][60143] Updated weights for policy 0, policy_version 3070 (0.0007) +[2023-10-09 04:15:26,916][60144] Updated weights for policy 1, policy_version 3082 (0.0007) +[2023-10-09 04:15:27,286][60144] Updated weights for policy 1, policy_version 3092 (0.0007) +[2023-10-09 04:15:27,657][60144] Updated weights for policy 1, policy_version 3102 (0.0008) +[2023-10-09 04:15:30,784][60143] Updated weights for policy 0, policy_version 3080 (0.0009) +[2023-10-09 04:15:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 6324224. Throughput: 0: 1711.3, 1: 1751.8. Samples: 1597514. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-09 04:15:31,053][59242] Avg episode reward: [(0, '4.610'), (1, '5.960')] +[2023-10-09 04:15:31,059][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000003104_3178496.pth... +[2023-10-09 04:15:31,095][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000001504_1540096.pth +[2023-10-09 04:15:31,152][60143] Updated weights for policy 0, policy_version 3090 (0.0008) +[2023-10-09 04:15:31,510][60143] Updated weights for policy 0, policy_version 3100 (0.0008) +[2023-10-09 04:15:31,658][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000003104_3178496.pth... +[2023-10-09 04:15:31,698][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000001472_1507328.pth +[2023-10-09 04:15:31,758][60144] Updated weights for policy 1, policy_version 3112 (0.0009) +[2023-10-09 04:15:32,121][60144] Updated weights for policy 1, policy_version 3122 (0.0008) +[2023-10-09 04:15:32,483][60144] Updated weights for policy 1, policy_version 3132 (0.0007) +[2023-10-09 04:15:35,294][60143] Updated weights for policy 0, policy_version 3110 (0.0008) +[2023-10-09 04:15:35,658][60143] Updated weights for policy 0, policy_version 3120 (0.0009) +[2023-10-09 04:15:36,028][60143] Updated weights for policy 0, policy_version 3130 (0.0007) +[2023-10-09 04:15:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 6389760. Throughput: 0: 1711.4, 1: 1722.5. Samples: 1606996. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-09 04:15:36,052][59242] Avg episode reward: [(0, '5.050'), (1, '6.420')] +[2023-10-09 04:15:36,053][60003] Saving new best policy, reward=6.420! +[2023-10-09 04:15:36,404][60144] Updated weights for policy 1, policy_version 3142 (0.0009) +[2023-10-09 04:15:36,779][60144] Updated weights for policy 1, policy_version 3152 (0.0009) +[2023-10-09 04:15:37,140][60144] Updated weights for policy 1, policy_version 3162 (0.0007) +[2023-10-09 04:15:40,171][60143] Updated weights for policy 0, policy_version 3140 (0.0008) +[2023-10-09 04:15:40,550][60143] Updated weights for policy 0, policy_version 3150 (0.0007) +[2023-10-09 04:15:40,912][60143] Updated weights for policy 0, policy_version 3160 (0.0008) +[2023-10-09 04:15:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 6455296. Throughput: 0: 1720.9, 1: 1737.3. Samples: 1628536. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-09 04:15:41,053][59242] Avg episode reward: [(0, '4.710'), (1, '6.150')] +[2023-10-09 04:15:41,223][60144] Updated weights for policy 1, policy_version 3172 (0.0007) +[2023-10-09 04:15:41,606][60144] Updated weights for policy 1, policy_version 3182 (0.0010) +[2023-10-09 04:15:41,976][60144] Updated weights for policy 1, policy_version 3192 (0.0010) +[2023-10-09 04:15:44,815][60143] Updated weights for policy 0, policy_version 3170 (0.0007) +[2023-10-09 04:15:45,199][60143] Updated weights for policy 0, policy_version 3180 (0.0008) +[2023-10-09 04:15:45,564][60143] Updated weights for policy 0, policy_version 3190 (0.0007) +[2023-10-09 04:15:45,937][60143] Updated weights for policy 0, policy_version 3200 (0.0009) +[2023-10-09 04:15:45,949][60144] Updated weights for policy 1, policy_version 3202 (0.0010) +[2023-10-09 04:15:46,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 6553600. Throughput: 0: 1708.3, 1: 1741.4. Samples: 1648926. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-09 04:15:46,052][59242] Avg episode reward: [(0, '4.750'), (1, '6.340')] +[2023-10-09 04:15:46,321][60144] Updated weights for policy 1, policy_version 3212 (0.0007) +[2023-10-09 04:15:46,698][60144] Updated weights for policy 1, policy_version 3222 (0.0009) +[2023-10-09 04:15:47,070][60144] Updated weights for policy 1, policy_version 3232 (0.0007) +[2023-10-09 04:15:49,745][60143] Updated weights for policy 0, policy_version 3210 (0.0008) +[2023-10-09 04:15:50,112][60143] Updated weights for policy 0, policy_version 3220 (0.0007) +[2023-10-09 04:15:50,487][60143] Updated weights for policy 0, policy_version 3230 (0.0009) +[2023-10-09 04:15:50,947][60144] Updated weights for policy 1, policy_version 3242 (0.0007) +[2023-10-09 04:15:51,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 6619136. Throughput: 0: 1731.8, 1: 1721.0. Samples: 1659174. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-09 04:15:51,053][59242] Avg episode reward: [(0, '4.880'), (1, '6.360')] +[2023-10-09 04:15:51,310][60144] Updated weights for policy 1, policy_version 3252 (0.0008) +[2023-10-09 04:15:51,678][60144] Updated weights for policy 1, policy_version 3262 (0.0007) +[2023-10-09 04:15:54,523][60143] Updated weights for policy 0, policy_version 3240 (0.0007) +[2023-10-09 04:15:54,888][60143] Updated weights for policy 0, policy_version 3250 (0.0010) +[2023-10-09 04:15:55,267][60143] Updated weights for policy 0, policy_version 3260 (0.0010) +[2023-10-09 04:15:55,478][60144] Updated weights for policy 1, policy_version 3272 (0.0008) +[2023-10-09 04:15:55,840][60144] Updated weights for policy 1, policy_version 3282 (0.0010) +[2023-10-09 04:15:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 6684672. Throughput: 0: 1716.5, 1: 1745.4. Samples: 1680202. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-09 04:15:56,052][59242] Avg episode reward: [(0, '4.980'), (1, '6.200')] +[2023-10-09 04:15:56,208][60144] Updated weights for policy 1, policy_version 3292 (0.0010) +[2023-10-09 04:15:59,265][60143] Updated weights for policy 0, policy_version 3270 (0.0010) +[2023-10-09 04:15:59,636][60143] Updated weights for policy 0, policy_version 3280 (0.0008) +[2023-10-09 04:16:00,008][60143] Updated weights for policy 0, policy_version 3290 (0.0007) +[2023-10-09 04:16:00,227][60144] Updated weights for policy 1, policy_version 3302 (0.0009) +[2023-10-09 04:16:00,603][60144] Updated weights for policy 1, policy_version 3312 (0.0011) +[2023-10-09 04:16:00,969][60144] Updated weights for policy 1, policy_version 3322 (0.0007) +[2023-10-09 04:16:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 6750208. Throughput: 0: 1698.9, 1: 1726.0. Samples: 1699790. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-09 04:16:01,053][59242] Avg episode reward: [(0, '5.430'), (1, '6.420')] +[2023-10-09 04:16:03,915][60143] Updated weights for policy 0, policy_version 3300 (0.0010) +[2023-10-09 04:16:04,277][60143] Updated weights for policy 0, policy_version 3310 (0.0010) +[2023-10-09 04:16:04,637][60143] Updated weights for policy 0, policy_version 3320 (0.0009) +[2023-10-09 04:16:04,911][60144] Updated weights for policy 1, policy_version 3332 (0.0007) +[2023-10-09 04:16:05,268][60144] Updated weights for policy 1, policy_version 3342 (0.0011) +[2023-10-09 04:16:05,638][60144] Updated weights for policy 1, policy_version 3352 (0.0008) +[2023-10-09 04:16:06,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 6848512. Throughput: 0: 1727.6, 1: 1734.2. Samples: 1710898. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-09 04:16:06,053][59242] Avg episode reward: [(0, '5.160'), (1, '5.910')] +[2023-10-09 04:16:08,640][60143] Updated weights for policy 0, policy_version 3330 (0.0008) +[2023-10-09 04:16:09,015][60143] Updated weights for policy 0, policy_version 3340 (0.0007) +[2023-10-09 04:16:09,390][60143] Updated weights for policy 0, policy_version 3350 (0.0009) +[2023-10-09 04:16:09,541][60144] Updated weights for policy 1, policy_version 3362 (0.0007) +[2023-10-09 04:16:09,766][60143] Updated weights for policy 0, policy_version 3360 (0.0008) +[2023-10-09 04:16:09,916][60144] Updated weights for policy 1, policy_version 3372 (0.0008) +[2023-10-09 04:16:10,291][60144] Updated weights for policy 1, policy_version 3382 (0.0010) +[2023-10-09 04:16:10,662][60144] Updated weights for policy 1, policy_version 3392 (0.0007) +[2023-10-09 04:16:11,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 6914048. Throughput: 0: 1712.0, 1: 1735.6. Samples: 1731380. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:16:11,052][59242] Avg episode reward: [(0, '5.260'), (1, '5.960')] +[2023-10-09 04:16:13,771][60143] Updated weights for policy 0, policy_version 3370 (0.0008) +[2023-10-09 04:16:14,145][60143] Updated weights for policy 0, policy_version 3380 (0.0009) +[2023-10-09 04:16:14,442][60144] Updated weights for policy 1, policy_version 3402 (0.0008) +[2023-10-09 04:16:14,514][60143] Updated weights for policy 0, policy_version 3390 (0.0009) +[2023-10-09 04:16:14,809][60144] Updated weights for policy 1, policy_version 3412 (0.0008) +[2023-10-09 04:16:15,177][60144] Updated weights for policy 1, policy_version 3422 (0.0007) +[2023-10-09 04:16:16,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 6979584. Throughput: 0: 1703.0, 1: 1712.0. Samples: 1751188. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:16:16,052][59242] Avg episode reward: [(0, '5.300'), (1, '6.870')] +[2023-10-09 04:16:16,061][60003] Saving new best policy, reward=6.870! +[2023-10-09 04:16:18,478][60143] Updated weights for policy 0, policy_version 3400 (0.0007) +[2023-10-09 04:16:18,852][60143] Updated weights for policy 0, policy_version 3410 (0.0009) +[2023-10-09 04:16:19,155][60144] Updated weights for policy 1, policy_version 3432 (0.0008) +[2023-10-09 04:16:19,213][60143] Updated weights for policy 0, policy_version 3420 (0.0008) +[2023-10-09 04:16:19,525][60144] Updated weights for policy 1, policy_version 3442 (0.0009) +[2023-10-09 04:16:19,894][60144] Updated weights for policy 1, policy_version 3452 (0.0007) +[2023-10-09 04:16:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 7045120. Throughput: 0: 1715.8, 1: 1742.5. Samples: 1762620. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 04:16:21,053][59242] Avg episode reward: [(0, '5.580'), (1, '6.450')] +[2023-10-09 04:16:23,243][60143] Updated weights for policy 0, policy_version 3430 (0.0008) +[2023-10-09 04:16:23,614][60143] Updated weights for policy 0, policy_version 3440 (0.0009) +[2023-10-09 04:16:23,730][60144] Updated weights for policy 1, policy_version 3462 (0.0008) +[2023-10-09 04:16:23,987][60143] Updated weights for policy 0, policy_version 3450 (0.0009) +[2023-10-09 04:16:24,093][60144] Updated weights for policy 1, policy_version 3472 (0.0008) +[2023-10-09 04:16:24,457][60144] Updated weights for policy 1, policy_version 3482 (0.0010) +[2023-10-09 04:16:26,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 7110656. Throughput: 0: 1686.3, 1: 1721.9. Samples: 1781906. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 04:16:26,053][59242] Avg episode reward: [(0, '6.010'), (1, '6.680')] +[2023-10-09 04:16:26,055][59934] Saving new best policy, reward=6.010! +[2023-10-09 04:16:28,029][60143] Updated weights for policy 0, policy_version 3460 (0.0009) +[2023-10-09 04:16:28,411][60143] Updated weights for policy 0, policy_version 3470 (0.0010) +[2023-10-09 04:16:28,698][60144] Updated weights for policy 1, policy_version 3492 (0.0008) +[2023-10-09 04:16:28,778][60143] Updated weights for policy 0, policy_version 3480 (0.0008) +[2023-10-09 04:16:29,096][60144] Updated weights for policy 1, policy_version 3502 (0.0009) +[2023-10-09 04:16:29,460][60144] Updated weights for policy 1, policy_version 3512 (0.0009) +[2023-10-09 04:16:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 7176192. Throughput: 0: 1698.3, 1: 1717.1. Samples: 1802622. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 04:16:31,053][59242] Avg episode reward: [(0, '5.750'), (1, '6.500')] +[2023-10-09 04:16:32,855][60143] Updated weights for policy 0, policy_version 3490 (0.0008) +[2023-10-09 04:16:33,263][60143] Updated weights for policy 0, policy_version 3500 (0.0007) +[2023-10-09 04:16:33,443][60144] Updated weights for policy 1, policy_version 3522 (0.0007) +[2023-10-09 04:16:33,629][60143] Updated weights for policy 0, policy_version 3510 (0.0007) +[2023-10-09 04:16:33,805][60144] Updated weights for policy 1, policy_version 3532 (0.0009) +[2023-10-09 04:16:33,993][60143] Updated weights for policy 0, policy_version 3520 (0.0008) +[2023-10-09 04:16:34,163][60144] Updated weights for policy 1, policy_version 3542 (0.0008) +[2023-10-09 04:16:34,536][60144] Updated weights for policy 1, policy_version 3552 (0.0007) +[2023-10-09 04:16:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 7241728. Throughput: 0: 1688.4, 1: 1743.3. Samples: 1813600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:16:36,053][59242] Avg episode reward: [(0, '5.650'), (1, '6.780')] +[2023-10-09 04:16:38,071][60143] Updated weights for policy 0, policy_version 3530 (0.0008) +[2023-10-09 04:16:38,438][60143] Updated weights for policy 0, policy_version 3540 (0.0008) +[2023-10-09 04:16:38,450][60144] Updated weights for policy 1, policy_version 3562 (0.0007) +[2023-10-09 04:16:38,808][60143] Updated weights for policy 0, policy_version 3550 (0.0009) +[2023-10-09 04:16:38,810][60144] Updated weights for policy 1, policy_version 3572 (0.0008) +[2023-10-09 04:16:39,180][60144] Updated weights for policy 1, policy_version 3582 (0.0009) +[2023-10-09 04:16:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 7307264. Throughput: 0: 1682.4, 1: 1713.0. Samples: 1832992. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:16:41,052][59242] Avg episode reward: [(0, '5.870'), (1, '6.680')] +[2023-10-09 04:16:42,797][60143] Updated weights for policy 0, policy_version 3560 (0.0007) +[2023-10-09 04:16:43,152][60144] Updated weights for policy 1, policy_version 3592 (0.0009) +[2023-10-09 04:16:43,164][60143] Updated weights for policy 0, policy_version 3570 (0.0007) +[2023-10-09 04:16:43,520][60144] Updated weights for policy 1, policy_version 3602 (0.0008) +[2023-10-09 04:16:43,529][60143] Updated weights for policy 0, policy_version 3580 (0.0007) +[2023-10-09 04:16:43,879][60144] Updated weights for policy 1, policy_version 3612 (0.0008) +[2023-10-09 04:16:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 7372800. Throughput: 0: 1702.9, 1: 1728.3. Samples: 1854192. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:16:46,053][59242] Avg episode reward: [(0, '5.640'), (1, '6.100')] +[2023-10-09 04:16:47,502][60143] Updated weights for policy 0, policy_version 3590 (0.0008) +[2023-10-09 04:16:47,871][60143] Updated weights for policy 0, policy_version 3600 (0.0008) +[2023-10-09 04:16:47,885][60144] Updated weights for policy 1, policy_version 3622 (0.0008) +[2023-10-09 04:16:48,241][60143] Updated weights for policy 0, policy_version 3610 (0.0008) +[2023-10-09 04:16:48,256][60144] Updated weights for policy 1, policy_version 3632 (0.0009) +[2023-10-09 04:16:48,615][60144] Updated weights for policy 1, policy_version 3642 (0.0009) +[2023-10-09 04:16:51,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 7438336. Throughput: 0: 1676.0, 1: 1720.7. Samples: 1863748. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:16:51,053][59242] Avg episode reward: [(0, '5.500'), (1, '6.330')] +[2023-10-09 04:16:52,309][60143] Updated weights for policy 0, policy_version 3620 (0.0009) +[2023-10-09 04:16:52,612][60144] Updated weights for policy 1, policy_version 3652 (0.0009) +[2023-10-09 04:16:52,684][60143] Updated weights for policy 0, policy_version 3630 (0.0007) +[2023-10-09 04:16:52,986][60144] Updated weights for policy 1, policy_version 3662 (0.0009) +[2023-10-09 04:16:53,047][60143] Updated weights for policy 0, policy_version 3640 (0.0008) +[2023-10-09 04:16:53,347][60144] Updated weights for policy 1, policy_version 3672 (0.0008) +[2023-10-09 04:16:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 7503872. Throughput: 0: 1698.3, 1: 1715.6. Samples: 1885008. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:16:56,054][59242] Avg episode reward: [(0, '5.380'), (1, '6.170')] +[2023-10-09 04:16:57,061][60143] Updated weights for policy 0, policy_version 3650 (0.0009) +[2023-10-09 04:16:57,160][60144] Updated weights for policy 1, policy_version 3682 (0.0007) +[2023-10-09 04:16:57,421][60143] Updated weights for policy 0, policy_version 3660 (0.0008) +[2023-10-09 04:16:57,528][60144] Updated weights for policy 1, policy_version 3692 (0.0008) +[2023-10-09 04:16:57,797][60143] Updated weights for policy 0, policy_version 3670 (0.0007) +[2023-10-09 04:16:57,890][60144] Updated weights for policy 1, policy_version 3702 (0.0007) +[2023-10-09 04:16:58,160][60143] Updated weights for policy 0, policy_version 3680 (0.0008) +[2023-10-09 04:16:58,258][60144] Updated weights for policy 1, policy_version 3712 (0.0008) +[2023-10-09 04:17:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 7569408. Throughput: 0: 1707.9, 1: 1738.6. Samples: 1906282. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:17:01,053][59242] Avg episode reward: [(0, '5.910'), (1, '6.530')] +[2023-10-09 04:17:02,117][60144] Updated weights for policy 1, policy_version 3722 (0.0007) +[2023-10-09 04:17:02,271][60143] Updated weights for policy 0, policy_version 3690 (0.0007) +[2023-10-09 04:17:02,485][60144] Updated weights for policy 1, policy_version 3732 (0.0007) +[2023-10-09 04:17:02,633][60143] Updated weights for policy 0, policy_version 3700 (0.0008) +[2023-10-09 04:17:02,850][60144] Updated weights for policy 1, policy_version 3742 (0.0007) +[2023-10-09 04:17:03,011][60143] Updated weights for policy 0, policy_version 3710 (0.0009) +[2023-10-09 04:17:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 7634944. Throughput: 0: 1687.6, 1: 1706.9. Samples: 1915374. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-09 04:17:06,053][59242] Avg episode reward: [(0, '6.070'), (1, '6.500')] +[2023-10-09 04:17:06,054][59934] Saving new best policy, reward=6.070! +[2023-10-09 04:17:06,714][60144] Updated weights for policy 1, policy_version 3752 (0.0009) +[2023-10-09 04:17:06,921][60143] Updated weights for policy 0, policy_version 3720 (0.0008) +[2023-10-09 04:17:07,085][60144] Updated weights for policy 1, policy_version 3762 (0.0008) +[2023-10-09 04:17:07,293][60143] Updated weights for policy 0, policy_version 3730 (0.0009) +[2023-10-09 04:17:07,458][60144] Updated weights for policy 1, policy_version 3772 (0.0007) +[2023-10-09 04:17:07,673][60143] Updated weights for policy 0, policy_version 3740 (0.0008) +[2023-10-09 04:17:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 7700480. Throughput: 0: 1710.2, 1: 1729.2. Samples: 1936678. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-09 04:17:11,053][59242] Avg episode reward: [(0, '5.870'), (1, '6.530')] +[2023-10-09 04:17:11,538][60144] Updated weights for policy 1, policy_version 3782 (0.0008) +[2023-10-09 04:17:11,643][60143] Updated weights for policy 0, policy_version 3750 (0.0008) +[2023-10-09 04:17:11,900][60144] Updated weights for policy 1, policy_version 3792 (0.0008) +[2023-10-09 04:17:12,007][60143] Updated weights for policy 0, policy_version 3760 (0.0007) +[2023-10-09 04:17:12,278][60144] Updated weights for policy 1, policy_version 3802 (0.0008) +[2023-10-09 04:17:12,377][60143] Updated weights for policy 0, policy_version 3770 (0.0007) +[2023-10-09 04:17:16,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 7766016. Throughput: 0: 1716.1, 1: 1732.4. Samples: 1957804. Policy #0 lag: (min: 16.0, avg: 43.3, max: 48.0) +[2023-10-09 04:17:16,053][59242] Avg episode reward: [(0, '6.220'), (1, '6.520')] +[2023-10-09 04:17:16,267][60143] Updated weights for policy 0, policy_version 3780 (0.0010) +[2023-10-09 04:17:16,434][60144] Updated weights for policy 1, policy_version 3812 (0.0008) +[2023-10-09 04:17:16,636][60143] Updated weights for policy 0, policy_version 3790 (0.0009) +[2023-10-09 04:17:16,831][60144] Updated weights for policy 1, policy_version 3822 (0.0009) +[2023-10-09 04:17:17,004][60143] Updated weights for policy 0, policy_version 3800 (0.0008) +[2023-10-09 04:17:17,203][60144] Updated weights for policy 1, policy_version 3832 (0.0008) +[2023-10-09 04:17:17,299][59934] Saving new best policy, reward=6.220! +[2023-10-09 04:17:21,032][60144] Updated weights for policy 1, policy_version 3842 (0.0007) +[2023-10-09 04:17:21,042][60143] Updated weights for policy 0, policy_version 3810 (0.0008) +[2023-10-09 04:17:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 7831552. Throughput: 0: 1705.9, 1: 1700.4. Samples: 1966886. Policy #0 lag: (min: 16.0, avg: 43.3, max: 48.0) +[2023-10-09 04:17:21,053][59242] Avg episode reward: [(0, '6.520'), (1, '6.910')] +[2023-10-09 04:17:21,394][60144] Updated weights for policy 1, policy_version 3852 (0.0008) +[2023-10-09 04:17:21,418][60143] Updated weights for policy 0, policy_version 3820 (0.0008) +[2023-10-09 04:17:21,763][60144] Updated weights for policy 1, policy_version 3862 (0.0008) +[2023-10-09 04:17:21,786][60143] Updated weights for policy 0, policy_version 3830 (0.0008) +[2023-10-09 04:17:22,125][60003] Saving new best policy, reward=6.910! +[2023-10-09 04:17:22,130][60144] Updated weights for policy 1, policy_version 3872 (0.0009) +[2023-10-09 04:17:22,157][59934] Saving new best policy, reward=6.520! +[2023-10-09 04:17:22,159][60143] Updated weights for policy 0, policy_version 3840 (0.0007) +[2023-10-09 04:17:25,971][60144] Updated weights for policy 1, policy_version 3882 (0.0009) +[2023-10-09 04:17:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 7897088. Throughput: 0: 1718.1, 1: 1729.1. Samples: 1988114. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-09 04:17:26,053][59242] Avg episode reward: [(0, '6.560'), (1, '6.700')] +[2023-10-09 04:17:26,184][60143] Updated weights for policy 0, policy_version 3850 (0.0007) +[2023-10-09 04:17:26,336][60144] Updated weights for policy 1, policy_version 3892 (0.0007) +[2023-10-09 04:17:26,554][60143] Updated weights for policy 0, policy_version 3860 (0.0007) +[2023-10-09 04:17:26,710][60144] Updated weights for policy 1, policy_version 3902 (0.0009) +[2023-10-09 04:17:26,923][60143] Updated weights for policy 0, policy_version 3870 (0.0008) +[2023-10-09 04:17:26,997][59934] Saving new best policy, reward=6.560! +[2023-10-09 04:17:30,826][60144] Updated weights for policy 1, policy_version 3912 (0.0008) +[2023-10-09 04:17:30,839][60143] Updated weights for policy 0, policy_version 3880 (0.0009) +[2023-10-09 04:17:31,053][59242] Fps is (10 sec: 13106.5, 60 sec: 13107.1, 300 sec: 13662.6). Total num frames: 7962624. Throughput: 0: 1719.0, 1: 1720.7. Samples: 2008980. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-09 04:17:31,054][59242] Avg episode reward: [(0, '6.540'), (1, '6.690')] +[2023-10-09 04:17:31,197][60144] Updated weights for policy 1, policy_version 3922 (0.0008) +[2023-10-09 04:17:31,206][60143] Updated weights for policy 0, policy_version 3890 (0.0009) +[2023-10-09 04:17:31,561][60144] Updated weights for policy 1, policy_version 3932 (0.0008) +[2023-10-09 04:17:31,585][60143] Updated weights for policy 0, policy_version 3900 (0.0009) +[2023-10-09 04:17:31,702][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000003936_4030464.pth... +[2023-10-09 04:17:31,730][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000003904_3997696.pth... +[2023-10-09 04:17:31,733][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000002304_2359296.pth +[2023-10-09 04:17:31,768][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000002304_2359296.pth +[2023-10-09 04:17:35,390][60144] Updated weights for policy 1, policy_version 3942 (0.0010) +[2023-10-09 04:17:35,719][60143] Updated weights for policy 0, policy_version 3910 (0.0009) +[2023-10-09 04:17:35,757][60144] Updated weights for policy 1, policy_version 3952 (0.0009) +[2023-10-09 04:17:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 8028160. Throughput: 0: 1717.6, 1: 1720.1. Samples: 2018446. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) +[2023-10-09 04:17:36,053][59242] Avg episode reward: [(0, '6.230'), (1, '6.710')] +[2023-10-09 04:17:36,094][60143] Updated weights for policy 0, policy_version 3920 (0.0007) +[2023-10-09 04:17:36,121][60144] Updated weights for policy 1, policy_version 3962 (0.0008) +[2023-10-09 04:17:36,460][60143] Updated weights for policy 0, policy_version 3930 (0.0008) +[2023-10-09 04:17:40,017][60144] Updated weights for policy 1, policy_version 3972 (0.0008) +[2023-10-09 04:17:40,241][60143] Updated weights for policy 0, policy_version 3940 (0.0009) +[2023-10-09 04:17:40,383][60144] Updated weights for policy 1, policy_version 3982 (0.0007) +[2023-10-09 04:17:40,618][60143] Updated weights for policy 0, policy_version 3950 (0.0009) +[2023-10-09 04:17:40,747][60144] Updated weights for policy 1, policy_version 3992 (0.0008) +[2023-10-09 04:17:40,982][60143] Updated weights for policy 0, policy_version 3960 (0.0010) +[2023-10-09 04:17:41,052][59242] Fps is (10 sec: 16384.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 8126464. Throughput: 0: 1713.2, 1: 1726.3. Samples: 2039784. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) +[2023-10-09 04:17:41,053][59242] Avg episode reward: [(0, '6.190'), (1, '6.790')] +[2023-10-09 04:17:44,702][60144] Updated weights for policy 1, policy_version 4002 (0.0008) +[2023-10-09 04:17:45,074][60144] Updated weights for policy 1, policy_version 4012 (0.0008) +[2023-10-09 04:17:45,149][60143] Updated weights for policy 0, policy_version 3970 (0.0008) +[2023-10-09 04:17:45,437][60144] Updated weights for policy 1, policy_version 4022 (0.0007) +[2023-10-09 04:17:45,511][60143] Updated weights for policy 0, policy_version 3980 (0.0008) +[2023-10-09 04:17:45,803][60144] Updated weights for policy 1, policy_version 4032 (0.0009) +[2023-10-09 04:17:45,886][60143] Updated weights for policy 0, policy_version 3990 (0.0008) +[2023-10-09 04:17:46,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 8192000. Throughput: 0: 1695.4, 1: 1708.7. Samples: 2059466. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:17:46,052][59242] Avg episode reward: [(0, '6.480'), (1, '7.220')] +[2023-10-09 04:17:46,058][60003] Saving new best policy, reward=7.220! +[2023-10-09 04:17:46,254][60143] Updated weights for policy 0, policy_version 4000 (0.0008) +[2023-10-09 04:17:49,833][60144] Updated weights for policy 1, policy_version 4042 (0.0008) +[2023-10-09 04:17:50,203][60144] Updated weights for policy 1, policy_version 4052 (0.0007) +[2023-10-09 04:17:50,395][60143] Updated weights for policy 0, policy_version 4010 (0.0009) +[2023-10-09 04:17:50,570][60144] Updated weights for policy 1, policy_version 4062 (0.0008) +[2023-10-09 04:17:50,761][60143] Updated weights for policy 0, policy_version 4020 (0.0010) +[2023-10-09 04:17:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 8257536. Throughput: 0: 1703.8, 1: 1728.4. Samples: 2069820. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:17:51,053][59242] Avg episode reward: [(0, '6.560'), (1, '7.120')] +[2023-10-09 04:17:51,134][60143] Updated weights for policy 0, policy_version 4030 (0.0009) +[2023-10-09 04:17:54,475][60144] Updated weights for policy 1, policy_version 4072 (0.0010) +[2023-10-09 04:17:54,845][60144] Updated weights for policy 1, policy_version 4082 (0.0010) +[2023-10-09 04:17:55,099][60143] Updated weights for policy 0, policy_version 4040 (0.0009) +[2023-10-09 04:17:55,219][60144] Updated weights for policy 1, policy_version 4092 (0.0008) +[2023-10-09 04:17:55,468][60143] Updated weights for policy 0, policy_version 4050 (0.0010) +[2023-10-09 04:17:55,842][60143] Updated weights for policy 0, policy_version 4060 (0.0010) +[2023-10-09 04:17:56,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 8355840. Throughput: 0: 1706.7, 1: 1720.9. Samples: 2090918. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) +[2023-10-09 04:17:56,053][59242] Avg episode reward: [(0, '6.250'), (1, '7.240')] +[2023-10-09 04:17:56,054][60003] Saving new best policy, reward=7.240! +[2023-10-09 04:17:59,245][60144] Updated weights for policy 1, policy_version 4102 (0.0011) +[2023-10-09 04:17:59,629][60144] Updated weights for policy 1, policy_version 4112 (0.0008) +[2023-10-09 04:17:59,855][60143] Updated weights for policy 0, policy_version 4070 (0.0008) +[2023-10-09 04:17:59,994][60144] Updated weights for policy 1, policy_version 4122 (0.0008) +[2023-10-09 04:18:00,221][60143] Updated weights for policy 0, policy_version 4080 (0.0009) +[2023-10-09 04:18:00,601][60143] Updated weights for policy 0, policy_version 4090 (0.0008) +[2023-10-09 04:18:01,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 8421376. Throughput: 0: 1680.6, 1: 1706.6. Samples: 2110230. Policy #0 lag: (min: 2.0, avg: 10.0, max: 34.0) +[2023-10-09 04:18:01,053][59242] Avg episode reward: [(0, '6.430'), (1, '7.130')] +[2023-10-09 04:18:04,100][60144] Updated weights for policy 1, policy_version 4132 (0.0008) +[2023-10-09 04:18:04,326][60143] Updated weights for policy 0, policy_version 4100 (0.0009) +[2023-10-09 04:18:04,494][60144] Updated weights for policy 1, policy_version 4142 (0.0008) +[2023-10-09 04:18:04,693][60143] Updated weights for policy 0, policy_version 4110 (0.0008) +[2023-10-09 04:18:04,866][60144] Updated weights for policy 1, policy_version 4152 (0.0008) +[2023-10-09 04:18:05,059][60143] Updated weights for policy 0, policy_version 4120 (0.0007) +[2023-10-09 04:18:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 8486912. Throughput: 0: 1702.6, 1: 1738.1. Samples: 2121718. Policy #0 lag: (min: 7.0, avg: 12.6, max: 39.0) +[2023-10-09 04:18:06,053][59242] Avg episode reward: [(0, '6.690'), (1, '7.360')] +[2023-10-09 04:18:06,055][59934] Saving new best policy, reward=6.690! +[2023-10-09 04:18:06,055][60003] Saving new best policy, reward=7.360! +[2023-10-09 04:18:08,647][60144] Updated weights for policy 1, policy_version 4162 (0.0008) +[2023-10-09 04:18:09,017][60144] Updated weights for policy 1, policy_version 4172 (0.0008) +[2023-10-09 04:18:09,039][60143] Updated weights for policy 0, policy_version 4130 (0.0009) +[2023-10-09 04:18:09,382][60144] Updated weights for policy 1, policy_version 4182 (0.0008) +[2023-10-09 04:18:09,410][60143] Updated weights for policy 0, policy_version 4140 (0.0007) +[2023-10-09 04:18:09,751][60144] Updated weights for policy 1, policy_version 4192 (0.0007) +[2023-10-09 04:18:09,765][60143] Updated weights for policy 0, policy_version 4150 (0.0007) +[2023-10-09 04:18:10,140][60143] Updated weights for policy 0, policy_version 4160 (0.0007) +[2023-10-09 04:18:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 8552448. Throughput: 0: 1695.6, 1: 1707.5. Samples: 2141254. Policy #0 lag: (min: 31.0, avg: 41.1, max: 63.0) +[2023-10-09 04:18:11,053][59242] Avg episode reward: [(0, '6.730'), (1, '7.190')] +[2023-10-09 04:18:11,055][59934] Saving new best policy, reward=6.730! +[2023-10-09 04:18:13,760][60144] Updated weights for policy 1, policy_version 4202 (0.0009) +[2023-10-09 04:18:14,134][60144] Updated weights for policy 1, policy_version 4212 (0.0007) +[2023-10-09 04:18:14,193][60143] Updated weights for policy 0, policy_version 4170 (0.0007) +[2023-10-09 04:18:14,501][60144] Updated weights for policy 1, policy_version 4222 (0.0010) +[2023-10-09 04:18:14,559][60143] Updated weights for policy 0, policy_version 4180 (0.0007) +[2023-10-09 04:18:14,937][60143] Updated weights for policy 0, policy_version 4190 (0.0008) +[2023-10-09 04:18:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 8617984. Throughput: 0: 1675.1, 1: 1710.3. Samples: 2161324. Policy #0 lag: (min: 31.0, avg: 41.1, max: 63.0) +[2023-10-09 04:18:16,053][59242] Avg episode reward: [(0, '6.680'), (1, '7.590')] +[2023-10-09 04:18:16,064][60003] Saving new best policy, reward=7.590! +[2023-10-09 04:18:18,430][60144] Updated weights for policy 1, policy_version 4232 (0.0009) +[2023-10-09 04:18:18,804][60144] Updated weights for policy 1, policy_version 4242 (0.0008) +[2023-10-09 04:18:19,029][60143] Updated weights for policy 0, policy_version 4200 (0.0009) +[2023-10-09 04:18:19,162][60144] Updated weights for policy 1, policy_version 4252 (0.0008) +[2023-10-09 04:18:19,394][60143] Updated weights for policy 0, policy_version 4210 (0.0009) +[2023-10-09 04:18:19,769][60143] Updated weights for policy 0, policy_version 4220 (0.0008) +[2023-10-09 04:18:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 8683520. Throughput: 0: 1703.5, 1: 1722.5. Samples: 2172616. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:18:21,053][59242] Avg episode reward: [(0, '6.820'), (1, '7.370')] +[2023-10-09 04:18:21,054][59934] Saving new best policy, reward=6.820! +[2023-10-09 04:18:23,014][60144] Updated weights for policy 1, policy_version 4262 (0.0009) +[2023-10-09 04:18:23,385][60144] Updated weights for policy 1, policy_version 4272 (0.0007) +[2023-10-09 04:18:23,749][60144] Updated weights for policy 1, policy_version 4282 (0.0009) +[2023-10-09 04:18:23,879][60143] Updated weights for policy 0, policy_version 4230 (0.0010) +[2023-10-09 04:18:24,249][60143] Updated weights for policy 0, policy_version 4240 (0.0007) +[2023-10-09 04:18:24,624][60143] Updated weights for policy 0, policy_version 4250 (0.0007) +[2023-10-09 04:18:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 8749056. Throughput: 0: 1684.4, 1: 1706.7. Samples: 2192384. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:18:26,053][59242] Avg episode reward: [(0, '6.560'), (1, '7.780')] +[2023-10-09 04:18:26,055][60003] Saving new best policy, reward=7.780! +[2023-10-09 04:18:27,657][60144] Updated weights for policy 1, policy_version 4292 (0.0010) +[2023-10-09 04:18:28,022][60144] Updated weights for policy 1, policy_version 4302 (0.0008) +[2023-10-09 04:18:28,396][60144] Updated weights for policy 1, policy_version 4312 (0.0007) +[2023-10-09 04:18:28,600][60143] Updated weights for policy 0, policy_version 4260 (0.0007) +[2023-10-09 04:18:28,979][60143] Updated weights for policy 0, policy_version 4270 (0.0009) +[2023-10-09 04:18:29,348][60143] Updated weights for policy 0, policy_version 4280 (0.0007) +[2023-10-09 04:18:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.6, 300 sec: 13773.7). Total num frames: 8814592. Throughput: 0: 1693.6, 1: 1729.2. Samples: 2213490. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) +[2023-10-09 04:18:31,053][59242] Avg episode reward: [(0, '6.940'), (1, '7.680')] +[2023-10-09 04:18:31,061][59934] Saving new best policy, reward=6.940! +[2023-10-09 04:18:32,414][60144] Updated weights for policy 1, policy_version 4322 (0.0007) +[2023-10-09 04:18:32,777][60144] Updated weights for policy 1, policy_version 4332 (0.0007) +[2023-10-09 04:18:33,141][60144] Updated weights for policy 1, policy_version 4342 (0.0008) +[2023-10-09 04:18:33,239][60143] Updated weights for policy 0, policy_version 4290 (0.0007) +[2023-10-09 04:18:33,512][60144] Updated weights for policy 1, policy_version 4352 (0.0007) +[2023-10-09 04:18:33,606][60143] Updated weights for policy 0, policy_version 4300 (0.0007) +[2023-10-09 04:18:33,974][60143] Updated weights for policy 0, policy_version 4310 (0.0010) +[2023-10-09 04:18:34,347][60143] Updated weights for policy 0, policy_version 4320 (0.0008) +[2023-10-09 04:18:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 8880128. Throughput: 0: 1719.2, 1: 1707.4. Samples: 2224014. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) +[2023-10-09 04:18:36,053][59242] Avg episode reward: [(0, '6.930'), (1, '7.560')] +[2023-10-09 04:18:37,448][60144] Updated weights for policy 1, policy_version 4362 (0.0010) +[2023-10-09 04:18:37,808][60144] Updated weights for policy 1, policy_version 4372 (0.0011) +[2023-10-09 04:18:38,182][60144] Updated weights for policy 1, policy_version 4382 (0.0009) +[2023-10-09 04:18:38,282][60143] Updated weights for policy 0, policy_version 4330 (0.0008) +[2023-10-09 04:18:38,650][60143] Updated weights for policy 0, policy_version 4340 (0.0009) +[2023-10-09 04:18:39,025][60143] Updated weights for policy 0, policy_version 4350 (0.0008) +[2023-10-09 04:18:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 8945664. Throughput: 0: 1694.7, 1: 1715.2. Samples: 2244366. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) +[2023-10-09 04:18:41,053][59242] Avg episode reward: [(0, '7.210'), (1, '7.730')] +[2023-10-09 04:18:41,054][59934] Saving new best policy, reward=7.210! +[2023-10-09 04:18:42,266][60144] Updated weights for policy 1, policy_version 4392 (0.0010) +[2023-10-09 04:18:42,643][60144] Updated weights for policy 1, policy_version 4402 (0.0010) +[2023-10-09 04:18:42,893][60143] Updated weights for policy 0, policy_version 4360 (0.0009) +[2023-10-09 04:18:43,015][60144] Updated weights for policy 1, policy_version 4412 (0.0010) +[2023-10-09 04:18:43,274][60143] Updated weights for policy 0, policy_version 4370 (0.0008) +[2023-10-09 04:18:43,637][60143] Updated weights for policy 0, policy_version 4380 (0.0007) +[2023-10-09 04:18:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 9011200. Throughput: 0: 1721.0, 1: 1734.4. Samples: 2265722. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) +[2023-10-09 04:18:46,053][59242] Avg episode reward: [(0, '7.360'), (1, '7.970')] +[2023-10-09 04:18:46,066][59934] Saving new best policy, reward=7.360! +[2023-10-09 04:18:46,066][60003] Saving new best policy, reward=7.970! +[2023-10-09 04:18:46,969][60144] Updated weights for policy 1, policy_version 4422 (0.0009) +[2023-10-09 04:18:47,338][60144] Updated weights for policy 1, policy_version 4432 (0.0009) +[2023-10-09 04:18:47,597][60143] Updated weights for policy 0, policy_version 4390 (0.0008) +[2023-10-09 04:18:47,700][60144] Updated weights for policy 1, policy_version 4442 (0.0009) +[2023-10-09 04:18:47,952][60143] Updated weights for policy 0, policy_version 4400 (0.0008) +[2023-10-09 04:18:48,330][60143] Updated weights for policy 0, policy_version 4410 (0.0009) +[2023-10-09 04:18:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 9076736. Throughput: 0: 1701.8, 1: 1707.8. Samples: 2275148. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:18:51,053][59242] Avg episode reward: [(0, '7.190'), (1, '7.860')] +[2023-10-09 04:18:51,657][60144] Updated weights for policy 1, policy_version 4452 (0.0007) +[2023-10-09 04:18:52,054][60144] Updated weights for policy 1, policy_version 4462 (0.0009) +[2023-10-09 04:18:52,347][60143] Updated weights for policy 0, policy_version 4420 (0.0008) +[2023-10-09 04:18:52,423][60144] Updated weights for policy 1, policy_version 4472 (0.0008) +[2023-10-09 04:18:52,718][60143] Updated weights for policy 0, policy_version 4430 (0.0007) +[2023-10-09 04:18:53,087][60143] Updated weights for policy 0, policy_version 4440 (0.0007) +[2023-10-09 04:18:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 9142272. Throughput: 0: 1706.4, 1: 1732.5. Samples: 2296008. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:18:56,053][59242] Avg episode reward: [(0, '7.310'), (1, '7.630')] +[2023-10-09 04:18:56,301][60144] Updated weights for policy 1, policy_version 4482 (0.0010) +[2023-10-09 04:18:56,677][60144] Updated weights for policy 1, policy_version 4492 (0.0007) +[2023-10-09 04:18:57,043][60144] Updated weights for policy 1, policy_version 4502 (0.0008) +[2023-10-09 04:18:57,059][60143] Updated weights for policy 0, policy_version 4450 (0.0009) +[2023-10-09 04:18:57,402][60144] Updated weights for policy 1, policy_version 4512 (0.0008) +[2023-10-09 04:18:57,446][60143] Updated weights for policy 0, policy_version 4460 (0.0007) +[2023-10-09 04:18:57,818][60143] Updated weights for policy 0, policy_version 4470 (0.0008) +[2023-10-09 04:18:58,191][60143] Updated weights for policy 0, policy_version 4480 (0.0007) +[2023-10-09 04:19:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 9207808. Throughput: 0: 1728.9, 1: 1735.7. Samples: 2317232. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:19:01,053][59242] Avg episode reward: [(0, '6.850'), (1, '7.780')] +[2023-10-09 04:19:01,379][60144] Updated weights for policy 1, policy_version 4522 (0.0009) +[2023-10-09 04:19:01,749][60144] Updated weights for policy 1, policy_version 4532 (0.0009) +[2023-10-09 04:19:02,107][60143] Updated weights for policy 0, policy_version 4490 (0.0008) +[2023-10-09 04:19:02,117][60144] Updated weights for policy 1, policy_version 4542 (0.0008) +[2023-10-09 04:19:02,474][60143] Updated weights for policy 0, policy_version 4500 (0.0009) +[2023-10-09 04:19:02,848][60143] Updated weights for policy 0, policy_version 4510 (0.0008) +[2023-10-09 04:19:06,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 9273344. Throughput: 0: 1701.3, 1: 1718.6. Samples: 2326514. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:19:06,052][59242] Avg episode reward: [(0, '6.670'), (1, '7.600')] +[2023-10-09 04:19:06,104][60144] Updated weights for policy 1, policy_version 4552 (0.0011) +[2023-10-09 04:19:06,473][60144] Updated weights for policy 1, policy_version 4562 (0.0009) +[2023-10-09 04:19:06,838][60144] Updated weights for policy 1, policy_version 4572 (0.0009) +[2023-10-09 04:19:06,867][60143] Updated weights for policy 0, policy_version 4520 (0.0008) +[2023-10-09 04:19:07,225][60143] Updated weights for policy 0, policy_version 4530 (0.0007) +[2023-10-09 04:19:07,599][60143] Updated weights for policy 0, policy_version 4540 (0.0007) +[2023-10-09 04:19:10,664][60144] Updated weights for policy 1, policy_version 4582 (0.0009) +[2023-10-09 04:19:11,042][60144] Updated weights for policy 1, policy_version 4592 (0.0009) +[2023-10-09 04:19:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 9338880. Throughput: 0: 1719.4, 1: 1733.9. Samples: 2347782. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-09 04:19:11,053][59242] Avg episode reward: [(0, '6.670'), (1, '7.740')] +[2023-10-09 04:19:11,413][60144] Updated weights for policy 1, policy_version 4602 (0.0008) +[2023-10-09 04:19:11,631][60143] Updated weights for policy 0, policy_version 4550 (0.0009) +[2023-10-09 04:19:12,003][60143] Updated weights for policy 0, policy_version 4560 (0.0008) +[2023-10-09 04:19:12,372][60143] Updated weights for policy 0, policy_version 4570 (0.0008) +[2023-10-09 04:19:15,523][60144] Updated weights for policy 1, policy_version 4612 (0.0008) +[2023-10-09 04:19:15,897][60144] Updated weights for policy 1, policy_version 4622 (0.0007) +[2023-10-09 04:19:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 9404416. Throughput: 0: 1729.3, 1: 1717.8. Samples: 2368610. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-09 04:19:16,053][59242] Avg episode reward: [(0, '6.760'), (1, '7.740')] +[2023-10-09 04:19:16,262][60144] Updated weights for policy 1, policy_version 4632 (0.0007) +[2023-10-09 04:19:16,286][60143] Updated weights for policy 0, policy_version 4580 (0.0009) +[2023-10-09 04:19:16,658][60143] Updated weights for policy 0, policy_version 4590 (0.0008) +[2023-10-09 04:19:17,037][60143] Updated weights for policy 0, policy_version 4600 (0.0009) +[2023-10-09 04:19:20,276][60144] Updated weights for policy 1, policy_version 4642 (0.0007) +[2023-10-09 04:19:20,641][60144] Updated weights for policy 1, policy_version 4652 (0.0011) +[2023-10-09 04:19:20,836][60143] Updated weights for policy 0, policy_version 4610 (0.0007) +[2023-10-09 04:19:21,009][60144] Updated weights for policy 1, policy_version 4662 (0.0008) +[2023-10-09 04:19:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 9469952. Throughput: 0: 1701.7, 1: 1726.9. Samples: 2378302. Policy #0 lag: (min: 10.0, avg: 23.1, max: 42.0) +[2023-10-09 04:19:21,053][59242] Avg episode reward: [(0, '7.140'), (1, '8.180')] +[2023-10-09 04:19:21,201][60143] Updated weights for policy 0, policy_version 4620 (0.0008) +[2023-10-09 04:19:21,377][60003] Saving new best policy, reward=8.180! +[2023-10-09 04:19:21,378][60144] Updated weights for policy 1, policy_version 4672 (0.0009) +[2023-10-09 04:19:21,583][60143] Updated weights for policy 0, policy_version 4630 (0.0008) +[2023-10-09 04:19:21,948][60143] Updated weights for policy 0, policy_version 4640 (0.0009) +[2023-10-09 04:19:25,317][60144] Updated weights for policy 1, policy_version 4682 (0.0008) +[2023-10-09 04:19:25,687][60144] Updated weights for policy 1, policy_version 4692 (0.0008) +[2023-10-09 04:19:25,884][60143] Updated weights for policy 0, policy_version 4650 (0.0008) +[2023-10-09 04:19:26,052][60144] Updated weights for policy 1, policy_version 4702 (0.0007) +[2023-10-09 04:19:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 9535488. Throughput: 0: 1729.8, 1: 1724.5. Samples: 2399812. Policy #0 lag: (min: 10.0, avg: 23.1, max: 42.0) +[2023-10-09 04:19:26,053][59242] Avg episode reward: [(0, '6.620'), (1, '8.070')] +[2023-10-09 04:19:26,247][60143] Updated weights for policy 0, policy_version 4660 (0.0007) +[2023-10-09 04:19:26,623][60143] Updated weights for policy 0, policy_version 4670 (0.0007) +[2023-10-09 04:19:29,846][60144] Updated weights for policy 1, policy_version 4712 (0.0010) +[2023-10-09 04:19:30,211][60144] Updated weights for policy 1, policy_version 4722 (0.0011) +[2023-10-09 04:19:30,580][60144] Updated weights for policy 1, policy_version 4732 (0.0009) +[2023-10-09 04:19:30,702][60143] Updated weights for policy 0, policy_version 4680 (0.0007) +[2023-10-09 04:19:31,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 9633792. Throughput: 0: 1724.6, 1: 1703.0. Samples: 2419966. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 04:19:31,053][59242] Avg episode reward: [(0, '6.670'), (1, '8.140')] +[2023-10-09 04:19:31,062][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000004736_4849664.pth... +[2023-10-09 04:19:31,079][60143] Updated weights for policy 0, policy_version 4690 (0.0008) +[2023-10-09 04:19:31,103][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000003104_3178496.pth +[2023-10-09 04:19:31,448][60143] Updated weights for policy 0, policy_version 4700 (0.0010) +[2023-10-09 04:19:31,595][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000004704_4816896.pth... +[2023-10-09 04:19:31,624][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000003104_3178496.pth +[2023-10-09 04:19:34,398][60144] Updated weights for policy 1, policy_version 4742 (0.0008) +[2023-10-09 04:19:34,762][60144] Updated weights for policy 1, policy_version 4752 (0.0007) +[2023-10-09 04:19:35,126][60144] Updated weights for policy 1, policy_version 4762 (0.0007) +[2023-10-09 04:19:35,533][60143] Updated weights for policy 0, policy_version 4710 (0.0008) +[2023-10-09 04:19:35,900][60143] Updated weights for policy 0, policy_version 4720 (0.0009) +[2023-10-09 04:19:36,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 9699328. Throughput: 0: 1720.4, 1: 1728.0. Samples: 2430330. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 04:19:36,053][59242] Avg episode reward: [(0, '7.070'), (1, '8.410')] +[2023-10-09 04:19:36,054][60003] Saving new best policy, reward=8.410! +[2023-10-09 04:19:36,266][60143] Updated weights for policy 0, policy_version 4730 (0.0009) +[2023-10-09 04:19:39,116][60144] Updated weights for policy 1, policy_version 4772 (0.0007) +[2023-10-09 04:19:39,517][60144] Updated weights for policy 1, policy_version 4782 (0.0007) +[2023-10-09 04:19:39,889][60144] Updated weights for policy 1, policy_version 4792 (0.0011) +[2023-10-09 04:19:40,195][60143] Updated weights for policy 0, policy_version 4740 (0.0008) +[2023-10-09 04:19:40,563][60143] Updated weights for policy 0, policy_version 4750 (0.0010) +[2023-10-09 04:19:40,935][60143] Updated weights for policy 0, policy_version 4760 (0.0010) +[2023-10-09 04:19:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 9764864. Throughput: 0: 1727.8, 1: 1715.0. Samples: 2450936. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) +[2023-10-09 04:19:41,053][59242] Avg episode reward: [(0, '7.420'), (1, '8.540')] +[2023-10-09 04:19:41,054][60003] Saving new best policy, reward=8.540! +[2023-10-09 04:19:41,236][59934] Saving new best policy, reward=7.420! +[2023-10-09 04:19:43,892][60144] Updated weights for policy 1, policy_version 4802 (0.0009) +[2023-10-09 04:19:44,266][60144] Updated weights for policy 1, policy_version 4812 (0.0009) +[2023-10-09 04:19:44,631][60144] Updated weights for policy 1, policy_version 4822 (0.0008) +[2023-10-09 04:19:44,962][60143] Updated weights for policy 0, policy_version 4770 (0.0009) +[2023-10-09 04:19:44,996][60144] Updated weights for policy 1, policy_version 4832 (0.0009) +[2023-10-09 04:19:45,387][60143] Updated weights for policy 0, policy_version 4780 (0.0008) +[2023-10-09 04:19:45,750][60143] Updated weights for policy 0, policy_version 4790 (0.0009) +[2023-10-09 04:19:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 9830400. Throughput: 0: 1711.8, 1: 1697.8. Samples: 2470664. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) +[2023-10-09 04:19:46,053][59242] Avg episode reward: [(0, '7.360'), (1, '8.800')] +[2023-10-09 04:19:46,063][60003] Saving new best policy, reward=8.800! +[2023-10-09 04:19:46,128][60143] Updated weights for policy 0, policy_version 4800 (0.0012) +[2023-10-09 04:19:49,106][60144] Updated weights for policy 1, policy_version 4842 (0.0008) +[2023-10-09 04:19:49,482][60144] Updated weights for policy 1, policy_version 4852 (0.0010) +[2023-10-09 04:19:49,843][60144] Updated weights for policy 1, policy_version 4862 (0.0007) +[2023-10-09 04:19:50,113][60143] Updated weights for policy 0, policy_version 4810 (0.0007) +[2023-10-09 04:19:50,477][60143] Updated weights for policy 0, policy_version 4820 (0.0009) +[2023-10-09 04:19:50,855][60143] Updated weights for policy 0, policy_version 4830 (0.0008) +[2023-10-09 04:19:51,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 9928704. Throughput: 0: 1721.6, 1: 1723.0. Samples: 2481522. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 04:19:51,053][59242] Avg episode reward: [(0, '7.210'), (1, '8.820')] +[2023-10-09 04:19:51,055][60003] Saving new best policy, reward=8.820! +[2023-10-09 04:19:53,887][60144] Updated weights for policy 1, policy_version 4872 (0.0008) +[2023-10-09 04:19:54,259][60144] Updated weights for policy 1, policy_version 4882 (0.0010) +[2023-10-09 04:19:54,631][60144] Updated weights for policy 1, policy_version 4892 (0.0010) +[2023-10-09 04:19:54,792][60143] Updated weights for policy 0, policy_version 4840 (0.0008) +[2023-10-09 04:19:55,170][60143] Updated weights for policy 0, policy_version 4850 (0.0008) +[2023-10-09 04:19:55,538][60143] Updated weights for policy 0, policy_version 4860 (0.0009) +[2023-10-09 04:19:56,052][59242] Fps is (10 sec: 16384.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 9994240. Throughput: 0: 1725.6, 1: 1695.9. Samples: 2501750. Policy #0 lag: (min: 12.0, avg: 19.9, max: 44.0) +[2023-10-09 04:19:56,052][59242] Avg episode reward: [(0, '7.380'), (1, '8.640')] +[2023-10-09 04:19:58,563][60144] Updated weights for policy 1, policy_version 4902 (0.0009) +[2023-10-09 04:19:58,937][60144] Updated weights for policy 1, policy_version 4912 (0.0008) +[2023-10-09 04:19:59,254][60143] Updated weights for policy 0, policy_version 4870 (0.0008) +[2023-10-09 04:19:59,304][60144] Updated weights for policy 1, policy_version 4922 (0.0007) +[2023-10-09 04:19:59,613][60143] Updated weights for policy 0, policy_version 4880 (0.0010) +[2023-10-09 04:19:59,996][60143] Updated weights for policy 0, policy_version 4890 (0.0010) +[2023-10-09 04:20:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 10059776. Throughput: 0: 1700.1, 1: 1704.6. Samples: 2521822. Policy #0 lag: (min: 12.0, avg: 19.9, max: 44.0) +[2023-10-09 04:20:01,053][59242] Avg episode reward: [(0, '7.550'), (1, '8.120')] +[2023-10-09 04:20:01,065][59934] Saving new best policy, reward=7.550! +[2023-10-09 04:20:03,318][60144] Updated weights for policy 1, policy_version 4932 (0.0009) +[2023-10-09 04:20:03,693][60144] Updated weights for policy 1, policy_version 4942 (0.0009) +[2023-10-09 04:20:03,754][60143] Updated weights for policy 0, policy_version 4900 (0.0008) +[2023-10-09 04:20:04,054][60144] Updated weights for policy 1, policy_version 4952 (0.0008) +[2023-10-09 04:20:04,116][60143] Updated weights for policy 0, policy_version 4910 (0.0008) +[2023-10-09 04:20:04,499][60143] Updated weights for policy 0, policy_version 4920 (0.0007) +[2023-10-09 04:20:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 10125312. Throughput: 0: 1731.5, 1: 1713.3. Samples: 2533316. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:20:06,053][59242] Avg episode reward: [(0, '8.020'), (1, '8.130')] +[2023-10-09 04:20:06,053][59934] Saving new best policy, reward=8.020! +[2023-10-09 04:20:07,898][60144] Updated weights for policy 1, policy_version 4962 (0.0008) +[2023-10-09 04:20:08,278][60144] Updated weights for policy 1, policy_version 4972 (0.0008) +[2023-10-09 04:20:08,397][60143] Updated weights for policy 0, policy_version 4930 (0.0008) +[2023-10-09 04:20:08,642][60144] Updated weights for policy 1, policy_version 4982 (0.0008) +[2023-10-09 04:20:08,765][60143] Updated weights for policy 0, policy_version 4940 (0.0009) +[2023-10-09 04:20:09,012][60144] Updated weights for policy 1, policy_version 4992 (0.0007) +[2023-10-09 04:20:09,133][60143] Updated weights for policy 0, policy_version 4950 (0.0008) +[2023-10-09 04:20:09,502][60143] Updated weights for policy 0, policy_version 4960 (0.0010) +[2023-10-09 04:20:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 10190848. Throughput: 0: 1697.6, 1: 1697.6. Samples: 2552594. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:20:11,053][59242] Avg episode reward: [(0, '7.850'), (1, '8.140')] +[2023-10-09 04:20:13,039][60144] Updated weights for policy 1, policy_version 5002 (0.0011) +[2023-10-09 04:20:13,405][60144] Updated weights for policy 1, policy_version 5012 (0.0008) +[2023-10-09 04:20:13,543][60143] Updated weights for policy 0, policy_version 4970 (0.0009) +[2023-10-09 04:20:13,770][60144] Updated weights for policy 1, policy_version 5022 (0.0008) +[2023-10-09 04:20:13,911][60143] Updated weights for policy 0, policy_version 4980 (0.0008) +[2023-10-09 04:20:14,284][60143] Updated weights for policy 0, policy_version 4990 (0.0007) +[2023-10-09 04:20:16,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 10256384. Throughput: 0: 1701.1, 1: 1714.5. Samples: 2573670. Policy #0 lag: (min: 8.0, avg: 29.7, max: 40.0) +[2023-10-09 04:20:16,053][59242] Avg episode reward: [(0, '8.010'), (1, '8.220')] +[2023-10-09 04:20:17,803][60144] Updated weights for policy 1, policy_version 5032 (0.0007) +[2023-10-09 04:20:18,168][60144] Updated weights for policy 1, policy_version 5042 (0.0009) +[2023-10-09 04:20:18,172][60143] Updated weights for policy 0, policy_version 5000 (0.0008) +[2023-10-09 04:20:18,543][60144] Updated weights for policy 1, policy_version 5052 (0.0009) +[2023-10-09 04:20:18,548][60143] Updated weights for policy 0, policy_version 5010 (0.0008) +[2023-10-09 04:20:18,921][60143] Updated weights for policy 0, policy_version 5020 (0.0009) +[2023-10-09 04:20:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 10321920. Throughput: 0: 1718.3, 1: 1692.7. Samples: 2583826. Policy #0 lag: (min: 8.0, avg: 29.7, max: 40.0) +[2023-10-09 04:20:21,053][59242] Avg episode reward: [(0, '8.520'), (1, '8.240')] +[2023-10-09 04:20:21,055][59934] Saving new best policy, reward=8.520! +[2023-10-09 04:20:22,398][60144] Updated weights for policy 1, policy_version 5062 (0.0008) +[2023-10-09 04:20:22,759][60144] Updated weights for policy 1, policy_version 5072 (0.0009) +[2023-10-09 04:20:23,098][60143] Updated weights for policy 0, policy_version 5030 (0.0009) +[2023-10-09 04:20:23,127][60144] Updated weights for policy 1, policy_version 5082 (0.0008) +[2023-10-09 04:20:23,462][60143] Updated weights for policy 0, policy_version 5040 (0.0007) +[2023-10-09 04:20:23,829][60143] Updated weights for policy 0, policy_version 5050 (0.0007) +[2023-10-09 04:20:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 10387456. Throughput: 0: 1706.1, 1: 1707.2. Samples: 2604536. Policy #0 lag: (min: 27.0, avg: 37.7, max: 59.0) +[2023-10-09 04:20:26,053][59242] Avg episode reward: [(0, '8.450'), (1, '8.350')] +[2023-10-09 04:20:26,968][60144] Updated weights for policy 1, policy_version 5092 (0.0009) +[2023-10-09 04:20:27,363][60144] Updated weights for policy 1, policy_version 5102 (0.0009) +[2023-10-09 04:20:27,733][60144] Updated weights for policy 1, policy_version 5112 (0.0008) +[2023-10-09 04:20:27,777][60143] Updated weights for policy 0, policy_version 5060 (0.0008) +[2023-10-09 04:20:28,150][60143] Updated weights for policy 0, policy_version 5070 (0.0008) +[2023-10-09 04:20:28,526][60143] Updated weights for policy 0, policy_version 5080 (0.0010) +[2023-10-09 04:20:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 10452992. Throughput: 0: 1718.4, 1: 1724.9. Samples: 2625610. Policy #0 lag: (min: 27.0, avg: 37.7, max: 59.0) +[2023-10-09 04:20:31,052][59242] Avg episode reward: [(0, '8.260'), (1, '8.320')] +[2023-10-09 04:20:31,780][60144] Updated weights for policy 1, policy_version 5122 (0.0009) +[2023-10-09 04:20:32,155][60144] Updated weights for policy 1, policy_version 5132 (0.0009) +[2023-10-09 04:20:32,484][60143] Updated weights for policy 0, policy_version 5090 (0.0009) +[2023-10-09 04:20:32,526][60144] Updated weights for policy 1, policy_version 5142 (0.0007) +[2023-10-09 04:20:32,866][60143] Updated weights for policy 0, policy_version 5100 (0.0008) +[2023-10-09 04:20:32,895][60144] Updated weights for policy 1, policy_version 5152 (0.0007) +[2023-10-09 04:20:33,244][60143] Updated weights for policy 0, policy_version 5110 (0.0010) +[2023-10-09 04:20:33,614][60143] Updated weights for policy 0, policy_version 5120 (0.0010) +[2023-10-09 04:20:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 10518528. Throughput: 0: 1710.0, 1: 1701.0. Samples: 2635016. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:20:36,052][59242] Avg episode reward: [(0, '8.290'), (1, '8.760')] +[2023-10-09 04:20:36,786][60144] Updated weights for policy 1, policy_version 5162 (0.0009) +[2023-10-09 04:20:37,164][60144] Updated weights for policy 1, policy_version 5172 (0.0007) +[2023-10-09 04:20:37,532][60144] Updated weights for policy 1, policy_version 5182 (0.0007) +[2023-10-09 04:20:37,855][60143] Updated weights for policy 0, policy_version 5130 (0.0007) +[2023-10-09 04:20:38,230][60143] Updated weights for policy 0, policy_version 5140 (0.0008) +[2023-10-09 04:20:38,599][60143] Updated weights for policy 0, policy_version 5150 (0.0009) +[2023-10-09 04:20:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 10584064. Throughput: 0: 1700.5, 1: 1726.8. Samples: 2655980. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:20:41,052][59242] Avg episode reward: [(0, '8.280'), (1, '8.700')] +[2023-10-09 04:20:41,423][60144] Updated weights for policy 1, policy_version 5192 (0.0007) +[2023-10-09 04:20:41,794][60144] Updated weights for policy 1, policy_version 5202 (0.0009) +[2023-10-09 04:20:42,163][60144] Updated weights for policy 1, policy_version 5212 (0.0010) +[2023-10-09 04:20:42,493][60143] Updated weights for policy 0, policy_version 5160 (0.0008) +[2023-10-09 04:20:42,864][60143] Updated weights for policy 0, policy_version 5170 (0.0007) +[2023-10-09 04:20:43,228][60143] Updated weights for policy 0, policy_version 5180 (0.0008) +[2023-10-09 04:20:45,959][60144] Updated weights for policy 1, policy_version 5222 (0.0008) +[2023-10-09 04:20:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 10649600. Throughput: 0: 1727.8, 1: 1734.7. Samples: 2677632. Policy #0 lag: (min: 26.0, avg: 30.7, max: 58.0) +[2023-10-09 04:20:46,053][59242] Avg episode reward: [(0, '8.850'), (1, '9.110')] +[2023-10-09 04:20:46,060][59934] Saving new best policy, reward=8.850! +[2023-10-09 04:20:46,329][60144] Updated weights for policy 1, policy_version 5232 (0.0009) +[2023-10-09 04:20:46,701][60144] Updated weights for policy 1, policy_version 5242 (0.0007) +[2023-10-09 04:20:46,916][60003] Saving new best policy, reward=9.110! +[2023-10-09 04:20:47,138][60143] Updated weights for policy 0, policy_version 5190 (0.0009) +[2023-10-09 04:20:47,504][60143] Updated weights for policy 0, policy_version 5200 (0.0011) +[2023-10-09 04:20:47,876][60143] Updated weights for policy 0, policy_version 5210 (0.0010) +[2023-10-09 04:20:50,640][60144] Updated weights for policy 1, policy_version 5252 (0.0008) +[2023-10-09 04:20:51,014][60144] Updated weights for policy 1, policy_version 5262 (0.0010) +[2023-10-09 04:20:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 10715136. Throughput: 0: 1693.2, 1: 1719.8. Samples: 2686902. Policy #0 lag: (min: 26.0, avg: 30.7, max: 58.0) +[2023-10-09 04:20:51,052][59242] Avg episode reward: [(0, '9.030'), (1, '8.820')] +[2023-10-09 04:20:51,053][59934] Saving new best policy, reward=9.030! +[2023-10-09 04:20:51,385][60144] Updated weights for policy 1, policy_version 5272 (0.0009) +[2023-10-09 04:20:51,739][60143] Updated weights for policy 0, policy_version 5220 (0.0010) +[2023-10-09 04:20:52,118][60143] Updated weights for policy 0, policy_version 5230 (0.0008) +[2023-10-09 04:20:52,483][60143] Updated weights for policy 0, policy_version 5240 (0.0009) +[2023-10-09 04:20:55,386][60144] Updated weights for policy 1, policy_version 5282 (0.0009) +[2023-10-09 04:20:55,759][60144] Updated weights for policy 1, policy_version 5292 (0.0009) +[2023-10-09 04:20:56,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13662.6). Total num frames: 10780672. Throughput: 0: 1719.1, 1: 1741.1. Samples: 2708306. Policy #0 lag: (min: 1.0, avg: 12.6, max: 33.0) +[2023-10-09 04:20:56,053][59242] Avg episode reward: [(0, '8.880'), (1, '8.550')] +[2023-10-09 04:20:56,117][60144] Updated weights for policy 1, policy_version 5302 (0.0008) +[2023-10-09 04:20:56,440][60143] Updated weights for policy 0, policy_version 5250 (0.0008) +[2023-10-09 04:20:56,483][60144] Updated weights for policy 1, policy_version 5312 (0.0009) +[2023-10-09 04:20:56,811][60143] Updated weights for policy 0, policy_version 5260 (0.0009) +[2023-10-09 04:20:57,182][60143] Updated weights for policy 0, policy_version 5270 (0.0008) +[2023-10-09 04:20:57,561][60143] Updated weights for policy 0, policy_version 5280 (0.0007) +[2023-10-09 04:21:00,442][60144] Updated weights for policy 1, policy_version 5322 (0.0008) +[2023-10-09 04:21:00,816][60144] Updated weights for policy 1, policy_version 5332 (0.0007) +[2023-10-09 04:21:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 10846208. Throughput: 0: 1720.3, 1: 1730.4. Samples: 2728950. Policy #0 lag: (min: 1.0, avg: 12.6, max: 33.0) +[2023-10-09 04:21:01,053][59242] Avg episode reward: [(0, '8.690'), (1, '8.720')] +[2023-10-09 04:21:01,178][60144] Updated weights for policy 1, policy_version 5342 (0.0009) +[2023-10-09 04:21:01,457][60143] Updated weights for policy 0, policy_version 5290 (0.0008) +[2023-10-09 04:21:01,831][60143] Updated weights for policy 0, policy_version 5300 (0.0008) +[2023-10-09 04:21:02,202][60143] Updated weights for policy 0, policy_version 5310 (0.0009) +[2023-10-09 04:21:05,258][60144] Updated weights for policy 1, policy_version 5352 (0.0008) +[2023-10-09 04:21:05,626][60144] Updated weights for policy 1, policy_version 5362 (0.0009) +[2023-10-09 04:21:06,001][60144] Updated weights for policy 1, policy_version 5372 (0.0009) +[2023-10-09 04:21:06,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 10911744. Throughput: 0: 1705.9, 1: 1737.8. Samples: 2738790. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-09 04:21:06,052][59242] Avg episode reward: [(0, '8.620'), (1, '8.770')] +[2023-10-09 04:21:06,144][60143] Updated weights for policy 0, policy_version 5320 (0.0007) +[2023-10-09 04:21:06,523][60143] Updated weights for policy 0, policy_version 5330 (0.0008) +[2023-10-09 04:21:06,895][60143] Updated weights for policy 0, policy_version 5340 (0.0008) +[2023-10-09 04:21:09,816][60144] Updated weights for policy 1, policy_version 5382 (0.0009) +[2023-10-09 04:21:10,181][60144] Updated weights for policy 1, policy_version 5392 (0.0009) +[2023-10-09 04:21:10,556][60144] Updated weights for policy 1, policy_version 5402 (0.0010) +[2023-10-09 04:21:10,858][60143] Updated weights for policy 0, policy_version 5350 (0.0009) +[2023-10-09 04:21:11,052][59242] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 11010048. Throughput: 0: 1719.2, 1: 1742.0. Samples: 2760290. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-09 04:21:11,053][59242] Avg episode reward: [(0, '9.410'), (1, '8.660')] +[2023-10-09 04:21:11,233][60143] Updated weights for policy 0, policy_version 5360 (0.0007) +[2023-10-09 04:21:11,599][60143] Updated weights for policy 0, policy_version 5370 (0.0008) +[2023-10-09 04:21:11,818][59934] Saving new best policy, reward=9.410! +[2023-10-09 04:21:14,537][60144] Updated weights for policy 1, policy_version 5412 (0.0009) +[2023-10-09 04:21:14,935][60144] Updated weights for policy 1, policy_version 5422 (0.0009) +[2023-10-09 04:21:15,303][60144] Updated weights for policy 1, policy_version 5432 (0.0010) +[2023-10-09 04:21:15,452][60143] Updated weights for policy 0, policy_version 5380 (0.0009) +[2023-10-09 04:21:15,814][60143] Updated weights for policy 0, policy_version 5390 (0.0009) +[2023-10-09 04:21:16,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 11075584. Throughput: 0: 1722.6, 1: 1716.1. Samples: 2780352. Policy #0 lag: (min: 4.0, avg: 15.1, max: 36.0) +[2023-10-09 04:21:16,052][59242] Avg episode reward: [(0, '9.260'), (1, '8.750')] +[2023-10-09 04:21:16,187][60143] Updated weights for policy 0, policy_version 5400 (0.0010) +[2023-10-09 04:21:19,199][60144] Updated weights for policy 1, policy_version 5442 (0.0008) +[2023-10-09 04:21:19,570][60144] Updated weights for policy 1, policy_version 5452 (0.0009) +[2023-10-09 04:21:19,936][60144] Updated weights for policy 1, policy_version 5462 (0.0010) +[2023-10-09 04:21:20,304][60144] Updated weights for policy 1, policy_version 5472 (0.0009) +[2023-10-09 04:21:20,379][60143] Updated weights for policy 0, policy_version 5410 (0.0011) +[2023-10-09 04:21:20,785][60143] Updated weights for policy 0, policy_version 5420 (0.0010) +[2023-10-09 04:21:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 11141120. Throughput: 0: 1723.0, 1: 1744.2. Samples: 2791040. Policy #0 lag: (min: 4.0, avg: 15.1, max: 36.0) +[2023-10-09 04:21:21,053][59242] Avg episode reward: [(0, '9.520'), (1, '8.700')] +[2023-10-09 04:21:21,162][60143] Updated weights for policy 0, policy_version 5430 (0.0007) +[2023-10-09 04:21:21,532][59934] Saving new best policy, reward=9.520! +[2023-10-09 04:21:21,534][60143] Updated weights for policy 0, policy_version 5440 (0.0008) +[2023-10-09 04:21:24,180][60144] Updated weights for policy 1, policy_version 5482 (0.0008) +[2023-10-09 04:21:24,552][60144] Updated weights for policy 1, policy_version 5492 (0.0007) +[2023-10-09 04:21:24,922][60144] Updated weights for policy 1, policy_version 5502 (0.0008) +[2023-10-09 04:21:25,462][60143] Updated weights for policy 0, policy_version 5450 (0.0008) +[2023-10-09 04:21:25,845][60143] Updated weights for policy 0, policy_version 5460 (0.0008) +[2023-10-09 04:21:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 11206656. Throughput: 0: 1726.5, 1: 1728.3. Samples: 2811446. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-09 04:21:26,052][59242] Avg episode reward: [(0, '9.090'), (1, '9.310')] +[2023-10-09 04:21:26,053][60003] Saving new best policy, reward=9.310! +[2023-10-09 04:21:26,210][60143] Updated weights for policy 0, policy_version 5470 (0.0007) +[2023-10-09 04:21:28,877][60144] Updated weights for policy 1, policy_version 5512 (0.0008) +[2023-10-09 04:21:29,251][60144] Updated weights for policy 1, policy_version 5522 (0.0009) +[2023-10-09 04:21:29,613][60144] Updated weights for policy 1, policy_version 5532 (0.0010) +[2023-10-09 04:21:30,314][60143] Updated weights for policy 0, policy_version 5480 (0.0010) +[2023-10-09 04:21:30,690][60143] Updated weights for policy 0, policy_version 5490 (0.0008) +[2023-10-09 04:21:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 11272192. Throughput: 0: 1714.9, 1: 1711.8. Samples: 2831836. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-09 04:21:31,053][59242] Avg episode reward: [(0, '8.740'), (1, '9.350')] +[2023-10-09 04:21:31,060][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000005536_5668864.pth... +[2023-10-09 04:21:31,069][60143] Updated weights for policy 0, policy_version 5500 (0.0008) +[2023-10-09 04:21:31,093][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000003936_4030464.pth +[2023-10-09 04:21:31,098][60003] Saving new best policy, reward=9.350! +[2023-10-09 04:21:31,213][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000005504_5636096.pth... +[2023-10-09 04:21:31,242][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000003904_3997696.pth +[2023-10-09 04:21:33,302][60144] Updated weights for policy 1, policy_version 5542 (0.0009) +[2023-10-09 04:21:33,671][60144] Updated weights for policy 1, policy_version 5552 (0.0008) +[2023-10-09 04:21:34,045][60144] Updated weights for policy 1, policy_version 5562 (0.0010) +[2023-10-09 04:21:34,948][60143] Updated weights for policy 0, policy_version 5510 (0.0010) +[2023-10-09 04:21:35,316][60143] Updated weights for policy 0, policy_version 5520 (0.0007) +[2023-10-09 04:21:35,686][60143] Updated weights for policy 0, policy_version 5530 (0.0008) +[2023-10-09 04:21:36,052][59242] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 11370496. Throughput: 0: 1728.1, 1: 1731.6. Samples: 2842590. Policy #0 lag: (min: 17.0, avg: 28.9, max: 49.0) +[2023-10-09 04:21:36,053][59242] Avg episode reward: [(0, '8.840'), (1, '9.150')] +[2023-10-09 04:21:37,963][60144] Updated weights for policy 1, policy_version 5572 (0.0008) +[2023-10-09 04:21:38,318][60144] Updated weights for policy 1, policy_version 5582 (0.0008) +[2023-10-09 04:21:38,691][60144] Updated weights for policy 1, policy_version 5592 (0.0007) +[2023-10-09 04:21:39,665][60143] Updated weights for policy 0, policy_version 5540 (0.0008) +[2023-10-09 04:21:40,044][60143] Updated weights for policy 0, policy_version 5550 (0.0008) +[2023-10-09 04:21:40,414][60143] Updated weights for policy 0, policy_version 5560 (0.0009) +[2023-10-09 04:21:41,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 11436032. Throughput: 0: 1728.9, 1: 1714.5. Samples: 2863258. Policy #0 lag: (min: 17.0, avg: 29.9, max: 49.0) +[2023-10-09 04:21:41,053][59242] Avg episode reward: [(0, '9.080'), (1, '9.370')] +[2023-10-09 04:21:41,055][60003] Saving new best policy, reward=9.370! +[2023-10-09 04:21:42,615][60144] Updated weights for policy 1, policy_version 5602 (0.0009) +[2023-10-09 04:21:42,982][60144] Updated weights for policy 1, policy_version 5612 (0.0008) +[2023-10-09 04:21:43,340][60144] Updated weights for policy 1, policy_version 5622 (0.0007) +[2023-10-09 04:21:43,708][60144] Updated weights for policy 1, policy_version 5632 (0.0009) +[2023-10-09 04:21:44,146][60143] Updated weights for policy 0, policy_version 5570 (0.0007) +[2023-10-09 04:21:44,511][60143] Updated weights for policy 0, policy_version 5580 (0.0008) +[2023-10-09 04:21:44,893][60143] Updated weights for policy 0, policy_version 5590 (0.0010) +[2023-10-09 04:21:45,260][60143] Updated weights for policy 0, policy_version 5600 (0.0009) +[2023-10-09 04:21:46,052][59242] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 11501568. Throughput: 0: 1697.2, 1: 1735.0. Samples: 2883400. Policy #0 lag: (min: 17.0, avg: 29.9, max: 49.0) +[2023-10-09 04:21:46,053][59242] Avg episode reward: [(0, '8.370'), (1, '9.470')] +[2023-10-09 04:21:46,064][60003] Saving new best policy, reward=9.470! +[2023-10-09 04:21:47,603][60144] Updated weights for policy 1, policy_version 5642 (0.0010) +[2023-10-09 04:21:47,979][60144] Updated weights for policy 1, policy_version 5652 (0.0008) +[2023-10-09 04:21:48,345][60144] Updated weights for policy 1, policy_version 5662 (0.0008) +[2023-10-09 04:21:49,272][60143] Updated weights for policy 0, policy_version 5610 (0.0008) +[2023-10-09 04:21:49,649][60143] Updated weights for policy 0, policy_version 5620 (0.0008) +[2023-10-09 04:21:50,020][60143] Updated weights for policy 0, policy_version 5630 (0.0007) +[2023-10-09 04:21:51,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 11567104. Throughput: 0: 1726.2, 1: 1724.5. Samples: 2894070. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:21:51,053][59242] Avg episode reward: [(0, '8.630'), (1, '9.720')] +[2023-10-09 04:21:51,054][60003] Saving new best policy, reward=9.720! +[2023-10-09 04:21:52,432][60144] Updated weights for policy 1, policy_version 5672 (0.0008) +[2023-10-09 04:21:52,796][60144] Updated weights for policy 1, policy_version 5682 (0.0008) +[2023-10-09 04:21:53,165][60144] Updated weights for policy 1, policy_version 5692 (0.0009) +[2023-10-09 04:21:54,071][60143] Updated weights for policy 0, policy_version 5640 (0.0009) +[2023-10-09 04:21:54,441][60143] Updated weights for policy 0, policy_version 5650 (0.0008) +[2023-10-09 04:21:54,809][60143] Updated weights for policy 0, policy_version 5660 (0.0008) +[2023-10-09 04:21:56,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 11632640. Throughput: 0: 1706.4, 1: 1721.8. Samples: 2914558. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:21:56,053][59242] Avg episode reward: [(0, '8.190'), (1, '9.640')] +[2023-10-09 04:21:56,959][60144] Updated weights for policy 1, policy_version 5702 (0.0009) +[2023-10-09 04:21:57,323][60144] Updated weights for policy 1, policy_version 5712 (0.0011) +[2023-10-09 04:21:57,694][60144] Updated weights for policy 1, policy_version 5722 (0.0009) +[2023-10-09 04:21:58,751][60143] Updated weights for policy 0, policy_version 5670 (0.0009) +[2023-10-09 04:21:59,124][60143] Updated weights for policy 0, policy_version 5680 (0.0010) +[2023-10-09 04:21:59,493][60143] Updated weights for policy 0, policy_version 5690 (0.0010) +[2023-10-09 04:22:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 11698176. Throughput: 0: 1692.4, 1: 1752.6. Samples: 2935374. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) +[2023-10-09 04:22:01,053][59242] Avg episode reward: [(0, '8.570'), (1, '9.520')] +[2023-10-09 04:22:01,601][60144] Updated weights for policy 1, policy_version 5732 (0.0008) +[2023-10-09 04:22:01,994][60144] Updated weights for policy 1, policy_version 5742 (0.0010) +[2023-10-09 04:22:02,360][60144] Updated weights for policy 1, policy_version 5752 (0.0007) +[2023-10-09 04:22:03,546][60143] Updated weights for policy 0, policy_version 5700 (0.0008) +[2023-10-09 04:22:03,916][60143] Updated weights for policy 0, policy_version 5710 (0.0009) +[2023-10-09 04:22:04,286][60143] Updated weights for policy 0, policy_version 5720 (0.0007) +[2023-10-09 04:22:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 11763712. Throughput: 0: 1718.6, 1: 1720.3. Samples: 2945790. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) +[2023-10-09 04:22:06,053][59242] Avg episode reward: [(0, '8.530'), (1, '9.950')] +[2023-10-09 04:22:06,055][60003] Saving new best policy, reward=9.950! +[2023-10-09 04:22:06,302][60144] Updated weights for policy 1, policy_version 5762 (0.0008) +[2023-10-09 04:22:06,671][60144] Updated weights for policy 1, policy_version 5772 (0.0007) +[2023-10-09 04:22:07,040][60144] Updated weights for policy 1, policy_version 5782 (0.0007) +[2023-10-09 04:22:07,412][60144] Updated weights for policy 1, policy_version 5792 (0.0007) +[2023-10-09 04:22:08,530][60143] Updated weights for policy 0, policy_version 5730 (0.0009) +[2023-10-09 04:22:09,015][60143] Updated weights for policy 0, policy_version 5742 (0.0009) +[2023-10-09 04:22:09,384][60143] Updated weights for policy 0, policy_version 5752 (0.0009) +[2023-10-09 04:22:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 11829248. Throughput: 0: 1695.7, 1: 1741.2. Samples: 2966110. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:22:11,053][59242] Avg episode reward: [(0, '9.010'), (1, '9.820')] +[2023-10-09 04:22:11,569][60144] Updated weights for policy 1, policy_version 5802 (0.0007) +[2023-10-09 04:22:11,933][60144] Updated weights for policy 1, policy_version 5812 (0.0007) +[2023-10-09 04:22:12,309][60144] Updated weights for policy 1, policy_version 5822 (0.0007) +[2023-10-09 04:22:13,433][60143] Updated weights for policy 0, policy_version 5762 (0.0010) +[2023-10-09 04:22:13,802][60143] Updated weights for policy 0, policy_version 5772 (0.0011) +[2023-10-09 04:22:14,169][60143] Updated weights for policy 0, policy_version 5782 (0.0010) +[2023-10-09 04:22:14,536][60143] Updated weights for policy 0, policy_version 5792 (0.0010) +[2023-10-09 04:22:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 11894784. Throughput: 0: 1693.3, 1: 1751.3. Samples: 2986844. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:22:16,053][59242] Avg episode reward: [(0, '9.060'), (1, '10.110')] +[2023-10-09 04:22:16,190][60144] Updated weights for policy 1, policy_version 5832 (0.0009) +[2023-10-09 04:22:16,560][60144] Updated weights for policy 1, policy_version 5842 (0.0008) +[2023-10-09 04:22:16,933][60144] Updated weights for policy 1, policy_version 5852 (0.0011) +[2023-10-09 04:22:17,082][60003] Saving new best policy, reward=10.110! +[2023-10-09 04:22:18,594][60143] Updated weights for policy 0, policy_version 5802 (0.0010) +[2023-10-09 04:22:18,964][60143] Updated weights for policy 0, policy_version 5812 (0.0008) +[2023-10-09 04:22:19,332][60143] Updated weights for policy 0, policy_version 5822 (0.0007) +[2023-10-09 04:22:20,886][60144] Updated weights for policy 1, policy_version 5862 (0.0008) +[2023-10-09 04:22:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 11960320. Throughput: 0: 1703.2, 1: 1731.4. Samples: 2997148. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:22:21,053][59242] Avg episode reward: [(0, '9.080'), (1, '9.850')] +[2023-10-09 04:22:21,256][60144] Updated weights for policy 1, policy_version 5872 (0.0007) +[2023-10-09 04:22:21,624][60144] Updated weights for policy 1, policy_version 5882 (0.0007) +[2023-10-09 04:22:23,272][60143] Updated weights for policy 0, policy_version 5832 (0.0009) +[2023-10-09 04:22:23,647][60143] Updated weights for policy 0, policy_version 5842 (0.0007) +[2023-10-09 04:22:24,009][60143] Updated weights for policy 0, policy_version 5852 (0.0008) +[2023-10-09 04:22:25,476][60144] Updated weights for policy 1, policy_version 5892 (0.0007) +[2023-10-09 04:22:25,840][60144] Updated weights for policy 1, policy_version 5902 (0.0009) +[2023-10-09 04:22:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 12025856. Throughput: 0: 1680.7, 1: 1746.6. Samples: 3017486. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:22:26,053][59242] Avg episode reward: [(0, '9.070'), (1, '9.900')] +[2023-10-09 04:22:26,215][60144] Updated weights for policy 1, policy_version 5912 (0.0010) +[2023-10-09 04:22:27,710][60143] Updated weights for policy 0, policy_version 5862 (0.0010) +[2023-10-09 04:22:28,079][60143] Updated weights for policy 0, policy_version 5872 (0.0008) +[2023-10-09 04:22:28,444][60143] Updated weights for policy 0, policy_version 5882 (0.0010) +[2023-10-09 04:22:30,027][60144] Updated weights for policy 1, policy_version 5922 (0.0011) +[2023-10-09 04:22:30,390][60144] Updated weights for policy 1, policy_version 5932 (0.0010) +[2023-10-09 04:22:30,755][60144] Updated weights for policy 1, policy_version 5942 (0.0007) +[2023-10-09 04:22:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 12091392. Throughput: 0: 1710.4, 1: 1729.3. Samples: 3038186. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) +[2023-10-09 04:22:31,052][59242] Avg episode reward: [(0, '9.480'), (1, '9.880')] +[2023-10-09 04:22:31,132][60144] Updated weights for policy 1, policy_version 5952 (0.0007) +[2023-10-09 04:22:32,469][60143] Updated weights for policy 0, policy_version 5892 (0.0011) +[2023-10-09 04:22:32,844][60143] Updated weights for policy 0, policy_version 5902 (0.0007) +[2023-10-09 04:22:33,218][60143] Updated weights for policy 0, policy_version 5912 (0.0009) +[2023-10-09 04:22:34,971][60144] Updated weights for policy 1, policy_version 5962 (0.0007) +[2023-10-09 04:22:35,337][60144] Updated weights for policy 1, policy_version 5972 (0.0009) +[2023-10-09 04:22:35,698][60144] Updated weights for policy 1, policy_version 5982 (0.0010) +[2023-10-09 04:22:36,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 12189696. Throughput: 0: 1682.9, 1: 1746.8. Samples: 3048408. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) +[2023-10-09 04:22:36,053][59242] Avg episode reward: [(0, '9.930'), (1, '9.370')] +[2023-10-09 04:22:36,054][59934] Saving new best policy, reward=9.930! +[2023-10-09 04:22:37,059][60143] Updated weights for policy 0, policy_version 5922 (0.0009) +[2023-10-09 04:22:37,434][60143] Updated weights for policy 0, policy_version 5932 (0.0009) +[2023-10-09 04:22:37,799][60143] Updated weights for policy 0, policy_version 5942 (0.0009) +[2023-10-09 04:22:38,172][60143] Updated weights for policy 0, policy_version 5952 (0.0009) +[2023-10-09 04:22:39,663][60144] Updated weights for policy 1, policy_version 5992 (0.0007) +[2023-10-09 04:22:40,037][60144] Updated weights for policy 1, policy_version 6002 (0.0009) +[2023-10-09 04:22:40,405][60144] Updated weights for policy 1, policy_version 6012 (0.0009) +[2023-10-09 04:22:41,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 12255232. Throughput: 0: 1702.1, 1: 1740.8. Samples: 3069488. Policy #0 lag: (min: 17.0, avg: 25.2, max: 49.0) +[2023-10-09 04:22:41,053][59242] Avg episode reward: [(0, '9.840'), (1, '9.830')] +[2023-10-09 04:22:42,148][60143] Updated weights for policy 0, policy_version 5962 (0.0007) +[2023-10-09 04:22:42,517][60143] Updated weights for policy 0, policy_version 5972 (0.0007) +[2023-10-09 04:22:42,891][60143] Updated weights for policy 0, policy_version 5982 (0.0008) +[2023-10-09 04:22:44,209][60144] Updated weights for policy 1, policy_version 6022 (0.0009) +[2023-10-09 04:22:44,574][60144] Updated weights for policy 1, policy_version 6032 (0.0009) +[2023-10-09 04:22:44,946][60144] Updated weights for policy 1, policy_version 6042 (0.0007) +[2023-10-09 04:22:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 12320768. Throughput: 0: 1716.7, 1: 1718.2. Samples: 3089942. Policy #0 lag: (min: 17.0, avg: 25.2, max: 49.0) +[2023-10-09 04:22:46,053][59242] Avg episode reward: [(0, '9.700'), (1, '9.110')] +[2023-10-09 04:22:46,892][60143] Updated weights for policy 0, policy_version 5992 (0.0009) +[2023-10-09 04:22:47,266][60143] Updated weights for policy 0, policy_version 6002 (0.0008) +[2023-10-09 04:22:47,630][60143] Updated weights for policy 0, policy_version 6012 (0.0007) +[2023-10-09 04:22:49,086][60144] Updated weights for policy 1, policy_version 6052 (0.0009) +[2023-10-09 04:22:49,487][60144] Updated weights for policy 1, policy_version 6062 (0.0008) +[2023-10-09 04:22:49,865][60144] Updated weights for policy 1, policy_version 6072 (0.0009) +[2023-10-09 04:22:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 12386304. Throughput: 0: 1683.6, 1: 1750.1. Samples: 3100306. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 04:22:51,053][59242] Avg episode reward: [(0, '9.860'), (1, '8.800')] +[2023-10-09 04:22:51,561][60143] Updated weights for policy 0, policy_version 6022 (0.0009) +[2023-10-09 04:22:51,934][60143] Updated weights for policy 0, policy_version 6032 (0.0011) +[2023-10-09 04:22:52,308][60143] Updated weights for policy 0, policy_version 6042 (0.0011) +[2023-10-09 04:22:53,614][60144] Updated weights for policy 1, policy_version 6082 (0.0009) +[2023-10-09 04:22:53,991][60144] Updated weights for policy 1, policy_version 6092 (0.0009) +[2023-10-09 04:22:54,356][60144] Updated weights for policy 1, policy_version 6102 (0.0007) +[2023-10-09 04:22:54,728][60144] Updated weights for policy 1, policy_version 6112 (0.0008) +[2023-10-09 04:22:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 12451840. Throughput: 0: 1713.1, 1: 1718.3. Samples: 3120524. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 04:22:56,053][59242] Avg episode reward: [(0, '10.010'), (1, '9.060')] +[2023-10-09 04:22:56,303][60143] Updated weights for policy 0, policy_version 6052 (0.0010) +[2023-10-09 04:22:56,695][60143] Updated weights for policy 0, policy_version 6062 (0.0007) +[2023-10-09 04:22:57,067][60143] Updated weights for policy 0, policy_version 6072 (0.0010) +[2023-10-09 04:22:57,355][59934] Saving new best policy, reward=10.010! +[2023-10-09 04:22:58,688][60144] Updated weights for policy 1, policy_version 6122 (0.0011) +[2023-10-09 04:22:59,062][60144] Updated weights for policy 1, policy_version 6132 (0.0010) +[2023-10-09 04:22:59,422][60144] Updated weights for policy 1, policy_version 6142 (0.0008) +[2023-10-09 04:23:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 12517376. Throughput: 0: 1716.6, 1: 1717.9. Samples: 3141394. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-09 04:23:01,053][59242] Avg episode reward: [(0, '10.220'), (1, '9.830')] +[2023-10-09 04:23:01,199][60143] Updated weights for policy 0, policy_version 6082 (0.0009) +[2023-10-09 04:23:01,577][60143] Updated weights for policy 0, policy_version 6092 (0.0007) +[2023-10-09 04:23:01,947][60143] Updated weights for policy 0, policy_version 6102 (0.0009) +[2023-10-09 04:23:02,310][59934] Saving new best policy, reward=10.220! +[2023-10-09 04:23:02,310][60143] Updated weights for policy 0, policy_version 6112 (0.0010) +[2023-10-09 04:23:03,191][60144] Updated weights for policy 1, policy_version 6152 (0.0008) +[2023-10-09 04:23:03,567][60144] Updated weights for policy 1, policy_version 6162 (0.0010) +[2023-10-09 04:23:03,928][60144] Updated weights for policy 1, policy_version 6172 (0.0009) +[2023-10-09 04:23:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 12582912. Throughput: 0: 1696.1, 1: 1734.4. Samples: 3151522. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-09 04:23:06,052][59242] Avg episode reward: [(0, '10.570'), (1, '9.930')] +[2023-10-09 04:23:06,053][59934] Saving new best policy, reward=10.570! +[2023-10-09 04:23:06,556][60143] Updated weights for policy 0, policy_version 6122 (0.0007) +[2023-10-09 04:23:06,929][60143] Updated weights for policy 0, policy_version 6132 (0.0007) +[2023-10-09 04:23:07,302][60143] Updated weights for policy 0, policy_version 6142 (0.0007) +[2023-10-09 04:23:07,791][60144] Updated weights for policy 1, policy_version 6182 (0.0009) +[2023-10-09 04:23:08,162][60144] Updated weights for policy 1, policy_version 6192 (0.0007) +[2023-10-09 04:23:08,522][60144] Updated weights for policy 1, policy_version 6202 (0.0007) +[2023-10-09 04:23:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 12648448. Throughput: 0: 1714.0, 1: 1723.7. Samples: 3172182. Policy #0 lag: (min: 10.0, avg: 15.3, max: 42.0) +[2023-10-09 04:23:11,053][59242] Avg episode reward: [(0, '10.140'), (1, '9.830')] +[2023-10-09 04:23:11,473][60143] Updated weights for policy 0, policy_version 6152 (0.0009) +[2023-10-09 04:23:11,849][60143] Updated weights for policy 0, policy_version 6162 (0.0009) +[2023-10-09 04:23:12,211][60143] Updated weights for policy 0, policy_version 6172 (0.0009) +[2023-10-09 04:23:12,670][60144] Updated weights for policy 1, policy_version 6212 (0.0007) +[2023-10-09 04:23:13,045][60144] Updated weights for policy 1, policy_version 6222 (0.0010) +[2023-10-09 04:23:13,427][60144] Updated weights for policy 1, policy_version 6232 (0.0010) +[2023-10-09 04:23:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 12713984. Throughput: 0: 1714.3, 1: 1734.1. Samples: 3193364. Policy #0 lag: (min: 10.0, avg: 15.3, max: 42.0) +[2023-10-09 04:23:16,053][59242] Avg episode reward: [(0, '9.750'), (1, '9.950')] +[2023-10-09 04:23:16,089][60143] Updated weights for policy 0, policy_version 6182 (0.0007) +[2023-10-09 04:23:16,461][60143] Updated weights for policy 0, policy_version 6192 (0.0008) +[2023-10-09 04:23:16,837][60143] Updated weights for policy 0, policy_version 6202 (0.0009) +[2023-10-09 04:23:17,446][60144] Updated weights for policy 1, policy_version 6242 (0.0009) +[2023-10-09 04:23:17,808][60144] Updated weights for policy 1, policy_version 6252 (0.0008) +[2023-10-09 04:23:18,170][60144] Updated weights for policy 1, policy_version 6262 (0.0010) +[2023-10-09 04:23:18,537][60144] Updated weights for policy 1, policy_version 6272 (0.0007) +[2023-10-09 04:23:20,756][60143] Updated weights for policy 0, policy_version 6212 (0.0008) +[2023-10-09 04:23:21,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 12779520. Throughput: 0: 1711.9, 1: 1720.1. Samples: 3202848. Policy #0 lag: (min: 30.0, avg: 38.0, max: 62.0) +[2023-10-09 04:23:21,053][59242] Avg episode reward: [(0, '9.680'), (1, '10.220')] +[2023-10-09 04:23:21,053][60003] Saving new best policy, reward=10.220! +[2023-10-09 04:23:21,127][60143] Updated weights for policy 0, policy_version 6222 (0.0009) +[2023-10-09 04:23:21,499][60143] Updated weights for policy 0, policy_version 6232 (0.0008) +[2023-10-09 04:23:22,476][60144] Updated weights for policy 1, policy_version 6282 (0.0009) +[2023-10-09 04:23:22,839][60144] Updated weights for policy 1, policy_version 6292 (0.0009) +[2023-10-09 04:23:23,206][60144] Updated weights for policy 1, policy_version 6302 (0.0007) +[2023-10-09 04:23:25,260][60143] Updated weights for policy 0, policy_version 6242 (0.0008) +[2023-10-09 04:23:25,621][60143] Updated weights for policy 0, policy_version 6252 (0.0009) +[2023-10-09 04:23:25,994][60143] Updated weights for policy 0, policy_version 6262 (0.0009) +[2023-10-09 04:23:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 12845056. Throughput: 0: 1711.6, 1: 1726.8. Samples: 3224214. Policy #0 lag: (min: 30.0, avg: 38.0, max: 62.0) +[2023-10-09 04:23:26,052][59242] Avg episode reward: [(0, '9.940'), (1, '10.490')] +[2023-10-09 04:23:26,053][60003] Saving new best policy, reward=10.490! +[2023-10-09 04:23:26,370][60143] Updated weights for policy 0, policy_version 6272 (0.0007) +[2023-10-09 04:23:27,178][60144] Updated weights for policy 1, policy_version 6312 (0.0008) +[2023-10-09 04:23:27,551][60144] Updated weights for policy 1, policy_version 6322 (0.0008) +[2023-10-09 04:23:27,921][60144] Updated weights for policy 1, policy_version 6332 (0.0009) +[2023-10-09 04:23:30,258][60143] Updated weights for policy 0, policy_version 6282 (0.0009) +[2023-10-09 04:23:30,625][60143] Updated weights for policy 0, policy_version 6292 (0.0009) +[2023-10-09 04:23:30,995][60143] Updated weights for policy 0, policy_version 6302 (0.0009) +[2023-10-09 04:23:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 12910592. Throughput: 0: 1698.3, 1: 1742.8. Samples: 3244792. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) +[2023-10-09 04:23:31,052][59242] Avg episode reward: [(0, '10.080'), (1, '11.280')] +[2023-10-09 04:23:31,062][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000006336_6488064.pth... +[2023-10-09 04:23:31,066][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000006304_6455296.pth... +[2023-10-09 04:23:31,104][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000004736_4849664.pth +[2023-10-09 04:23:31,106][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000004704_4816896.pth +[2023-10-09 04:23:31,109][60003] Saving new best policy, reward=11.280! +[2023-10-09 04:23:31,790][60144] Updated weights for policy 1, policy_version 6342 (0.0008) +[2023-10-09 04:23:32,164][60144] Updated weights for policy 1, policy_version 6352 (0.0009) +[2023-10-09 04:23:32,534][60144] Updated weights for policy 1, policy_version 6362 (0.0009) +[2023-10-09 04:23:34,913][60143] Updated weights for policy 0, policy_version 6312 (0.0007) +[2023-10-09 04:23:35,286][60143] Updated weights for policy 0, policy_version 6322 (0.0008) +[2023-10-09 04:23:35,651][60143] Updated weights for policy 0, policy_version 6332 (0.0010) +[2023-10-09 04:23:36,052][59242] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 13008896. Throughput: 0: 1722.9, 1: 1712.5. Samples: 3254900. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:23:36,053][59242] Avg episode reward: [(0, '10.040'), (1, '10.770')] +[2023-10-09 04:23:36,645][60144] Updated weights for policy 1, policy_version 6372 (0.0009) +[2023-10-09 04:23:37,021][60144] Updated weights for policy 1, policy_version 6382 (0.0009) +[2023-10-09 04:23:37,398][60144] Updated weights for policy 1, policy_version 6392 (0.0007) +[2023-10-09 04:23:39,718][60143] Updated weights for policy 0, policy_version 6342 (0.0010) +[2023-10-09 04:23:40,078][60143] Updated weights for policy 0, policy_version 6352 (0.0010) +[2023-10-09 04:23:40,454][60143] Updated weights for policy 0, policy_version 6362 (0.0010) +[2023-10-09 04:23:41,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 13074432. Throughput: 0: 1718.6, 1: 1732.0. Samples: 3275804. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:23:41,053][59242] Avg episode reward: [(0, '9.880'), (1, '10.360')] +[2023-10-09 04:23:41,314][60144] Updated weights for policy 1, policy_version 6402 (0.0007) +[2023-10-09 04:23:41,679][60144] Updated weights for policy 1, policy_version 6412 (0.0007) +[2023-10-09 04:23:42,046][60144] Updated weights for policy 1, policy_version 6422 (0.0008) +[2023-10-09 04:23:42,412][60144] Updated weights for policy 1, policy_version 6432 (0.0008) +[2023-10-09 04:23:44,532][60143] Updated weights for policy 0, policy_version 6372 (0.0009) +[2023-10-09 04:23:44,915][60143] Updated weights for policy 0, policy_version 6382 (0.0009) +[2023-10-09 04:23:45,285][60143] Updated weights for policy 0, policy_version 6392 (0.0008) +[2023-10-09 04:23:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 13139968. Throughput: 0: 1693.1, 1: 1735.0. Samples: 3295660. Policy #0 lag: (min: 31.0, avg: 39.3, max: 63.0) +[2023-10-09 04:23:46,053][59242] Avg episode reward: [(0, '9.670'), (1, '10.220')] +[2023-10-09 04:23:46,263][60144] Updated weights for policy 1, policy_version 6442 (0.0008) +[2023-10-09 04:23:46,620][60144] Updated weights for policy 1, policy_version 6452 (0.0007) +[2023-10-09 04:23:46,990][60144] Updated weights for policy 1, policy_version 6462 (0.0007) +[2023-10-09 04:23:49,366][60143] Updated weights for policy 0, policy_version 6402 (0.0009) +[2023-10-09 04:23:49,743][60143] Updated weights for policy 0, policy_version 6412 (0.0008) +[2023-10-09 04:23:50,122][60143] Updated weights for policy 0, policy_version 6422 (0.0007) +[2023-10-09 04:23:50,483][60143] Updated weights for policy 0, policy_version 6432 (0.0007) +[2023-10-09 04:23:50,784][60144] Updated weights for policy 1, policy_version 6472 (0.0008) +[2023-10-09 04:23:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 13205504. Throughput: 0: 1715.0, 1: 1720.1. Samples: 3306104. Policy #0 lag: (min: 31.0, avg: 39.3, max: 63.0) +[2023-10-09 04:23:51,053][59242] Avg episode reward: [(0, '9.880'), (1, '10.670')] +[2023-10-09 04:23:51,159][60144] Updated weights for policy 1, policy_version 6482 (0.0007) +[2023-10-09 04:23:51,519][60144] Updated weights for policy 1, policy_version 6492 (0.0007) +[2023-10-09 04:23:54,438][60143] Updated weights for policy 0, policy_version 6442 (0.0008) +[2023-10-09 04:23:54,815][60143] Updated weights for policy 0, policy_version 6452 (0.0008) +[2023-10-09 04:23:55,185][60143] Updated weights for policy 0, policy_version 6462 (0.0009) +[2023-10-09 04:23:55,387][60144] Updated weights for policy 1, policy_version 6502 (0.0010) +[2023-10-09 04:23:55,765][60144] Updated weights for policy 1, policy_version 6512 (0.0009) +[2023-10-09 04:23:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 13271040. Throughput: 0: 1706.4, 1: 1734.5. Samples: 3327024. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 04:23:56,053][59242] Avg episode reward: [(0, '10.170'), (1, '10.230')] +[2023-10-09 04:23:56,143][60144] Updated weights for policy 1, policy_version 6522 (0.0010) +[2023-10-09 04:23:59,227][60143] Updated weights for policy 0, policy_version 6472 (0.0008) +[2023-10-09 04:23:59,586][60143] Updated weights for policy 0, policy_version 6482 (0.0008) +[2023-10-09 04:23:59,966][60143] Updated weights for policy 0, policy_version 6492 (0.0010) +[2023-10-09 04:24:00,097][60144] Updated weights for policy 1, policy_version 6532 (0.0010) +[2023-10-09 04:24:00,465][60144] Updated weights for policy 1, policy_version 6542 (0.0008) +[2023-10-09 04:24:00,829][60144] Updated weights for policy 1, policy_version 6552 (0.0008) +[2023-10-09 04:24:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 13336576. Throughput: 0: 1685.3, 1: 1723.6. Samples: 3346764. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 04:24:01,053][59242] Avg episode reward: [(0, '10.770'), (1, '10.110')] +[2023-10-09 04:24:01,064][59934] Saving new best policy, reward=10.770! +[2023-10-09 04:24:04,140][60143] Updated weights for policy 0, policy_version 6502 (0.0009) +[2023-10-09 04:24:04,514][60143] Updated weights for policy 0, policy_version 6512 (0.0009) +[2023-10-09 04:24:04,679][60144] Updated weights for policy 1, policy_version 6562 (0.0010) +[2023-10-09 04:24:04,890][60143] Updated weights for policy 0, policy_version 6522 (0.0008) +[2023-10-09 04:24:05,035][60144] Updated weights for policy 1, policy_version 6572 (0.0007) +[2023-10-09 04:24:05,408][60144] Updated weights for policy 1, policy_version 6582 (0.0009) +[2023-10-09 04:24:05,778][60144] Updated weights for policy 1, policy_version 6592 (0.0010) +[2023-10-09 04:24:06,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 13434880. Throughput: 0: 1713.5, 1: 1734.5. Samples: 3358006. Policy #0 lag: (min: 0.0, avg: 23.5, max: 32.0) +[2023-10-09 04:24:06,052][59242] Avg episode reward: [(0, '10.510'), (1, '10.600')] +[2023-10-09 04:24:08,996][60143] Updated weights for policy 0, policy_version 6532 (0.0010) +[2023-10-09 04:24:09,378][60143] Updated weights for policy 0, policy_version 6542 (0.0010) +[2023-10-09 04:24:09,720][60144] Updated weights for policy 1, policy_version 6602 (0.0008) +[2023-10-09 04:24:09,752][60143] Updated weights for policy 0, policy_version 6552 (0.0009) +[2023-10-09 04:24:10,087][60144] Updated weights for policy 1, policy_version 6612 (0.0008) +[2023-10-09 04:24:10,460][60144] Updated weights for policy 1, policy_version 6622 (0.0009) +[2023-10-09 04:24:11,052][59242] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 13500416. Throughput: 0: 1693.7, 1: 1726.4. Samples: 3378122. Policy #0 lag: (min: 0.0, avg: 23.5, max: 32.0) +[2023-10-09 04:24:11,052][59242] Avg episode reward: [(0, '10.840'), (1, '10.400')] +[2023-10-09 04:24:11,053][59934] Saving new best policy, reward=10.840! +[2023-10-09 04:24:13,612][60143] Updated weights for policy 0, policy_version 6562 (0.0007) +[2023-10-09 04:24:13,990][60143] Updated weights for policy 0, policy_version 6572 (0.0008) +[2023-10-09 04:24:14,366][60143] Updated weights for policy 0, policy_version 6582 (0.0007) +[2023-10-09 04:24:14,438][60144] Updated weights for policy 1, policy_version 6632 (0.0009) +[2023-10-09 04:24:14,734][60143] Updated weights for policy 0, policy_version 6592 (0.0008) +[2023-10-09 04:24:14,826][60144] Updated weights for policy 1, policy_version 6642 (0.0007) +[2023-10-09 04:24:15,194][60144] Updated weights for policy 1, policy_version 6652 (0.0007) +[2023-10-09 04:24:16,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 13565952. Throughput: 0: 1692.8, 1: 1702.3. Samples: 3397574. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:24:16,053][59242] Avg episode reward: [(0, '11.490'), (1, '10.290')] +[2023-10-09 04:24:16,066][59934] Saving new best policy, reward=11.490! +[2023-10-09 04:24:18,731][60143] Updated weights for policy 0, policy_version 6602 (0.0009) +[2023-10-09 04:24:19,083][60144] Updated weights for policy 1, policy_version 6662 (0.0007) +[2023-10-09 04:24:19,113][60143] Updated weights for policy 0, policy_version 6612 (0.0008) +[2023-10-09 04:24:19,453][60144] Updated weights for policy 1, policy_version 6672 (0.0007) +[2023-10-09 04:24:19,469][60143] Updated weights for policy 0, policy_version 6622 (0.0008) +[2023-10-09 04:24:19,816][60144] Updated weights for policy 1, policy_version 6682 (0.0008) +[2023-10-09 04:24:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 13631488. Throughput: 0: 1697.6, 1: 1737.4. Samples: 3409474. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:24:21,053][59242] Avg episode reward: [(0, '11.280'), (1, '11.140')] +[2023-10-09 04:24:23,177][60143] Updated weights for policy 0, policy_version 6632 (0.0009) +[2023-10-09 04:24:23,553][60143] Updated weights for policy 0, policy_version 6642 (0.0007) +[2023-10-09 04:24:23,897][60144] Updated weights for policy 1, policy_version 6692 (0.0007) +[2023-10-09 04:24:23,915][60143] Updated weights for policy 0, policy_version 6652 (0.0010) +[2023-10-09 04:24:24,293][60144] Updated weights for policy 1, policy_version 6702 (0.0008) +[2023-10-09 04:24:24,670][60144] Updated weights for policy 1, policy_version 6712 (0.0010) +[2023-10-09 04:24:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 13697024. Throughput: 0: 1678.0, 1: 1720.3. Samples: 3428728. Policy #0 lag: (min: 31.0, avg: 31.9, max: 52.0) +[2023-10-09 04:24:26,053][59242] Avg episode reward: [(0, '10.670'), (1, '11.240')] +[2023-10-09 04:24:27,852][60143] Updated weights for policy 0, policy_version 6662 (0.0010) +[2023-10-09 04:24:28,218][60143] Updated weights for policy 0, policy_version 6672 (0.0009) +[2023-10-09 04:24:28,598][60143] Updated weights for policy 0, policy_version 6682 (0.0009) +[2023-10-09 04:24:28,613][60144] Updated weights for policy 1, policy_version 6722 (0.0008) +[2023-10-09 04:24:28,981][60144] Updated weights for policy 1, policy_version 6732 (0.0007) +[2023-10-09 04:24:29,342][60144] Updated weights for policy 1, policy_version 6742 (0.0011) +[2023-10-09 04:24:29,720][60144] Updated weights for policy 1, policy_version 6752 (0.0010) +[2023-10-09 04:24:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 13762560. Throughput: 0: 1708.8, 1: 1708.0. Samples: 3449414. Policy #0 lag: (min: 31.0, avg: 31.9, max: 52.0) +[2023-10-09 04:24:31,053][59242] Avg episode reward: [(0, '11.160'), (1, '11.590')] +[2023-10-09 04:24:31,064][60003] Saving new best policy, reward=11.590! +[2023-10-09 04:24:32,696][60143] Updated weights for policy 0, policy_version 6692 (0.0008) +[2023-10-09 04:24:33,093][60143] Updated weights for policy 0, policy_version 6702 (0.0009) +[2023-10-09 04:24:33,469][60143] Updated weights for policy 0, policy_version 6712 (0.0007) +[2023-10-09 04:24:33,827][60144] Updated weights for policy 1, policy_version 6762 (0.0008) +[2023-10-09 04:24:34,197][60144] Updated weights for policy 1, policy_version 6772 (0.0010) +[2023-10-09 04:24:34,560][60144] Updated weights for policy 1, policy_version 6782 (0.0007) +[2023-10-09 04:24:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 13828096. Throughput: 0: 1689.9, 1: 1731.1. Samples: 3460046. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:24:36,053][59242] Avg episode reward: [(0, '11.410'), (1, '11.410')] +[2023-10-09 04:24:37,487][60143] Updated weights for policy 0, policy_version 6722 (0.0008) +[2023-10-09 04:24:37,870][60143] Updated weights for policy 0, policy_version 6732 (0.0010) +[2023-10-09 04:24:38,238][60143] Updated weights for policy 0, policy_version 6742 (0.0009) +[2023-10-09 04:24:38,473][60144] Updated weights for policy 1, policy_version 6792 (0.0007) +[2023-10-09 04:24:38,599][60143] Updated weights for policy 0, policy_version 6752 (0.0009) +[2023-10-09 04:24:38,836][60144] Updated weights for policy 1, policy_version 6802 (0.0007) +[2023-10-09 04:24:39,207][60144] Updated weights for policy 1, policy_version 6812 (0.0007) +[2023-10-09 04:24:41,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 13893632. Throughput: 0: 1692.9, 1: 1704.9. Samples: 3479928. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:24:41,053][59242] Avg episode reward: [(0, '11.790'), (1, '11.890')] +[2023-10-09 04:24:41,055][60003] Saving new best policy, reward=11.890! +[2023-10-09 04:24:41,055][59934] Saving new best policy, reward=11.790! +[2023-10-09 04:24:42,575][60143] Updated weights for policy 0, policy_version 6762 (0.0007) +[2023-10-09 04:24:42,946][60143] Updated weights for policy 0, policy_version 6772 (0.0009) +[2023-10-09 04:24:43,107][60144] Updated weights for policy 1, policy_version 6822 (0.0008) +[2023-10-09 04:24:43,322][60143] Updated weights for policy 0, policy_version 6782 (0.0009) +[2023-10-09 04:24:43,476][60144] Updated weights for policy 1, policy_version 6832 (0.0010) +[2023-10-09 04:24:43,837][60144] Updated weights for policy 1, policy_version 6842 (0.0007) +[2023-10-09 04:24:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 13959168. Throughput: 0: 1711.8, 1: 1714.4. Samples: 3500942. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:24:46,053][59242] Avg episode reward: [(0, '11.410'), (1, '11.930')] +[2023-10-09 04:24:46,065][60003] Saving new best policy, reward=11.930! +[2023-10-09 04:24:47,365][60143] Updated weights for policy 0, policy_version 6792 (0.0008) +[2023-10-09 04:24:47,744][60143] Updated weights for policy 0, policy_version 6802 (0.0010) +[2023-10-09 04:24:47,920][60144] Updated weights for policy 1, policy_version 6852 (0.0008) +[2023-10-09 04:24:48,110][60143] Updated weights for policy 0, policy_version 6812 (0.0008) +[2023-10-09 04:24:48,290][60144] Updated weights for policy 1, policy_version 6862 (0.0008) +[2023-10-09 04:24:48,659][60144] Updated weights for policy 1, policy_version 6872 (0.0008) +[2023-10-09 04:24:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 14024704. Throughput: 0: 1681.9, 1: 1713.2. Samples: 3510786. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:24:51,053][59242] Avg episode reward: [(0, '12.250'), (1, '11.740')] +[2023-10-09 04:24:51,054][59934] Saving new best policy, reward=12.250! +[2023-10-09 04:24:52,072][60143] Updated weights for policy 0, policy_version 6822 (0.0009) +[2023-10-09 04:24:52,444][60143] Updated weights for policy 0, policy_version 6832 (0.0010) +[2023-10-09 04:24:52,505][60144] Updated weights for policy 1, policy_version 6882 (0.0009) +[2023-10-09 04:24:52,819][60143] Updated weights for policy 0, policy_version 6842 (0.0007) +[2023-10-09 04:24:52,875][60144] Updated weights for policy 1, policy_version 6892 (0.0008) +[2023-10-09 04:24:53,244][60144] Updated weights for policy 1, policy_version 6902 (0.0008) +[2023-10-09 04:24:53,612][60144] Updated weights for policy 1, policy_version 6912 (0.0009) +[2023-10-09 04:24:56,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 14090240. Throughput: 0: 1699.8, 1: 1708.3. Samples: 3531488. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-09 04:24:56,053][59242] Avg episode reward: [(0, '11.440'), (1, '11.970')] +[2023-10-09 04:24:56,054][60003] Saving new best policy, reward=11.970! +[2023-10-09 04:24:56,924][60143] Updated weights for policy 0, policy_version 6852 (0.0010) +[2023-10-09 04:24:57,292][60143] Updated weights for policy 0, policy_version 6862 (0.0010) +[2023-10-09 04:24:57,614][60144] Updated weights for policy 1, policy_version 6922 (0.0008) +[2023-10-09 04:24:57,669][60143] Updated weights for policy 0, policy_version 6872 (0.0007) +[2023-10-09 04:24:57,981][60144] Updated weights for policy 1, policy_version 6932 (0.0007) +[2023-10-09 04:24:58,346][60144] Updated weights for policy 1, policy_version 6942 (0.0007) +[2023-10-09 04:25:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 14155776. Throughput: 0: 1710.5, 1: 1737.3. Samples: 3552722. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-09 04:25:01,053][59242] Avg episode reward: [(0, '11.300'), (1, '12.670')] +[2023-10-09 04:25:01,063][60003] Saving new best policy, reward=12.670! +[2023-10-09 04:25:01,730][60143] Updated weights for policy 0, policy_version 6882 (0.0009) +[2023-10-09 04:25:02,107][60143] Updated weights for policy 0, policy_version 6892 (0.0009) +[2023-10-09 04:25:02,131][60144] Updated weights for policy 1, policy_version 6952 (0.0007) +[2023-10-09 04:25:02,480][60143] Updated weights for policy 0, policy_version 6902 (0.0009) +[2023-10-09 04:25:02,508][60144] Updated weights for policy 1, policy_version 6962 (0.0009) +[2023-10-09 04:25:02,850][60143] Updated weights for policy 0, policy_version 6912 (0.0011) +[2023-10-09 04:25:02,874][60144] Updated weights for policy 1, policy_version 6972 (0.0007) +[2023-10-09 04:25:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 14221312. Throughput: 0: 1685.7, 1: 1705.2. Samples: 3562064. Policy #0 lag: (min: 24.0, avg: 47.0, max: 56.0) +[2023-10-09 04:25:06,053][59242] Avg episode reward: [(0, '11.460'), (1, '12.260')] +[2023-10-09 04:25:06,730][60144] Updated weights for policy 1, policy_version 6982 (0.0008) +[2023-10-09 04:25:06,967][60143] Updated weights for policy 0, policy_version 6922 (0.0007) +[2023-10-09 04:25:07,099][60144] Updated weights for policy 1, policy_version 6992 (0.0008) +[2023-10-09 04:25:07,338][60143] Updated weights for policy 0, policy_version 6932 (0.0008) +[2023-10-09 04:25:07,461][60144] Updated weights for policy 1, policy_version 7002 (0.0008) +[2023-10-09 04:25:07,695][60143] Updated weights for policy 0, policy_version 6942 (0.0007) +[2023-10-09 04:25:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 14286848. Throughput: 0: 1707.9, 1: 1731.2. Samples: 3583486. Policy #0 lag: (min: 24.0, avg: 47.0, max: 56.0) +[2023-10-09 04:25:11,052][59242] Avg episode reward: [(0, '11.650'), (1, '12.570')] +[2023-10-09 04:25:11,523][60143] Updated weights for policy 0, policy_version 6952 (0.0007) +[2023-10-09 04:25:11,530][60144] Updated weights for policy 1, policy_version 7012 (0.0009) +[2023-10-09 04:25:11,895][60143] Updated weights for policy 0, policy_version 6962 (0.0008) +[2023-10-09 04:25:11,928][60144] Updated weights for policy 1, policy_version 7022 (0.0010) +[2023-10-09 04:25:12,268][60143] Updated weights for policy 0, policy_version 6972 (0.0009) +[2023-10-09 04:25:12,291][60144] Updated weights for policy 1, policy_version 7032 (0.0008) +[2023-10-09 04:25:16,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 14352384. Throughput: 0: 1706.2, 1: 1742.9. Samples: 3604624. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 04:25:16,053][59242] Avg episode reward: [(0, '12.060'), (1, '12.300')] +[2023-10-09 04:25:16,138][60144] Updated weights for policy 1, policy_version 7042 (0.0010) +[2023-10-09 04:25:16,216][60143] Updated weights for policy 0, policy_version 6982 (0.0008) +[2023-10-09 04:25:16,501][60144] Updated weights for policy 1, policy_version 7052 (0.0009) +[2023-10-09 04:25:16,589][60143] Updated weights for policy 0, policy_version 6992 (0.0010) +[2023-10-09 04:25:16,871][60144] Updated weights for policy 1, policy_version 7062 (0.0008) +[2023-10-09 04:25:16,951][60143] Updated weights for policy 0, policy_version 7002 (0.0010) +[2023-10-09 04:25:17,229][60144] Updated weights for policy 1, policy_version 7072 (0.0008) +[2023-10-09 04:25:20,981][60143] Updated weights for policy 0, policy_version 7012 (0.0009) +[2023-10-09 04:25:21,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 14417920. Throughput: 0: 1704.2, 1: 1716.7. Samples: 3613986. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 04:25:21,053][59242] Avg episode reward: [(0, '11.940'), (1, '12.460')] +[2023-10-09 04:25:21,361][60143] Updated weights for policy 0, policy_version 7022 (0.0008) +[2023-10-09 04:25:21,418][60144] Updated weights for policy 1, policy_version 7082 (0.0007) +[2023-10-09 04:25:21,738][60143] Updated weights for policy 0, policy_version 7032 (0.0007) +[2023-10-09 04:25:21,777][60144] Updated weights for policy 1, policy_version 7092 (0.0007) +[2023-10-09 04:25:22,144][60144] Updated weights for policy 1, policy_version 7102 (0.0008) +[2023-10-09 04:25:25,720][60143] Updated weights for policy 0, policy_version 7042 (0.0009) +[2023-10-09 04:25:25,867][60144] Updated weights for policy 1, policy_version 7112 (0.0007) +[2023-10-09 04:25:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 14483456. Throughput: 0: 1705.7, 1: 1740.6. Samples: 3635012. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:25:26,053][59242] Avg episode reward: [(0, '11.080'), (1, '12.480')] +[2023-10-09 04:25:26,087][60143] Updated weights for policy 0, policy_version 7052 (0.0007) +[2023-10-09 04:25:26,239][60144] Updated weights for policy 1, policy_version 7122 (0.0008) +[2023-10-09 04:25:26,464][60143] Updated weights for policy 0, policy_version 7062 (0.0009) +[2023-10-09 04:25:26,605][60144] Updated weights for policy 1, policy_version 7132 (0.0009) +[2023-10-09 04:25:26,827][60143] Updated weights for policy 0, policy_version 7072 (0.0010) +[2023-10-09 04:25:30,579][60144] Updated weights for policy 1, policy_version 7142 (0.0007) +[2023-10-09 04:25:30,812][60143] Updated weights for policy 0, policy_version 7082 (0.0009) +[2023-10-09 04:25:30,950][60144] Updated weights for policy 1, policy_version 7152 (0.0007) +[2023-10-09 04:25:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 14548992. Throughput: 0: 1705.4, 1: 1737.7. Samples: 3655884. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:25:31,053][59242] Avg episode reward: [(0, '10.740'), (1, '12.940')] +[2023-10-09 04:25:31,185][60143] Updated weights for policy 0, policy_version 7092 (0.0008) +[2023-10-09 04:25:31,310][60144] Updated weights for policy 1, policy_version 7162 (0.0008) +[2023-10-09 04:25:31,528][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000007168_7340032.pth... +[2023-10-09 04:25:31,555][60143] Updated weights for policy 0, policy_version 7102 (0.0009) +[2023-10-09 04:25:31,557][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000005536_5668864.pth +[2023-10-09 04:25:31,561][60003] Saving new best policy, reward=12.940! +[2023-10-09 04:25:31,623][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000007104_7274496.pth... +[2023-10-09 04:25:31,653][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000005504_5636096.pth +[2023-10-09 04:25:35,313][60144] Updated weights for policy 1, policy_version 7172 (0.0008) +[2023-10-09 04:25:35,506][60143] Updated weights for policy 0, policy_version 7112 (0.0011) +[2023-10-09 04:25:35,683][60144] Updated weights for policy 1, policy_version 7182 (0.0008) +[2023-10-09 04:25:35,875][60143] Updated weights for policy 0, policy_version 7122 (0.0007) +[2023-10-09 04:25:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 14614528. Throughput: 0: 1708.2, 1: 1728.7. Samples: 3665444. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:25:36,052][59242] Avg episode reward: [(0, '9.820'), (1, '13.550')] +[2023-10-09 04:25:36,055][60144] Updated weights for policy 1, policy_version 7192 (0.0007) +[2023-10-09 04:25:36,257][60143] Updated weights for policy 0, policy_version 7132 (0.0008) +[2023-10-09 04:25:36,336][60003] Saving new best policy, reward=13.550! +[2023-10-09 04:25:40,012][60144] Updated weights for policy 1, policy_version 7202 (0.0007) +[2023-10-09 04:25:40,259][60143] Updated weights for policy 0, policy_version 7142 (0.0008) +[2023-10-09 04:25:40,379][60144] Updated weights for policy 1, policy_version 7212 (0.0008) +[2023-10-09 04:25:40,630][60143] Updated weights for policy 0, policy_version 7152 (0.0008) +[2023-10-09 04:25:40,746][60144] Updated weights for policy 1, policy_version 7222 (0.0009) +[2023-10-09 04:25:41,011][60143] Updated weights for policy 0, policy_version 7162 (0.0008) +[2023-10-09 04:25:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 14680064. Throughput: 0: 1708.8, 1: 1737.6. Samples: 3686578. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:25:41,052][59242] Avg episode reward: [(0, '10.200'), (1, '13.280')] +[2023-10-09 04:25:41,107][60144] Updated weights for policy 1, policy_version 7232 (0.0009) +[2023-10-09 04:25:44,960][60143] Updated weights for policy 0, policy_version 7172 (0.0008) +[2023-10-09 04:25:45,098][60144] Updated weights for policy 1, policy_version 7242 (0.0008) +[2023-10-09 04:25:45,336][60143] Updated weights for policy 0, policy_version 7182 (0.0009) +[2023-10-09 04:25:45,465][60144] Updated weights for policy 1, policy_version 7252 (0.0007) +[2023-10-09 04:25:45,695][60143] Updated weights for policy 0, policy_version 7192 (0.0008) +[2023-10-09 04:25:45,822][60144] Updated weights for policy 1, policy_version 7262 (0.0008) +[2023-10-09 04:25:46,052][59242] Fps is (10 sec: 19660.1, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 14811136. Throughput: 0: 1691.2, 1: 1715.0. Samples: 3706004. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-09 04:25:46,054][59242] Avg episode reward: [(0, '10.800'), (1, '12.640')] +[2023-10-09 04:25:49,806][60143] Updated weights for policy 0, policy_version 7202 (0.0007) +[2023-10-09 04:25:49,868][60144] Updated weights for policy 1, policy_version 7272 (0.0008) +[2023-10-09 04:25:50,170][60143] Updated weights for policy 0, policy_version 7212 (0.0007) +[2023-10-09 04:25:50,230][60144] Updated weights for policy 1, policy_version 7282 (0.0009) +[2023-10-09 04:25:50,535][60143] Updated weights for policy 0, policy_version 7222 (0.0008) +[2023-10-09 04:25:50,591][60144] Updated weights for policy 1, policy_version 7292 (0.0007) +[2023-10-09 04:25:50,910][60143] Updated weights for policy 0, policy_version 7232 (0.0009) +[2023-10-09 04:25:51,052][59242] Fps is (10 sec: 19660.8, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 14876672. Throughput: 0: 1704.9, 1: 1733.8. Samples: 3716806. Policy #0 lag: (min: 29.0, avg: 32.0, max: 61.0) +[2023-10-09 04:25:51,052][59242] Avg episode reward: [(0, '10.830'), (1, '12.630')] +[2023-10-09 04:25:54,571][60144] Updated weights for policy 1, policy_version 7302 (0.0007) +[2023-10-09 04:25:54,890][60143] Updated weights for policy 0, policy_version 7242 (0.0007) +[2023-10-09 04:25:54,938][60144] Updated weights for policy 1, policy_version 7312 (0.0008) +[2023-10-09 04:25:55,251][60143] Updated weights for policy 0, policy_version 7252 (0.0007) +[2023-10-09 04:25:55,302][60144] Updated weights for policy 1, policy_version 7322 (0.0008) +[2023-10-09 04:25:55,629][60143] Updated weights for policy 0, policy_version 7262 (0.0008) +[2023-10-09 04:25:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 14942208. Throughput: 0: 1703.3, 1: 1723.1. Samples: 3737674. Policy #0 lag: (min: 29.0, avg: 32.0, max: 61.0) +[2023-10-09 04:25:56,053][59242] Avg episode reward: [(0, '10.460'), (1, '12.570')] +[2023-10-09 04:25:59,269][60144] Updated weights for policy 1, policy_version 7332 (0.0009) +[2023-10-09 04:25:59,660][60144] Updated weights for policy 1, policy_version 7342 (0.0007) +[2023-10-09 04:25:59,696][60143] Updated weights for policy 0, policy_version 7272 (0.0008) +[2023-10-09 04:26:00,033][60144] Updated weights for policy 1, policy_version 7352 (0.0008) +[2023-10-09 04:26:00,067][60143] Updated weights for policy 0, policy_version 7282 (0.0009) +[2023-10-09 04:26:00,433][60143] Updated weights for policy 0, policy_version 7292 (0.0009) +[2023-10-09 04:26:01,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 15007744. Throughput: 0: 1682.0, 1: 1695.6. Samples: 3756614. Policy #0 lag: (min: 1.0, avg: 4.2, max: 33.0) +[2023-10-09 04:26:01,054][59242] Avg episode reward: [(0, '10.940'), (1, '13.330')] +[2023-10-09 04:26:03,979][60144] Updated weights for policy 1, policy_version 7362 (0.0007) +[2023-10-09 04:26:04,346][60144] Updated weights for policy 1, policy_version 7372 (0.0010) +[2023-10-09 04:26:04,460][60143] Updated weights for policy 0, policy_version 7302 (0.0010) +[2023-10-09 04:26:04,716][60144] Updated weights for policy 1, policy_version 7382 (0.0007) +[2023-10-09 04:26:04,822][60143] Updated weights for policy 0, policy_version 7312 (0.0009) +[2023-10-09 04:26:05,073][60144] Updated weights for policy 1, policy_version 7392 (0.0008) +[2023-10-09 04:26:05,192][60143] Updated weights for policy 0, policy_version 7322 (0.0007) +[2023-10-09 04:26:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 15073280. Throughput: 0: 1704.3, 1: 1728.0. Samples: 3768436. Policy #0 lag: (min: 1.0, avg: 4.2, max: 33.0) +[2023-10-09 04:26:06,053][59242] Avg episode reward: [(0, '11.340'), (1, '13.110')] +[2023-10-09 04:26:08,862][60144] Updated weights for policy 1, policy_version 7402 (0.0008) +[2023-10-09 04:26:09,235][60144] Updated weights for policy 1, policy_version 7412 (0.0008) +[2023-10-09 04:26:09,363][60143] Updated weights for policy 0, policy_version 7332 (0.0009) +[2023-10-09 04:26:09,600][60144] Updated weights for policy 1, policy_version 7422 (0.0009) +[2023-10-09 04:26:09,747][60143] Updated weights for policy 0, policy_version 7342 (0.0008) +[2023-10-09 04:26:10,119][60143] Updated weights for policy 0, policy_version 7352 (0.0009) +[2023-10-09 04:26:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 15138816. Throughput: 0: 1702.0, 1: 1705.7. Samples: 3788358. Policy #0 lag: (min: 13.0, avg: 16.0, max: 45.0) +[2023-10-09 04:26:11,053][59242] Avg episode reward: [(0, '11.840'), (1, '13.060')] +[2023-10-09 04:26:13,492][60144] Updated weights for policy 1, policy_version 7432 (0.0007) +[2023-10-09 04:26:13,862][60144] Updated weights for policy 1, policy_version 7442 (0.0008) +[2023-10-09 04:26:14,022][60143] Updated weights for policy 0, policy_version 7362 (0.0008) +[2023-10-09 04:26:14,233][60144] Updated weights for policy 1, policy_version 7452 (0.0008) +[2023-10-09 04:26:14,385][60143] Updated weights for policy 0, policy_version 7372 (0.0010) +[2023-10-09 04:26:14,761][60143] Updated weights for policy 0, policy_version 7382 (0.0010) +[2023-10-09 04:26:15,136][60143] Updated weights for policy 0, policy_version 7392 (0.0010) +[2023-10-09 04:26:16,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 15204352. Throughput: 0: 1679.6, 1: 1709.5. Samples: 3808394. Policy #0 lag: (min: 13.0, avg: 16.0, max: 45.0) +[2023-10-09 04:26:16,053][59242] Avg episode reward: [(0, '11.520'), (1, '13.250')] +[2023-10-09 04:26:18,108][60144] Updated weights for policy 1, policy_version 7462 (0.0008) +[2023-10-09 04:26:18,480][60144] Updated weights for policy 1, policy_version 7472 (0.0011) +[2023-10-09 04:26:18,840][60144] Updated weights for policy 1, policy_version 7482 (0.0007) +[2023-10-09 04:26:19,197][60143] Updated weights for policy 0, policy_version 7402 (0.0008) +[2023-10-09 04:26:19,576][60143] Updated weights for policy 0, policy_version 7412 (0.0009) +[2023-10-09 04:26:19,936][60143] Updated weights for policy 0, policy_version 7422 (0.0010) +[2023-10-09 04:26:21,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 15269888. Throughput: 0: 1705.2, 1: 1720.0. Samples: 3819582. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-09 04:26:21,052][59242] Avg episode reward: [(0, '11.640'), (1, '13.150')] +[2023-10-09 04:26:22,747][60144] Updated weights for policy 1, policy_version 7492 (0.0008) +[2023-10-09 04:26:23,126][60144] Updated weights for policy 1, policy_version 7502 (0.0008) +[2023-10-09 04:26:23,494][60144] Updated weights for policy 1, policy_version 7512 (0.0007) +[2023-10-09 04:26:23,980][60143] Updated weights for policy 0, policy_version 7432 (0.0008) +[2023-10-09 04:26:24,357][60143] Updated weights for policy 0, policy_version 7442 (0.0007) +[2023-10-09 04:26:24,723][60143] Updated weights for policy 0, policy_version 7452 (0.0007) +[2023-10-09 04:26:26,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 15335424. Throughput: 0: 1681.9, 1: 1713.8. Samples: 3839384. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-09 04:26:26,053][59242] Avg episode reward: [(0, '11.730'), (1, '13.490')] +[2023-10-09 04:26:27,352][60144] Updated weights for policy 1, policy_version 7522 (0.0009) +[2023-10-09 04:26:27,717][60144] Updated weights for policy 1, policy_version 7532 (0.0011) +[2023-10-09 04:26:28,080][60144] Updated weights for policy 1, policy_version 7542 (0.0008) +[2023-10-09 04:26:28,449][60144] Updated weights for policy 1, policy_version 7552 (0.0011) +[2023-10-09 04:26:28,814][60143] Updated weights for policy 0, policy_version 7462 (0.0009) +[2023-10-09 04:26:29,195][60143] Updated weights for policy 0, policy_version 7472 (0.0008) +[2023-10-09 04:26:29,567][60143] Updated weights for policy 0, policy_version 7482 (0.0007) +[2023-10-09 04:26:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 15400960. Throughput: 0: 1688.9, 1: 1737.8. Samples: 3860204. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:26:31,053][59242] Avg episode reward: [(0, '11.060'), (1, '13.930')] +[2023-10-09 04:26:31,062][60003] Saving new best policy, reward=13.930! +[2023-10-09 04:26:32,399][60144] Updated weights for policy 1, policy_version 7562 (0.0008) +[2023-10-09 04:26:32,761][60144] Updated weights for policy 1, policy_version 7572 (0.0007) +[2023-10-09 04:26:33,125][60144] Updated weights for policy 1, policy_version 7582 (0.0009) +[2023-10-09 04:26:33,503][60143] Updated weights for policy 0, policy_version 7492 (0.0009) +[2023-10-09 04:26:33,876][60143] Updated weights for policy 0, policy_version 7502 (0.0008) +[2023-10-09 04:26:34,246][60143] Updated weights for policy 0, policy_version 7512 (0.0008) +[2023-10-09 04:26:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 15466496. Throughput: 0: 1702.9, 1: 1716.1. Samples: 3870660. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:26:36,053][59242] Avg episode reward: [(0, '10.900'), (1, '14.500')] +[2023-10-09 04:26:36,055][60003] Saving new best policy, reward=14.500! +[2023-10-09 04:26:36,993][60144] Updated weights for policy 1, policy_version 7592 (0.0009) +[2023-10-09 04:26:37,365][60144] Updated weights for policy 1, policy_version 7602 (0.0009) +[2023-10-09 04:26:37,728][60144] Updated weights for policy 1, policy_version 7612 (0.0008) +[2023-10-09 04:26:38,124][60143] Updated weights for policy 0, policy_version 7522 (0.0008) +[2023-10-09 04:26:38,492][60143] Updated weights for policy 0, policy_version 7532 (0.0011) +[2023-10-09 04:26:38,859][60143] Updated weights for policy 0, policy_version 7542 (0.0010) +[2023-10-09 04:26:39,234][60143] Updated weights for policy 0, policy_version 7552 (0.0009) +[2023-10-09 04:26:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 15532032. Throughput: 0: 1678.4, 1: 1721.1. Samples: 3890652. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-09 04:26:41,053][59242] Avg episode reward: [(0, '10.640'), (1, '14.740')] +[2023-10-09 04:26:41,054][60003] Saving new best policy, reward=14.740! +[2023-10-09 04:26:41,925][60144] Updated weights for policy 1, policy_version 7622 (0.0008) +[2023-10-09 04:26:42,287][60144] Updated weights for policy 1, policy_version 7632 (0.0007) +[2023-10-09 04:26:42,658][60144] Updated weights for policy 1, policy_version 7642 (0.0008) +[2023-10-09 04:26:43,253][60143] Updated weights for policy 0, policy_version 7562 (0.0009) +[2023-10-09 04:26:43,622][60143] Updated weights for policy 0, policy_version 7572 (0.0008) +[2023-10-09 04:26:43,999][60143] Updated weights for policy 0, policy_version 7582 (0.0008) +[2023-10-09 04:26:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 15597568. Throughput: 0: 1709.0, 1: 1751.9. Samples: 3912356. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-09 04:26:46,053][59242] Avg episode reward: [(0, '10.980'), (1, '15.120')] +[2023-10-09 04:26:46,066][60003] Saving new best policy, reward=15.120! +[2023-10-09 04:26:46,541][60144] Updated weights for policy 1, policy_version 7652 (0.0007) +[2023-10-09 04:26:46,937][60144] Updated weights for policy 1, policy_version 7662 (0.0008) +[2023-10-09 04:26:47,302][60144] Updated weights for policy 1, policy_version 7672 (0.0007) +[2023-10-09 04:26:47,758][60143] Updated weights for policy 0, policy_version 7592 (0.0008) +[2023-10-09 04:26:48,132][60143] Updated weights for policy 0, policy_version 7602 (0.0008) +[2023-10-09 04:26:48,498][60143] Updated weights for policy 0, policy_version 7612 (0.0007) +[2023-10-09 04:26:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 15663104. Throughput: 0: 1690.5, 1: 1717.5. Samples: 3921796. Policy #0 lag: (min: 28.0, avg: 30.0, max: 59.0) +[2023-10-09 04:26:51,052][59242] Avg episode reward: [(0, '11.060'), (1, '15.120')] +[2023-10-09 04:26:51,174][60144] Updated weights for policy 1, policy_version 7682 (0.0009) +[2023-10-09 04:26:51,537][60144] Updated weights for policy 1, policy_version 7692 (0.0007) +[2023-10-09 04:26:51,904][60144] Updated weights for policy 1, policy_version 7702 (0.0010) +[2023-10-09 04:26:52,266][60144] Updated weights for policy 1, policy_version 7712 (0.0010) +[2023-10-09 04:26:52,609][60143] Updated weights for policy 0, policy_version 7622 (0.0007) +[2023-10-09 04:26:52,978][60143] Updated weights for policy 0, policy_version 7632 (0.0008) +[2023-10-09 04:26:53,364][60143] Updated weights for policy 0, policy_version 7642 (0.0009) +[2023-10-09 04:26:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 15728640. Throughput: 0: 1690.8, 1: 1742.4. Samples: 3942850. Policy #0 lag: (min: 28.0, avg: 30.0, max: 59.0) +[2023-10-09 04:26:56,053][60144] Updated weights for policy 1, policy_version 7722 (0.0009) +[2023-10-09 04:26:56,053][59242] Avg episode reward: [(0, '11.680'), (1, '14.900')] +[2023-10-09 04:26:56,426][60144] Updated weights for policy 1, policy_version 7732 (0.0009) +[2023-10-09 04:26:56,799][60144] Updated weights for policy 1, policy_version 7742 (0.0008) +[2023-10-09 04:26:57,268][60143] Updated weights for policy 0, policy_version 7652 (0.0008) +[2023-10-09 04:26:57,644][60143] Updated weights for policy 0, policy_version 7662 (0.0011) +[2023-10-09 04:26:58,031][60143] Updated weights for policy 0, policy_version 7672 (0.0010) +[2023-10-09 04:27:00,867][60144] Updated weights for policy 1, policy_version 7752 (0.0011) +[2023-10-09 04:27:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 15794176. Throughput: 0: 1714.3, 1: 1747.0. Samples: 3964152. Policy #0 lag: (min: 25.0, avg: 32.5, max: 57.0) +[2023-10-09 04:27:01,053][59242] Avg episode reward: [(0, '12.050'), (1, '14.690')] +[2023-10-09 04:27:01,235][60144] Updated weights for policy 1, policy_version 7762 (0.0008) +[2023-10-09 04:27:01,611][60144] Updated weights for policy 1, policy_version 7772 (0.0008) +[2023-10-09 04:27:01,975][60143] Updated weights for policy 0, policy_version 7682 (0.0008) +[2023-10-09 04:27:02,341][60143] Updated weights for policy 0, policy_version 7692 (0.0007) +[2023-10-09 04:27:02,706][60143] Updated weights for policy 0, policy_version 7702 (0.0007) +[2023-10-09 04:27:03,081][60143] Updated weights for policy 0, policy_version 7712 (0.0011) +[2023-10-09 04:27:05,506][60144] Updated weights for policy 1, policy_version 7782 (0.0009) +[2023-10-09 04:27:05,885][60144] Updated weights for policy 1, policy_version 7792 (0.0008) +[2023-10-09 04:27:06,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 15859712. Throughput: 0: 1686.5, 1: 1732.7. Samples: 3973446. Policy #0 lag: (min: 25.0, avg: 32.5, max: 57.0) +[2023-10-09 04:27:06,052][59242] Avg episode reward: [(0, '12.520'), (1, '14.910')] +[2023-10-09 04:27:06,053][59934] Saving new best policy, reward=12.520! +[2023-10-09 04:27:06,251][60144] Updated weights for policy 1, policy_version 7802 (0.0007) +[2023-10-09 04:27:07,147][60143] Updated weights for policy 0, policy_version 7722 (0.0007) +[2023-10-09 04:27:07,521][60143] Updated weights for policy 0, policy_version 7732 (0.0009) +[2023-10-09 04:27:07,893][60143] Updated weights for policy 0, policy_version 7742 (0.0009) +[2023-10-09 04:27:10,189][60144] Updated weights for policy 1, policy_version 7812 (0.0008) +[2023-10-09 04:27:10,562][60144] Updated weights for policy 1, policy_version 7822 (0.0008) +[2023-10-09 04:27:10,920][60144] Updated weights for policy 1, policy_version 7832 (0.0010) +[2023-10-09 04:27:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 15925248. Throughput: 0: 1708.9, 1: 1744.5. Samples: 3994786. Policy #0 lag: (min: 31.0, avg: 34.9, max: 63.0) +[2023-10-09 04:27:11,052][59242] Avg episode reward: [(0, '12.160'), (1, '14.590')] +[2023-10-09 04:27:11,888][60143] Updated weights for policy 0, policy_version 7752 (0.0009) +[2023-10-09 04:27:12,269][60143] Updated weights for policy 0, policy_version 7762 (0.0008) +[2023-10-09 04:27:12,638][60143] Updated weights for policy 0, policy_version 7772 (0.0009) +[2023-10-09 04:27:14,790][60144] Updated weights for policy 1, policy_version 7842 (0.0008) +[2023-10-09 04:27:15,163][60144] Updated weights for policy 1, policy_version 7852 (0.0008) +[2023-10-09 04:27:15,535][60144] Updated weights for policy 1, policy_version 7862 (0.0009) +[2023-10-09 04:27:15,895][60144] Updated weights for policy 1, policy_version 7872 (0.0007) +[2023-10-09 04:27:16,052][59242] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 16023552. Throughput: 0: 1719.6, 1: 1721.6. Samples: 4015058. Policy #0 lag: (min: 31.0, avg: 34.9, max: 63.0) +[2023-10-09 04:27:16,053][59242] Avg episode reward: [(0, '12.990'), (1, '13.740')] +[2023-10-09 04:27:16,066][59934] Saving new best policy, reward=12.990! +[2023-10-09 04:27:16,918][60143] Updated weights for policy 0, policy_version 7782 (0.0009) +[2023-10-09 04:27:17,286][60143] Updated weights for policy 0, policy_version 7792 (0.0008) +[2023-10-09 04:27:17,665][60143] Updated weights for policy 0, policy_version 7802 (0.0007) +[2023-10-09 04:27:19,775][60144] Updated weights for policy 1, policy_version 7882 (0.0010) +[2023-10-09 04:27:20,140][60144] Updated weights for policy 1, policy_version 7892 (0.0007) +[2023-10-09 04:27:20,508][60144] Updated weights for policy 1, policy_version 7902 (0.0008) +[2023-10-09 04:27:21,052][59242] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 16089088. Throughput: 0: 1690.8, 1: 1748.8. Samples: 4025442. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:27:21,053][59242] Avg episode reward: [(0, '13.260'), (1, '13.670')] +[2023-10-09 04:27:21,055][59934] Saving new best policy, reward=13.260! +[2023-10-09 04:27:21,590][60143] Updated weights for policy 0, policy_version 7812 (0.0008) +[2023-10-09 04:27:21,952][60143] Updated weights for policy 0, policy_version 7822 (0.0009) +[2023-10-09 04:27:22,326][60143] Updated weights for policy 0, policy_version 7832 (0.0007) +[2023-10-09 04:27:24,453][60144] Updated weights for policy 1, policy_version 7912 (0.0008) +[2023-10-09 04:27:24,825][60144] Updated weights for policy 1, policy_version 7922 (0.0007) +[2023-10-09 04:27:25,190][60144] Updated weights for policy 1, policy_version 7932 (0.0007) +[2023-10-09 04:27:26,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 16154624. Throughput: 0: 1710.6, 1: 1742.4. Samples: 4046036. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:27:26,052][59242] Avg episode reward: [(0, '13.250'), (1, '13.540')] +[2023-10-09 04:27:26,320][60143] Updated weights for policy 0, policy_version 7842 (0.0009) +[2023-10-09 04:27:26,683][60143] Updated weights for policy 0, policy_version 7852 (0.0007) +[2023-10-09 04:27:27,052][60143] Updated weights for policy 0, policy_version 7862 (0.0008) +[2023-10-09 04:27:27,419][60143] Updated weights for policy 0, policy_version 7872 (0.0010) +[2023-10-09 04:27:28,930][60144] Updated weights for policy 1, policy_version 7942 (0.0010) +[2023-10-09 04:27:29,300][60144] Updated weights for policy 1, policy_version 7952 (0.0008) +[2023-10-09 04:27:29,663][60144] Updated weights for policy 1, policy_version 7962 (0.0007) +[2023-10-09 04:27:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 16220160. Throughput: 0: 1702.3, 1: 1724.0. Samples: 4066538. Policy #0 lag: (min: 31.0, avg: 40.6, max: 63.0) +[2023-10-09 04:27:31,053][59242] Avg episode reward: [(0, '13.400'), (1, '13.230')] +[2023-10-09 04:27:31,063][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000007968_8159232.pth... +[2023-10-09 04:27:31,108][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000006336_6488064.pth +[2023-10-09 04:27:31,114][60003] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p1/milestones/checkpoint_000007968_8159232.pth +[2023-10-09 04:27:31,510][60143] Updated weights for policy 0, policy_version 7882 (0.0008) +[2023-10-09 04:27:31,889][60143] Updated weights for policy 0, policy_version 7892 (0.0010) +[2023-10-09 04:27:32,250][60143] Updated weights for policy 0, policy_version 7902 (0.0011) +[2023-10-09 04:27:32,322][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000007904_8093696.pth... +[2023-10-09 04:27:32,350][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000006304_6455296.pth +[2023-10-09 04:27:32,354][59934] Saving new best policy, reward=13.400! +[2023-10-09 04:27:32,386][59934] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p0/milestones/checkpoint_000007904_8093696.pth +[2023-10-09 04:27:33,695][60144] Updated weights for policy 1, policy_version 7972 (0.0008) +[2023-10-09 04:27:34,101][60144] Updated weights for policy 1, policy_version 7982 (0.0008) +[2023-10-09 04:27:34,473][60144] Updated weights for policy 1, policy_version 7992 (0.0009) +[2023-10-09 04:27:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 16285696. Throughput: 0: 1697.9, 1: 1752.4. Samples: 4077058. Policy #0 lag: (min: 31.0, avg: 40.6, max: 63.0) +[2023-10-09 04:27:36,053][59242] Avg episode reward: [(0, '12.480'), (1, '13.120')] +[2023-10-09 04:27:36,258][60143] Updated weights for policy 0, policy_version 7912 (0.0010) +[2023-10-09 04:27:36,633][60143] Updated weights for policy 0, policy_version 7922 (0.0010) +[2023-10-09 04:27:37,013][60143] Updated weights for policy 0, policy_version 7932 (0.0011) +[2023-10-09 04:27:38,146][60144] Updated weights for policy 1, policy_version 8002 (0.0011) +[2023-10-09 04:27:38,505][60144] Updated weights for policy 1, policy_version 8012 (0.0008) +[2023-10-09 04:27:38,880][60144] Updated weights for policy 1, policy_version 8022 (0.0008) +[2023-10-09 04:27:39,240][60144] Updated weights for policy 1, policy_version 8032 (0.0010) +[2023-10-09 04:27:40,958][60143] Updated weights for policy 0, policy_version 7942 (0.0008) +[2023-10-09 04:27:41,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 16351232. Throughput: 0: 1706.0, 1: 1718.1. Samples: 4096932. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:27:41,053][59242] Avg episode reward: [(0, '12.260'), (1, '12.560')] +[2023-10-09 04:27:41,324][60143] Updated weights for policy 0, policy_version 7952 (0.0009) +[2023-10-09 04:27:41,702][60143] Updated weights for policy 0, policy_version 7962 (0.0009) +[2023-10-09 04:27:43,095][60144] Updated weights for policy 1, policy_version 8042 (0.0010) +[2023-10-09 04:27:43,472][60144] Updated weights for policy 1, policy_version 8052 (0.0011) +[2023-10-09 04:27:43,829][60144] Updated weights for policy 1, policy_version 8062 (0.0009) +[2023-10-09 04:27:45,605][60143] Updated weights for policy 0, policy_version 7972 (0.0009) +[2023-10-09 04:27:45,995][60143] Updated weights for policy 0, policy_version 7982 (0.0008) +[2023-10-09 04:27:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 16416768. Throughput: 0: 1704.7, 1: 1717.9. Samples: 4118166. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:27:46,053][59242] Avg episode reward: [(0, '12.370'), (1, '13.710')] +[2023-10-09 04:27:46,369][60143] Updated weights for policy 0, policy_version 7992 (0.0011) +[2023-10-09 04:27:47,829][60144] Updated weights for policy 1, policy_version 8072 (0.0008) +[2023-10-09 04:27:48,191][60144] Updated weights for policy 1, policy_version 8082 (0.0008) +[2023-10-09 04:27:48,559][60144] Updated weights for policy 1, policy_version 8092 (0.0007) +[2023-10-09 04:27:50,418][60143] Updated weights for policy 0, policy_version 8002 (0.0009) +[2023-10-09 04:27:50,791][60143] Updated weights for policy 0, policy_version 8012 (0.0009) +[2023-10-09 04:27:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 16482304. Throughput: 0: 1703.5, 1: 1725.4. Samples: 4127750. Policy #0 lag: (min: 1.0, avg: 6.4, max: 33.0) +[2023-10-09 04:27:51,053][59242] Avg episode reward: [(0, '12.590'), (1, '12.350')] +[2023-10-09 04:27:51,165][60143] Updated weights for policy 0, policy_version 8022 (0.0010) +[2023-10-09 04:27:51,538][60143] Updated weights for policy 0, policy_version 8032 (0.0009) +[2023-10-09 04:27:52,729][60144] Updated weights for policy 1, policy_version 8102 (0.0007) +[2023-10-09 04:27:53,095][60144] Updated weights for policy 1, policy_version 8112 (0.0007) +[2023-10-09 04:27:53,463][60144] Updated weights for policy 1, policy_version 8122 (0.0009) +[2023-10-09 04:27:55,376][60143] Updated weights for policy 0, policy_version 8042 (0.0009) +[2023-10-09 04:27:55,749][60143] Updated weights for policy 0, policy_version 8052 (0.0010) +[2023-10-09 04:27:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 16547840. Throughput: 0: 1709.7, 1: 1714.4. Samples: 4148872. Policy #0 lag: (min: 1.0, avg: 6.4, max: 33.0) +[2023-10-09 04:27:56,053][59242] Avg episode reward: [(0, '12.150'), (1, '12.360')] +[2023-10-09 04:27:56,116][60143] Updated weights for policy 0, policy_version 8062 (0.0007) +[2023-10-09 04:27:57,641][60144] Updated weights for policy 1, policy_version 8132 (0.0009) +[2023-10-09 04:27:58,008][60144] Updated weights for policy 1, policy_version 8142 (0.0008) +[2023-10-09 04:27:58,371][60144] Updated weights for policy 1, policy_version 8152 (0.0009) +[2023-10-09 04:28:00,162][60143] Updated weights for policy 0, policy_version 8072 (0.0008) +[2023-10-09 04:28:00,533][60143] Updated weights for policy 0, policy_version 8082 (0.0007) +[2023-10-09 04:28:00,903][60143] Updated weights for policy 0, policy_version 8092 (0.0007) +[2023-10-09 04:28:01,052][59242] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 16646144. Throughput: 0: 1695.4, 1: 1740.6. Samples: 4169678. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 04:28:01,052][59242] Avg episode reward: [(0, '11.530'), (1, '12.700')] +[2023-10-09 04:28:02,107][60144] Updated weights for policy 1, policy_version 8162 (0.0007) +[2023-10-09 04:28:02,464][60144] Updated weights for policy 1, policy_version 8172 (0.0009) +[2023-10-09 04:28:02,843][60144] Updated weights for policy 1, policy_version 8182 (0.0007) +[2023-10-09 04:28:03,205][60144] Updated weights for policy 1, policy_version 8192 (0.0010) +[2023-10-09 04:28:04,896][60143] Updated weights for policy 0, policy_version 8102 (0.0009) +[2023-10-09 04:28:05,271][60143] Updated weights for policy 0, policy_version 8112 (0.0009) +[2023-10-09 04:28:05,641][60143] Updated weights for policy 0, policy_version 8122 (0.0010) +[2023-10-09 04:28:06,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 16711680. Throughput: 0: 1711.5, 1: 1714.8. Samples: 4179628. Policy #0 lag: (min: 2.0, avg: 6.2, max: 34.0) +[2023-10-09 04:28:06,053][59242] Avg episode reward: [(0, '11.910'), (1, '12.550')] +[2023-10-09 04:28:07,190][60144] Updated weights for policy 1, policy_version 8202 (0.0007) +[2023-10-09 04:28:07,557][60144] Updated weights for policy 1, policy_version 8212 (0.0008) +[2023-10-09 04:28:07,928][60144] Updated weights for policy 1, policy_version 8222 (0.0009) +[2023-10-09 04:28:09,464][60143] Updated weights for policy 0, policy_version 8132 (0.0008) +[2023-10-09 04:28:09,843][60143] Updated weights for policy 0, policy_version 8142 (0.0007) +[2023-10-09 04:28:10,209][60143] Updated weights for policy 0, policy_version 8152 (0.0011) +[2023-10-09 04:28:11,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 16777216. Throughput: 0: 1714.9, 1: 1724.6. Samples: 4200814. Policy #0 lag: (min: 2.0, avg: 6.2, max: 34.0) +[2023-10-09 04:28:11,053][59242] Avg episode reward: [(0, '12.030'), (1, '13.070')] +[2023-10-09 04:28:11,743][60144] Updated weights for policy 1, policy_version 8232 (0.0009) +[2023-10-09 04:28:12,104][60144] Updated weights for policy 1, policy_version 8242 (0.0009) +[2023-10-09 04:28:12,476][60144] Updated weights for policy 1, policy_version 8252 (0.0009) +[2023-10-09 04:28:14,287][60143] Updated weights for policy 0, policy_version 8162 (0.0009) +[2023-10-09 04:28:14,666][60143] Updated weights for policy 0, policy_version 8172 (0.0007) +[2023-10-09 04:28:15,046][60143] Updated weights for policy 0, policy_version 8182 (0.0009) +[2023-10-09 04:28:15,408][60143] Updated weights for policy 0, policy_version 8192 (0.0009) +[2023-10-09 04:28:16,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 16842752. Throughput: 0: 1691.6, 1: 1742.6. Samples: 4221080. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-09 04:28:16,053][59242] Avg episode reward: [(0, '12.040'), (1, '12.390')] +[2023-10-09 04:28:16,394][60144] Updated weights for policy 1, policy_version 8262 (0.0008) +[2023-10-09 04:28:16,760][60144] Updated weights for policy 1, policy_version 8272 (0.0007) +[2023-10-09 04:28:17,130][60144] Updated weights for policy 1, policy_version 8282 (0.0008) +[2023-10-09 04:28:19,426][60143] Updated weights for policy 0, policy_version 8202 (0.0009) +[2023-10-09 04:28:19,805][60143] Updated weights for policy 0, policy_version 8212 (0.0007) +[2023-10-09 04:28:20,179][60143] Updated weights for policy 0, policy_version 8222 (0.0009) +[2023-10-09 04:28:21,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 16908288. Throughput: 0: 1718.4, 1: 1713.4. Samples: 4231488. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-09 04:28:21,053][59242] Avg episode reward: [(0, '12.580'), (1, '12.220')] +[2023-10-09 04:28:21,091][60144] Updated weights for policy 1, policy_version 8292 (0.0008) +[2023-10-09 04:28:21,493][60144] Updated weights for policy 1, policy_version 8302 (0.0007) +[2023-10-09 04:28:21,873][60144] Updated weights for policy 1, policy_version 8312 (0.0010) +[2023-10-09 04:28:24,158][60143] Updated weights for policy 0, policy_version 8232 (0.0009) +[2023-10-09 04:28:24,526][60143] Updated weights for policy 0, policy_version 8242 (0.0011) +[2023-10-09 04:28:24,910][60143] Updated weights for policy 0, policy_version 8252 (0.0011) +[2023-10-09 04:28:25,675][60144] Updated weights for policy 1, policy_version 8322 (0.0009) +[2023-10-09 04:28:26,039][60144] Updated weights for policy 1, policy_version 8332 (0.0008) +[2023-10-09 04:28:26,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 16973824. Throughput: 0: 1701.9, 1: 1746.8. Samples: 4252122. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 04:28:26,052][59242] Avg episode reward: [(0, '11.940'), (1, '12.610')] +[2023-10-09 04:28:26,408][60144] Updated weights for policy 1, policy_version 8342 (0.0008) +[2023-10-09 04:28:26,772][60144] Updated weights for policy 1, policy_version 8352 (0.0010) +[2023-10-09 04:28:28,953][60143] Updated weights for policy 0, policy_version 8262 (0.0010) +[2023-10-09 04:28:29,323][60143] Updated weights for policy 0, policy_version 8272 (0.0008) +[2023-10-09 04:28:29,698][60143] Updated weights for policy 0, policy_version 8282 (0.0010) +[2023-10-09 04:28:30,709][60144] Updated weights for policy 1, policy_version 8362 (0.0007) +[2023-10-09 04:28:31,052][59242] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 17039360. Throughput: 0: 1688.3, 1: 1738.6. Samples: 4272378. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 04:28:31,053][59242] Avg episode reward: [(0, '12.130'), (1, '12.970')] +[2023-10-09 04:28:31,074][60144] Updated weights for policy 1, policy_version 8372 (0.0007) +[2023-10-09 04:28:31,454][60144] Updated weights for policy 1, policy_version 8382 (0.0008) +[2023-10-09 04:28:33,817][60143] Updated weights for policy 0, policy_version 8292 (0.0007) +[2023-10-09 04:28:34,214][60143] Updated weights for policy 0, policy_version 8302 (0.0009) +[2023-10-09 04:28:34,575][60143] Updated weights for policy 0, policy_version 8312 (0.0009) +[2023-10-09 04:28:35,431][60144] Updated weights for policy 1, policy_version 8392 (0.0008) +[2023-10-09 04:28:35,807][60144] Updated weights for policy 1, policy_version 8402 (0.0009) +[2023-10-09 04:28:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 17104896. Throughput: 0: 1716.0, 1: 1734.5. Samples: 4283020. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:28:36,053][59242] Avg episode reward: [(0, '12.640'), (1, '12.900')] +[2023-10-09 04:28:36,182][60144] Updated weights for policy 1, policy_version 8412 (0.0009) +[2023-10-09 04:28:38,462][60143] Updated weights for policy 0, policy_version 8322 (0.0008) +[2023-10-09 04:28:38,836][60143] Updated weights for policy 0, policy_version 8332 (0.0009) +[2023-10-09 04:28:39,211][60143] Updated weights for policy 0, policy_version 8342 (0.0012) +[2023-10-09 04:28:39,581][60143] Updated weights for policy 0, policy_version 8352 (0.0008) +[2023-10-09 04:28:40,112][60144] Updated weights for policy 1, policy_version 8422 (0.0010) +[2023-10-09 04:28:40,475][60144] Updated weights for policy 1, policy_version 8432 (0.0010) +[2023-10-09 04:28:40,846][60144] Updated weights for policy 1, policy_version 8442 (0.0009) +[2023-10-09 04:28:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 17170432. Throughput: 0: 1682.9, 1: 1745.1. Samples: 4303132. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:28:41,053][59242] Avg episode reward: [(0, '12.740'), (1, '13.650')] +[2023-10-09 04:28:43,589][60143] Updated weights for policy 0, policy_version 8362 (0.0008) +[2023-10-09 04:28:43,959][60143] Updated weights for policy 0, policy_version 8372 (0.0011) +[2023-10-09 04:28:44,336][60143] Updated weights for policy 0, policy_version 8382 (0.0010) +[2023-10-09 04:28:44,808][60144] Updated weights for policy 1, policy_version 8452 (0.0008) +[2023-10-09 04:28:45,187][60144] Updated weights for policy 1, policy_version 8462 (0.0007) +[2023-10-09 04:28:45,560][60144] Updated weights for policy 1, policy_version 8472 (0.0009) +[2023-10-09 04:28:46,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 17268736. Throughput: 0: 1695.9, 1: 1714.7. Samples: 4323158. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:28:46,053][59242] Avg episode reward: [(0, '13.020'), (1, '13.390')] +[2023-10-09 04:28:48,358][60143] Updated weights for policy 0, policy_version 8392 (0.0009) +[2023-10-09 04:28:48,719][60143] Updated weights for policy 0, policy_version 8402 (0.0008) +[2023-10-09 04:28:49,087][60143] Updated weights for policy 0, policy_version 8412 (0.0008) +[2023-10-09 04:28:49,683][60144] Updated weights for policy 1, policy_version 8482 (0.0009) +[2023-10-09 04:28:50,047][60144] Updated weights for policy 1, policy_version 8492 (0.0008) +[2023-10-09 04:28:50,423][60144] Updated weights for policy 1, policy_version 8502 (0.0010) +[2023-10-09 04:28:50,780][60144] Updated weights for policy 1, policy_version 8512 (0.0010) +[2023-10-09 04:28:51,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 17334272. Throughput: 0: 1696.2, 1: 1737.6. Samples: 4334148. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:28:51,053][59242] Avg episode reward: [(0, '13.320'), (1, '13.600')] +[2023-10-09 04:28:53,094][60143] Updated weights for policy 0, policy_version 8422 (0.0008) +[2023-10-09 04:28:53,464][60143] Updated weights for policy 0, policy_version 8432 (0.0009) +[2023-10-09 04:28:53,844][60143] Updated weights for policy 0, policy_version 8442 (0.0008) +[2023-10-09 04:28:54,681][60144] Updated weights for policy 1, policy_version 8522 (0.0008) +[2023-10-09 04:28:55,053][60144] Updated weights for policy 1, policy_version 8532 (0.0010) +[2023-10-09 04:28:55,415][60144] Updated weights for policy 1, policy_version 8542 (0.0009) +[2023-10-09 04:28:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 17399808. Throughput: 0: 1678.8, 1: 1729.0. Samples: 4354164. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 04:28:56,053][59242] Avg episode reward: [(0, '13.140'), (1, '13.350')] +[2023-10-09 04:28:57,754][60143] Updated weights for policy 0, policy_version 8452 (0.0008) +[2023-10-09 04:28:58,117][60143] Updated weights for policy 0, policy_version 8462 (0.0009) +[2023-10-09 04:28:58,487][60143] Updated weights for policy 0, policy_version 8472 (0.0009) +[2023-10-09 04:28:59,310][60144] Updated weights for policy 1, policy_version 8552 (0.0009) +[2023-10-09 04:28:59,680][60144] Updated weights for policy 1, policy_version 8562 (0.0009) +[2023-10-09 04:29:00,059][60144] Updated weights for policy 1, policy_version 8572 (0.0009) +[2023-10-09 04:29:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 17465344. Throughput: 0: 1710.1, 1: 1699.2. Samples: 4374502. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 04:29:01,053][59242] Avg episode reward: [(0, '13.130'), (1, '14.240')] +[2023-10-09 04:29:02,504][60143] Updated weights for policy 0, policy_version 8482 (0.0008) +[2023-10-09 04:29:02,881][60143] Updated weights for policy 0, policy_version 8492 (0.0008) +[2023-10-09 04:29:03,249][60143] Updated weights for policy 0, policy_version 8502 (0.0009) +[2023-10-09 04:29:03,622][60143] Updated weights for policy 0, policy_version 8512 (0.0009) +[2023-10-09 04:29:04,040][60144] Updated weights for policy 1, policy_version 8582 (0.0009) +[2023-10-09 04:29:04,404][60144] Updated weights for policy 1, policy_version 8592 (0.0010) +[2023-10-09 04:29:04,777][60144] Updated weights for policy 1, policy_version 8602 (0.0010) +[2023-10-09 04:29:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 17530880. Throughput: 0: 1686.1, 1: 1730.9. Samples: 4385254. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) +[2023-10-09 04:29:06,053][59242] Avg episode reward: [(0, '13.080'), (1, '13.810')] +[2023-10-09 04:29:07,713][60143] Updated weights for policy 0, policy_version 8522 (0.0008) +[2023-10-09 04:29:08,082][60143] Updated weights for policy 0, policy_version 8532 (0.0009) +[2023-10-09 04:29:08,457][60143] Updated weights for policy 0, policy_version 8542 (0.0011) +[2023-10-09 04:29:08,845][60144] Updated weights for policy 1, policy_version 8612 (0.0010) +[2023-10-09 04:29:09,247][60144] Updated weights for policy 1, policy_version 8622 (0.0011) +[2023-10-09 04:29:09,622][60144] Updated weights for policy 1, policy_version 8632 (0.0010) +[2023-10-09 04:29:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 17596416. Throughput: 0: 1701.4, 1: 1700.7. Samples: 4405218. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) +[2023-10-09 04:29:11,053][59242] Avg episode reward: [(0, '12.670'), (1, '14.160')] +[2023-10-09 04:29:12,170][60143] Updated weights for policy 0, policy_version 8552 (0.0008) +[2023-10-09 04:29:12,531][60143] Updated weights for policy 0, policy_version 8562 (0.0008) +[2023-10-09 04:29:12,906][60143] Updated weights for policy 0, policy_version 8572 (0.0008) +[2023-10-09 04:29:13,644][60144] Updated weights for policy 1, policy_version 8642 (0.0010) +[2023-10-09 04:29:14,015][60144] Updated weights for policy 1, policy_version 8652 (0.0009) +[2023-10-09 04:29:14,385][60144] Updated weights for policy 1, policy_version 8662 (0.0008) +[2023-10-09 04:29:14,759][60144] Updated weights for policy 1, policy_version 8672 (0.0008) +[2023-10-09 04:29:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 17661952. Throughput: 0: 1721.2, 1: 1693.2. Samples: 4426026. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-09 04:29:16,053][59242] Avg episode reward: [(0, '13.040'), (1, '14.090')] +[2023-10-09 04:29:16,848][60143] Updated weights for policy 0, policy_version 8582 (0.0010) +[2023-10-09 04:29:17,222][60143] Updated weights for policy 0, policy_version 8592 (0.0008) +[2023-10-09 04:29:17,602][60143] Updated weights for policy 0, policy_version 8602 (0.0008) +[2023-10-09 04:29:18,826][60144] Updated weights for policy 1, policy_version 8682 (0.0009) +[2023-10-09 04:29:19,191][60144] Updated weights for policy 1, policy_version 8692 (0.0008) +[2023-10-09 04:29:19,563][60144] Updated weights for policy 1, policy_version 8702 (0.0008) +[2023-10-09 04:29:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 17727488. Throughput: 0: 1693.4, 1: 1717.5. Samples: 4436508. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-09 04:29:21,053][59242] Avg episode reward: [(0, '13.640'), (1, '13.840')] +[2023-10-09 04:29:21,054][59934] Saving new best policy, reward=13.640! +[2023-10-09 04:29:21,722][60143] Updated weights for policy 0, policy_version 8612 (0.0009) +[2023-10-09 04:29:22,114][60143] Updated weights for policy 0, policy_version 8622 (0.0010) +[2023-10-09 04:29:22,482][60143] Updated weights for policy 0, policy_version 8632 (0.0009) +[2023-10-09 04:29:23,316][60144] Updated weights for policy 1, policy_version 8712 (0.0009) +[2023-10-09 04:29:23,700][60144] Updated weights for policy 1, policy_version 8722 (0.0009) +[2023-10-09 04:29:24,067][60144] Updated weights for policy 1, policy_version 8732 (0.0008) +[2023-10-09 04:29:26,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 17793024. Throughput: 0: 1718.6, 1: 1689.9. Samples: 4456512. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:29:26,053][59242] Avg episode reward: [(0, '13.090'), (1, '13.940')] +[2023-10-09 04:29:26,519][60143] Updated weights for policy 0, policy_version 8642 (0.0009) +[2023-10-09 04:29:26,890][60143] Updated weights for policy 0, policy_version 8652 (0.0008) +[2023-10-09 04:29:27,261][60143] Updated weights for policy 0, policy_version 8662 (0.0007) +[2023-10-09 04:29:27,636][60143] Updated weights for policy 0, policy_version 8672 (0.0007) +[2023-10-09 04:29:27,978][60144] Updated weights for policy 1, policy_version 8742 (0.0011) +[2023-10-09 04:29:28,346][60144] Updated weights for policy 1, policy_version 8752 (0.0010) +[2023-10-09 04:29:28,709][60144] Updated weights for policy 1, policy_version 8762 (0.0010) +[2023-10-09 04:29:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 17858560. Throughput: 0: 1718.9, 1: 1713.9. Samples: 4477634. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:29:31,053][59242] Avg episode reward: [(0, '13.520'), (1, '14.050')] +[2023-10-09 04:29:31,064][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000008672_8880128.pth... +[2023-10-09 04:29:31,064][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000008768_8978432.pth... +[2023-10-09 04:29:31,095][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000007104_7274496.pth +[2023-10-09 04:29:31,106][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000007168_7340032.pth +[2023-10-09 04:29:31,617][60143] Updated weights for policy 0, policy_version 8682 (0.0007) +[2023-10-09 04:29:31,985][60143] Updated weights for policy 0, policy_version 8692 (0.0007) +[2023-10-09 04:29:32,361][60143] Updated weights for policy 0, policy_version 8702 (0.0008) +[2023-10-09 04:29:32,768][60144] Updated weights for policy 1, policy_version 8772 (0.0009) +[2023-10-09 04:29:33,138][60144] Updated weights for policy 1, policy_version 8782 (0.0009) +[2023-10-09 04:29:33,513][60144] Updated weights for policy 1, policy_version 8792 (0.0010) +[2023-10-09 04:29:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 17924096. Throughput: 0: 1704.0, 1: 1697.6. Samples: 4487218. Policy #0 lag: (min: 15.0, avg: 22.8, max: 47.0) +[2023-10-09 04:29:36,053][59242] Avg episode reward: [(0, '14.270'), (1, '14.090')] +[2023-10-09 04:29:36,438][60143] Updated weights for policy 0, policy_version 8712 (0.0009) +[2023-10-09 04:29:36,814][60143] Updated weights for policy 0, policy_version 8722 (0.0008) +[2023-10-09 04:29:37,189][60143] Updated weights for policy 0, policy_version 8732 (0.0011) +[2023-10-09 04:29:37,324][59934] Saving new best policy, reward=14.270! +[2023-10-09 04:29:37,523][60144] Updated weights for policy 1, policy_version 8802 (0.0010) +[2023-10-09 04:29:37,889][60144] Updated weights for policy 1, policy_version 8812 (0.0007) +[2023-10-09 04:29:38,256][60144] Updated weights for policy 1, policy_version 8822 (0.0007) +[2023-10-09 04:29:38,624][60144] Updated weights for policy 1, policy_version 8832 (0.0007) +[2023-10-09 04:29:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 17989632. Throughput: 0: 1725.3, 1: 1698.0. Samples: 4508214. Policy #0 lag: (min: 15.0, avg: 22.8, max: 47.0) +[2023-10-09 04:29:41,053][59242] Avg episode reward: [(0, '13.690'), (1, '14.420')] +[2023-10-09 04:29:41,334][60143] Updated weights for policy 0, policy_version 8742 (0.0008) +[2023-10-09 04:29:41,707][60143] Updated weights for policy 0, policy_version 8752 (0.0008) +[2023-10-09 04:29:42,074][60143] Updated weights for policy 0, policy_version 8762 (0.0007) +[2023-10-09 04:29:42,613][60144] Updated weights for policy 1, policy_version 8842 (0.0009) +[2023-10-09 04:29:42,978][60144] Updated weights for policy 1, policy_version 8852 (0.0008) +[2023-10-09 04:29:43,346][60144] Updated weights for policy 1, policy_version 8862 (0.0009) +[2023-10-09 04:29:45,917][60143] Updated weights for policy 0, policy_version 8772 (0.0008) +[2023-10-09 04:29:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 18055168. Throughput: 0: 1718.9, 1: 1725.0. Samples: 4529478. Policy #0 lag: (min: 9.0, avg: 22.0, max: 41.0) +[2023-10-09 04:29:46,053][59242] Avg episode reward: [(0, '14.350'), (1, '15.110')] +[2023-10-09 04:29:46,302][60143] Updated weights for policy 0, policy_version 8782 (0.0009) +[2023-10-09 04:29:46,675][60143] Updated weights for policy 0, policy_version 8792 (0.0007) +[2023-10-09 04:29:46,972][59934] Saving new best policy, reward=14.350! +[2023-10-09 04:29:47,199][60144] Updated weights for policy 1, policy_version 8872 (0.0008) +[2023-10-09 04:29:47,570][60144] Updated weights for policy 1, policy_version 8882 (0.0009) +[2023-10-09 04:29:47,944][60144] Updated weights for policy 1, policy_version 8892 (0.0008) +[2023-10-09 04:29:50,419][60143] Updated weights for policy 0, policy_version 8802 (0.0010) +[2023-10-09 04:29:50,791][60143] Updated weights for policy 0, policy_version 8812 (0.0010) +[2023-10-09 04:29:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 18120704. Throughput: 0: 1714.1, 1: 1698.3. Samples: 4538812. Policy #0 lag: (min: 9.0, avg: 22.0, max: 41.0) +[2023-10-09 04:29:51,052][59242] Avg episode reward: [(0, '14.600'), (1, '15.060')] +[2023-10-09 04:29:51,167][60143] Updated weights for policy 0, policy_version 8822 (0.0011) +[2023-10-09 04:29:51,537][60143] Updated weights for policy 0, policy_version 8832 (0.0010) +[2023-10-09 04:29:51,537][59934] Saving new best policy, reward=14.600! +[2023-10-09 04:29:51,864][60144] Updated weights for policy 1, policy_version 8902 (0.0008) +[2023-10-09 04:29:52,239][60144] Updated weights for policy 1, policy_version 8912 (0.0008) +[2023-10-09 04:29:52,604][60144] Updated weights for policy 1, policy_version 8922 (0.0010) +[2023-10-09 04:29:55,254][60143] Updated weights for policy 0, policy_version 8842 (0.0008) +[2023-10-09 04:29:55,624][60143] Updated weights for policy 0, policy_version 8852 (0.0009) +[2023-10-09 04:29:56,008][60143] Updated weights for policy 0, policy_version 8862 (0.0007) +[2023-10-09 04:29:56,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 18186240. Throughput: 0: 1717.2, 1: 1728.2. Samples: 4560262. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) +[2023-10-09 04:29:56,054][59242] Avg episode reward: [(0, '15.150'), (1, '15.120')] +[2023-10-09 04:29:56,077][59934] Saving new best policy, reward=15.150! +[2023-10-09 04:29:56,443][60144] Updated weights for policy 1, policy_version 8932 (0.0008) +[2023-10-09 04:29:56,848][60144] Updated weights for policy 1, policy_version 8942 (0.0007) +[2023-10-09 04:29:57,219][60144] Updated weights for policy 1, policy_version 8952 (0.0008) +[2023-10-09 04:30:00,085][60143] Updated weights for policy 0, policy_version 8872 (0.0008) +[2023-10-09 04:30:00,458][60143] Updated weights for policy 0, policy_version 8882 (0.0008) +[2023-10-09 04:30:00,835][60143] Updated weights for policy 0, policy_version 8892 (0.0007) +[2023-10-09 04:30:01,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 18284544. Throughput: 0: 1697.4, 1: 1742.1. Samples: 4580806. Policy #0 lag: (min: 17.0, avg: 17.7, max: 34.0) +[2023-10-09 04:30:01,053][59242] Avg episode reward: [(0, '15.510'), (1, '15.480')] +[2023-10-09 04:30:01,063][59934] Saving new best policy, reward=15.510! +[2023-10-09 04:30:01,064][60003] Saving new best policy, reward=15.480! +[2023-10-09 04:30:01,323][60144] Updated weights for policy 1, policy_version 8962 (0.0008) +[2023-10-09 04:30:01,683][60144] Updated weights for policy 1, policy_version 8972 (0.0009) +[2023-10-09 04:30:02,051][60144] Updated weights for policy 1, policy_version 8982 (0.0009) +[2023-10-09 04:30:02,420][60144] Updated weights for policy 1, policy_version 8992 (0.0007) +[2023-10-09 04:30:04,829][60143] Updated weights for policy 0, policy_version 8902 (0.0007) +[2023-10-09 04:30:05,201][60143] Updated weights for policy 0, policy_version 8912 (0.0009) +[2023-10-09 04:30:05,560][60143] Updated weights for policy 0, policy_version 8922 (0.0008) +[2023-10-09 04:30:06,052][59242] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 18350080. Throughput: 0: 1718.1, 1: 1711.8. Samples: 4590854. Policy #0 lag: (min: 17.0, avg: 17.7, max: 34.0) +[2023-10-09 04:30:06,053][59242] Avg episode reward: [(0, '15.170'), (1, '15.980')] +[2023-10-09 04:30:06,337][60144] Updated weights for policy 1, policy_version 9002 (0.0008) +[2023-10-09 04:30:06,709][60144] Updated weights for policy 1, policy_version 9012 (0.0008) +[2023-10-09 04:30:07,080][60144] Updated weights for policy 1, policy_version 9022 (0.0009) +[2023-10-09 04:30:07,145][60003] Saving new best policy, reward=15.980! +[2023-10-09 04:30:09,624][60143] Updated weights for policy 0, policy_version 8932 (0.0010) +[2023-10-09 04:30:10,019][60143] Updated weights for policy 0, policy_version 8942 (0.0009) +[2023-10-09 04:30:10,390][60143] Updated weights for policy 0, policy_version 8952 (0.0008) +[2023-10-09 04:30:11,011][60144] Updated weights for policy 1, policy_version 9032 (0.0007) +[2023-10-09 04:30:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 18415616. Throughput: 0: 1720.8, 1: 1735.3. Samples: 4612038. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-09 04:30:11,053][59242] Avg episode reward: [(0, '14.440'), (1, '16.150')] +[2023-10-09 04:30:11,380][60144] Updated weights for policy 1, policy_version 9042 (0.0007) +[2023-10-09 04:30:11,742][60144] Updated weights for policy 1, policy_version 9052 (0.0009) +[2023-10-09 04:30:11,886][60003] Saving new best policy, reward=16.150! +[2023-10-09 04:30:14,325][60143] Updated weights for policy 0, policy_version 8962 (0.0009) +[2023-10-09 04:30:14,707][60143] Updated weights for policy 0, policy_version 8972 (0.0008) +[2023-10-09 04:30:15,076][60143] Updated weights for policy 0, policy_version 8982 (0.0007) +[2023-10-09 04:30:15,448][60143] Updated weights for policy 0, policy_version 8992 (0.0008) +[2023-10-09 04:30:15,660][60144] Updated weights for policy 1, policy_version 9062 (0.0008) +[2023-10-09 04:30:16,028][60144] Updated weights for policy 1, policy_version 9072 (0.0007) +[2023-10-09 04:30:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 18481152. Throughput: 0: 1696.1, 1: 1732.8. Samples: 4631934. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-09 04:30:16,053][59242] Avg episode reward: [(0, '14.650'), (1, '15.980')] +[2023-10-09 04:30:16,398][60144] Updated weights for policy 1, policy_version 9082 (0.0007) +[2023-10-09 04:30:19,466][60143] Updated weights for policy 0, policy_version 9002 (0.0010) +[2023-10-09 04:30:19,846][60143] Updated weights for policy 0, policy_version 9012 (0.0008) +[2023-10-09 04:30:20,183][60144] Updated weights for policy 1, policy_version 9092 (0.0007) +[2023-10-09 04:30:20,214][60143] Updated weights for policy 0, policy_version 9022 (0.0008) +[2023-10-09 04:30:20,556][60144] Updated weights for policy 1, policy_version 9102 (0.0009) +[2023-10-09 04:30:20,922][60144] Updated weights for policy 1, policy_version 9112 (0.0008) +[2023-10-09 04:30:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 18546688. Throughput: 0: 1726.2, 1: 1731.2. Samples: 4642804. Policy #0 lag: (min: 31.0, avg: 34.9, max: 63.0) +[2023-10-09 04:30:21,053][59242] Avg episode reward: [(0, '14.290'), (1, '15.630')] +[2023-10-09 04:30:24,103][60143] Updated weights for policy 0, policy_version 9032 (0.0008) +[2023-10-09 04:30:24,463][60143] Updated weights for policy 0, policy_version 9042 (0.0008) +[2023-10-09 04:30:24,830][60143] Updated weights for policy 0, policy_version 9052 (0.0008) +[2023-10-09 04:30:24,951][60144] Updated weights for policy 1, policy_version 9122 (0.0009) +[2023-10-09 04:30:25,320][60144] Updated weights for policy 1, policy_version 9132 (0.0009) +[2023-10-09 04:30:25,681][60144] Updated weights for policy 1, policy_version 9142 (0.0008) +[2023-10-09 04:30:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 18612224. Throughput: 0: 1709.2, 1: 1739.6. Samples: 4663410. Policy #0 lag: (min: 31.0, avg: 34.9, max: 63.0) +[2023-10-09 04:30:26,053][59242] Avg episode reward: [(0, '14.140'), (1, '16.360')] +[2023-10-09 04:30:26,060][60144] Updated weights for policy 1, policy_version 9152 (0.0010) +[2023-10-09 04:30:26,060][60003] Saving new best policy, reward=16.360! +[2023-10-09 04:30:28,863][60143] Updated weights for policy 0, policy_version 9062 (0.0009) +[2023-10-09 04:30:29,242][60143] Updated weights for policy 0, policy_version 9072 (0.0010) +[2023-10-09 04:30:29,616][60143] Updated weights for policy 0, policy_version 9082 (0.0010) +[2023-10-09 04:30:30,152][60144] Updated weights for policy 1, policy_version 9162 (0.0011) +[2023-10-09 04:30:30,519][60144] Updated weights for policy 1, policy_version 9172 (0.0010) +[2023-10-09 04:30:30,892][60144] Updated weights for policy 1, policy_version 9182 (0.0007) +[2023-10-09 04:30:31,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 18710528. Throughput: 0: 1694.8, 1: 1717.5. Samples: 4683032. Policy #0 lag: (min: 26.0, avg: 28.8, max: 58.0) +[2023-10-09 04:30:31,053][59242] Avg episode reward: [(0, '13.560'), (1, '15.780')] +[2023-10-09 04:30:33,514][60143] Updated weights for policy 0, policy_version 9092 (0.0009) +[2023-10-09 04:30:33,884][60143] Updated weights for policy 0, policy_version 9102 (0.0009) +[2023-10-09 04:30:34,259][60143] Updated weights for policy 0, policy_version 9112 (0.0007) +[2023-10-09 04:30:34,830][60144] Updated weights for policy 1, policy_version 9192 (0.0008) +[2023-10-09 04:30:35,197][60144] Updated weights for policy 1, policy_version 9202 (0.0008) +[2023-10-09 04:30:35,559][60144] Updated weights for policy 1, policy_version 9212 (0.0009) +[2023-10-09 04:30:36,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 18776064. Throughput: 0: 1721.8, 1: 1735.1. Samples: 4694372. Policy #0 lag: (min: 26.0, avg: 28.8, max: 58.0) +[2023-10-09 04:30:36,053][59242] Avg episode reward: [(0, '12.680'), (1, '15.750')] +[2023-10-09 04:30:38,249][60143] Updated weights for policy 0, policy_version 9122 (0.0008) +[2023-10-09 04:30:38,625][60143] Updated weights for policy 0, policy_version 9132 (0.0008) +[2023-10-09 04:30:38,995][60143] Updated weights for policy 0, policy_version 9142 (0.0009) +[2023-10-09 04:30:39,370][60143] Updated weights for policy 0, policy_version 9152 (0.0008) +[2023-10-09 04:30:39,398][60144] Updated weights for policy 1, policy_version 9222 (0.0008) +[2023-10-09 04:30:39,772][60144] Updated weights for policy 1, policy_version 9232 (0.0008) +[2023-10-09 04:30:40,140][60144] Updated weights for policy 1, policy_version 9242 (0.0011) +[2023-10-09 04:30:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 18841600. Throughput: 0: 1694.5, 1: 1727.3. Samples: 4714244. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:30:41,052][59242] Avg episode reward: [(0, '12.690'), (1, '16.050')] +[2023-10-09 04:30:43,437][60143] Updated weights for policy 0, policy_version 9162 (0.0008) +[2023-10-09 04:30:43,805][60143] Updated weights for policy 0, policy_version 9172 (0.0008) +[2023-10-09 04:30:44,130][60144] Updated weights for policy 1, policy_version 9252 (0.0009) +[2023-10-09 04:30:44,172][60143] Updated weights for policy 0, policy_version 9182 (0.0008) +[2023-10-09 04:30:44,523][60144] Updated weights for policy 1, policy_version 9262 (0.0009) +[2023-10-09 04:30:44,893][60144] Updated weights for policy 1, policy_version 9272 (0.0008) +[2023-10-09 04:30:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 18907136. Throughput: 0: 1712.7, 1: 1705.1. Samples: 4734604. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:30:46,053][59242] Avg episode reward: [(0, '12.960'), (1, '15.920')] +[2023-10-09 04:30:48,123][60143] Updated weights for policy 0, policy_version 9192 (0.0008) +[2023-10-09 04:30:48,498][60143] Updated weights for policy 0, policy_version 9202 (0.0008) +[2023-10-09 04:30:48,724][60144] Updated weights for policy 1, policy_version 9282 (0.0007) +[2023-10-09 04:30:48,866][60143] Updated weights for policy 0, policy_version 9212 (0.0008) +[2023-10-09 04:30:49,086][60144] Updated weights for policy 1, policy_version 9292 (0.0010) +[2023-10-09 04:30:49,449][60144] Updated weights for policy 1, policy_version 9302 (0.0009) +[2023-10-09 04:30:49,815][60144] Updated weights for policy 1, policy_version 9312 (0.0008) +[2023-10-09 04:30:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 18972672. Throughput: 0: 1703.6, 1: 1737.9. Samples: 4745718. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:30:51,053][59242] Avg episode reward: [(0, '12.960'), (1, '15.990')] +[2023-10-09 04:30:52,737][60143] Updated weights for policy 0, policy_version 9222 (0.0008) +[2023-10-09 04:30:53,117][60143] Updated weights for policy 0, policy_version 9232 (0.0007) +[2023-10-09 04:30:53,487][60143] Updated weights for policy 0, policy_version 9242 (0.0009) +[2023-10-09 04:30:53,843][60144] Updated weights for policy 1, policy_version 9322 (0.0009) +[2023-10-09 04:30:54,217][60144] Updated weights for policy 1, policy_version 9332 (0.0008) +[2023-10-09 04:30:54,583][60144] Updated weights for policy 1, policy_version 9342 (0.0008) +[2023-10-09 04:30:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 19038208. Throughput: 0: 1690.8, 1: 1709.8. Samples: 4765064. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:30:56,053][59242] Avg episode reward: [(0, '12.990'), (1, '15.920')] +[2023-10-09 04:30:57,394][60143] Updated weights for policy 0, policy_version 9252 (0.0010) +[2023-10-09 04:30:57,792][60143] Updated weights for policy 0, policy_version 9262 (0.0009) +[2023-10-09 04:30:58,164][60143] Updated weights for policy 0, policy_version 9272 (0.0010) +[2023-10-09 04:30:58,340][60144] Updated weights for policy 1, policy_version 9352 (0.0009) +[2023-10-09 04:30:58,702][60144] Updated weights for policy 1, policy_version 9362 (0.0010) +[2023-10-09 04:30:59,079][60144] Updated weights for policy 1, policy_version 9372 (0.0009) +[2023-10-09 04:31:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 19103744. Throughput: 0: 1714.8, 1: 1712.6. Samples: 4786168. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:31:01,053][59242] Avg episode reward: [(0, '13.700'), (1, '15.930')] +[2023-10-09 04:31:02,142][60143] Updated weights for policy 0, policy_version 9282 (0.0008) +[2023-10-09 04:31:02,503][60143] Updated weights for policy 0, policy_version 9292 (0.0007) +[2023-10-09 04:31:02,873][60143] Updated weights for policy 0, policy_version 9302 (0.0007) +[2023-10-09 04:31:03,198][60144] Updated weights for policy 1, policy_version 9382 (0.0008) +[2023-10-09 04:31:03,245][60143] Updated weights for policy 0, policy_version 9312 (0.0007) +[2023-10-09 04:31:03,575][60144] Updated weights for policy 1, policy_version 9392 (0.0010) +[2023-10-09 04:31:03,939][60144] Updated weights for policy 1, policy_version 9402 (0.0009) +[2023-10-09 04:31:06,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 19169280. Throughput: 0: 1688.4, 1: 1718.5. Samples: 4796116. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:31:06,053][59242] Avg episode reward: [(0, '14.200'), (1, '15.880')] +[2023-10-09 04:31:07,439][60143] Updated weights for policy 0, policy_version 9322 (0.0008) +[2023-10-09 04:31:07,817][60143] Updated weights for policy 0, policy_version 9332 (0.0010) +[2023-10-09 04:31:08,011][60144] Updated weights for policy 1, policy_version 9412 (0.0009) +[2023-10-09 04:31:08,179][60143] Updated weights for policy 0, policy_version 9342 (0.0007) +[2023-10-09 04:31:08,386][60144] Updated weights for policy 1, policy_version 9422 (0.0009) +[2023-10-09 04:31:08,751][60144] Updated weights for policy 1, policy_version 9432 (0.0009) +[2023-10-09 04:31:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 19234816. Throughput: 0: 1699.5, 1: 1705.6. Samples: 4816640. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:31:11,053][59242] Avg episode reward: [(0, '13.930'), (1, '15.740')] +[2023-10-09 04:31:12,319][60143] Updated weights for policy 0, policy_version 9352 (0.0010) +[2023-10-09 04:31:12,596][60144] Updated weights for policy 1, policy_version 9442 (0.0010) +[2023-10-09 04:31:12,692][60143] Updated weights for policy 0, policy_version 9362 (0.0009) +[2023-10-09 04:31:12,963][60144] Updated weights for policy 1, policy_version 9452 (0.0008) +[2023-10-09 04:31:13,067][60143] Updated weights for policy 0, policy_version 9372 (0.0009) +[2023-10-09 04:31:13,335][60144] Updated weights for policy 1, policy_version 9462 (0.0009) +[2023-10-09 04:31:13,704][60144] Updated weights for policy 1, policy_version 9472 (0.0008) +[2023-10-09 04:31:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 19300352. Throughput: 0: 1713.8, 1: 1726.5. Samples: 4837844. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:31:16,053][59242] Avg episode reward: [(0, '14.200'), (1, '16.280')] +[2023-10-09 04:31:16,889][60143] Updated weights for policy 0, policy_version 9382 (0.0008) +[2023-10-09 04:31:17,257][60143] Updated weights for policy 0, policy_version 9392 (0.0007) +[2023-10-09 04:31:17,620][60144] Updated weights for policy 1, policy_version 9482 (0.0009) +[2023-10-09 04:31:17,633][60143] Updated weights for policy 0, policy_version 9402 (0.0007) +[2023-10-09 04:31:17,994][60144] Updated weights for policy 1, policy_version 9492 (0.0009) +[2023-10-09 04:31:18,374][60144] Updated weights for policy 1, policy_version 9502 (0.0008) +[2023-10-09 04:31:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 19365888. Throughput: 0: 1684.8, 1: 1706.7. Samples: 4846990. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:31:21,053][59242] Avg episode reward: [(0, '14.850'), (1, '16.220')] +[2023-10-09 04:31:21,662][60143] Updated weights for policy 0, policy_version 9412 (0.0008) +[2023-10-09 04:31:22,023][60143] Updated weights for policy 0, policy_version 9422 (0.0011) +[2023-10-09 04:31:22,206][60144] Updated weights for policy 1, policy_version 9512 (0.0008) +[2023-10-09 04:31:22,396][60143] Updated weights for policy 0, policy_version 9432 (0.0009) +[2023-10-09 04:31:22,577][60144] Updated weights for policy 1, policy_version 9522 (0.0008) +[2023-10-09 04:31:22,936][60144] Updated weights for policy 1, policy_version 9532 (0.0009) +[2023-10-09 04:31:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 19431424. Throughput: 0: 1707.1, 1: 1714.9. Samples: 4868236. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:31:26,053][59242] Avg episode reward: [(0, '14.750'), (1, '16.270')] +[2023-10-09 04:31:26,529][60143] Updated weights for policy 0, policy_version 9442 (0.0008) +[2023-10-09 04:31:26,895][60144] Updated weights for policy 1, policy_version 9542 (0.0007) +[2023-10-09 04:31:26,899][60143] Updated weights for policy 0, policy_version 9452 (0.0007) +[2023-10-09 04:31:27,261][60144] Updated weights for policy 1, policy_version 9552 (0.0007) +[2023-10-09 04:31:27,263][60143] Updated weights for policy 0, policy_version 9462 (0.0009) +[2023-10-09 04:31:27,618][60144] Updated weights for policy 1, policy_version 9562 (0.0007) +[2023-10-09 04:31:27,636][60143] Updated weights for policy 0, policy_version 9472 (0.0008) +[2023-10-09 04:31:31,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 19496960. Throughput: 0: 1700.1, 1: 1738.3. Samples: 4889332. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:31:31,053][59242] Avg episode reward: [(0, '15.550'), (1, '16.860')] +[2023-10-09 04:31:31,069][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000009568_9797632.pth... +[2023-10-09 04:31:31,069][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000009472_9699328.pth... +[2023-10-09 04:31:31,104][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000007968_8159232.pth +[2023-10-09 04:31:31,104][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000007904_8093696.pth +[2023-10-09 04:31:31,108][60003] Saving new best policy, reward=16.860! +[2023-10-09 04:31:31,108][59934] Saving new best policy, reward=15.550! +[2023-10-09 04:31:31,636][60143] Updated weights for policy 0, policy_version 9482 (0.0008) +[2023-10-09 04:31:31,768][60144] Updated weights for policy 1, policy_version 9572 (0.0009) +[2023-10-09 04:31:32,009][60143] Updated weights for policy 0, policy_version 9492 (0.0008) +[2023-10-09 04:31:32,175][60144] Updated weights for policy 1, policy_version 9582 (0.0007) +[2023-10-09 04:31:32,380][60143] Updated weights for policy 0, policy_version 9502 (0.0007) +[2023-10-09 04:31:32,537][60144] Updated weights for policy 1, policy_version 9592 (0.0008) +[2023-10-09 04:31:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 19562496. Throughput: 0: 1691.8, 1: 1702.7. Samples: 4898468. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:31:36,052][59242] Avg episode reward: [(0, '15.760'), (1, '16.150')] +[2023-10-09 04:31:36,330][60143] Updated weights for policy 0, policy_version 9512 (0.0008) +[2023-10-09 04:31:36,359][60144] Updated weights for policy 1, policy_version 9602 (0.0008) +[2023-10-09 04:31:36,705][60143] Updated weights for policy 0, policy_version 9522 (0.0009) +[2023-10-09 04:31:36,725][60144] Updated weights for policy 1, policy_version 9612 (0.0007) +[2023-10-09 04:31:37,074][60143] Updated weights for policy 0, policy_version 9532 (0.0008) +[2023-10-09 04:31:37,094][60144] Updated weights for policy 1, policy_version 9622 (0.0007) +[2023-10-09 04:31:37,224][59934] Saving new best policy, reward=15.760! +[2023-10-09 04:31:37,461][60144] Updated weights for policy 1, policy_version 9632 (0.0008) +[2023-10-09 04:31:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 19628032. Throughput: 0: 1699.9, 1: 1731.2. Samples: 4919460. Policy #0 lag: (min: 9.0, avg: 13.3, max: 41.0) +[2023-10-09 04:31:41,053][59242] Avg episode reward: [(0, '15.630'), (1, '15.980')] +[2023-10-09 04:31:41,174][60143] Updated weights for policy 0, policy_version 9542 (0.0009) +[2023-10-09 04:31:41,454][60144] Updated weights for policy 1, policy_version 9642 (0.0007) +[2023-10-09 04:31:41,545][60143] Updated weights for policy 0, policy_version 9552 (0.0007) +[2023-10-09 04:31:41,821][60144] Updated weights for policy 1, policy_version 9652 (0.0007) +[2023-10-09 04:31:41,913][60143] Updated weights for policy 0, policy_version 9562 (0.0008) +[2023-10-09 04:31:42,193][60144] Updated weights for policy 1, policy_version 9662 (0.0008) +[2023-10-09 04:31:45,981][60143] Updated weights for policy 0, policy_version 9572 (0.0007) +[2023-10-09 04:31:46,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 19693568. Throughput: 0: 1700.3, 1: 1727.6. Samples: 4940424. Policy #0 lag: (min: 9.0, avg: 13.3, max: 41.0) +[2023-10-09 04:31:46,053][59242] Avg episode reward: [(0, '15.870'), (1, '16.360')] +[2023-10-09 04:31:46,149][60144] Updated weights for policy 1, policy_version 9672 (0.0007) +[2023-10-09 04:31:46,370][60143] Updated weights for policy 0, policy_version 9582 (0.0007) +[2023-10-09 04:31:46,510][60144] Updated weights for policy 1, policy_version 9682 (0.0007) +[2023-10-09 04:31:46,744][60143] Updated weights for policy 0, policy_version 9592 (0.0007) +[2023-10-09 04:31:46,876][60144] Updated weights for policy 1, policy_version 9692 (0.0007) +[2023-10-09 04:31:47,041][59934] Saving new best policy, reward=15.870! +[2023-10-09 04:31:50,854][60143] Updated weights for policy 0, policy_version 9602 (0.0008) +[2023-10-09 04:31:50,943][60144] Updated weights for policy 1, policy_version 9702 (0.0008) +[2023-10-09 04:31:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 19759104. Throughput: 0: 1690.9, 1: 1716.6. Samples: 4949456. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:31:51,053][59242] Avg episode reward: [(0, '15.130'), (1, '17.160')] +[2023-10-09 04:31:51,224][60143] Updated weights for policy 0, policy_version 9612 (0.0009) +[2023-10-09 04:31:51,305][60144] Updated weights for policy 1, policy_version 9712 (0.0008) +[2023-10-09 04:31:51,588][60143] Updated weights for policy 0, policy_version 9622 (0.0008) +[2023-10-09 04:31:51,672][60144] Updated weights for policy 1, policy_version 9722 (0.0009) +[2023-10-09 04:31:51,888][60003] Saving new best policy, reward=17.160! +[2023-10-09 04:31:51,961][60143] Updated weights for policy 0, policy_version 9632 (0.0008) +[2023-10-09 04:31:55,558][60144] Updated weights for policy 1, policy_version 9732 (0.0009) +[2023-10-09 04:31:55,918][60144] Updated weights for policy 1, policy_version 9742 (0.0009) +[2023-10-09 04:31:56,043][60143] Updated weights for policy 0, policy_version 9642 (0.0008) +[2023-10-09 04:31:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 19824640. Throughput: 0: 1692.5, 1: 1728.1. Samples: 4970570. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:31:56,053][59242] Avg episode reward: [(0, '14.990'), (1, '17.050')] +[2023-10-09 04:31:56,285][60144] Updated weights for policy 1, policy_version 9752 (0.0007) +[2023-10-09 04:31:56,415][60143] Updated weights for policy 0, policy_version 9652 (0.0007) +[2023-10-09 04:31:56,797][60143] Updated weights for policy 0, policy_version 9662 (0.0009) +[2023-10-09 04:32:00,180][60144] Updated weights for policy 1, policy_version 9762 (0.0008) +[2023-10-09 04:32:00,549][60144] Updated weights for policy 1, policy_version 9772 (0.0007) +[2023-10-09 04:32:00,743][60143] Updated weights for policy 0, policy_version 9672 (0.0010) +[2023-10-09 04:32:00,915][60144] Updated weights for policy 1, policy_version 9782 (0.0008) +[2023-10-09 04:32:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 19890176. Throughput: 0: 1691.0, 1: 1715.1. Samples: 4991118. Policy #0 lag: (min: 25.0, avg: 28.1, max: 57.0) +[2023-10-09 04:32:01,053][59242] Avg episode reward: [(0, '14.330'), (1, '17.750')] +[2023-10-09 04:32:01,116][60143] Updated weights for policy 0, policy_version 9682 (0.0010) +[2023-10-09 04:32:01,292][60144] Updated weights for policy 1, policy_version 9792 (0.0008) +[2023-10-09 04:32:01,292][60003] Saving new best policy, reward=17.750! +[2023-10-09 04:32:01,502][60143] Updated weights for policy 0, policy_version 9692 (0.0009) +[2023-10-09 04:32:05,339][60144] Updated weights for policy 1, policy_version 9802 (0.0010) +[2023-10-09 04:32:05,437][60143] Updated weights for policy 0, policy_version 9702 (0.0009) +[2023-10-09 04:32:05,709][60144] Updated weights for policy 1, policy_version 9812 (0.0009) +[2023-10-09 04:32:05,806][60143] Updated weights for policy 0, policy_version 9712 (0.0008) +[2023-10-09 04:32:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 19955712. Throughput: 0: 1696.4, 1: 1727.6. Samples: 5001070. Policy #0 lag: (min: 25.0, avg: 28.1, max: 57.0) +[2023-10-09 04:32:06,053][59242] Avg episode reward: [(0, '14.690'), (1, '18.330')] +[2023-10-09 04:32:06,072][60144] Updated weights for policy 1, policy_version 9822 (0.0008) +[2023-10-09 04:32:06,137][60003] Saving new best policy, reward=18.330! +[2023-10-09 04:32:06,176][60143] Updated weights for policy 0, policy_version 9722 (0.0008) +[2023-10-09 04:32:09,915][60144] Updated weights for policy 1, policy_version 9832 (0.0007) +[2023-10-09 04:32:10,272][60144] Updated weights for policy 1, policy_version 9842 (0.0008) +[2023-10-09 04:32:10,285][60143] Updated weights for policy 0, policy_version 9732 (0.0009) +[2023-10-09 04:32:10,642][60144] Updated weights for policy 1, policy_version 9852 (0.0009) +[2023-10-09 04:32:10,657][60143] Updated weights for policy 0, policy_version 9742 (0.0009) +[2023-10-09 04:32:11,024][60143] Updated weights for policy 0, policy_version 9752 (0.0010) +[2023-10-09 04:32:11,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 20054016. Throughput: 0: 1699.5, 1: 1725.4. Samples: 5022354. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:32:11,053][59242] Avg episode reward: [(0, '14.730'), (1, '19.090')] +[2023-10-09 04:32:11,054][60003] Saving new best policy, reward=19.090! +[2023-10-09 04:32:14,651][60144] Updated weights for policy 1, policy_version 9862 (0.0008) +[2023-10-09 04:32:14,884][60143] Updated weights for policy 0, policy_version 9762 (0.0010) +[2023-10-09 04:32:15,018][60144] Updated weights for policy 1, policy_version 9872 (0.0008) +[2023-10-09 04:32:15,255][60143] Updated weights for policy 0, policy_version 9772 (0.0009) +[2023-10-09 04:32:15,381][60144] Updated weights for policy 1, policy_version 9882 (0.0007) +[2023-10-09 04:32:15,633][60143] Updated weights for policy 0, policy_version 9782 (0.0007) +[2023-10-09 04:32:15,999][60143] Updated weights for policy 0, policy_version 9792 (0.0008) +[2023-10-09 04:32:16,052][59242] Fps is (10 sec: 19660.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 20152320. Throughput: 0: 1691.9, 1: 1691.4. Samples: 5041580. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:32:16,052][59242] Avg episode reward: [(0, '15.580'), (1, '18.800')] +[2023-10-09 04:32:19,479][60144] Updated weights for policy 1, policy_version 9892 (0.0008) +[2023-10-09 04:32:19,883][60144] Updated weights for policy 1, policy_version 9902 (0.0009) +[2023-10-09 04:32:19,981][60143] Updated weights for policy 0, policy_version 9802 (0.0007) +[2023-10-09 04:32:20,248][60144] Updated weights for policy 1, policy_version 9912 (0.0009) +[2023-10-09 04:32:20,347][60143] Updated weights for policy 0, policy_version 9812 (0.0008) +[2023-10-09 04:32:20,719][60143] Updated weights for policy 0, policy_version 9822 (0.0009) +[2023-10-09 04:32:21,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 20217856. Throughput: 0: 1705.2, 1: 1725.6. Samples: 5052856. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:32:21,053][59242] Avg episode reward: [(0, '15.430'), (1, '18.740')] +[2023-10-09 04:32:24,133][60144] Updated weights for policy 1, policy_version 9922 (0.0009) +[2023-10-09 04:32:24,504][60144] Updated weights for policy 1, policy_version 9932 (0.0008) +[2023-10-09 04:32:24,835][60143] Updated weights for policy 0, policy_version 9832 (0.0009) +[2023-10-09 04:32:24,867][60144] Updated weights for policy 1, policy_version 9942 (0.0007) +[2023-10-09 04:32:25,204][60143] Updated weights for policy 0, policy_version 9842 (0.0008) +[2023-10-09 04:32:25,229][60144] Updated weights for policy 1, policy_version 9952 (0.0008) +[2023-10-09 04:32:25,566][60143] Updated weights for policy 0, policy_version 9852 (0.0008) +[2023-10-09 04:32:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 20283392. Throughput: 0: 1709.4, 1: 1713.5. Samples: 5073492. Policy #0 lag: (min: 3.0, avg: 6.0, max: 35.0) +[2023-10-09 04:32:26,052][59242] Avg episode reward: [(0, '15.420'), (1, '18.830')] +[2023-10-09 04:32:29,376][60144] Updated weights for policy 1, policy_version 9962 (0.0007) +[2023-10-09 04:32:29,480][60143] Updated weights for policy 0, policy_version 9862 (0.0010) +[2023-10-09 04:32:29,742][60144] Updated weights for policy 1, policy_version 9972 (0.0008) +[2023-10-09 04:32:29,859][60143] Updated weights for policy 0, policy_version 9872 (0.0008) +[2023-10-09 04:32:30,106][60144] Updated weights for policy 1, policy_version 9982 (0.0009) +[2023-10-09 04:32:30,228][60143] Updated weights for policy 0, policy_version 9882 (0.0009) +[2023-10-09 04:32:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 20348928. Throughput: 0: 1683.7, 1: 1694.7. Samples: 5092450. Policy #0 lag: (min: 3.0, avg: 6.0, max: 35.0) +[2023-10-09 04:32:31,053][59242] Avg episode reward: [(0, '15.020'), (1, '18.770')] +[2023-10-09 04:32:34,026][60143] Updated weights for policy 0, policy_version 9892 (0.0010) +[2023-10-09 04:32:34,218][60144] Updated weights for policy 1, policy_version 9992 (0.0007) +[2023-10-09 04:32:34,427][60143] Updated weights for policy 0, policy_version 9902 (0.0009) +[2023-10-09 04:32:34,593][60144] Updated weights for policy 1, policy_version 10002 (0.0007) +[2023-10-09 04:32:34,793][60143] Updated weights for policy 0, policy_version 9912 (0.0009) +[2023-10-09 04:32:34,958][60144] Updated weights for policy 1, policy_version 10012 (0.0008) +[2023-10-09 04:32:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 20414464. Throughput: 0: 1720.6, 1: 1724.7. Samples: 5104496. Policy #0 lag: (min: 35.0, avg: 54.5, max: 56.0) +[2023-10-09 04:32:36,053][59242] Avg episode reward: [(0, '15.440'), (1, '18.600')] +[2023-10-09 04:32:38,898][60143] Updated weights for policy 0, policy_version 9922 (0.0007) +[2023-10-09 04:32:39,012][60144] Updated weights for policy 1, policy_version 10022 (0.0008) +[2023-10-09 04:32:39,268][60143] Updated weights for policy 0, policy_version 9932 (0.0008) +[2023-10-09 04:32:39,371][60144] Updated weights for policy 1, policy_version 10032 (0.0009) +[2023-10-09 04:32:39,633][60143] Updated weights for policy 0, policy_version 9942 (0.0008) +[2023-10-09 04:32:39,739][60144] Updated weights for policy 1, policy_version 10042 (0.0008) +[2023-10-09 04:32:40,003][60143] Updated weights for policy 0, policy_version 9952 (0.0009) +[2023-10-09 04:32:41,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 20480000. Throughput: 0: 1702.6, 1: 1700.9. Samples: 5123728. Policy #0 lag: (min: 35.0, avg: 54.5, max: 56.0) +[2023-10-09 04:32:41,053][59242] Avg episode reward: [(0, '16.680'), (1, '18.250')] +[2023-10-09 04:32:41,054][59934] Saving new best policy, reward=16.680! +[2023-10-09 04:32:43,511][60144] Updated weights for policy 1, policy_version 10052 (0.0008) +[2023-10-09 04:32:43,881][60144] Updated weights for policy 1, policy_version 10062 (0.0008) +[2023-10-09 04:32:43,890][60143] Updated weights for policy 0, policy_version 9962 (0.0009) +[2023-10-09 04:32:44,242][60144] Updated weights for policy 1, policy_version 10072 (0.0008) +[2023-10-09 04:32:44,262][60143] Updated weights for policy 0, policy_version 9972 (0.0008) +[2023-10-09 04:32:44,637][60143] Updated weights for policy 0, policy_version 9982 (0.0007) +[2023-10-09 04:32:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 20545536. Throughput: 0: 1688.8, 1: 1709.4. Samples: 5144036. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:32:46,052][59242] Avg episode reward: [(0, '15.990'), (1, '18.380')] +[2023-10-09 04:32:48,178][60144] Updated weights for policy 1, policy_version 10082 (0.0008) +[2023-10-09 04:32:48,550][60144] Updated weights for policy 1, policy_version 10092 (0.0009) +[2023-10-09 04:32:48,715][60143] Updated weights for policy 0, policy_version 9992 (0.0008) +[2023-10-09 04:32:48,918][60144] Updated weights for policy 1, policy_version 10102 (0.0008) +[2023-10-09 04:32:49,090][60143] Updated weights for policy 0, policy_version 10002 (0.0009) +[2023-10-09 04:32:49,283][60144] Updated weights for policy 1, policy_version 10112 (0.0008) +[2023-10-09 04:32:49,464][60143] Updated weights for policy 0, policy_version 10012 (0.0007) +[2023-10-09 04:32:51,052][59242] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 20611072. Throughput: 0: 1714.0, 1: 1715.2. Samples: 5155384. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:32:51,052][59242] Avg episode reward: [(0, '15.500'), (1, '17.110')] +[2023-10-09 04:32:53,060][60144] Updated weights for policy 1, policy_version 10122 (0.0008) +[2023-10-09 04:32:53,427][60144] Updated weights for policy 1, policy_version 10132 (0.0007) +[2023-10-09 04:32:53,530][60143] Updated weights for policy 0, policy_version 10022 (0.0007) +[2023-10-09 04:32:53,792][60144] Updated weights for policy 1, policy_version 10142 (0.0007) +[2023-10-09 04:32:53,894][60143] Updated weights for policy 0, policy_version 10032 (0.0007) +[2023-10-09 04:32:54,259][60143] Updated weights for policy 0, policy_version 10042 (0.0009) +[2023-10-09 04:32:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 20676608. Throughput: 0: 1687.6, 1: 1696.6. Samples: 5174646. Policy #0 lag: (min: 31.0, avg: 31.7, max: 48.0) +[2023-10-09 04:32:56,053][59242] Avg episode reward: [(0, '16.040'), (1, '17.740')] +[2023-10-09 04:32:57,786][60144] Updated weights for policy 1, policy_version 10152 (0.0007) +[2023-10-09 04:32:58,145][60144] Updated weights for policy 1, policy_version 10162 (0.0008) +[2023-10-09 04:32:58,297][60143] Updated weights for policy 0, policy_version 10052 (0.0009) +[2023-10-09 04:32:58,519][60144] Updated weights for policy 1, policy_version 10172 (0.0008) +[2023-10-09 04:32:58,663][60143] Updated weights for policy 0, policy_version 10062 (0.0009) +[2023-10-09 04:32:59,034][60143] Updated weights for policy 0, policy_version 10072 (0.0009) +[2023-10-09 04:33:01,052][59242] Fps is (10 sec: 13106.7, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 20742144. Throughput: 0: 1698.7, 1: 1728.5. Samples: 5195802. Policy #0 lag: (min: 31.0, avg: 31.7, max: 48.0) +[2023-10-09 04:33:01,053][59242] Avg episode reward: [(0, '15.990'), (1, '17.090')] +[2023-10-09 04:33:02,677][60144] Updated weights for policy 1, policy_version 10182 (0.0007) +[2023-10-09 04:33:02,984][60143] Updated weights for policy 0, policy_version 10082 (0.0009) +[2023-10-09 04:33:03,037][60144] Updated weights for policy 1, policy_version 10192 (0.0008) +[2023-10-09 04:33:03,344][60143] Updated weights for policy 0, policy_version 10092 (0.0008) +[2023-10-09 04:33:03,411][60144] Updated weights for policy 1, policy_version 10202 (0.0008) +[2023-10-09 04:33:03,717][60143] Updated weights for policy 0, policy_version 10102 (0.0010) +[2023-10-09 04:33:04,087][60143] Updated weights for policy 0, policy_version 10112 (0.0009) +[2023-10-09 04:33:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 20807680. Throughput: 0: 1702.4, 1: 1701.5. Samples: 5206032. Policy #0 lag: (min: 31.0, avg: 33.4, max: 62.0) +[2023-10-09 04:33:06,053][59242] Avg episode reward: [(0, '15.410'), (1, '17.390')] +[2023-10-09 04:33:07,488][60144] Updated weights for policy 1, policy_version 10212 (0.0009) +[2023-10-09 04:33:07,853][60144] Updated weights for policy 1, policy_version 10222 (0.0008) +[2023-10-09 04:33:07,907][60143] Updated weights for policy 0, policy_version 10122 (0.0007) +[2023-10-09 04:33:08,228][60144] Updated weights for policy 1, policy_version 10232 (0.0008) +[2023-10-09 04:33:08,273][60143] Updated weights for policy 0, policy_version 10132 (0.0009) +[2023-10-09 04:33:08,643][60143] Updated weights for policy 0, policy_version 10142 (0.0009) +[2023-10-09 04:33:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 20873216. Throughput: 0: 1688.3, 1: 1710.8. Samples: 5226454. Policy #0 lag: (min: 31.0, avg: 33.4, max: 62.0) +[2023-10-09 04:33:11,053][59242] Avg episode reward: [(0, '15.420'), (1, '17.540')] +[2023-10-09 04:33:12,192][60144] Updated weights for policy 1, policy_version 10242 (0.0007) +[2023-10-09 04:33:12,603][60144] Updated weights for policy 1, policy_version 10252 (0.0008) +[2023-10-09 04:33:12,780][60143] Updated weights for policy 0, policy_version 10152 (0.0008) +[2023-10-09 04:33:12,967][60144] Updated weights for policy 1, policy_version 10262 (0.0007) +[2023-10-09 04:33:13,154][60143] Updated weights for policy 0, policy_version 10162 (0.0007) +[2023-10-09 04:33:13,332][60144] Updated weights for policy 1, policy_version 10272 (0.0007) +[2023-10-09 04:33:13,522][60143] Updated weights for policy 0, policy_version 10172 (0.0007) +[2023-10-09 04:33:16,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 20938752. Throughput: 0: 1716.0, 1: 1734.5. Samples: 5247724. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-09 04:33:16,053][59242] Avg episode reward: [(0, '15.520'), (1, '17.720')] +[2023-10-09 04:33:17,075][60144] Updated weights for policy 1, policy_version 10282 (0.0008) +[2023-10-09 04:33:17,305][60143] Updated weights for policy 0, policy_version 10182 (0.0007) +[2023-10-09 04:33:17,454][60144] Updated weights for policy 1, policy_version 10292 (0.0007) +[2023-10-09 04:33:17,685][60143] Updated weights for policy 0, policy_version 10192 (0.0008) +[2023-10-09 04:33:17,824][60144] Updated weights for policy 1, policy_version 10302 (0.0007) +[2023-10-09 04:33:18,049][60143] Updated weights for policy 0, policy_version 10202 (0.0010) +[2023-10-09 04:33:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 21004288. Throughput: 0: 1685.6, 1: 1705.0. Samples: 5257074. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-09 04:33:21,053][59242] Avg episode reward: [(0, '15.530'), (1, '16.750')] +[2023-10-09 04:33:21,624][60144] Updated weights for policy 1, policy_version 10312 (0.0008) +[2023-10-09 04:33:21,982][60144] Updated weights for policy 1, policy_version 10322 (0.0010) +[2023-10-09 04:33:22,065][60143] Updated weights for policy 0, policy_version 10212 (0.0010) +[2023-10-09 04:33:22,345][60144] Updated weights for policy 1, policy_version 10332 (0.0007) +[2023-10-09 04:33:22,452][60143] Updated weights for policy 0, policy_version 10222 (0.0009) +[2023-10-09 04:33:22,828][60143] Updated weights for policy 0, policy_version 10232 (0.0010) +[2023-10-09 04:33:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 21069824. Throughput: 0: 1701.5, 1: 1730.6. Samples: 5278172. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 04:33:26,052][59242] Avg episode reward: [(0, '15.260'), (1, '16.400')] +[2023-10-09 04:33:26,219][60144] Updated weights for policy 1, policy_version 10342 (0.0008) +[2023-10-09 04:33:26,581][60144] Updated weights for policy 1, policy_version 10352 (0.0007) +[2023-10-09 04:33:26,791][60143] Updated weights for policy 0, policy_version 10242 (0.0009) +[2023-10-09 04:33:26,958][60144] Updated weights for policy 1, policy_version 10362 (0.0007) +[2023-10-09 04:33:27,154][60143] Updated weights for policy 0, policy_version 10252 (0.0008) +[2023-10-09 04:33:27,530][60143] Updated weights for policy 0, policy_version 10262 (0.0008) +[2023-10-09 04:33:27,893][60143] Updated weights for policy 0, policy_version 10272 (0.0009) +[2023-10-09 04:33:30,788][60144] Updated weights for policy 1, policy_version 10372 (0.0008) +[2023-10-09 04:33:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13662.6). Total num frames: 21135360. Throughput: 0: 1718.1, 1: 1738.1. Samples: 5299564. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 04:33:31,053][59242] Avg episode reward: [(0, '15.030'), (1, '16.460')] +[2023-10-09 04:33:31,063][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000010272_10518528.pth... +[2023-10-09 04:33:31,098][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000008672_8880128.pth +[2023-10-09 04:33:31,152][60144] Updated weights for policy 1, policy_version 10382 (0.0008) +[2023-10-09 04:33:31,526][60144] Updated weights for policy 1, policy_version 10392 (0.0010) +[2023-10-09 04:33:31,817][60143] Updated weights for policy 0, policy_version 10282 (0.0009) +[2023-10-09 04:33:31,818][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000010400_10649600.pth... +[2023-10-09 04:33:31,849][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000008768_8978432.pth +[2023-10-09 04:33:32,196][60143] Updated weights for policy 0, policy_version 10292 (0.0007) +[2023-10-09 04:33:32,564][60143] Updated weights for policy 0, policy_version 10302 (0.0007) +[2023-10-09 04:33:35,711][60144] Updated weights for policy 1, policy_version 10402 (0.0007) +[2023-10-09 04:33:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 21200896. Throughput: 0: 1691.8, 1: 1721.7. Samples: 5308992. Policy #0 lag: (min: 19.0, avg: 22.4, max: 51.0) +[2023-10-09 04:33:36,053][59242] Avg episode reward: [(0, '15.130'), (1, '17.160')] +[2023-10-09 04:33:36,087][60144] Updated weights for policy 1, policy_version 10412 (0.0007) +[2023-10-09 04:33:36,366][60143] Updated weights for policy 0, policy_version 10312 (0.0008) +[2023-10-09 04:33:36,450][60144] Updated weights for policy 1, policy_version 10422 (0.0007) +[2023-10-09 04:33:36,734][60143] Updated weights for policy 0, policy_version 10322 (0.0007) +[2023-10-09 04:33:36,822][60144] Updated weights for policy 1, policy_version 10432 (0.0007) +[2023-10-09 04:33:37,101][60143] Updated weights for policy 0, policy_version 10332 (0.0008) +[2023-10-09 04:33:40,752][60144] Updated weights for policy 1, policy_version 10442 (0.0009) +[2023-10-09 04:33:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 21266432. Throughput: 0: 1724.3, 1: 1733.4. Samples: 5330240. Policy #0 lag: (min: 19.0, avg: 22.4, max: 51.0) +[2023-10-09 04:33:41,053][59242] Avg episode reward: [(0, '15.060'), (1, '17.470')] +[2023-10-09 04:33:41,087][60143] Updated weights for policy 0, policy_version 10342 (0.0008) +[2023-10-09 04:33:41,115][60144] Updated weights for policy 1, policy_version 10452 (0.0008) +[2023-10-09 04:33:41,471][60143] Updated weights for policy 0, policy_version 10352 (0.0007) +[2023-10-09 04:33:41,489][60144] Updated weights for policy 1, policy_version 10462 (0.0007) +[2023-10-09 04:33:41,839][60143] Updated weights for policy 0, policy_version 10362 (0.0008) +[2023-10-09 04:33:45,476][60144] Updated weights for policy 1, policy_version 10472 (0.0007) +[2023-10-09 04:33:45,846][60144] Updated weights for policy 1, policy_version 10482 (0.0008) +[2023-10-09 04:33:45,849][60143] Updated weights for policy 0, policy_version 10372 (0.0009) +[2023-10-09 04:33:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13551.5). Total num frames: 21331968. Throughput: 0: 1729.6, 1: 1722.1. Samples: 5351126. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-09 04:33:46,053][59242] Avg episode reward: [(0, '15.500'), (1, '16.420')] +[2023-10-09 04:33:46,209][60144] Updated weights for policy 1, policy_version 10492 (0.0007) +[2023-10-09 04:33:46,219][60143] Updated weights for policy 0, policy_version 10382 (0.0008) +[2023-10-09 04:33:46,588][60143] Updated weights for policy 0, policy_version 10392 (0.0007) +[2023-10-09 04:33:50,244][60144] Updated weights for policy 1, policy_version 10502 (0.0009) +[2023-10-09 04:33:50,610][60144] Updated weights for policy 1, policy_version 10512 (0.0007) +[2023-10-09 04:33:50,612][60143] Updated weights for policy 0, policy_version 10402 (0.0007) +[2023-10-09 04:33:50,979][60144] Updated weights for policy 1, policy_version 10522 (0.0008) +[2023-10-09 04:33:50,981][60143] Updated weights for policy 0, policy_version 10412 (0.0008) +[2023-10-09 04:33:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 21397504. Throughput: 0: 1710.0, 1: 1729.4. Samples: 5360804. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-09 04:33:51,052][59242] Avg episode reward: [(0, '15.140'), (1, '15.820')] +[2023-10-09 04:33:51,354][60143] Updated weights for policy 0, policy_version 10422 (0.0008) +[2023-10-09 04:33:51,728][60143] Updated weights for policy 0, policy_version 10432 (0.0008) +[2023-10-09 04:33:54,835][60144] Updated weights for policy 1, policy_version 10532 (0.0009) +[2023-10-09 04:33:55,206][60144] Updated weights for policy 1, policy_version 10542 (0.0008) +[2023-10-09 04:33:55,581][60144] Updated weights for policy 1, policy_version 10552 (0.0010) +[2023-10-09 04:33:55,764][60143] Updated weights for policy 0, policy_version 10442 (0.0009) +[2023-10-09 04:33:56,052][59242] Fps is (10 sec: 16384.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 21495808. Throughput: 0: 1726.1, 1: 1738.9. Samples: 5382378. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:33:56,052][59242] Avg episode reward: [(0, '15.480'), (1, '16.600')] +[2023-10-09 04:33:56,130][60143] Updated weights for policy 0, policy_version 10452 (0.0010) +[2023-10-09 04:33:56,504][60143] Updated weights for policy 0, policy_version 10462 (0.0008) +[2023-10-09 04:33:59,530][60144] Updated weights for policy 1, policy_version 10562 (0.0007) +[2023-10-09 04:33:59,944][60144] Updated weights for policy 1, policy_version 10572 (0.0008) +[2023-10-09 04:34:00,310][60144] Updated weights for policy 1, policy_version 10582 (0.0008) +[2023-10-09 04:34:00,562][60143] Updated weights for policy 0, policy_version 10472 (0.0009) +[2023-10-09 04:34:00,680][60144] Updated weights for policy 1, policy_version 10592 (0.0009) +[2023-10-09 04:34:00,935][60143] Updated weights for policy 0, policy_version 10482 (0.0008) +[2023-10-09 04:34:01,052][59242] Fps is (10 sec: 16383.6, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 21561344. Throughput: 0: 1720.7, 1: 1709.2. Samples: 5402070. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:34:01,053][59242] Avg episode reward: [(0, '15.560'), (1, '16.080')] +[2023-10-09 04:34:01,307][60143] Updated weights for policy 0, policy_version 10492 (0.0009) +[2023-10-09 04:34:04,616][60144] Updated weights for policy 1, policy_version 10602 (0.0008) +[2023-10-09 04:34:04,987][60144] Updated weights for policy 1, policy_version 10612 (0.0007) +[2023-10-09 04:34:05,329][60143] Updated weights for policy 0, policy_version 10502 (0.0008) +[2023-10-09 04:34:05,350][60144] Updated weights for policy 1, policy_version 10622 (0.0008) +[2023-10-09 04:34:05,687][60143] Updated weights for policy 0, policy_version 10512 (0.0010) +[2023-10-09 04:34:06,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 21626880. Throughput: 0: 1724.6, 1: 1736.1. Samples: 5412808. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 04:34:06,053][59242] Avg episode reward: [(0, '15.610'), (1, '15.810')] +[2023-10-09 04:34:06,065][60143] Updated weights for policy 0, policy_version 10522 (0.0008) +[2023-10-09 04:34:09,339][60144] Updated weights for policy 1, policy_version 10632 (0.0008) +[2023-10-09 04:34:09,702][60144] Updated weights for policy 1, policy_version 10642 (0.0008) +[2023-10-09 04:34:10,075][60144] Updated weights for policy 1, policy_version 10652 (0.0009) +[2023-10-09 04:34:10,087][60143] Updated weights for policy 0, policy_version 10532 (0.0009) +[2023-10-09 04:34:10,491][60143] Updated weights for policy 0, policy_version 10542 (0.0010) +[2023-10-09 04:34:10,863][60143] Updated weights for policy 0, policy_version 10552 (0.0010) +[2023-10-09 04:34:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 21692416. Throughput: 0: 1730.4, 1: 1718.4. Samples: 5433370. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 04:34:11,053][59242] Avg episode reward: [(0, '15.230'), (1, '15.300')] +[2023-10-09 04:34:14,024][60144] Updated weights for policy 1, policy_version 10662 (0.0009) +[2023-10-09 04:34:14,389][60144] Updated weights for policy 1, policy_version 10672 (0.0010) +[2023-10-09 04:34:14,769][60144] Updated weights for policy 1, policy_version 10682 (0.0010) +[2023-10-09 04:34:14,890][60143] Updated weights for policy 0, policy_version 10562 (0.0009) +[2023-10-09 04:34:15,264][60143] Updated weights for policy 0, policy_version 10572 (0.0010) +[2023-10-09 04:34:15,626][60143] Updated weights for policy 0, policy_version 10582 (0.0008) +[2023-10-09 04:34:15,996][60143] Updated weights for policy 0, policy_version 10592 (0.0009) +[2023-10-09 04:34:16,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 21790720. Throughput: 0: 1708.6, 1: 1697.5. Samples: 5452840. Policy #0 lag: (min: 25.0, avg: 39.9, max: 57.0) +[2023-10-09 04:34:16,053][59242] Avg episode reward: [(0, '14.610'), (1, '16.340')] +[2023-10-09 04:34:18,720][60144] Updated weights for policy 1, policy_version 10692 (0.0009) +[2023-10-09 04:34:19,089][60144] Updated weights for policy 1, policy_version 10702 (0.0010) +[2023-10-09 04:34:19,459][60144] Updated weights for policy 1, policy_version 10712 (0.0009) +[2023-10-09 04:34:19,851][60143] Updated weights for policy 0, policy_version 10602 (0.0007) +[2023-10-09 04:34:20,208][60143] Updated weights for policy 0, policy_version 10612 (0.0009) +[2023-10-09 04:34:20,579][60143] Updated weights for policy 0, policy_version 10622 (0.0007) +[2023-10-09 04:34:21,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 21856256. Throughput: 0: 1721.7, 1: 1731.4. Samples: 5464384. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 04:34:21,053][59242] Avg episode reward: [(0, '14.210'), (1, '16.320')] +[2023-10-09 04:34:23,392][60144] Updated weights for policy 1, policy_version 10722 (0.0008) +[2023-10-09 04:34:23,756][60144] Updated weights for policy 1, policy_version 10732 (0.0010) +[2023-10-09 04:34:24,128][60144] Updated weights for policy 1, policy_version 10742 (0.0008) +[2023-10-09 04:34:24,497][60144] Updated weights for policy 1, policy_version 10752 (0.0007) +[2023-10-09 04:34:24,543][60143] Updated weights for policy 0, policy_version 10632 (0.0009) +[2023-10-09 04:34:24,911][60143] Updated weights for policy 0, policy_version 10642 (0.0011) +[2023-10-09 04:34:25,285][60143] Updated weights for policy 0, policy_version 10652 (0.0008) +[2023-10-09 04:34:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 21921792. Throughput: 0: 1712.6, 1: 1708.1. Samples: 5484170. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 04:34:26,053][59242] Avg episode reward: [(0, '13.830'), (1, '16.270')] +[2023-10-09 04:34:28,345][60144] Updated weights for policy 1, policy_version 10762 (0.0009) +[2023-10-09 04:34:28,700][60144] Updated weights for policy 1, policy_version 10772 (0.0009) +[2023-10-09 04:34:29,079][60144] Updated weights for policy 1, policy_version 10782 (0.0009) +[2023-10-09 04:34:29,274][60143] Updated weights for policy 0, policy_version 10662 (0.0011) +[2023-10-09 04:34:29,648][60143] Updated weights for policy 0, policy_version 10672 (0.0007) +[2023-10-09 04:34:30,026][60143] Updated weights for policy 0, policy_version 10682 (0.0008) +[2023-10-09 04:34:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 21987328. Throughput: 0: 1684.7, 1: 1718.1. Samples: 5504254. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-09 04:34:31,053][59242] Avg episode reward: [(0, '13.320'), (1, '16.840')] +[2023-10-09 04:34:33,091][60144] Updated weights for policy 1, policy_version 10792 (0.0009) +[2023-10-09 04:34:33,451][60144] Updated weights for policy 1, policy_version 10802 (0.0010) +[2023-10-09 04:34:33,814][60144] Updated weights for policy 1, policy_version 10812 (0.0010) +[2023-10-09 04:34:33,894][60143] Updated weights for policy 0, policy_version 10692 (0.0010) +[2023-10-09 04:34:34,261][60143] Updated weights for policy 0, policy_version 10702 (0.0009) +[2023-10-09 04:34:34,621][60143] Updated weights for policy 0, policy_version 10712 (0.0009) +[2023-10-09 04:34:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 22052864. Throughput: 0: 1718.6, 1: 1716.8. Samples: 5515396. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-09 04:34:36,053][59242] Avg episode reward: [(0, '13.890'), (1, '16.720')] +[2023-10-09 04:34:37,773][60144] Updated weights for policy 1, policy_version 10822 (0.0007) +[2023-10-09 04:34:38,132][60144] Updated weights for policy 1, policy_version 10832 (0.0010) +[2023-10-09 04:34:38,509][60144] Updated weights for policy 1, policy_version 10842 (0.0009) +[2023-10-09 04:34:38,789][60143] Updated weights for policy 0, policy_version 10722 (0.0010) +[2023-10-09 04:34:39,165][60143] Updated weights for policy 0, policy_version 10732 (0.0011) +[2023-10-09 04:34:39,537][60143] Updated weights for policy 0, policy_version 10742 (0.0011) +[2023-10-09 04:34:39,916][60143] Updated weights for policy 0, policy_version 10752 (0.0009) +[2023-10-09 04:34:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 22118400. Throughput: 0: 1695.8, 1: 1701.4. Samples: 5535250. Policy #0 lag: (min: 2.0, avg: 3.3, max: 26.0) +[2023-10-09 04:34:41,053][59242] Avg episode reward: [(0, '12.830'), (1, '17.040')] +[2023-10-09 04:34:42,364][60144] Updated weights for policy 1, policy_version 10852 (0.0007) +[2023-10-09 04:34:42,739][60144] Updated weights for policy 1, policy_version 10862 (0.0009) +[2023-10-09 04:34:43,109][60144] Updated weights for policy 1, policy_version 10872 (0.0010) +[2023-10-09 04:34:43,757][60143] Updated weights for policy 0, policy_version 10762 (0.0009) +[2023-10-09 04:34:44,125][60143] Updated weights for policy 0, policy_version 10772 (0.0009) +[2023-10-09 04:34:44,497][60143] Updated weights for policy 0, policy_version 10782 (0.0008) +[2023-10-09 04:34:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 22183936. Throughput: 0: 1689.6, 1: 1732.9. Samples: 5556082. Policy #0 lag: (min: 2.0, avg: 3.3, max: 26.0) +[2023-10-09 04:34:46,053][59242] Avg episode reward: [(0, '12.780'), (1, '17.150')] +[2023-10-09 04:34:47,189][60144] Updated weights for policy 1, policy_version 10882 (0.0009) +[2023-10-09 04:34:47,607][60144] Updated weights for policy 1, policy_version 10892 (0.0007) +[2023-10-09 04:34:47,975][60144] Updated weights for policy 1, policy_version 10902 (0.0007) +[2023-10-09 04:34:48,341][60144] Updated weights for policy 1, policy_version 10912 (0.0008) +[2023-10-09 04:34:48,615][60143] Updated weights for policy 0, policy_version 10792 (0.0007) +[2023-10-09 04:34:48,982][60143] Updated weights for policy 0, policy_version 10802 (0.0007) +[2023-10-09 04:34:49,347][60143] Updated weights for policy 0, policy_version 10812 (0.0009) +[2023-10-09 04:34:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 22249472. Throughput: 0: 1710.4, 1: 1701.4. Samples: 5566338. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-09 04:34:51,052][59242] Avg episode reward: [(0, '12.650'), (1, '16.800')] +[2023-10-09 04:34:52,314][60144] Updated weights for policy 1, policy_version 10922 (0.0007) +[2023-10-09 04:34:52,678][60144] Updated weights for policy 1, policy_version 10932 (0.0007) +[2023-10-09 04:34:53,042][60144] Updated weights for policy 1, policy_version 10942 (0.0007) +[2023-10-09 04:34:53,379][60143] Updated weights for policy 0, policy_version 10822 (0.0008) +[2023-10-09 04:34:53,753][60143] Updated weights for policy 0, policy_version 10832 (0.0007) +[2023-10-09 04:34:54,125][60143] Updated weights for policy 0, policy_version 10842 (0.0007) +[2023-10-09 04:34:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 22315008. Throughput: 0: 1684.9, 1: 1719.8. Samples: 5586580. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-09 04:34:56,053][59242] Avg episode reward: [(0, '12.390'), (1, '16.810')] +[2023-10-09 04:34:56,916][60144] Updated weights for policy 1, policy_version 10952 (0.0008) +[2023-10-09 04:34:57,299][60144] Updated weights for policy 1, policy_version 10962 (0.0007) +[2023-10-09 04:34:57,671][60144] Updated weights for policy 1, policy_version 10972 (0.0008) +[2023-10-09 04:34:57,949][60143] Updated weights for policy 0, policy_version 10852 (0.0007) +[2023-10-09 04:34:58,326][60143] Updated weights for policy 0, policy_version 10862 (0.0009) +[2023-10-09 04:34:58,686][60143] Updated weights for policy 0, policy_version 10872 (0.0008) +[2023-10-09 04:35:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 22380544. Throughput: 0: 1710.8, 1: 1733.2. Samples: 5607818. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:35:01,052][59242] Avg episode reward: [(0, '12.550'), (1, '17.310')] +[2023-10-09 04:35:01,601][60144] Updated weights for policy 1, policy_version 10982 (0.0007) +[2023-10-09 04:35:01,971][60144] Updated weights for policy 1, policy_version 10992 (0.0007) +[2023-10-09 04:35:02,343][60144] Updated weights for policy 1, policy_version 11002 (0.0007) +[2023-10-09 04:35:02,526][60143] Updated weights for policy 0, policy_version 10882 (0.0007) +[2023-10-09 04:35:02,895][60143] Updated weights for policy 0, policy_version 10892 (0.0009) +[2023-10-09 04:35:03,271][60143] Updated weights for policy 0, policy_version 10902 (0.0011) +[2023-10-09 04:35:03,640][60143] Updated weights for policy 0, policy_version 10912 (0.0011) +[2023-10-09 04:35:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 22446080. Throughput: 0: 1698.7, 1: 1698.6. Samples: 5617262. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:35:06,052][59242] Avg episode reward: [(0, '13.440'), (1, '16.960')] +[2023-10-09 04:35:06,401][60144] Updated weights for policy 1, policy_version 11012 (0.0009) +[2023-10-09 04:35:06,773][60144] Updated weights for policy 1, policy_version 11022 (0.0010) +[2023-10-09 04:35:07,155][60144] Updated weights for policy 1, policy_version 11032 (0.0008) +[2023-10-09 04:35:07,795][60143] Updated weights for policy 0, policy_version 10922 (0.0010) +[2023-10-09 04:35:08,170][60143] Updated weights for policy 0, policy_version 10932 (0.0009) +[2023-10-09 04:35:08,554][60143] Updated weights for policy 0, policy_version 10942 (0.0009) +[2023-10-09 04:35:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 22511616. Throughput: 0: 1694.2, 1: 1726.9. Samples: 5638116. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:35:11,053][59242] Avg episode reward: [(0, '14.020'), (1, '17.540')] +[2023-10-09 04:35:11,131][60144] Updated weights for policy 1, policy_version 11042 (0.0009) +[2023-10-09 04:35:11,500][60144] Updated weights for policy 1, policy_version 11052 (0.0007) +[2023-10-09 04:35:11,865][60144] Updated weights for policy 1, policy_version 11062 (0.0008) +[2023-10-09 04:35:12,228][60144] Updated weights for policy 1, policy_version 11072 (0.0009) +[2023-10-09 04:35:12,364][60143] Updated weights for policy 0, policy_version 10952 (0.0007) +[2023-10-09 04:35:12,739][60143] Updated weights for policy 0, policy_version 10962 (0.0010) +[2023-10-09 04:35:13,115][60143] Updated weights for policy 0, policy_version 10972 (0.0010) +[2023-10-09 04:35:16,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 22577152. Throughput: 0: 1719.0, 1: 1730.4. Samples: 5659476. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:35:16,053][59242] Avg episode reward: [(0, '13.820'), (1, '17.910')] +[2023-10-09 04:35:16,102][60144] Updated weights for policy 1, policy_version 11082 (0.0007) +[2023-10-09 04:35:16,472][60144] Updated weights for policy 1, policy_version 11092 (0.0007) +[2023-10-09 04:35:16,839][60144] Updated weights for policy 1, policy_version 11102 (0.0009) +[2023-10-09 04:35:17,101][60143] Updated weights for policy 0, policy_version 10982 (0.0008) +[2023-10-09 04:35:17,477][60143] Updated weights for policy 0, policy_version 10992 (0.0007) +[2023-10-09 04:35:17,858][60143] Updated weights for policy 0, policy_version 11002 (0.0007) +[2023-10-09 04:35:20,754][60144] Updated weights for policy 1, policy_version 11112 (0.0009) +[2023-10-09 04:35:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 22642688. Throughput: 0: 1687.9, 1: 1721.4. Samples: 5668814. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:35:21,053][59242] Avg episode reward: [(0, '14.020'), (1, '16.900')] +[2023-10-09 04:35:21,129][60144] Updated weights for policy 1, policy_version 11122 (0.0007) +[2023-10-09 04:35:21,498][60144] Updated weights for policy 1, policy_version 11132 (0.0010) +[2023-10-09 04:35:21,921][60143] Updated weights for policy 0, policy_version 11012 (0.0009) +[2023-10-09 04:35:22,298][60143] Updated weights for policy 0, policy_version 11022 (0.0008) +[2023-10-09 04:35:22,677][60143] Updated weights for policy 0, policy_version 11032 (0.0009) +[2023-10-09 04:35:25,528][60144] Updated weights for policy 1, policy_version 11142 (0.0009) +[2023-10-09 04:35:25,894][60144] Updated weights for policy 1, policy_version 11152 (0.0009) +[2023-10-09 04:35:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 22708224. Throughput: 0: 1705.0, 1: 1730.7. Samples: 5689856. Policy #0 lag: (min: 26.0, avg: 31.6, max: 58.0) +[2023-10-09 04:35:26,052][59242] Avg episode reward: [(0, '14.450'), (1, '16.810')] +[2023-10-09 04:35:26,262][60144] Updated weights for policy 1, policy_version 11162 (0.0010) +[2023-10-09 04:35:26,626][60143] Updated weights for policy 0, policy_version 11042 (0.0009) +[2023-10-09 04:35:26,997][60143] Updated weights for policy 0, policy_version 11052 (0.0008) +[2023-10-09 04:35:27,364][60143] Updated weights for policy 0, policy_version 11062 (0.0007) +[2023-10-09 04:35:27,732][60143] Updated weights for policy 0, policy_version 11072 (0.0008) +[2023-10-09 04:35:30,267][60144] Updated weights for policy 1, policy_version 11172 (0.0009) +[2023-10-09 04:35:30,640][60144] Updated weights for policy 1, policy_version 11182 (0.0011) +[2023-10-09 04:35:31,001][60144] Updated weights for policy 1, policy_version 11192 (0.0011) +[2023-10-09 04:35:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 22773760. Throughput: 0: 1718.7, 1: 1710.4. Samples: 5710392. Policy #0 lag: (min: 26.0, avg: 31.6, max: 58.0) +[2023-10-09 04:35:31,053][59242] Avg episode reward: [(0, '15.070'), (1, '17.340')] +[2023-10-09 04:35:31,060][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000011072_11337728.pth... +[2023-10-09 04:35:31,092][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000009472_9699328.pth +[2023-10-09 04:35:31,292][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000011200_11468800.pth... +[2023-10-09 04:35:31,321][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000009568_9797632.pth +[2023-10-09 04:35:31,869][60143] Updated weights for policy 0, policy_version 11082 (0.0011) +[2023-10-09 04:35:32,243][60143] Updated weights for policy 0, policy_version 11092 (0.0010) +[2023-10-09 04:35:32,619][60143] Updated weights for policy 0, policy_version 11102 (0.0008) +[2023-10-09 04:35:35,202][60144] Updated weights for policy 1, policy_version 11202 (0.0011) +[2023-10-09 04:35:35,599][60144] Updated weights for policy 1, policy_version 11212 (0.0009) +[2023-10-09 04:35:35,972][60144] Updated weights for policy 1, policy_version 11222 (0.0008) +[2023-10-09 04:35:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 22839296. Throughput: 0: 1692.1, 1: 1724.0. Samples: 5720062. Policy #0 lag: (min: 0.0, avg: 22.1, max: 32.0) +[2023-10-09 04:35:36,053][59242] Avg episode reward: [(0, '14.970'), (1, '17.160')] +[2023-10-09 04:35:36,338][60144] Updated weights for policy 1, policy_version 11232 (0.0007) +[2023-10-09 04:35:36,769][60143] Updated weights for policy 0, policy_version 11112 (0.0009) +[2023-10-09 04:35:37,136][60143] Updated weights for policy 0, policy_version 11122 (0.0010) +[2023-10-09 04:35:37,510][60143] Updated weights for policy 0, policy_version 11132 (0.0009) +[2023-10-09 04:35:40,255][60144] Updated weights for policy 1, policy_version 11242 (0.0009) +[2023-10-09 04:35:40,619][60144] Updated weights for policy 1, policy_version 11252 (0.0009) +[2023-10-09 04:35:40,983][60144] Updated weights for policy 1, policy_version 11262 (0.0008) +[2023-10-09 04:35:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 22904832. Throughput: 0: 1712.4, 1: 1717.5. Samples: 5740926. Policy #0 lag: (min: 0.0, avg: 22.1, max: 32.0) +[2023-10-09 04:35:41,053][59242] Avg episode reward: [(0, '15.490'), (1, '17.520')] +[2023-10-09 04:35:41,429][60143] Updated weights for policy 0, policy_version 11142 (0.0008) +[2023-10-09 04:35:41,801][60143] Updated weights for policy 0, policy_version 11152 (0.0008) +[2023-10-09 04:35:42,174][60143] Updated weights for policy 0, policy_version 11162 (0.0010) +[2023-10-09 04:35:44,816][60144] Updated weights for policy 1, policy_version 11272 (0.0009) +[2023-10-09 04:35:45,185][60144] Updated weights for policy 1, policy_version 11282 (0.0007) +[2023-10-09 04:35:45,542][60144] Updated weights for policy 1, policy_version 11292 (0.0008) +[2023-10-09 04:35:45,985][60143] Updated weights for policy 0, policy_version 11172 (0.0008) +[2023-10-09 04:35:46,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 23003136. Throughput: 0: 1711.7, 1: 1700.3. Samples: 5761358. Policy #0 lag: (min: 29.0, avg: 36.0, max: 61.0) +[2023-10-09 04:35:46,052][59242] Avg episode reward: [(0, '16.270'), (1, '17.440')] +[2023-10-09 04:35:46,368][60143] Updated weights for policy 0, policy_version 11182 (0.0009) +[2023-10-09 04:35:46,747][60143] Updated weights for policy 0, policy_version 11192 (0.0009) +[2023-10-09 04:35:49,427][60144] Updated weights for policy 1, policy_version 11302 (0.0008) +[2023-10-09 04:35:49,789][60144] Updated weights for policy 1, policy_version 11312 (0.0008) +[2023-10-09 04:35:50,153][60144] Updated weights for policy 1, policy_version 11322 (0.0007) +[2023-10-09 04:35:50,776][60143] Updated weights for policy 0, policy_version 11202 (0.0009) +[2023-10-09 04:35:51,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 23068672. Throughput: 0: 1706.6, 1: 1727.0. Samples: 5771774. Policy #0 lag: (min: 29.0, avg: 36.0, max: 61.0) +[2023-10-09 04:35:51,052][59242] Avg episode reward: [(0, '16.510'), (1, '18.090')] +[2023-10-09 04:35:51,152][60143] Updated weights for policy 0, policy_version 11212 (0.0010) +[2023-10-09 04:35:51,513][60143] Updated weights for policy 0, policy_version 11222 (0.0010) +[2023-10-09 04:35:51,888][60143] Updated weights for policy 0, policy_version 11232 (0.0008) +[2023-10-09 04:35:54,057][60144] Updated weights for policy 1, policy_version 11332 (0.0008) +[2023-10-09 04:35:54,425][60144] Updated weights for policy 1, policy_version 11342 (0.0008) +[2023-10-09 04:35:54,794][60144] Updated weights for policy 1, policy_version 11352 (0.0008) +[2023-10-09 04:35:56,035][60143] Updated weights for policy 0, policy_version 11242 (0.0009) +[2023-10-09 04:35:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 23134208. Throughput: 0: 1714.9, 1: 1714.2. Samples: 5792424. Policy #0 lag: (min: 29.0, avg: 36.0, max: 61.0) +[2023-10-09 04:35:56,052][59242] Avg episode reward: [(0, '17.090'), (1, '17.960')] +[2023-10-09 04:35:56,396][60143] Updated weights for policy 0, policy_version 11252 (0.0008) +[2023-10-09 04:35:56,783][60143] Updated weights for policy 0, policy_version 11262 (0.0008) +[2023-10-09 04:35:56,850][59934] Saving new best policy, reward=17.090! +[2023-10-09 04:35:58,569][60144] Updated weights for policy 1, policy_version 11362 (0.0008) +[2023-10-09 04:35:58,940][60144] Updated weights for policy 1, policy_version 11372 (0.0010) +[2023-10-09 04:35:59,311][60144] Updated weights for policy 1, policy_version 11382 (0.0007) +[2023-10-09 04:35:59,677][60144] Updated weights for policy 1, policy_version 11392 (0.0009) +[2023-10-09 04:36:00,613][60143] Updated weights for policy 0, policy_version 11272 (0.0009) +[2023-10-09 04:36:00,979][60143] Updated weights for policy 0, policy_version 11282 (0.0009) +[2023-10-09 04:36:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 23199744. Throughput: 0: 1708.6, 1: 1703.6. Samples: 5813028. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 04:36:01,053][59242] Avg episode reward: [(0, '17.090'), (1, '17.810')] +[2023-10-09 04:36:01,361][60143] Updated weights for policy 0, policy_version 11292 (0.0009) +[2023-10-09 04:36:03,564][60144] Updated weights for policy 1, policy_version 11402 (0.0007) +[2023-10-09 04:36:03,927][60144] Updated weights for policy 1, policy_version 11412 (0.0009) +[2023-10-09 04:36:04,297][60144] Updated weights for policy 1, policy_version 11422 (0.0008) +[2023-10-09 04:36:05,452][60143] Updated weights for policy 0, policy_version 11302 (0.0009) +[2023-10-09 04:36:05,823][60143] Updated weights for policy 0, policy_version 11312 (0.0010) +[2023-10-09 04:36:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 23265280. Throughput: 0: 1710.1, 1: 1728.8. Samples: 5823566. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 04:36:06,053][59242] Avg episode reward: [(0, '16.210'), (1, '16.950')] +[2023-10-09 04:36:06,191][60143] Updated weights for policy 0, policy_version 11322 (0.0009) +[2023-10-09 04:36:08,262][60144] Updated weights for policy 1, policy_version 11432 (0.0008) +[2023-10-09 04:36:08,631][60144] Updated weights for policy 1, policy_version 11442 (0.0010) +[2023-10-09 04:36:09,003][60144] Updated weights for policy 1, policy_version 11452 (0.0011) +[2023-10-09 04:36:10,250][60143] Updated weights for policy 0, policy_version 11332 (0.0008) +[2023-10-09 04:36:10,628][60143] Updated weights for policy 0, policy_version 11342 (0.0010) +[2023-10-09 04:36:10,997][60143] Updated weights for policy 0, policy_version 11352 (0.0010) +[2023-10-09 04:36:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 23330816. Throughput: 0: 1713.5, 1: 1708.9. Samples: 5843864. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 04:36:11,052][59242] Avg episode reward: [(0, '15.790'), (1, '17.150')] +[2023-10-09 04:36:12,762][60144] Updated weights for policy 1, policy_version 11462 (0.0008) +[2023-10-09 04:36:13,122][60144] Updated weights for policy 1, policy_version 11472 (0.0008) +[2023-10-09 04:36:13,480][60144] Updated weights for policy 1, policy_version 11482 (0.0008) +[2023-10-09 04:36:14,953][60143] Updated weights for policy 0, policy_version 11362 (0.0008) +[2023-10-09 04:36:15,319][60143] Updated weights for policy 0, policy_version 11372 (0.0009) +[2023-10-09 04:36:15,686][60143] Updated weights for policy 0, policy_version 11382 (0.0008) +[2023-10-09 04:36:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 23396352. Throughput: 0: 1694.8, 1: 1730.4. Samples: 5864528. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:36:16,053][59242] Avg episode reward: [(0, '16.490'), (1, '17.160')] +[2023-10-09 04:36:16,060][60143] Updated weights for policy 0, policy_version 11392 (0.0008) +[2023-10-09 04:36:17,308][60144] Updated weights for policy 1, policy_version 11492 (0.0009) +[2023-10-09 04:36:17,682][60144] Updated weights for policy 1, policy_version 11502 (0.0009) +[2023-10-09 04:36:18,043][60144] Updated weights for policy 1, policy_version 11512 (0.0010) +[2023-10-09 04:36:19,907][60143] Updated weights for policy 0, policy_version 11402 (0.0009) +[2023-10-09 04:36:20,279][60143] Updated weights for policy 0, policy_version 11412 (0.0009) +[2023-10-09 04:36:20,646][60143] Updated weights for policy 0, policy_version 11422 (0.0008) +[2023-10-09 04:36:21,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 23494656. Throughput: 0: 1712.8, 1: 1722.1. Samples: 5874634. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:36:21,052][59242] Avg episode reward: [(0, '16.340'), (1, '16.790')] +[2023-10-09 04:36:22,045][60144] Updated weights for policy 1, policy_version 11522 (0.0009) +[2023-10-09 04:36:22,458][60144] Updated weights for policy 1, policy_version 11532 (0.0009) +[2023-10-09 04:36:22,815][60144] Updated weights for policy 1, policy_version 11542 (0.0010) +[2023-10-09 04:36:23,177][60144] Updated weights for policy 1, policy_version 11552 (0.0009) +[2023-10-09 04:36:24,557][60143] Updated weights for policy 0, policy_version 11432 (0.0008) +[2023-10-09 04:36:24,941][60143] Updated weights for policy 0, policy_version 11442 (0.0010) +[2023-10-09 04:36:25,309][60143] Updated weights for policy 0, policy_version 11452 (0.0008) +[2023-10-09 04:36:26,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 23560192. Throughput: 0: 1716.1, 1: 1724.0. Samples: 5895734. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) +[2023-10-09 04:36:26,053][59242] Avg episode reward: [(0, '15.670'), (1, '16.450')] +[2023-10-09 04:36:27,001][60144] Updated weights for policy 1, policy_version 11562 (0.0008) +[2023-10-09 04:36:27,371][60144] Updated weights for policy 1, policy_version 11572 (0.0008) +[2023-10-09 04:36:27,738][60144] Updated weights for policy 1, policy_version 11582 (0.0008) +[2023-10-09 04:36:29,477][60143] Updated weights for policy 0, policy_version 11462 (0.0009) +[2023-10-09 04:36:29,855][60143] Updated weights for policy 0, policy_version 11472 (0.0008) +[2023-10-09 04:36:30,232][60143] Updated weights for policy 0, policy_version 11482 (0.0008) +[2023-10-09 04:36:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 23625728. Throughput: 0: 1691.1, 1: 1755.2. Samples: 5916446. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) +[2023-10-09 04:36:31,053][59242] Avg episode reward: [(0, '15.800'), (1, '16.080')] +[2023-10-09 04:36:31,701][60144] Updated weights for policy 1, policy_version 11592 (0.0009) +[2023-10-09 04:36:32,066][60144] Updated weights for policy 1, policy_version 11602 (0.0007) +[2023-10-09 04:36:32,434][60144] Updated weights for policy 1, policy_version 11612 (0.0008) +[2023-10-09 04:36:34,169][60143] Updated weights for policy 0, policy_version 11492 (0.0008) +[2023-10-09 04:36:34,552][60143] Updated weights for policy 0, policy_version 11502 (0.0009) +[2023-10-09 04:36:34,913][60143] Updated weights for policy 0, policy_version 11512 (0.0010) +[2023-10-09 04:36:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 23691264. Throughput: 0: 1725.7, 1: 1725.8. Samples: 5927094. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) +[2023-10-09 04:36:36,053][59242] Avg episode reward: [(0, '15.720'), (1, '15.550')] +[2023-10-09 04:36:36,303][60144] Updated weights for policy 1, policy_version 11622 (0.0008) +[2023-10-09 04:36:36,669][60144] Updated weights for policy 1, policy_version 11632 (0.0007) +[2023-10-09 04:36:37,036][60144] Updated weights for policy 1, policy_version 11642 (0.0007) +[2023-10-09 04:36:38,876][60143] Updated weights for policy 0, policy_version 11522 (0.0011) +[2023-10-09 04:36:39,248][60143] Updated weights for policy 0, policy_version 11532 (0.0007) +[2023-10-09 04:36:39,615][60143] Updated weights for policy 0, policy_version 11542 (0.0007) +[2023-10-09 04:36:39,986][60143] Updated weights for policy 0, policy_version 11552 (0.0010) +[2023-10-09 04:36:40,904][60144] Updated weights for policy 1, policy_version 11652 (0.0007) +[2023-10-09 04:36:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 23756800. Throughput: 0: 1706.2, 1: 1746.0. Samples: 5947776. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:36:41,052][59242] Avg episode reward: [(0, '16.010'), (1, '14.900')] +[2023-10-09 04:36:41,265][60144] Updated weights for policy 1, policy_version 11662 (0.0008) +[2023-10-09 04:36:41,630][60144] Updated weights for policy 1, policy_version 11672 (0.0008) +[2023-10-09 04:36:43,926][60143] Updated weights for policy 0, policy_version 11562 (0.0008) +[2023-10-09 04:36:44,296][60143] Updated weights for policy 0, policy_version 11572 (0.0010) +[2023-10-09 04:36:44,663][60143] Updated weights for policy 0, policy_version 11582 (0.0007) +[2023-10-09 04:36:45,506][60144] Updated weights for policy 1, policy_version 11682 (0.0009) +[2023-10-09 04:36:45,882][60144] Updated weights for policy 1, policy_version 11692 (0.0010) +[2023-10-09 04:36:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 23822336. Throughput: 0: 1697.6, 1: 1755.6. Samples: 5968422. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:36:46,053][59242] Avg episode reward: [(0, '15.420'), (1, '15.310')] +[2023-10-09 04:36:46,239][60144] Updated weights for policy 1, policy_version 11702 (0.0009) +[2023-10-09 04:36:46,603][60144] Updated weights for policy 1, policy_version 11712 (0.0010) +[2023-10-09 04:36:48,549][60143] Updated weights for policy 0, policy_version 11592 (0.0007) +[2023-10-09 04:36:48,924][60143] Updated weights for policy 0, policy_version 11602 (0.0008) +[2023-10-09 04:36:49,303][60143] Updated weights for policy 0, policy_version 11612 (0.0011) +[2023-10-09 04:36:50,681][60144] Updated weights for policy 1, policy_version 11722 (0.0009) +[2023-10-09 04:36:51,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 23887872. Throughput: 0: 1717.8, 1: 1732.0. Samples: 5978808. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:36:51,053][59242] Avg episode reward: [(0, '15.610'), (1, '15.500')] +[2023-10-09 04:36:51,058][60144] Updated weights for policy 1, policy_version 11732 (0.0008) +[2023-10-09 04:36:51,424][60144] Updated weights for policy 1, policy_version 11742 (0.0007) +[2023-10-09 04:36:53,249][60143] Updated weights for policy 0, policy_version 11622 (0.0009) +[2023-10-09 04:36:53,620][60143] Updated weights for policy 0, policy_version 11632 (0.0007) +[2023-10-09 04:36:53,994][60143] Updated weights for policy 0, policy_version 11642 (0.0007) +[2023-10-09 04:36:55,441][60144] Updated weights for policy 1, policy_version 11752 (0.0007) +[2023-10-09 04:36:55,809][60144] Updated weights for policy 1, policy_version 11762 (0.0008) +[2023-10-09 04:36:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 23953408. Throughput: 0: 1699.2, 1: 1753.7. Samples: 5999248. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:36:56,053][59242] Avg episode reward: [(0, '15.470'), (1, '15.100')] +[2023-10-09 04:36:56,177][60144] Updated weights for policy 1, policy_version 11772 (0.0008) +[2023-10-09 04:36:57,884][60143] Updated weights for policy 0, policy_version 11652 (0.0008) +[2023-10-09 04:36:58,246][60143] Updated weights for policy 0, policy_version 11662 (0.0009) +[2023-10-09 04:36:58,627][60143] Updated weights for policy 0, policy_version 11672 (0.0009) +[2023-10-09 04:37:00,074][60144] Updated weights for policy 1, policy_version 11782 (0.0008) +[2023-10-09 04:37:00,444][60144] Updated weights for policy 1, policy_version 11792 (0.0008) +[2023-10-09 04:37:00,817][60144] Updated weights for policy 1, policy_version 11802 (0.0007) +[2023-10-09 04:37:01,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 24051712. Throughput: 0: 1719.1, 1: 1733.1. Samples: 6019874. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:37:01,053][59242] Avg episode reward: [(0, '16.440'), (1, '14.580')] +[2023-10-09 04:37:02,597][60143] Updated weights for policy 0, policy_version 11682 (0.0008) +[2023-10-09 04:37:02,981][60143] Updated weights for policy 0, policy_version 11692 (0.0008) +[2023-10-09 04:37:03,343][60143] Updated weights for policy 0, policy_version 11702 (0.0007) +[2023-10-09 04:37:03,724][60143] Updated weights for policy 0, policy_version 11712 (0.0008) +[2023-10-09 04:37:04,708][60144] Updated weights for policy 1, policy_version 11812 (0.0009) +[2023-10-09 04:37:05,070][60144] Updated weights for policy 1, policy_version 11822 (0.0011) +[2023-10-09 04:37:05,435][60144] Updated weights for policy 1, policy_version 11832 (0.0009) +[2023-10-09 04:37:06,052][59242] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 24117248. Throughput: 0: 1705.2, 1: 1750.2. Samples: 6030128. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:37:06,053][59242] Avg episode reward: [(0, '17.060'), (1, '15.150')] +[2023-10-09 04:37:07,750][60143] Updated weights for policy 0, policy_version 11722 (0.0010) +[2023-10-09 04:37:08,116][60143] Updated weights for policy 0, policy_version 11732 (0.0009) +[2023-10-09 04:37:08,493][60143] Updated weights for policy 0, policy_version 11742 (0.0008) +[2023-10-09 04:37:09,388][60144] Updated weights for policy 1, policy_version 11842 (0.0010) +[2023-10-09 04:37:09,757][60144] Updated weights for policy 1, policy_version 11852 (0.0010) +[2023-10-09 04:37:10,124][60144] Updated weights for policy 1, policy_version 11862 (0.0009) +[2023-10-09 04:37:10,488][60144] Updated weights for policy 1, policy_version 11872 (0.0007) +[2023-10-09 04:37:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 24182784. Throughput: 0: 1697.6, 1: 1747.6. Samples: 6050772. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) +[2023-10-09 04:37:11,053][59242] Avg episode reward: [(0, '17.710'), (1, '15.140')] +[2023-10-09 04:37:11,054][59934] Saving new best policy, reward=17.710! +[2023-10-09 04:37:12,432][60143] Updated weights for policy 0, policy_version 11752 (0.0010) +[2023-10-09 04:37:12,802][60143] Updated weights for policy 0, policy_version 11762 (0.0009) +[2023-10-09 04:37:13,180][60143] Updated weights for policy 0, policy_version 11772 (0.0011) +[2023-10-09 04:37:14,554][60144] Updated weights for policy 1, policy_version 11882 (0.0008) +[2023-10-09 04:37:14,924][60144] Updated weights for policy 1, policy_version 11892 (0.0007) +[2023-10-09 04:37:15,285][60144] Updated weights for policy 1, policy_version 11902 (0.0007) +[2023-10-09 04:37:16,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 24248320. Throughput: 0: 1718.4, 1: 1713.2. Samples: 6070868. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) +[2023-10-09 04:37:16,053][59242] Avg episode reward: [(0, '17.680'), (1, '15.810')] +[2023-10-09 04:37:17,150][60143] Updated weights for policy 0, policy_version 11782 (0.0008) +[2023-10-09 04:37:17,524][60143] Updated weights for policy 0, policy_version 11792 (0.0008) +[2023-10-09 04:37:17,902][60143] Updated weights for policy 0, policy_version 11802 (0.0008) +[2023-10-09 04:37:19,198][60144] Updated weights for policy 1, policy_version 11912 (0.0007) +[2023-10-09 04:37:19,564][60144] Updated weights for policy 1, policy_version 11922 (0.0007) +[2023-10-09 04:37:19,937][60144] Updated weights for policy 1, policy_version 11932 (0.0008) +[2023-10-09 04:37:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 24313856. Throughput: 0: 1685.9, 1: 1748.2. Samples: 6081628. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) +[2023-10-09 04:37:21,053][59242] Avg episode reward: [(0, '17.970'), (1, '15.660')] +[2023-10-09 04:37:21,054][59934] Saving new best policy, reward=17.970! +[2023-10-09 04:37:21,910][60143] Updated weights for policy 0, policy_version 11812 (0.0009) +[2023-10-09 04:37:22,302][60143] Updated weights for policy 0, policy_version 11822 (0.0008) +[2023-10-09 04:37:22,674][60143] Updated weights for policy 0, policy_version 11832 (0.0008) +[2023-10-09 04:37:23,849][60144] Updated weights for policy 1, policy_version 11942 (0.0011) +[2023-10-09 04:37:24,213][60144] Updated weights for policy 1, policy_version 11952 (0.0009) +[2023-10-09 04:37:24,575][60144] Updated weights for policy 1, policy_version 11962 (0.0010) +[2023-10-09 04:37:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 24379392. Throughput: 0: 1706.5, 1: 1716.0. Samples: 6101790. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-09 04:37:26,053][59242] Avg episode reward: [(0, '18.550'), (1, '17.300')] +[2023-10-09 04:37:26,055][59934] Saving new best policy, reward=18.550! +[2023-10-09 04:37:26,638][60143] Updated weights for policy 0, policy_version 11842 (0.0009) +[2023-10-09 04:37:27,013][60143] Updated weights for policy 0, policy_version 11852 (0.0009) +[2023-10-09 04:37:27,385][60143] Updated weights for policy 0, policy_version 11862 (0.0008) +[2023-10-09 04:37:27,754][60143] Updated weights for policy 0, policy_version 11872 (0.0007) +[2023-10-09 04:37:28,668][60144] Updated weights for policy 1, policy_version 11972 (0.0009) +[2023-10-09 04:37:29,031][60144] Updated weights for policy 1, policy_version 11982 (0.0008) +[2023-10-09 04:37:29,399][60144] Updated weights for policy 1, policy_version 11992 (0.0009) +[2023-10-09 04:37:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 24444928. Throughput: 0: 1722.8, 1: 1706.4. Samples: 6122738. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-09 04:37:31,053][59242] Avg episode reward: [(0, '17.730'), (1, '17.070')] +[2023-10-09 04:37:31,065][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000011872_12156928.pth... +[2023-10-09 04:37:31,065][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000012000_12288000.pth... +[2023-10-09 04:37:31,102][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000010400_10649600.pth +[2023-10-09 04:37:31,108][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000010272_10518528.pth +[2023-10-09 04:37:31,921][60143] Updated weights for policy 0, policy_version 11882 (0.0007) +[2023-10-09 04:37:32,296][60143] Updated weights for policy 0, policy_version 11892 (0.0008) +[2023-10-09 04:37:32,659][60143] Updated weights for policy 0, policy_version 11902 (0.0011) +[2023-10-09 04:37:33,351][60144] Updated weights for policy 1, policy_version 12002 (0.0007) +[2023-10-09 04:37:33,709][60144] Updated weights for policy 1, policy_version 12012 (0.0010) +[2023-10-09 04:37:34,083][60144] Updated weights for policy 1, policy_version 12022 (0.0010) +[2023-10-09 04:37:34,438][60144] Updated weights for policy 1, policy_version 12032 (0.0008) +[2023-10-09 04:37:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 24510464. Throughput: 0: 1698.5, 1: 1730.5. Samples: 6133108. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-09 04:37:36,053][59242] Avg episode reward: [(0, '17.850'), (1, '17.190')] +[2023-10-09 04:37:36,525][60143] Updated weights for policy 0, policy_version 11912 (0.0008) +[2023-10-09 04:37:36,896][60143] Updated weights for policy 0, policy_version 11922 (0.0008) +[2023-10-09 04:37:37,265][60143] Updated weights for policy 0, policy_version 11932 (0.0010) +[2023-10-09 04:37:38,330][60144] Updated weights for policy 1, policy_version 12042 (0.0009) +[2023-10-09 04:37:38,700][60144] Updated weights for policy 1, policy_version 12052 (0.0009) +[2023-10-09 04:37:39,069][60144] Updated weights for policy 1, policy_version 12062 (0.0008) +[2023-10-09 04:37:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 24576000. Throughput: 0: 1721.0, 1: 1702.9. Samples: 6153324. Policy #0 lag: (min: 1.0, avg: 19.8, max: 33.0) +[2023-10-09 04:37:41,053][59242] Avg episode reward: [(0, '16.150'), (1, '17.980')] +[2023-10-09 04:37:41,315][60143] Updated weights for policy 0, policy_version 11942 (0.0010) +[2023-10-09 04:37:41,700][60143] Updated weights for policy 0, policy_version 11952 (0.0010) +[2023-10-09 04:37:42,062][60143] Updated weights for policy 0, policy_version 11962 (0.0011) +[2023-10-09 04:37:42,962][60144] Updated weights for policy 1, policy_version 12072 (0.0008) +[2023-10-09 04:37:43,335][60144] Updated weights for policy 1, policy_version 12082 (0.0007) +[2023-10-09 04:37:43,708][60144] Updated weights for policy 1, policy_version 12092 (0.0007) +[2023-10-09 04:37:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 24641536. Throughput: 0: 1714.0, 1: 1719.9. Samples: 6174398. Policy #0 lag: (min: 1.0, avg: 19.8, max: 33.0) +[2023-10-09 04:37:46,053][59242] Avg episode reward: [(0, '15.670'), (1, '17.450')] +[2023-10-09 04:37:46,093][60143] Updated weights for policy 0, policy_version 11972 (0.0010) +[2023-10-09 04:37:46,463][60143] Updated weights for policy 0, policy_version 11982 (0.0007) +[2023-10-09 04:37:46,839][60143] Updated weights for policy 0, policy_version 11992 (0.0007) +[2023-10-09 04:37:47,517][60144] Updated weights for policy 1, policy_version 12102 (0.0008) +[2023-10-09 04:37:47,878][60144] Updated weights for policy 1, policy_version 12112 (0.0009) +[2023-10-09 04:37:48,257][60144] Updated weights for policy 1, policy_version 12122 (0.0008) +[2023-10-09 04:37:50,816][60143] Updated weights for policy 0, policy_version 12002 (0.0009) +[2023-10-09 04:37:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 24707072. Throughput: 0: 1709.2, 1: 1706.1. Samples: 6183816. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:37:51,052][59242] Avg episode reward: [(0, '15.700'), (1, '18.340')] +[2023-10-09 04:37:51,184][60143] Updated weights for policy 0, policy_version 12012 (0.0010) +[2023-10-09 04:37:51,559][60143] Updated weights for policy 0, policy_version 12022 (0.0010) +[2023-10-09 04:37:51,925][60143] Updated weights for policy 0, policy_version 12032 (0.0009) +[2023-10-09 04:37:52,182][60144] Updated weights for policy 1, policy_version 12132 (0.0009) +[2023-10-09 04:37:52,549][60144] Updated weights for policy 1, policy_version 12142 (0.0008) +[2023-10-09 04:37:52,923][60144] Updated weights for policy 1, policy_version 12152 (0.0008) +[2023-10-09 04:37:55,905][60143] Updated weights for policy 0, policy_version 12042 (0.0010) +[2023-10-09 04:37:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 24772608. Throughput: 0: 1719.4, 1: 1712.3. Samples: 6205200. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:37:56,053][59242] Avg episode reward: [(0, '15.750'), (1, '18.540')] +[2023-10-09 04:37:56,277][60143] Updated weights for policy 0, policy_version 12052 (0.0007) +[2023-10-09 04:37:56,646][60143] Updated weights for policy 0, policy_version 12062 (0.0009) +[2023-10-09 04:37:57,009][60144] Updated weights for policy 1, policy_version 12162 (0.0009) +[2023-10-09 04:37:57,439][60144] Updated weights for policy 1, policy_version 12172 (0.0010) +[2023-10-09 04:37:57,808][60144] Updated weights for policy 1, policy_version 12182 (0.0008) +[2023-10-09 04:37:58,180][60144] Updated weights for policy 1, policy_version 12192 (0.0010) +[2023-10-09 04:38:00,448][60143] Updated weights for policy 0, policy_version 12072 (0.0008) +[2023-10-09 04:38:00,819][60143] Updated weights for policy 0, policy_version 12082 (0.0009) +[2023-10-09 04:38:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 24838144. Throughput: 0: 1712.4, 1: 1737.2. Samples: 6226098. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:38:01,053][59242] Avg episode reward: [(0, '15.330'), (1, '19.270')] +[2023-10-09 04:38:01,062][60003] Saving new best policy, reward=19.270! +[2023-10-09 04:38:01,192][60143] Updated weights for policy 0, policy_version 12092 (0.0010) +[2023-10-09 04:38:01,902][60144] Updated weights for policy 1, policy_version 12202 (0.0008) +[2023-10-09 04:38:02,271][60144] Updated weights for policy 1, policy_version 12212 (0.0008) +[2023-10-09 04:38:02,648][60144] Updated weights for policy 1, policy_version 12222 (0.0009) +[2023-10-09 04:38:04,998][60143] Updated weights for policy 0, policy_version 12102 (0.0009) +[2023-10-09 04:38:05,363][60143] Updated weights for policy 0, policy_version 12112 (0.0008) +[2023-10-09 04:38:05,726][60143] Updated weights for policy 0, policy_version 12122 (0.0009) +[2023-10-09 04:38:06,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 24936448. Throughput: 0: 1725.9, 1: 1705.9. Samples: 6236058. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-09 04:38:06,053][59242] Avg episode reward: [(0, '15.260'), (1, '18.860')] +[2023-10-09 04:38:06,715][60144] Updated weights for policy 1, policy_version 12232 (0.0010) +[2023-10-09 04:38:07,084][60144] Updated weights for policy 1, policy_version 12242 (0.0009) +[2023-10-09 04:38:07,465][60144] Updated weights for policy 1, policy_version 12252 (0.0008) +[2023-10-09 04:38:09,896][60143] Updated weights for policy 0, policy_version 12132 (0.0010) +[2023-10-09 04:38:10,288][60143] Updated weights for policy 0, policy_version 12142 (0.0009) +[2023-10-09 04:38:10,661][60143] Updated weights for policy 0, policy_version 12152 (0.0010) +[2023-10-09 04:38:11,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 25001984. Throughput: 0: 1721.2, 1: 1728.5. Samples: 6257028. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-09 04:38:11,052][59242] Avg episode reward: [(0, '15.180'), (1, '18.720')] +[2023-10-09 04:38:11,512][60144] Updated weights for policy 1, policy_version 12262 (0.0007) +[2023-10-09 04:38:11,874][60144] Updated weights for policy 1, policy_version 12272 (0.0009) +[2023-10-09 04:38:12,256][60144] Updated weights for policy 1, policy_version 12282 (0.0010) +[2023-10-09 04:38:14,542][60143] Updated weights for policy 0, policy_version 12162 (0.0011) +[2023-10-09 04:38:14,906][60143] Updated weights for policy 0, policy_version 12172 (0.0007) +[2023-10-09 04:38:15,286][60143] Updated weights for policy 0, policy_version 12182 (0.0008) +[2023-10-09 04:38:15,663][60143] Updated weights for policy 0, policy_version 12192 (0.0009) +[2023-10-09 04:38:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 25067520. Throughput: 0: 1691.0, 1: 1736.2. Samples: 6276964. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:38:16,053][59242] Avg episode reward: [(0, '15.270'), (1, '18.720')] +[2023-10-09 04:38:16,266][60144] Updated weights for policy 1, policy_version 12292 (0.0008) +[2023-10-09 04:38:16,640][60144] Updated weights for policy 1, policy_version 12302 (0.0007) +[2023-10-09 04:38:16,998][60144] Updated weights for policy 1, policy_version 12312 (0.0009) +[2023-10-09 04:38:19,620][60143] Updated weights for policy 0, policy_version 12202 (0.0010) +[2023-10-09 04:38:19,999][60143] Updated weights for policy 0, policy_version 12212 (0.0010) +[2023-10-09 04:38:20,366][60143] Updated weights for policy 0, policy_version 12222 (0.0011) +[2023-10-09 04:38:20,937][60144] Updated weights for policy 1, policy_version 12322 (0.0008) +[2023-10-09 04:38:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 25133056. Throughput: 0: 1720.0, 1: 1712.5. Samples: 6287570. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:38:21,053][59242] Avg episode reward: [(0, '15.520'), (1, '20.370')] +[2023-10-09 04:38:21,302][60144] Updated weights for policy 1, policy_version 12332 (0.0008) +[2023-10-09 04:38:21,673][60144] Updated weights for policy 1, policy_version 12342 (0.0008) +[2023-10-09 04:38:22,034][60003] Saving new best policy, reward=20.370! +[2023-10-09 04:38:22,040][60144] Updated weights for policy 1, policy_version 12352 (0.0010) +[2023-10-09 04:38:24,413][60143] Updated weights for policy 0, policy_version 12232 (0.0009) +[2023-10-09 04:38:24,781][60143] Updated weights for policy 0, policy_version 12242 (0.0009) +[2023-10-09 04:38:25,145][60143] Updated weights for policy 0, policy_version 12252 (0.0007) +[2023-10-09 04:38:25,939][60144] Updated weights for policy 1, policy_version 12362 (0.0007) +[2023-10-09 04:38:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 25198592. Throughput: 0: 1704.9, 1: 1737.5. Samples: 6308232. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:38:26,053][59242] Avg episode reward: [(0, '15.860'), (1, '20.410')] +[2023-10-09 04:38:26,304][60144] Updated weights for policy 1, policy_version 12372 (0.0008) +[2023-10-09 04:38:26,671][60144] Updated weights for policy 1, policy_version 12382 (0.0007) +[2023-10-09 04:38:26,750][60003] Saving new best policy, reward=20.410! +[2023-10-09 04:38:29,095][60143] Updated weights for policy 0, policy_version 12262 (0.0008) +[2023-10-09 04:38:29,469][60143] Updated weights for policy 0, policy_version 12272 (0.0008) +[2023-10-09 04:38:29,835][60143] Updated weights for policy 0, policy_version 12282 (0.0009) +[2023-10-09 04:38:30,627][60144] Updated weights for policy 1, policy_version 12392 (0.0007) +[2023-10-09 04:38:30,983][60144] Updated weights for policy 1, policy_version 12402 (0.0009) +[2023-10-09 04:38:31,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 25264128. Throughput: 0: 1686.2, 1: 1733.7. Samples: 6328294. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:38:31,053][59242] Avg episode reward: [(0, '16.650'), (1, '20.660')] +[2023-10-09 04:38:31,353][60144] Updated weights for policy 1, policy_version 12412 (0.0009) +[2023-10-09 04:38:31,501][60003] Saving new best policy, reward=20.660! +[2023-10-09 04:38:33,797][60143] Updated weights for policy 0, policy_version 12292 (0.0009) +[2023-10-09 04:38:34,168][60143] Updated weights for policy 0, policy_version 12302 (0.0010) +[2023-10-09 04:38:34,547][60143] Updated weights for policy 0, policy_version 12312 (0.0011) +[2023-10-09 04:38:35,323][60144] Updated weights for policy 1, policy_version 12422 (0.0011) +[2023-10-09 04:38:35,689][60144] Updated weights for policy 1, policy_version 12432 (0.0009) +[2023-10-09 04:38:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 25329664. Throughput: 0: 1720.3, 1: 1730.7. Samples: 6339110. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:38:36,053][60144] Updated weights for policy 1, policy_version 12442 (0.0009) +[2023-10-09 04:38:36,053][59242] Avg episode reward: [(0, '17.090'), (1, '20.150')] +[2023-10-09 04:38:38,640][60143] Updated weights for policy 0, policy_version 12322 (0.0009) +[2023-10-09 04:38:39,019][60143] Updated weights for policy 0, policy_version 12332 (0.0007) +[2023-10-09 04:38:39,400][60143] Updated weights for policy 0, policy_version 12342 (0.0008) +[2023-10-09 04:38:39,764][60143] Updated weights for policy 0, policy_version 12352 (0.0008) +[2023-10-09 04:38:40,089][60144] Updated weights for policy 1, policy_version 12452 (0.0008) +[2023-10-09 04:38:40,451][60144] Updated weights for policy 1, policy_version 12462 (0.0009) +[2023-10-09 04:38:40,828][60144] Updated weights for policy 1, policy_version 12472 (0.0008) +[2023-10-09 04:38:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 25395200. Throughput: 0: 1694.3, 1: 1728.9. Samples: 6359242. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:38:41,053][59242] Avg episode reward: [(0, '18.120'), (1, '18.940')] +[2023-10-09 04:38:43,673][60143] Updated weights for policy 0, policy_version 12362 (0.0008) +[2023-10-09 04:38:44,043][60143] Updated weights for policy 0, policy_version 12372 (0.0007) +[2023-10-09 04:38:44,410][60143] Updated weights for policy 0, policy_version 12382 (0.0008) +[2023-10-09 04:38:44,824][60144] Updated weights for policy 1, policy_version 12482 (0.0008) +[2023-10-09 04:38:45,233][60144] Updated weights for policy 1, policy_version 12492 (0.0008) +[2023-10-09 04:38:45,596][60144] Updated weights for policy 1, policy_version 12502 (0.0009) +[2023-10-09 04:38:45,970][60144] Updated weights for policy 1, policy_version 12512 (0.0010) +[2023-10-09 04:38:46,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 25493504. Throughput: 0: 1700.6, 1: 1706.8. Samples: 6379432. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-09 04:38:46,053][59242] Avg episode reward: [(0, '18.960'), (1, '19.650')] +[2023-10-09 04:38:46,065][59934] Saving new best policy, reward=18.960! +[2023-10-09 04:38:48,428][60143] Updated weights for policy 0, policy_version 12392 (0.0008) +[2023-10-09 04:38:48,800][60143] Updated weights for policy 0, policy_version 12402 (0.0008) +[2023-10-09 04:38:49,168][60143] Updated weights for policy 0, policy_version 12412 (0.0008) +[2023-10-09 04:38:49,970][60144] Updated weights for policy 1, policy_version 12522 (0.0007) +[2023-10-09 04:38:50,342][60144] Updated weights for policy 1, policy_version 12532 (0.0009) +[2023-10-09 04:38:50,709][60144] Updated weights for policy 1, policy_version 12542 (0.0009) +[2023-10-09 04:38:51,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 25559040. Throughput: 0: 1704.6, 1: 1723.1. Samples: 6390306. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-09 04:38:51,053][59242] Avg episode reward: [(0, '19.780'), (1, '19.550')] +[2023-10-09 04:38:51,054][59934] Saving new best policy, reward=19.780! +[2023-10-09 04:38:53,170][60143] Updated weights for policy 0, policy_version 12422 (0.0009) +[2023-10-09 04:38:53,543][60143] Updated weights for policy 0, policy_version 12432 (0.0009) +[2023-10-09 04:38:53,920][60143] Updated weights for policy 0, policy_version 12442 (0.0009) +[2023-10-09 04:38:54,522][60144] Updated weights for policy 1, policy_version 12552 (0.0009) +[2023-10-09 04:38:54,900][60144] Updated weights for policy 1, policy_version 12562 (0.0009) +[2023-10-09 04:38:55,255][60144] Updated weights for policy 1, policy_version 12572 (0.0007) +[2023-10-09 04:38:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 25624576. Throughput: 0: 1687.9, 1: 1717.2. Samples: 6410260. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-09 04:38:56,053][59242] Avg episode reward: [(0, '19.710'), (1, '19.460')] +[2023-10-09 04:38:57,814][60143] Updated weights for policy 0, policy_version 12452 (0.0008) +[2023-10-09 04:38:58,202][60143] Updated weights for policy 0, policy_version 12462 (0.0010) +[2023-10-09 04:38:58,565][60143] Updated weights for policy 0, policy_version 12472 (0.0009) +[2023-10-09 04:38:59,111][60144] Updated weights for policy 1, policy_version 12582 (0.0008) +[2023-10-09 04:38:59,472][60144] Updated weights for policy 1, policy_version 12592 (0.0009) +[2023-10-09 04:38:59,843][60144] Updated weights for policy 1, policy_version 12602 (0.0007) +[2023-10-09 04:39:01,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 25690112. Throughput: 0: 1719.8, 1: 1698.1. Samples: 6430772. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:39:01,052][59242] Avg episode reward: [(0, '19.940'), (1, '19.500')] +[2023-10-09 04:39:01,061][59934] Saving new best policy, reward=19.940! +[2023-10-09 04:39:02,707][60143] Updated weights for policy 0, policy_version 12482 (0.0008) +[2023-10-09 04:39:03,079][60143] Updated weights for policy 0, policy_version 12492 (0.0007) +[2023-10-09 04:39:03,452][60143] Updated weights for policy 0, policy_version 12502 (0.0010) +[2023-10-09 04:39:03,703][60144] Updated weights for policy 1, policy_version 12612 (0.0007) +[2023-10-09 04:39:03,825][60143] Updated weights for policy 0, policy_version 12512 (0.0007) +[2023-10-09 04:39:04,070][60144] Updated weights for policy 1, policy_version 12622 (0.0011) +[2023-10-09 04:39:04,441][60144] Updated weights for policy 1, policy_version 12632 (0.0010) +[2023-10-09 04:39:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 25755648. Throughput: 0: 1700.8, 1: 1729.6. Samples: 6441934. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:39:06,053][59242] Avg episode reward: [(0, '19.620'), (1, '19.440')] +[2023-10-09 04:39:07,664][60143] Updated weights for policy 0, policy_version 12522 (0.0007) +[2023-10-09 04:39:08,037][60143] Updated weights for policy 0, policy_version 12532 (0.0009) +[2023-10-09 04:39:08,407][60143] Updated weights for policy 0, policy_version 12542 (0.0010) +[2023-10-09 04:39:08,491][60144] Updated weights for policy 1, policy_version 12642 (0.0007) +[2023-10-09 04:39:08,855][60144] Updated weights for policy 1, policy_version 12652 (0.0008) +[2023-10-09 04:39:09,232][60144] Updated weights for policy 1, policy_version 12662 (0.0010) +[2023-10-09 04:39:09,600][60144] Updated weights for policy 1, policy_version 12672 (0.0008) +[2023-10-09 04:39:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 25821184. Throughput: 0: 1704.8, 1: 1700.0. Samples: 6461446. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:39:11,053][59242] Avg episode reward: [(0, '19.590'), (1, '19.440')] +[2023-10-09 04:39:12,506][60143] Updated weights for policy 0, policy_version 12552 (0.0008) +[2023-10-09 04:39:12,876][60143] Updated weights for policy 0, policy_version 12562 (0.0010) +[2023-10-09 04:39:13,251][60143] Updated weights for policy 0, policy_version 12572 (0.0009) +[2023-10-09 04:39:13,572][60144] Updated weights for policy 1, policy_version 12682 (0.0008) +[2023-10-09 04:39:13,934][60144] Updated weights for policy 1, policy_version 12692 (0.0008) +[2023-10-09 04:39:14,310][60144] Updated weights for policy 1, policy_version 12702 (0.0009) +[2023-10-09 04:39:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 25886720. Throughput: 0: 1725.2, 1: 1700.0. Samples: 6482432. Policy #0 lag: (min: 12.0, avg: 27.5, max: 44.0) +[2023-10-09 04:39:16,053][59242] Avg episode reward: [(0, '19.650'), (1, '18.720')] +[2023-10-09 04:39:17,341][60143] Updated weights for policy 0, policy_version 12582 (0.0009) +[2023-10-09 04:39:17,708][60143] Updated weights for policy 0, policy_version 12592 (0.0010) +[2023-10-09 04:39:18,076][60143] Updated weights for policy 0, policy_version 12602 (0.0008) +[2023-10-09 04:39:18,235][60144] Updated weights for policy 1, policy_version 12712 (0.0008) +[2023-10-09 04:39:18,599][60144] Updated weights for policy 1, policy_version 12722 (0.0009) +[2023-10-09 04:39:18,971][60144] Updated weights for policy 1, policy_version 12732 (0.0010) +[2023-10-09 04:39:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 25952256. Throughput: 0: 1688.6, 1: 1716.4. Samples: 6492336. Policy #0 lag: (min: 12.0, avg: 27.5, max: 44.0) +[2023-10-09 04:39:21,052][59242] Avg episode reward: [(0, '19.280'), (1, '18.270')] +[2023-10-09 04:39:22,116][60143] Updated weights for policy 0, policy_version 12612 (0.0007) +[2023-10-09 04:39:22,495][60143] Updated weights for policy 0, policy_version 12622 (0.0007) +[2023-10-09 04:39:22,864][60143] Updated weights for policy 0, policy_version 12632 (0.0009) +[2023-10-09 04:39:22,951][60144] Updated weights for policy 1, policy_version 12742 (0.0008) +[2023-10-09 04:39:23,322][60144] Updated weights for policy 1, policy_version 12752 (0.0008) +[2023-10-09 04:39:23,686][60144] Updated weights for policy 1, policy_version 12762 (0.0010) +[2023-10-09 04:39:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 26017792. Throughput: 0: 1707.6, 1: 1703.3. Samples: 6512734. Policy #0 lag: (min: 12.0, avg: 27.5, max: 44.0) +[2023-10-09 04:39:26,053][59242] Avg episode reward: [(0, '19.640'), (1, '17.480')] +[2023-10-09 04:39:26,791][60143] Updated weights for policy 0, policy_version 12642 (0.0008) +[2023-10-09 04:39:27,169][60143] Updated weights for policy 0, policy_version 12652 (0.0007) +[2023-10-09 04:39:27,543][60143] Updated weights for policy 0, policy_version 12662 (0.0007) +[2023-10-09 04:39:27,564][60144] Updated weights for policy 1, policy_version 12772 (0.0008) +[2023-10-09 04:39:27,916][60143] Updated weights for policy 0, policy_version 12672 (0.0009) +[2023-10-09 04:39:27,929][60144] Updated weights for policy 1, policy_version 12782 (0.0008) +[2023-10-09 04:39:28,299][60144] Updated weights for policy 1, policy_version 12792 (0.0009) +[2023-10-09 04:39:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 26083328. Throughput: 0: 1705.7, 1: 1726.1. Samples: 6533864. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:39:31,053][59242] Avg episode reward: [(0, '20.480'), (1, '18.070')] +[2023-10-09 04:39:31,060][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000012672_12976128.pth... +[2023-10-09 04:39:31,060][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000012800_13107200.pth... +[2023-10-09 04:39:31,100][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000011072_11337728.pth +[2023-10-09 04:39:31,101][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000011200_11468800.pth +[2023-10-09 04:39:31,106][59934] Saving new best policy, reward=20.480! +[2023-10-09 04:39:31,963][60143] Updated weights for policy 0, policy_version 12682 (0.0008) +[2023-10-09 04:39:32,332][60143] Updated weights for policy 0, policy_version 12692 (0.0008) +[2023-10-09 04:39:32,362][60144] Updated weights for policy 1, policy_version 12802 (0.0009) +[2023-10-09 04:39:32,709][60143] Updated weights for policy 0, policy_version 12702 (0.0009) +[2023-10-09 04:39:32,766][60144] Updated weights for policy 1, policy_version 12812 (0.0007) +[2023-10-09 04:39:33,139][60144] Updated weights for policy 1, policy_version 12822 (0.0010) +[2023-10-09 04:39:33,505][60144] Updated weights for policy 1, policy_version 12832 (0.0008) +[2023-10-09 04:39:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 26148864. Throughput: 0: 1687.5, 1: 1703.3. Samples: 6542892. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:39:36,053][59242] Avg episode reward: [(0, '20.520'), (1, '17.930')] +[2023-10-09 04:39:36,054][59934] Saving new best policy, reward=20.520! +[2023-10-09 04:39:36,822][60143] Updated weights for policy 0, policy_version 12712 (0.0007) +[2023-10-09 04:39:37,193][60143] Updated weights for policy 0, policy_version 12722 (0.0010) +[2023-10-09 04:39:37,427][60144] Updated weights for policy 1, policy_version 12842 (0.0007) +[2023-10-09 04:39:37,560][60143] Updated weights for policy 0, policy_version 12732 (0.0007) +[2023-10-09 04:39:37,797][60144] Updated weights for policy 1, policy_version 12852 (0.0008) +[2023-10-09 04:39:38,161][60144] Updated weights for policy 1, policy_version 12862 (0.0011) +[2023-10-09 04:39:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 26214400. Throughput: 0: 1702.6, 1: 1708.9. Samples: 6563776. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:39:41,053][59242] Avg episode reward: [(0, '20.090'), (1, '16.770')] +[2023-10-09 04:39:41,510][60143] Updated weights for policy 0, policy_version 12742 (0.0008) +[2023-10-09 04:39:41,886][60143] Updated weights for policy 0, policy_version 12752 (0.0010) +[2023-10-09 04:39:42,128][60144] Updated weights for policy 1, policy_version 12872 (0.0007) +[2023-10-09 04:39:42,257][60143] Updated weights for policy 0, policy_version 12762 (0.0010) +[2023-10-09 04:39:42,502][60144] Updated weights for policy 1, policy_version 12882 (0.0007) +[2023-10-09 04:39:42,869][60144] Updated weights for policy 1, policy_version 12892 (0.0008) +[2023-10-09 04:39:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 26279936. Throughput: 0: 1701.6, 1: 1732.3. Samples: 6585298. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-09 04:39:46,053][59242] Avg episode reward: [(0, '19.150'), (1, '16.620')] +[2023-10-09 04:39:46,278][60143] Updated weights for policy 0, policy_version 12772 (0.0010) +[2023-10-09 04:39:46,670][60143] Updated weights for policy 0, policy_version 12782 (0.0007) +[2023-10-09 04:39:46,767][60144] Updated weights for policy 1, policy_version 12902 (0.0008) +[2023-10-09 04:39:47,046][60143] Updated weights for policy 0, policy_version 12792 (0.0007) +[2023-10-09 04:39:47,133][60144] Updated weights for policy 1, policy_version 12912 (0.0007) +[2023-10-09 04:39:47,502][60144] Updated weights for policy 1, policy_version 12922 (0.0007) +[2023-10-09 04:39:51,029][60143] Updated weights for policy 0, policy_version 12802 (0.0007) +[2023-10-09 04:39:51,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 26345472. Throughput: 0: 1691.5, 1: 1701.2. Samples: 6594608. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-09 04:39:51,053][59242] Avg episode reward: [(0, '19.530'), (1, '17.360')] +[2023-10-09 04:39:51,402][60143] Updated weights for policy 0, policy_version 12812 (0.0007) +[2023-10-09 04:39:51,486][60144] Updated weights for policy 1, policy_version 12932 (0.0008) +[2023-10-09 04:39:51,769][60143] Updated weights for policy 0, policy_version 12822 (0.0007) +[2023-10-09 04:39:51,849][60144] Updated weights for policy 1, policy_version 12942 (0.0008) +[2023-10-09 04:39:52,141][60143] Updated weights for policy 0, policy_version 12832 (0.0008) +[2023-10-09 04:39:52,218][60144] Updated weights for policy 1, policy_version 12952 (0.0010) +[2023-10-09 04:39:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 26411008. Throughput: 0: 1700.7, 1: 1728.8. Samples: 6615776. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-09 04:39:56,053][59242] Avg episode reward: [(0, '19.200'), (1, '17.250')] +[2023-10-09 04:39:56,056][60143] Updated weights for policy 0, policy_version 12842 (0.0008) +[2023-10-09 04:39:56,184][60144] Updated weights for policy 1, policy_version 12962 (0.0008) +[2023-10-09 04:39:56,420][60143] Updated weights for policy 0, policy_version 12852 (0.0008) +[2023-10-09 04:39:56,541][60144] Updated weights for policy 1, policy_version 12972 (0.0007) +[2023-10-09 04:39:56,795][60143] Updated weights for policy 0, policy_version 12862 (0.0007) +[2023-10-09 04:39:56,911][60144] Updated weights for policy 1, policy_version 12982 (0.0008) +[2023-10-09 04:39:57,282][60144] Updated weights for policy 1, policy_version 12992 (0.0008) +[2023-10-09 04:40:00,777][60143] Updated weights for policy 0, policy_version 12872 (0.0008) +[2023-10-09 04:40:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 26476544. Throughput: 0: 1699.5, 1: 1736.7. Samples: 6637062. Policy #0 lag: (min: 46.0, avg: 55.8, max: 56.0) +[2023-10-09 04:40:01,053][59242] Avg episode reward: [(0, '19.120'), (1, '16.980')] +[2023-10-09 04:40:01,141][60143] Updated weights for policy 0, policy_version 12882 (0.0009) +[2023-10-09 04:40:01,241][60144] Updated weights for policy 1, policy_version 13002 (0.0008) +[2023-10-09 04:40:01,519][60143] Updated weights for policy 0, policy_version 12892 (0.0009) +[2023-10-09 04:40:01,614][60144] Updated weights for policy 1, policy_version 13012 (0.0008) +[2023-10-09 04:40:01,986][60144] Updated weights for policy 1, policy_version 13022 (0.0009) +[2023-10-09 04:40:05,485][60143] Updated weights for policy 0, policy_version 12902 (0.0008) +[2023-10-09 04:40:05,765][60144] Updated weights for policy 1, policy_version 13032 (0.0007) +[2023-10-09 04:40:05,860][60143] Updated weights for policy 0, policy_version 12912 (0.0007) +[2023-10-09 04:40:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 26542080. Throughput: 0: 1702.2, 1: 1717.2. Samples: 6646206. Policy #0 lag: (min: 46.0, avg: 55.8, max: 56.0) +[2023-10-09 04:40:06,053][59242] Avg episode reward: [(0, '18.940'), (1, '17.300')] +[2023-10-09 04:40:06,128][60144] Updated weights for policy 1, policy_version 13042 (0.0007) +[2023-10-09 04:40:06,228][60143] Updated weights for policy 0, policy_version 12922 (0.0008) +[2023-10-09 04:40:06,492][60144] Updated weights for policy 1, policy_version 13052 (0.0009) +[2023-10-09 04:40:10,249][60143] Updated weights for policy 0, policy_version 12932 (0.0008) +[2023-10-09 04:40:10,617][60144] Updated weights for policy 1, policy_version 13062 (0.0008) +[2023-10-09 04:40:10,623][60143] Updated weights for policy 0, policy_version 12942 (0.0009) +[2023-10-09 04:40:10,984][60144] Updated weights for policy 1, policy_version 13072 (0.0010) +[2023-10-09 04:40:10,988][60143] Updated weights for policy 0, policy_version 12952 (0.0008) +[2023-10-09 04:40:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 26607616. Throughput: 0: 1705.6, 1: 1731.9. Samples: 6667418. Policy #0 lag: (min: 46.0, avg: 55.8, max: 56.0) +[2023-10-09 04:40:11,052][59242] Avg episode reward: [(0, '18.970'), (1, '17.900')] +[2023-10-09 04:40:11,351][60144] Updated weights for policy 1, policy_version 13082 (0.0007) +[2023-10-09 04:40:15,165][60143] Updated weights for policy 0, policy_version 12962 (0.0009) +[2023-10-09 04:40:15,412][60144] Updated weights for policy 1, policy_version 13092 (0.0008) +[2023-10-09 04:40:15,532][60143] Updated weights for policy 0, policy_version 12972 (0.0009) +[2023-10-09 04:40:15,780][60144] Updated weights for policy 1, policy_version 13102 (0.0009) +[2023-10-09 04:40:15,905][60143] Updated weights for policy 0, policy_version 12982 (0.0008) +[2023-10-09 04:40:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 26673152. Throughput: 0: 1696.2, 1: 1721.8. Samples: 6687672. Policy #0 lag: (min: 1.0, avg: 10.3, max: 33.0) +[2023-10-09 04:40:16,053][59242] Avg episode reward: [(0, '19.280'), (1, '17.940')] +[2023-10-09 04:40:16,151][60144] Updated weights for policy 1, policy_version 13112 (0.0008) +[2023-10-09 04:40:16,266][60143] Updated weights for policy 0, policy_version 12992 (0.0007) +[2023-10-09 04:40:20,145][60144] Updated weights for policy 1, policy_version 13122 (0.0008) +[2023-10-09 04:40:20,205][60143] Updated weights for policy 0, policy_version 13002 (0.0009) +[2023-10-09 04:40:20,524][60144] Updated weights for policy 1, policy_version 13132 (0.0008) +[2023-10-09 04:40:20,577][60143] Updated weights for policy 0, policy_version 13012 (0.0009) +[2023-10-09 04:40:20,894][60144] Updated weights for policy 1, policy_version 13142 (0.0009) +[2023-10-09 04:40:20,943][60143] Updated weights for policy 0, policy_version 13022 (0.0008) +[2023-10-09 04:40:21,052][59242] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 26771456. Throughput: 0: 1709.7, 1: 1733.8. Samples: 6697850. Policy #0 lag: (min: 31.0, avg: 35.7, max: 63.0) +[2023-10-09 04:40:21,053][59242] Avg episode reward: [(0, '19.880'), (1, '18.230')] +[2023-10-09 04:40:21,262][60144] Updated weights for policy 1, policy_version 13152 (0.0010) +[2023-10-09 04:40:24,811][60143] Updated weights for policy 0, policy_version 13032 (0.0008) +[2023-10-09 04:40:25,104][60144] Updated weights for policy 1, policy_version 13162 (0.0008) +[2023-10-09 04:40:25,174][60143] Updated weights for policy 0, policy_version 13042 (0.0008) +[2023-10-09 04:40:25,466][60144] Updated weights for policy 1, policy_version 13172 (0.0008) +[2023-10-09 04:40:25,547][60143] Updated weights for policy 0, policy_version 13052 (0.0009) +[2023-10-09 04:40:25,838][60144] Updated weights for policy 1, policy_version 13182 (0.0009) +[2023-10-09 04:40:26,052][59242] Fps is (10 sec: 19660.5, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 26869760. Throughput: 0: 1718.0, 1: 1742.3. Samples: 6719494. Policy #0 lag: (min: 31.0, avg: 35.7, max: 63.0) +[2023-10-09 04:40:26,053][59242] Avg episode reward: [(0, '20.330'), (1, '18.180')] +[2023-10-09 04:40:29,379][60143] Updated weights for policy 0, policy_version 13062 (0.0009) +[2023-10-09 04:40:29,759][60143] Updated weights for policy 0, policy_version 13072 (0.0009) +[2023-10-09 04:40:29,857][60144] Updated weights for policy 1, policy_version 13192 (0.0007) +[2023-10-09 04:40:30,140][60143] Updated weights for policy 0, policy_version 13082 (0.0007) +[2023-10-09 04:40:30,216][60144] Updated weights for policy 1, policy_version 13202 (0.0009) +[2023-10-09 04:40:30,577][60144] Updated weights for policy 1, policy_version 13212 (0.0011) +[2023-10-09 04:40:31,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 26935296. Throughput: 0: 1686.1, 1: 1711.2. Samples: 6738174. Policy #0 lag: (min: 31.0, avg: 35.7, max: 63.0) +[2023-10-09 04:40:31,053][59242] Avg episode reward: [(0, '19.760'), (1, '18.840')] +[2023-10-09 04:40:34,222][60143] Updated weights for policy 0, policy_version 13092 (0.0007) +[2023-10-09 04:40:34,586][60144] Updated weights for policy 1, policy_version 13222 (0.0009) +[2023-10-09 04:40:34,613][60143] Updated weights for policy 0, policy_version 13102 (0.0007) +[2023-10-09 04:40:34,952][60144] Updated weights for policy 1, policy_version 13232 (0.0008) +[2023-10-09 04:40:34,990][60143] Updated weights for policy 0, policy_version 13112 (0.0007) +[2023-10-09 04:40:35,327][60144] Updated weights for policy 1, policy_version 13242 (0.0008) +[2023-10-09 04:40:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 27000832. Throughput: 0: 1716.5, 1: 1733.7. Samples: 6749868. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:40:36,053][59242] Avg episode reward: [(0, '18.340'), (1, '19.500')] +[2023-10-09 04:40:38,922][60143] Updated weights for policy 0, policy_version 13122 (0.0009) +[2023-10-09 04:40:39,223][60144] Updated weights for policy 1, policy_version 13252 (0.0009) +[2023-10-09 04:40:39,287][60143] Updated weights for policy 0, policy_version 13132 (0.0007) +[2023-10-09 04:40:39,597][60144] Updated weights for policy 1, policy_version 13262 (0.0009) +[2023-10-09 04:40:39,655][60143] Updated weights for policy 0, policy_version 13142 (0.0008) +[2023-10-09 04:40:39,964][60144] Updated weights for policy 1, policy_version 13272 (0.0009) +[2023-10-09 04:40:40,024][60143] Updated weights for policy 0, policy_version 13152 (0.0008) +[2023-10-09 04:40:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 27066368. Throughput: 0: 1701.3, 1: 1721.4. Samples: 6769796. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:40:41,053][59242] Avg episode reward: [(0, '18.840'), (1, '20.180')] +[2023-10-09 04:40:43,949][60144] Updated weights for policy 1, policy_version 13282 (0.0008) +[2023-10-09 04:40:44,081][60143] Updated weights for policy 0, policy_version 13162 (0.0008) +[2023-10-09 04:40:44,325][60144] Updated weights for policy 1, policy_version 13292 (0.0007) +[2023-10-09 04:40:44,450][60143] Updated weights for policy 0, policy_version 13172 (0.0010) +[2023-10-09 04:40:44,693][60144] Updated weights for policy 1, policy_version 13302 (0.0007) +[2023-10-09 04:40:44,824][60143] Updated weights for policy 0, policy_version 13182 (0.0008) +[2023-10-09 04:40:45,061][60144] Updated weights for policy 1, policy_version 13312 (0.0009) +[2023-10-09 04:40:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 27131904. Throughput: 0: 1688.1, 1: 1697.6. Samples: 6789418. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:40:46,053][59242] Avg episode reward: [(0, '18.610'), (1, '21.580')] +[2023-10-09 04:40:46,065][60003] Saving new best policy, reward=21.580! +[2023-10-09 04:40:48,779][60143] Updated weights for policy 0, policy_version 13192 (0.0008) +[2023-10-09 04:40:48,989][60144] Updated weights for policy 1, policy_version 13322 (0.0007) +[2023-10-09 04:40:49,155][60143] Updated weights for policy 0, policy_version 13202 (0.0008) +[2023-10-09 04:40:49,361][60144] Updated weights for policy 1, policy_version 13332 (0.0007) +[2023-10-09 04:40:49,528][60143] Updated weights for policy 0, policy_version 13212 (0.0007) +[2023-10-09 04:40:49,727][60144] Updated weights for policy 1, policy_version 13342 (0.0008) +[2023-10-09 04:40:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 27197440. Throughput: 0: 1714.6, 1: 1730.5. Samples: 6801236. Policy #0 lag: (min: 11.0, avg: 13.1, max: 40.0) +[2023-10-09 04:40:51,053][59242] Avg episode reward: [(0, '18.100'), (1, '21.600')] +[2023-10-09 04:40:51,054][60003] Saving new best policy, reward=21.600! +[2023-10-09 04:40:53,364][60143] Updated weights for policy 0, policy_version 13222 (0.0010) +[2023-10-09 04:40:53,745][60143] Updated weights for policy 0, policy_version 13232 (0.0009) +[2023-10-09 04:40:53,785][60144] Updated weights for policy 1, policy_version 13352 (0.0007) +[2023-10-09 04:40:54,106][60143] Updated weights for policy 0, policy_version 13242 (0.0009) +[2023-10-09 04:40:54,146][60144] Updated weights for policy 1, policy_version 13362 (0.0008) +[2023-10-09 04:40:54,508][60144] Updated weights for policy 1, policy_version 13372 (0.0008) +[2023-10-09 04:40:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 27262976. Throughput: 0: 1688.6, 1: 1698.8. Samples: 6819850. Policy #0 lag: (min: 11.0, avg: 13.1, max: 40.0) +[2023-10-09 04:40:56,053][59242] Avg episode reward: [(0, '17.910'), (1, '20.830')] +[2023-10-09 04:40:58,312][60143] Updated weights for policy 0, policy_version 13252 (0.0008) +[2023-10-09 04:40:58,347][60144] Updated weights for policy 1, policy_version 13382 (0.0007) +[2023-10-09 04:40:58,679][60143] Updated weights for policy 0, policy_version 13262 (0.0008) +[2023-10-09 04:40:58,707][60144] Updated weights for policy 1, policy_version 13392 (0.0008) +[2023-10-09 04:40:59,053][60143] Updated weights for policy 0, policy_version 13272 (0.0008) +[2023-10-09 04:40:59,071][60144] Updated weights for policy 1, policy_version 13402 (0.0008) +[2023-10-09 04:41:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 27328512. Throughput: 0: 1696.4, 1: 1704.8. Samples: 6840726. Policy #0 lag: (min: 11.0, avg: 13.1, max: 40.0) +[2023-10-09 04:41:01,053][59242] Avg episode reward: [(0, '18.190'), (1, '20.220')] +[2023-10-09 04:41:02,960][60144] Updated weights for policy 1, policy_version 13412 (0.0007) +[2023-10-09 04:41:03,161][60143] Updated weights for policy 0, policy_version 13282 (0.0008) +[2023-10-09 04:41:03,326][60144] Updated weights for policy 1, policy_version 13422 (0.0008) +[2023-10-09 04:41:03,530][60143] Updated weights for policy 0, policy_version 13292 (0.0007) +[2023-10-09 04:41:03,699][60144] Updated weights for policy 1, policy_version 13432 (0.0007) +[2023-10-09 04:41:03,900][60143] Updated weights for policy 0, policy_version 13302 (0.0009) +[2023-10-09 04:41:04,278][60143] Updated weights for policy 0, policy_version 13312 (0.0008) +[2023-10-09 04:41:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 27394048. Throughput: 0: 1700.4, 1: 1708.6. Samples: 6851258. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:41:06,053][59242] Avg episode reward: [(0, '18.880'), (1, '20.610')] +[2023-10-09 04:41:07,653][60144] Updated weights for policy 1, policy_version 13442 (0.0010) +[2023-10-09 04:41:08,017][60144] Updated weights for policy 1, policy_version 13452 (0.0008) +[2023-10-09 04:41:08,382][60144] Updated weights for policy 1, policy_version 13462 (0.0009) +[2023-10-09 04:41:08,566][60143] Updated weights for policy 0, policy_version 13322 (0.0010) +[2023-10-09 04:41:08,749][60144] Updated weights for policy 1, policy_version 13472 (0.0007) +[2023-10-09 04:41:08,934][60143] Updated weights for policy 0, policy_version 13332 (0.0009) +[2023-10-09 04:41:09,298][60143] Updated weights for policy 0, policy_version 13342 (0.0010) +[2023-10-09 04:41:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 27459584. Throughput: 0: 1671.0, 1: 1695.5. Samples: 6870984. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:41:11,053][59242] Avg episode reward: [(0, '18.830'), (1, '20.000')] +[2023-10-09 04:41:12,713][60144] Updated weights for policy 1, policy_version 13482 (0.0009) +[2023-10-09 04:41:13,087][60144] Updated weights for policy 1, policy_version 13492 (0.0007) +[2023-10-09 04:41:13,203][60143] Updated weights for policy 0, policy_version 13352 (0.0010) +[2023-10-09 04:41:13,453][60144] Updated weights for policy 1, policy_version 13502 (0.0010) +[2023-10-09 04:41:13,572][60143] Updated weights for policy 0, policy_version 13362 (0.0011) +[2023-10-09 04:41:13,942][60143] Updated weights for policy 0, policy_version 13372 (0.0007) +[2023-10-09 04:41:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 27525120. Throughput: 0: 1701.2, 1: 1723.2. Samples: 6892272. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:41:16,053][59242] Avg episode reward: [(0, '18.400'), (1, '20.640')] +[2023-10-09 04:41:17,364][60144] Updated weights for policy 1, policy_version 13512 (0.0010) +[2023-10-09 04:41:17,735][60144] Updated weights for policy 1, policy_version 13522 (0.0008) +[2023-10-09 04:41:17,797][60143] Updated weights for policy 0, policy_version 13382 (0.0007) +[2023-10-09 04:41:18,114][60144] Updated weights for policy 1, policy_version 13532 (0.0008) +[2023-10-09 04:41:18,169][60143] Updated weights for policy 0, policy_version 13392 (0.0007) +[2023-10-09 04:41:18,548][60143] Updated weights for policy 0, policy_version 13402 (0.0007) +[2023-10-09 04:41:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 27590656. Throughput: 0: 1679.4, 1: 1699.6. Samples: 6901920. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-09 04:41:21,053][59242] Avg episode reward: [(0, '18.330'), (1, '20.970')] +[2023-10-09 04:41:22,010][60144] Updated weights for policy 1, policy_version 13542 (0.0008) +[2023-10-09 04:41:22,372][60144] Updated weights for policy 1, policy_version 13552 (0.0009) +[2023-10-09 04:41:22,583][60143] Updated weights for policy 0, policy_version 13412 (0.0010) +[2023-10-09 04:41:22,744][60144] Updated weights for policy 1, policy_version 13562 (0.0008) +[2023-10-09 04:41:22,977][60143] Updated weights for policy 0, policy_version 13422 (0.0007) +[2023-10-09 04:41:23,339][60143] Updated weights for policy 0, policy_version 13432 (0.0007) +[2023-10-09 04:41:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 27656192. Throughput: 0: 1680.2, 1: 1712.3. Samples: 6922458. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-09 04:41:26,053][59242] Avg episode reward: [(0, '18.130'), (1, '20.880')] +[2023-10-09 04:41:26,917][60144] Updated weights for policy 1, policy_version 13572 (0.0008) +[2023-10-09 04:41:27,284][60144] Updated weights for policy 1, policy_version 13582 (0.0008) +[2023-10-09 04:41:27,507][60143] Updated weights for policy 0, policy_version 13442 (0.0008) +[2023-10-09 04:41:27,646][60144] Updated weights for policy 1, policy_version 13592 (0.0008) +[2023-10-09 04:41:27,885][60143] Updated weights for policy 0, policy_version 13452 (0.0009) +[2023-10-09 04:41:28,249][60143] Updated weights for policy 0, policy_version 13462 (0.0010) +[2023-10-09 04:41:28,618][60143] Updated weights for policy 0, policy_version 13472 (0.0008) +[2023-10-09 04:41:31,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 27721728. Throughput: 0: 1694.8, 1: 1732.2. Samples: 6943632. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-09 04:41:31,053][59242] Avg episode reward: [(0, '18.490'), (1, '20.350')] +[2023-10-09 04:41:31,066][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000013472_13795328.pth... +[2023-10-09 04:41:31,067][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000013600_13926400.pth... +[2023-10-09 04:41:31,098][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000011872_12156928.pth +[2023-10-09 04:41:31,110][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000012000_12288000.pth +[2023-10-09 04:41:31,552][60144] Updated weights for policy 1, policy_version 13602 (0.0008) +[2023-10-09 04:41:31,925][60144] Updated weights for policy 1, policy_version 13612 (0.0010) +[2023-10-09 04:41:32,294][60144] Updated weights for policy 1, policy_version 13622 (0.0007) +[2023-10-09 04:41:32,556][60143] Updated weights for policy 0, policy_version 13482 (0.0007) +[2023-10-09 04:41:32,657][60144] Updated weights for policy 1, policy_version 13632 (0.0008) +[2023-10-09 04:41:32,928][60143] Updated weights for policy 0, policy_version 13492 (0.0008) +[2023-10-09 04:41:33,303][60143] Updated weights for policy 0, policy_version 13502 (0.0007) +[2023-10-09 04:41:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 27787264. Throughput: 0: 1670.9, 1: 1701.2. Samples: 6952984. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-09 04:41:36,053][59242] Avg episode reward: [(0, '18.140'), (1, '19.520')] +[2023-10-09 04:41:36,683][60144] Updated weights for policy 1, policy_version 13642 (0.0007) +[2023-10-09 04:41:37,050][60144] Updated weights for policy 1, policy_version 13652 (0.0007) +[2023-10-09 04:41:37,337][60143] Updated weights for policy 0, policy_version 13512 (0.0008) +[2023-10-09 04:41:37,420][60144] Updated weights for policy 1, policy_version 13662 (0.0009) +[2023-10-09 04:41:37,706][60143] Updated weights for policy 0, policy_version 13522 (0.0008) +[2023-10-09 04:41:38,082][60143] Updated weights for policy 0, policy_version 13532 (0.0007) +[2023-10-09 04:41:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 27852800. Throughput: 0: 1700.6, 1: 1731.8. Samples: 6974306. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-09 04:41:41,053][59242] Avg episode reward: [(0, '17.750'), (1, '18.720')] +[2023-10-09 04:41:41,382][60144] Updated weights for policy 1, policy_version 13672 (0.0007) +[2023-10-09 04:41:41,751][60144] Updated weights for policy 1, policy_version 13682 (0.0009) +[2023-10-09 04:41:41,981][60143] Updated weights for policy 0, policy_version 13542 (0.0008) +[2023-10-09 04:41:42,113][60144] Updated weights for policy 1, policy_version 13692 (0.0007) +[2023-10-09 04:41:42,353][60143] Updated weights for policy 0, policy_version 13552 (0.0010) +[2023-10-09 04:41:42,720][60143] Updated weights for policy 0, policy_version 13562 (0.0011) +[2023-10-09 04:41:46,031][60144] Updated weights for policy 1, policy_version 13702 (0.0008) +[2023-10-09 04:41:46,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 27918336. Throughput: 0: 1701.2, 1: 1732.6. Samples: 6995246. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-09 04:41:46,052][59242] Avg episode reward: [(0, '18.590'), (1, '18.540')] +[2023-10-09 04:41:46,400][60144] Updated weights for policy 1, policy_version 13712 (0.0007) +[2023-10-09 04:41:46,755][60144] Updated weights for policy 1, policy_version 13722 (0.0008) +[2023-10-09 04:41:46,825][60143] Updated weights for policy 0, policy_version 13572 (0.0011) +[2023-10-09 04:41:47,207][60143] Updated weights for policy 0, policy_version 13582 (0.0010) +[2023-10-09 04:41:47,573][60143] Updated weights for policy 0, policy_version 13592 (0.0010) +[2023-10-09 04:41:50,702][60144] Updated weights for policy 1, policy_version 13732 (0.0008) +[2023-10-09 04:41:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 27983872. Throughput: 0: 1684.0, 1: 1726.3. Samples: 7004720. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:41:51,053][59242] Avg episode reward: [(0, '19.230'), (1, '18.810')] +[2023-10-09 04:41:51,067][60144] Updated weights for policy 1, policy_version 13742 (0.0009) +[2023-10-09 04:41:51,436][60144] Updated weights for policy 1, policy_version 13752 (0.0010) +[2023-10-09 04:41:51,556][60143] Updated weights for policy 0, policy_version 13602 (0.0009) +[2023-10-09 04:41:51,921][60143] Updated weights for policy 0, policy_version 13612 (0.0009) +[2023-10-09 04:41:52,307][60143] Updated weights for policy 0, policy_version 13622 (0.0010) +[2023-10-09 04:41:52,679][60143] Updated weights for policy 0, policy_version 13632 (0.0008) +[2023-10-09 04:41:55,080][60144] Updated weights for policy 1, policy_version 13762 (0.0007) +[2023-10-09 04:41:55,437][60144] Updated weights for policy 1, policy_version 13772 (0.0009) +[2023-10-09 04:41:55,808][60144] Updated weights for policy 1, policy_version 13782 (0.0007) +[2023-10-09 04:41:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 28049408. Throughput: 0: 1708.9, 1: 1739.9. Samples: 7026180. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:41:56,052][59242] Avg episode reward: [(0, '18.500'), (1, '19.250')] +[2023-10-09 04:41:56,176][60144] Updated weights for policy 1, policy_version 13792 (0.0007) +[2023-10-09 04:41:56,509][60143] Updated weights for policy 0, policy_version 13642 (0.0010) +[2023-10-09 04:41:56,876][60143] Updated weights for policy 0, policy_version 13652 (0.0008) +[2023-10-09 04:41:57,237][60143] Updated weights for policy 0, policy_version 13662 (0.0009) +[2023-10-09 04:42:00,292][60144] Updated weights for policy 1, policy_version 13802 (0.0009) +[2023-10-09 04:42:00,674][60144] Updated weights for policy 1, policy_version 13812 (0.0007) +[2023-10-09 04:42:01,032][60144] Updated weights for policy 1, policy_version 13822 (0.0009) +[2023-10-09 04:42:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 28114944. Throughput: 0: 1706.8, 1: 1722.1. Samples: 7046570. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:42:01,053][59242] Avg episode reward: [(0, '17.590'), (1, '20.010')] +[2023-10-09 04:42:01,430][60143] Updated weights for policy 0, policy_version 13672 (0.0009) +[2023-10-09 04:42:01,799][60143] Updated weights for policy 0, policy_version 13682 (0.0007) +[2023-10-09 04:42:02,167][60143] Updated weights for policy 0, policy_version 13692 (0.0007) +[2023-10-09 04:42:04,848][60144] Updated weights for policy 1, policy_version 13832 (0.0008) +[2023-10-09 04:42:05,227][60144] Updated weights for policy 1, policy_version 13842 (0.0009) +[2023-10-09 04:42:05,589][60144] Updated weights for policy 1, policy_version 13852 (0.0008) +[2023-10-09 04:42:06,034][60143] Updated weights for policy 0, policy_version 13702 (0.0009) +[2023-10-09 04:42:06,052][59242] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 28213248. Throughput: 0: 1699.0, 1: 1741.9. Samples: 7056758. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 04:42:06,053][59242] Avg episode reward: [(0, '17.010'), (1, '19.330')] +[2023-10-09 04:42:06,407][60143] Updated weights for policy 0, policy_version 13712 (0.0010) +[2023-10-09 04:42:06,776][60143] Updated weights for policy 0, policy_version 13722 (0.0007) +[2023-10-09 04:42:09,462][60144] Updated weights for policy 1, policy_version 13862 (0.0008) +[2023-10-09 04:42:09,821][60144] Updated weights for policy 1, policy_version 13872 (0.0010) +[2023-10-09 04:42:10,187][60144] Updated weights for policy 1, policy_version 13882 (0.0010) +[2023-10-09 04:42:10,923][60143] Updated weights for policy 0, policy_version 13732 (0.0008) +[2023-10-09 04:42:11,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 28278784. Throughput: 0: 1717.4, 1: 1735.7. Samples: 7077850. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 04:42:11,053][59242] Avg episode reward: [(0, '18.040'), (1, '18.080')] +[2023-10-09 04:42:11,310][60143] Updated weights for policy 0, policy_version 13742 (0.0008) +[2023-10-09 04:42:11,690][60143] Updated weights for policy 0, policy_version 13752 (0.0009) +[2023-10-09 04:42:13,827][60144] Updated weights for policy 1, policy_version 13892 (0.0011) +[2023-10-09 04:42:14,192][60144] Updated weights for policy 1, policy_version 13902 (0.0010) +[2023-10-09 04:42:14,563][60144] Updated weights for policy 1, policy_version 13912 (0.0007) +[2023-10-09 04:42:15,613][60143] Updated weights for policy 0, policy_version 13762 (0.0010) +[2023-10-09 04:42:15,980][60143] Updated weights for policy 0, policy_version 13772 (0.0007) +[2023-10-09 04:42:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 28344320. Throughput: 0: 1714.3, 1: 1725.6. Samples: 7098428. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 04:42:16,053][59242] Avg episode reward: [(0, '17.980'), (1, '17.240')] +[2023-10-09 04:42:16,357][60143] Updated weights for policy 0, policy_version 13782 (0.0009) +[2023-10-09 04:42:16,729][60143] Updated weights for policy 0, policy_version 13792 (0.0007) +[2023-10-09 04:42:18,623][60144] Updated weights for policy 1, policy_version 13922 (0.0008) +[2023-10-09 04:42:18,989][60144] Updated weights for policy 1, policy_version 13932 (0.0008) +[2023-10-09 04:42:19,353][60144] Updated weights for policy 1, policy_version 13942 (0.0007) +[2023-10-09 04:42:19,724][60144] Updated weights for policy 1, policy_version 13952 (0.0008) +[2023-10-09 04:42:20,708][60143] Updated weights for policy 0, policy_version 13802 (0.0010) +[2023-10-09 04:42:21,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 28409856. Throughput: 0: 1712.9, 1: 1755.6. Samples: 7109066. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:42:21,052][59242] Avg episode reward: [(0, '18.040'), (1, '16.480')] +[2023-10-09 04:42:21,078][60143] Updated weights for policy 0, policy_version 13812 (0.0011) +[2023-10-09 04:42:21,455][60143] Updated weights for policy 0, policy_version 13822 (0.0009) +[2023-10-09 04:42:23,569][60144] Updated weights for policy 1, policy_version 13962 (0.0009) +[2023-10-09 04:42:23,934][60144] Updated weights for policy 1, policy_version 13972 (0.0007) +[2023-10-09 04:42:24,295][60144] Updated weights for policy 1, policy_version 13982 (0.0009) +[2023-10-09 04:42:25,449][60143] Updated weights for policy 0, policy_version 13832 (0.0007) +[2023-10-09 04:42:25,818][60143] Updated weights for policy 0, policy_version 13842 (0.0007) +[2023-10-09 04:42:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 28475392. Throughput: 0: 1712.5, 1: 1728.8. Samples: 7129168. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:42:26,053][59242] Avg episode reward: [(0, '17.710'), (1, '17.250')] +[2023-10-09 04:42:26,185][60143] Updated weights for policy 0, policy_version 13852 (0.0007) +[2023-10-09 04:42:28,221][60144] Updated weights for policy 1, policy_version 13992 (0.0007) +[2023-10-09 04:42:28,585][60144] Updated weights for policy 1, policy_version 14002 (0.0007) +[2023-10-09 04:42:28,954][60144] Updated weights for policy 1, policy_version 14012 (0.0009) +[2023-10-09 04:42:30,113][60143] Updated weights for policy 0, policy_version 13862 (0.0007) +[2023-10-09 04:42:30,482][60143] Updated weights for policy 0, policy_version 13872 (0.0010) +[2023-10-09 04:42:30,858][60143] Updated weights for policy 0, policy_version 13882 (0.0010) +[2023-10-09 04:42:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 28540928. Throughput: 0: 1705.1, 1: 1736.8. Samples: 7150130. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:42:31,052][59242] Avg episode reward: [(0, '17.630'), (1, '17.360')] +[2023-10-09 04:42:32,845][60144] Updated weights for policy 1, policy_version 14022 (0.0008) +[2023-10-09 04:42:33,222][60144] Updated weights for policy 1, policy_version 14032 (0.0010) +[2023-10-09 04:42:33,589][60144] Updated weights for policy 1, policy_version 14042 (0.0009) +[2023-10-09 04:42:34,798][60143] Updated weights for policy 0, policy_version 13892 (0.0009) +[2023-10-09 04:42:35,168][60143] Updated weights for policy 0, policy_version 13902 (0.0008) +[2023-10-09 04:42:35,531][60143] Updated weights for policy 0, policy_version 13912 (0.0010) +[2023-10-09 04:42:36,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 28639232. Throughput: 0: 1721.0, 1: 1741.4. Samples: 7160526. Policy #0 lag: (min: 18.0, avg: 23.9, max: 50.0) +[2023-10-09 04:42:36,053][59242] Avg episode reward: [(0, '18.550'), (1, '17.900')] +[2023-10-09 04:42:37,561][60144] Updated weights for policy 1, policy_version 14052 (0.0008) +[2023-10-09 04:42:37,923][60144] Updated weights for policy 1, policy_version 14062 (0.0008) +[2023-10-09 04:42:38,291][60144] Updated weights for policy 1, policy_version 14072 (0.0007) +[2023-10-09 04:42:39,611][60143] Updated weights for policy 0, policy_version 13922 (0.0009) +[2023-10-09 04:42:39,980][60143] Updated weights for policy 0, policy_version 13932 (0.0007) +[2023-10-09 04:42:40,348][60143] Updated weights for policy 0, policy_version 13942 (0.0008) +[2023-10-09 04:42:40,708][60143] Updated weights for policy 0, policy_version 13952 (0.0008) +[2023-10-09 04:42:41,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 28704768. Throughput: 0: 1721.0, 1: 1733.3. Samples: 7181622. Policy #0 lag: (min: 18.0, avg: 23.9, max: 50.0) +[2023-10-09 04:42:41,053][59242] Avg episode reward: [(0, '19.460'), (1, '18.590')] +[2023-10-09 04:42:42,069][60144] Updated weights for policy 1, policy_version 14082 (0.0008) +[2023-10-09 04:42:42,440][60144] Updated weights for policy 1, policy_version 14092 (0.0008) +[2023-10-09 04:42:42,819][60144] Updated weights for policy 1, policy_version 14102 (0.0008) +[2023-10-09 04:42:43,192][60144] Updated weights for policy 1, policy_version 14112 (0.0009) +[2023-10-09 04:42:44,743][60143] Updated weights for policy 0, policy_version 13962 (0.0010) +[2023-10-09 04:42:45,105][60143] Updated weights for policy 0, policy_version 13972 (0.0009) +[2023-10-09 04:42:45,488][60143] Updated weights for policy 0, policy_version 13982 (0.0009) +[2023-10-09 04:42:46,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 28770304. Throughput: 0: 1689.9, 1: 1756.5. Samples: 7201656. Policy #0 lag: (min: 24.0, avg: 45.5, max: 56.0) +[2023-10-09 04:42:46,052][59242] Avg episode reward: [(0, '18.300'), (1, '17.640')] +[2023-10-09 04:42:47,219][60144] Updated weights for policy 1, policy_version 14122 (0.0007) +[2023-10-09 04:42:47,580][60144] Updated weights for policy 1, policy_version 14132 (0.0008) +[2023-10-09 04:42:47,952][60144] Updated weights for policy 1, policy_version 14142 (0.0007) +[2023-10-09 04:42:49,472][60143] Updated weights for policy 0, policy_version 13992 (0.0008) +[2023-10-09 04:42:49,844][60143] Updated weights for policy 0, policy_version 14002 (0.0007) +[2023-10-09 04:42:50,219][60143] Updated weights for policy 0, policy_version 14012 (0.0007) +[2023-10-09 04:42:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 28835840. Throughput: 0: 1716.6, 1: 1732.6. Samples: 7211972. Policy #0 lag: (min: 24.0, avg: 45.5, max: 56.0) +[2023-10-09 04:42:51,053][59242] Avg episode reward: [(0, '18.420'), (1, '18.170')] +[2023-10-09 04:42:51,874][60144] Updated weights for policy 1, policy_version 14152 (0.0009) +[2023-10-09 04:42:52,243][60144] Updated weights for policy 1, policy_version 14162 (0.0007) +[2023-10-09 04:42:52,617][60144] Updated weights for policy 1, policy_version 14172 (0.0007) +[2023-10-09 04:42:54,310][60143] Updated weights for policy 0, policy_version 14022 (0.0008) +[2023-10-09 04:42:54,681][60143] Updated weights for policy 0, policy_version 14032 (0.0009) +[2023-10-09 04:42:55,054][60143] Updated weights for policy 0, policy_version 14042 (0.0009) +[2023-10-09 04:42:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 28901376. Throughput: 0: 1700.9, 1: 1746.1. Samples: 7232968. Policy #0 lag: (min: 24.0, avg: 45.5, max: 56.0) +[2023-10-09 04:42:56,053][59242] Avg episode reward: [(0, '18.640'), (1, '17.770')] +[2023-10-09 04:42:56,418][60144] Updated weights for policy 1, policy_version 14182 (0.0009) +[2023-10-09 04:42:56,784][60144] Updated weights for policy 1, policy_version 14192 (0.0011) +[2023-10-09 04:42:57,154][60144] Updated weights for policy 1, policy_version 14202 (0.0007) +[2023-10-09 04:42:58,992][60143] Updated weights for policy 0, policy_version 14052 (0.0009) +[2023-10-09 04:42:59,387][60143] Updated weights for policy 0, policy_version 14062 (0.0007) +[2023-10-09 04:42:59,756][60143] Updated weights for policy 0, policy_version 14072 (0.0008) +[2023-10-09 04:43:01,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 28966912. Throughput: 0: 1684.0, 1: 1760.8. Samples: 7253440. Policy #0 lag: (min: 3.0, avg: 13.8, max: 35.0) +[2023-10-09 04:43:01,052][59242] Avg episode reward: [(0, '18.930'), (1, '17.660')] +[2023-10-09 04:43:01,124][60144] Updated weights for policy 1, policy_version 14212 (0.0009) +[2023-10-09 04:43:01,497][60144] Updated weights for policy 1, policy_version 14222 (0.0008) +[2023-10-09 04:43:01,862][60144] Updated weights for policy 1, policy_version 14232 (0.0007) +[2023-10-09 04:43:03,786][60143] Updated weights for policy 0, policy_version 14082 (0.0009) +[2023-10-09 04:43:04,149][60143] Updated weights for policy 0, policy_version 14092 (0.0011) +[2023-10-09 04:43:04,529][60143] Updated weights for policy 0, policy_version 14102 (0.0009) +[2023-10-09 04:43:04,899][60143] Updated weights for policy 0, policy_version 14112 (0.0007) +[2023-10-09 04:43:05,718][60144] Updated weights for policy 1, policy_version 14242 (0.0007) +[2023-10-09 04:43:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 29032448. Throughput: 0: 1709.2, 1: 1730.3. Samples: 7263844. Policy #0 lag: (min: 3.0, avg: 13.8, max: 35.0) +[2023-10-09 04:43:06,052][59242] Avg episode reward: [(0, '20.190'), (1, '19.390')] +[2023-10-09 04:43:06,075][60144] Updated weights for policy 1, policy_version 14252 (0.0007) +[2023-10-09 04:43:06,446][60144] Updated weights for policy 1, policy_version 14262 (0.0009) +[2023-10-09 04:43:06,813][60144] Updated weights for policy 1, policy_version 14272 (0.0007) +[2023-10-09 04:43:08,795][60143] Updated weights for policy 0, policy_version 14122 (0.0008) +[2023-10-09 04:43:09,160][60143] Updated weights for policy 0, policy_version 14132 (0.0009) +[2023-10-09 04:43:09,532][60143] Updated weights for policy 0, policy_version 14142 (0.0008) +[2023-10-09 04:43:10,485][60144] Updated weights for policy 1, policy_version 14282 (0.0008) +[2023-10-09 04:43:10,861][60144] Updated weights for policy 1, policy_version 14292 (0.0008) +[2023-10-09 04:43:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 29097984. Throughput: 0: 1681.8, 1: 1764.0. Samples: 7284232. Policy #0 lag: (min: 3.0, avg: 13.8, max: 35.0) +[2023-10-09 04:43:11,053][59242] Avg episode reward: [(0, '20.360'), (1, '19.930')] +[2023-10-09 04:43:11,232][60144] Updated weights for policy 1, policy_version 14302 (0.0010) +[2023-10-09 04:43:13,372][60143] Updated weights for policy 0, policy_version 14152 (0.0008) +[2023-10-09 04:43:13,736][60143] Updated weights for policy 0, policy_version 14162 (0.0008) +[2023-10-09 04:43:14,105][60143] Updated weights for policy 0, policy_version 14172 (0.0007) +[2023-10-09 04:43:15,063][60144] Updated weights for policy 1, policy_version 14312 (0.0007) +[2023-10-09 04:43:15,427][60144] Updated weights for policy 1, policy_version 14322 (0.0007) +[2023-10-09 04:43:15,799][60144] Updated weights for policy 1, policy_version 14332 (0.0007) +[2023-10-09 04:43:16,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 29196288. Throughput: 0: 1693.7, 1: 1748.9. Samples: 7305046. Policy #0 lag: (min: 30.0, avg: 40.5, max: 62.0) +[2023-10-09 04:43:16,053][59242] Avg episode reward: [(0, '20.520'), (1, '20.290')] +[2023-10-09 04:43:18,225][60143] Updated weights for policy 0, policy_version 14182 (0.0007) +[2023-10-09 04:43:18,596][60143] Updated weights for policy 0, policy_version 14192 (0.0011) +[2023-10-09 04:43:18,948][60143] Updated weights for policy 0, policy_version 14202 (0.0008) +[2023-10-09 04:43:19,842][60144] Updated weights for policy 1, policy_version 14342 (0.0009) +[2023-10-09 04:43:20,204][60144] Updated weights for policy 1, policy_version 14352 (0.0009) +[2023-10-09 04:43:20,580][60144] Updated weights for policy 1, policy_version 14362 (0.0009) +[2023-10-09 04:43:21,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 29261824. Throughput: 0: 1694.5, 1: 1757.5. Samples: 7315868. Policy #0 lag: (min: 30.0, avg: 40.5, max: 62.0) +[2023-10-09 04:43:21,052][59242] Avg episode reward: [(0, '20.670'), (1, '20.240')] +[2023-10-09 04:43:21,053][59934] Saving new best policy, reward=20.670! +[2023-10-09 04:43:23,007][60143] Updated weights for policy 0, policy_version 14212 (0.0009) +[2023-10-09 04:43:23,383][60143] Updated weights for policy 0, policy_version 14222 (0.0010) +[2023-10-09 04:43:23,743][60143] Updated weights for policy 0, policy_version 14232 (0.0009) +[2023-10-09 04:43:24,395][60144] Updated weights for policy 1, policy_version 14372 (0.0010) +[2023-10-09 04:43:24,768][60144] Updated weights for policy 1, policy_version 14382 (0.0011) +[2023-10-09 04:43:25,135][60144] Updated weights for policy 1, policy_version 14392 (0.0010) +[2023-10-09 04:43:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 29327360. Throughput: 0: 1675.7, 1: 1752.1. Samples: 7335874. Policy #0 lag: (min: 30.0, avg: 40.5, max: 62.0) +[2023-10-09 04:43:26,053][59242] Avg episode reward: [(0, '20.180'), (1, '20.180')] +[2023-10-09 04:43:27,692][60143] Updated weights for policy 0, policy_version 14242 (0.0007) +[2023-10-09 04:43:28,052][60143] Updated weights for policy 0, policy_version 14252 (0.0008) +[2023-10-09 04:43:28,418][60143] Updated weights for policy 0, policy_version 14262 (0.0011) +[2023-10-09 04:43:28,791][60143] Updated weights for policy 0, policy_version 14272 (0.0008) +[2023-10-09 04:43:28,996][60144] Updated weights for policy 1, policy_version 14402 (0.0011) +[2023-10-09 04:43:29,373][60144] Updated weights for policy 1, policy_version 14412 (0.0009) +[2023-10-09 04:43:29,740][60144] Updated weights for policy 1, policy_version 14422 (0.0007) +[2023-10-09 04:43:30,111][60144] Updated weights for policy 1, policy_version 14432 (0.0007) +[2023-10-09 04:43:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 29392896. Throughput: 0: 1710.8, 1: 1730.1. Samples: 7356498. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-09 04:43:31,053][59242] Avg episode reward: [(0, '20.520'), (1, '21.290')] +[2023-10-09 04:43:31,061][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000014272_14614528.pth... +[2023-10-09 04:43:31,062][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000014432_14778368.pth... +[2023-10-09 04:43:31,094][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000012672_12976128.pth +[2023-10-09 04:43:31,101][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000012800_13107200.pth +[2023-10-09 04:43:32,767][60143] Updated weights for policy 0, policy_version 14282 (0.0007) +[2023-10-09 04:43:33,135][60143] Updated weights for policy 0, policy_version 14292 (0.0009) +[2023-10-09 04:43:33,514][60143] Updated weights for policy 0, policy_version 14302 (0.0011) +[2023-10-09 04:43:34,142][60144] Updated weights for policy 1, policy_version 14442 (0.0007) +[2023-10-09 04:43:34,506][60144] Updated weights for policy 1, policy_version 14452 (0.0008) +[2023-10-09 04:43:34,875][60144] Updated weights for policy 1, policy_version 14462 (0.0008) +[2023-10-09 04:43:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 29458432. Throughput: 0: 1686.9, 1: 1766.0. Samples: 7367352. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-09 04:43:36,052][59242] Avg episode reward: [(0, '19.460'), (1, '21.550')] +[2023-10-09 04:43:37,642][60143] Updated weights for policy 0, policy_version 14312 (0.0010) +[2023-10-09 04:43:38,011][60143] Updated weights for policy 0, policy_version 14322 (0.0010) +[2023-10-09 04:43:38,379][60143] Updated weights for policy 0, policy_version 14332 (0.0008) +[2023-10-09 04:43:38,669][60144] Updated weights for policy 1, policy_version 14472 (0.0008) +[2023-10-09 04:43:39,040][60144] Updated weights for policy 1, policy_version 14482 (0.0008) +[2023-10-09 04:43:39,397][60144] Updated weights for policy 1, policy_version 14492 (0.0009) +[2023-10-09 04:43:41,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 29523968. Throughput: 0: 1694.0, 1: 1735.8. Samples: 7387310. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-09 04:43:41,053][59242] Avg episode reward: [(0, '19.550'), (1, '21.550')] +[2023-10-09 04:43:42,252][60143] Updated weights for policy 0, policy_version 14342 (0.0009) +[2023-10-09 04:43:42,633][60143] Updated weights for policy 0, policy_version 14352 (0.0010) +[2023-10-09 04:43:42,996][60143] Updated weights for policy 0, policy_version 14362 (0.0008) +[2023-10-09 04:43:43,398][60144] Updated weights for policy 1, policy_version 14502 (0.0008) +[2023-10-09 04:43:43,766][60144] Updated weights for policy 1, policy_version 14512 (0.0008) +[2023-10-09 04:43:44,145][60144] Updated weights for policy 1, policy_version 14522 (0.0008) +[2023-10-09 04:43:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 29589504. Throughput: 0: 1715.1, 1: 1728.0. Samples: 7408380. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-09 04:43:46,053][59242] Avg episode reward: [(0, '19.260'), (1, '22.190')] +[2023-10-09 04:43:46,065][60003] Saving new best policy, reward=22.190! +[2023-10-09 04:43:47,037][60143] Updated weights for policy 0, policy_version 14372 (0.0009) +[2023-10-09 04:43:47,428][60143] Updated weights for policy 0, policy_version 14382 (0.0009) +[2023-10-09 04:43:47,797][60143] Updated weights for policy 0, policy_version 14392 (0.0008) +[2023-10-09 04:43:48,198][60144] Updated weights for policy 1, policy_version 14532 (0.0009) +[2023-10-09 04:43:48,572][60144] Updated weights for policy 1, policy_version 14542 (0.0009) +[2023-10-09 04:43:48,940][60144] Updated weights for policy 1, policy_version 14552 (0.0009) +[2023-10-09 04:43:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 29655040. Throughput: 0: 1686.6, 1: 1746.1. Samples: 7418314. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-09 04:43:51,053][59242] Avg episode reward: [(0, '19.950'), (1, '21.740')] +[2023-10-09 04:43:51,890][60143] Updated weights for policy 0, policy_version 14402 (0.0010) +[2023-10-09 04:43:52,256][60143] Updated weights for policy 0, policy_version 14412 (0.0007) +[2023-10-09 04:43:52,630][60143] Updated weights for policy 0, policy_version 14422 (0.0009) +[2023-10-09 04:43:52,967][60144] Updated weights for policy 1, policy_version 14562 (0.0008) +[2023-10-09 04:43:53,002][60143] Updated weights for policy 0, policy_version 14432 (0.0007) +[2023-10-09 04:43:53,331][60144] Updated weights for policy 1, policy_version 14572 (0.0008) +[2023-10-09 04:43:53,699][60144] Updated weights for policy 1, policy_version 14582 (0.0008) +[2023-10-09 04:43:54,072][60144] Updated weights for policy 1, policy_version 14592 (0.0007) +[2023-10-09 04:43:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 29720576. Throughput: 0: 1716.5, 1: 1718.7. Samples: 7438816. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-09 04:43:56,053][59242] Avg episode reward: [(0, '19.710'), (1, '21.540')] +[2023-10-09 04:43:56,925][60143] Updated weights for policy 0, policy_version 14442 (0.0009) +[2023-10-09 04:43:57,289][60143] Updated weights for policy 0, policy_version 14452 (0.0009) +[2023-10-09 04:43:57,660][60143] Updated weights for policy 0, policy_version 14462 (0.0008) +[2023-10-09 04:43:57,970][60144] Updated weights for policy 1, policy_version 14602 (0.0008) +[2023-10-09 04:43:58,345][60144] Updated weights for policy 1, policy_version 14612 (0.0009) +[2023-10-09 04:43:58,710][60144] Updated weights for policy 1, policy_version 14622 (0.0009) +[2023-10-09 04:44:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 29786112. Throughput: 0: 1716.9, 1: 1729.8. Samples: 7460148. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:44:01,053][59242] Avg episode reward: [(0, '19.330'), (1, '21.450')] +[2023-10-09 04:44:01,471][60143] Updated weights for policy 0, policy_version 14472 (0.0008) +[2023-10-09 04:44:01,838][60143] Updated weights for policy 0, policy_version 14482 (0.0009) +[2023-10-09 04:44:02,208][60143] Updated weights for policy 0, policy_version 14492 (0.0010) +[2023-10-09 04:44:02,630][60144] Updated weights for policy 1, policy_version 14632 (0.0010) +[2023-10-09 04:44:03,001][60144] Updated weights for policy 1, policy_version 14642 (0.0010) +[2023-10-09 04:44:03,363][60144] Updated weights for policy 1, policy_version 14652 (0.0009) +[2023-10-09 04:44:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 29851648. Throughput: 0: 1700.3, 1: 1712.2. Samples: 7469432. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:44:06,053][59242] Avg episode reward: [(0, '20.070'), (1, '20.970')] +[2023-10-09 04:44:06,063][60143] Updated weights for policy 0, policy_version 14502 (0.0007) +[2023-10-09 04:44:06,439][60143] Updated weights for policy 0, policy_version 14512 (0.0008) +[2023-10-09 04:44:06,812][60143] Updated weights for policy 0, policy_version 14522 (0.0008) +[2023-10-09 04:44:07,420][60144] Updated weights for policy 1, policy_version 14662 (0.0008) +[2023-10-09 04:44:07,778][60144] Updated weights for policy 1, policy_version 14672 (0.0008) +[2023-10-09 04:44:08,147][60144] Updated weights for policy 1, policy_version 14682 (0.0009) +[2023-10-09 04:44:10,840][60143] Updated weights for policy 0, policy_version 14532 (0.0008) +[2023-10-09 04:44:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 29917184. Throughput: 0: 1720.4, 1: 1717.8. Samples: 7490594. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:44:11,053][59242] Avg episode reward: [(0, '20.360'), (1, '21.070')] +[2023-10-09 04:44:11,215][60143] Updated weights for policy 0, policy_version 14542 (0.0007) +[2023-10-09 04:44:11,576][60143] Updated weights for policy 0, policy_version 14552 (0.0007) +[2023-10-09 04:44:12,096][60144] Updated weights for policy 1, policy_version 14692 (0.0008) +[2023-10-09 04:44:12,459][60144] Updated weights for policy 1, policy_version 14702 (0.0008) +[2023-10-09 04:44:12,834][60144] Updated weights for policy 1, policy_version 14712 (0.0008) +[2023-10-09 04:44:15,639][60143] Updated weights for policy 0, policy_version 14562 (0.0009) +[2023-10-09 04:44:16,011][60143] Updated weights for policy 0, policy_version 14572 (0.0009) +[2023-10-09 04:44:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 29982720. Throughput: 0: 1719.3, 1: 1735.7. Samples: 7511974. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:44:16,053][59242] Avg episode reward: [(0, '20.220'), (1, '22.080')] +[2023-10-09 04:44:16,391][60143] Updated weights for policy 0, policy_version 14582 (0.0008) +[2023-10-09 04:44:16,754][60143] Updated weights for policy 0, policy_version 14592 (0.0010) +[2023-10-09 04:44:16,796][60144] Updated weights for policy 1, policy_version 14722 (0.0008) +[2023-10-09 04:44:17,157][60144] Updated weights for policy 1, policy_version 14732 (0.0010) +[2023-10-09 04:44:17,534][60144] Updated weights for policy 1, policy_version 14742 (0.0010) +[2023-10-09 04:44:17,905][60144] Updated weights for policy 1, policy_version 14752 (0.0011) +[2023-10-09 04:44:20,768][60143] Updated weights for policy 0, policy_version 14602 (0.0008) +[2023-10-09 04:44:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 30048256. Throughput: 0: 1717.5, 1: 1703.3. Samples: 7521290. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:44:21,053][59242] Avg episode reward: [(0, '19.670'), (1, '21.400')] +[2023-10-09 04:44:21,143][60143] Updated weights for policy 0, policy_version 14612 (0.0007) +[2023-10-09 04:44:21,511][60143] Updated weights for policy 0, policy_version 14622 (0.0007) +[2023-10-09 04:44:21,810][60144] Updated weights for policy 1, policy_version 14762 (0.0008) +[2023-10-09 04:44:22,181][60144] Updated weights for policy 1, policy_version 14772 (0.0009) +[2023-10-09 04:44:22,547][60144] Updated weights for policy 1, policy_version 14782 (0.0009) +[2023-10-09 04:44:25,334][60143] Updated weights for policy 0, policy_version 14632 (0.0007) +[2023-10-09 04:44:25,700][60143] Updated weights for policy 0, policy_version 14642 (0.0009) +[2023-10-09 04:44:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 30113792. Throughput: 0: 1724.7, 1: 1730.7. Samples: 7542802. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:44:26,052][59242] Avg episode reward: [(0, '18.890'), (1, '21.590')] +[2023-10-09 04:44:26,071][60143] Updated weights for policy 0, policy_version 14652 (0.0007) +[2023-10-09 04:44:26,350][60144] Updated weights for policy 1, policy_version 14792 (0.0009) +[2023-10-09 04:44:26,716][60144] Updated weights for policy 1, policy_version 14802 (0.0009) +[2023-10-09 04:44:27,078][60144] Updated weights for policy 1, policy_version 14812 (0.0010) +[2023-10-09 04:44:30,021][60143] Updated weights for policy 0, policy_version 14662 (0.0009) +[2023-10-09 04:44:30,393][60143] Updated weights for policy 0, policy_version 14672 (0.0008) +[2023-10-09 04:44:30,767][60143] Updated weights for policy 0, policy_version 14682 (0.0009) +[2023-10-09 04:44:31,052][59242] Fps is (10 sec: 16383.4, 60 sec: 13653.2, 300 sec: 13773.7). Total num frames: 30212096. Throughput: 0: 1712.3, 1: 1728.8. Samples: 7563230. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-09 04:44:31,053][59242] Avg episode reward: [(0, '19.340'), (1, '21.620')] +[2023-10-09 04:44:31,163][60144] Updated weights for policy 1, policy_version 14822 (0.0010) +[2023-10-09 04:44:31,537][60144] Updated weights for policy 1, policy_version 14832 (0.0007) +[2023-10-09 04:44:31,900][60144] Updated weights for policy 1, policy_version 14842 (0.0008) +[2023-10-09 04:44:34,729][60143] Updated weights for policy 0, policy_version 14692 (0.0009) +[2023-10-09 04:44:35,126][60143] Updated weights for policy 0, policy_version 14702 (0.0010) +[2023-10-09 04:44:35,499][60143] Updated weights for policy 0, policy_version 14712 (0.0009) +[2023-10-09 04:44:35,897][60144] Updated weights for policy 1, policy_version 14852 (0.0008) +[2023-10-09 04:44:36,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 30277632. Throughput: 0: 1736.4, 1: 1713.3. Samples: 7573552. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-09 04:44:36,052][59242] Avg episode reward: [(0, '19.780'), (1, '23.420')] +[2023-10-09 04:44:36,268][60144] Updated weights for policy 1, policy_version 14862 (0.0007) +[2023-10-09 04:44:36,631][60144] Updated weights for policy 1, policy_version 14872 (0.0008) +[2023-10-09 04:44:36,915][60003] Saving new best policy, reward=23.420! +[2023-10-09 04:44:39,441][60143] Updated weights for policy 0, policy_version 14722 (0.0008) +[2023-10-09 04:44:39,806][60143] Updated weights for policy 0, policy_version 14732 (0.0007) +[2023-10-09 04:44:40,182][60143] Updated weights for policy 0, policy_version 14742 (0.0008) +[2023-10-09 04:44:40,372][60144] Updated weights for policy 1, policy_version 14882 (0.0010) +[2023-10-09 04:44:40,547][60143] Updated weights for policy 0, policy_version 14752 (0.0008) +[2023-10-09 04:44:40,736][60144] Updated weights for policy 1, policy_version 14892 (0.0007) +[2023-10-09 04:44:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 30343168. Throughput: 0: 1722.6, 1: 1734.4. Samples: 7594378. Policy #0 lag: (min: 5.0, avg: 16.5, max: 37.0) +[2023-10-09 04:44:41,053][59242] Avg episode reward: [(0, '19.840'), (1, '21.680')] +[2023-10-09 04:44:41,110][60144] Updated weights for policy 1, policy_version 14902 (0.0008) +[2023-10-09 04:44:41,479][60144] Updated weights for policy 1, policy_version 14912 (0.0008) +[2023-10-09 04:44:44,650][60143] Updated weights for policy 0, policy_version 14762 (0.0007) +[2023-10-09 04:44:45,025][60143] Updated weights for policy 0, policy_version 14772 (0.0007) +[2023-10-09 04:44:45,392][60143] Updated weights for policy 0, policy_version 14782 (0.0008) +[2023-10-09 04:44:45,393][60144] Updated weights for policy 1, policy_version 14922 (0.0008) +[2023-10-09 04:44:45,764][60144] Updated weights for policy 1, policy_version 14932 (0.0009) +[2023-10-09 04:44:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 30408704. Throughput: 0: 1693.8, 1: 1722.2. Samples: 7613870. Policy #0 lag: (min: 5.0, avg: 16.5, max: 37.0) +[2023-10-09 04:44:46,053][59242] Avg episode reward: [(0, '20.240'), (1, '21.700')] +[2023-10-09 04:44:46,130][60144] Updated weights for policy 1, policy_version 14942 (0.0008) +[2023-10-09 04:44:49,396][60143] Updated weights for policy 0, policy_version 14792 (0.0011) +[2023-10-09 04:44:49,762][60143] Updated weights for policy 0, policy_version 14802 (0.0008) +[2023-10-09 04:44:50,138][60143] Updated weights for policy 0, policy_version 14812 (0.0008) +[2023-10-09 04:44:50,234][60144] Updated weights for policy 1, policy_version 14952 (0.0008) +[2023-10-09 04:44:50,597][60144] Updated weights for policy 1, policy_version 14962 (0.0008) +[2023-10-09 04:44:50,968][60144] Updated weights for policy 1, policy_version 14972 (0.0010) +[2023-10-09 04:44:51,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 30474240. Throughput: 0: 1719.6, 1: 1732.7. Samples: 7624786. Policy #0 lag: (min: 5.0, avg: 16.5, max: 37.0) +[2023-10-09 04:44:51,054][59242] Avg episode reward: [(0, '21.080'), (1, '20.460')] +[2023-10-09 04:44:51,055][59934] Saving new best policy, reward=21.080! +[2023-10-09 04:44:54,024][60143] Updated weights for policy 0, policy_version 14822 (0.0008) +[2023-10-09 04:44:54,405][60143] Updated weights for policy 0, policy_version 14832 (0.0008) +[2023-10-09 04:44:54,770][60143] Updated weights for policy 0, policy_version 14842 (0.0007) +[2023-10-09 04:44:55,000][60144] Updated weights for policy 1, policy_version 14982 (0.0010) +[2023-10-09 04:44:55,376][60144] Updated weights for policy 1, policy_version 14992 (0.0008) +[2023-10-09 04:44:55,743][60144] Updated weights for policy 1, policy_version 15002 (0.0008) +[2023-10-09 04:44:56,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 30572544. Throughput: 0: 1703.6, 1: 1734.0. Samples: 7645286. Policy #0 lag: (min: 31.0, avg: 41.0, max: 63.0) +[2023-10-09 04:44:56,053][59242] Avg episode reward: [(0, '20.550'), (1, '20.900')] +[2023-10-09 04:44:58,788][60143] Updated weights for policy 0, policy_version 14852 (0.0009) +[2023-10-09 04:44:59,152][60143] Updated weights for policy 0, policy_version 14862 (0.0011) +[2023-10-09 04:44:59,523][60143] Updated weights for policy 0, policy_version 14872 (0.0010) +[2023-10-09 04:44:59,731][60144] Updated weights for policy 1, policy_version 15012 (0.0007) +[2023-10-09 04:45:00,092][60144] Updated weights for policy 1, policy_version 15022 (0.0010) +[2023-10-09 04:45:00,465][60144] Updated weights for policy 1, policy_version 15032 (0.0009) +[2023-10-09 04:45:01,052][59242] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 30638080. Throughput: 0: 1688.4, 1: 1708.8. Samples: 7664848. Policy #0 lag: (min: 31.0, avg: 41.0, max: 63.0) +[2023-10-09 04:45:01,053][59242] Avg episode reward: [(0, '20.820'), (1, '22.350')] +[2023-10-09 04:45:03,588][60143] Updated weights for policy 0, policy_version 14882 (0.0008) +[2023-10-09 04:45:03,958][60143] Updated weights for policy 0, policy_version 14892 (0.0007) +[2023-10-09 04:45:04,241][60144] Updated weights for policy 1, policy_version 15042 (0.0008) +[2023-10-09 04:45:04,334][60143] Updated weights for policy 0, policy_version 14902 (0.0007) +[2023-10-09 04:45:04,598][60144] Updated weights for policy 1, policy_version 15052 (0.0007) +[2023-10-09 04:45:04,713][60143] Updated weights for policy 0, policy_version 14912 (0.0008) +[2023-10-09 04:45:04,970][60144] Updated weights for policy 1, policy_version 15062 (0.0009) +[2023-10-09 04:45:05,334][60144] Updated weights for policy 1, policy_version 15072 (0.0009) +[2023-10-09 04:45:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 30703616. Throughput: 0: 1717.1, 1: 1735.9. Samples: 7676680. Policy #0 lag: (min: 31.0, avg: 41.0, max: 63.0) +[2023-10-09 04:45:06,053][59242] Avg episode reward: [(0, '20.720'), (1, '21.660')] +[2023-10-09 04:45:08,743][60143] Updated weights for policy 0, policy_version 14922 (0.0009) +[2023-10-09 04:45:09,118][60143] Updated weights for policy 0, policy_version 14932 (0.0008) +[2023-10-09 04:45:09,358][60144] Updated weights for policy 1, policy_version 15082 (0.0007) +[2023-10-09 04:45:09,476][60143] Updated weights for policy 0, policy_version 14942 (0.0009) +[2023-10-09 04:45:09,721][60144] Updated weights for policy 1, policy_version 15092 (0.0008) +[2023-10-09 04:45:10,090][60144] Updated weights for policy 1, policy_version 15102 (0.0008) +[2023-10-09 04:45:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 30769152. Throughput: 0: 1682.3, 1: 1716.4. Samples: 7695742. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 04:45:11,053][59242] Avg episode reward: [(0, '19.310'), (1, '22.010')] +[2023-10-09 04:45:13,261][60143] Updated weights for policy 0, policy_version 14952 (0.0010) +[2023-10-09 04:45:13,629][60143] Updated weights for policy 0, policy_version 14962 (0.0007) +[2023-10-09 04:45:14,005][60143] Updated weights for policy 0, policy_version 14972 (0.0009) +[2023-10-09 04:45:14,056][60144] Updated weights for policy 1, policy_version 15112 (0.0008) +[2023-10-09 04:45:14,430][60144] Updated weights for policy 1, policy_version 15122 (0.0011) +[2023-10-09 04:45:14,798][60144] Updated weights for policy 1, policy_version 15132 (0.0011) +[2023-10-09 04:45:16,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 30834688. Throughput: 0: 1697.6, 1: 1701.2. Samples: 7716174. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 04:45:16,053][59242] Avg episode reward: [(0, '18.910'), (1, '21.020')] +[2023-10-09 04:45:18,214][60143] Updated weights for policy 0, policy_version 14982 (0.0009) +[2023-10-09 04:45:18,589][60143] Updated weights for policy 0, policy_version 14992 (0.0007) +[2023-10-09 04:45:18,743][60144] Updated weights for policy 1, policy_version 15142 (0.0010) +[2023-10-09 04:45:18,969][60143] Updated weights for policy 0, policy_version 15002 (0.0008) +[2023-10-09 04:45:19,113][60144] Updated weights for policy 1, policy_version 15152 (0.0008) +[2023-10-09 04:45:19,471][60144] Updated weights for policy 1, policy_version 15162 (0.0008) +[2023-10-09 04:45:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 30900224. Throughput: 0: 1691.5, 1: 1727.7. Samples: 7727418. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 04:45:21,053][59242] Avg episode reward: [(0, '19.560'), (1, '20.760')] +[2023-10-09 04:45:23,302][60143] Updated weights for policy 0, policy_version 15012 (0.0009) +[2023-10-09 04:45:23,585][60144] Updated weights for policy 1, policy_version 15172 (0.0008) +[2023-10-09 04:45:23,706][60143] Updated weights for policy 0, policy_version 15022 (0.0009) +[2023-10-09 04:45:23,955][60144] Updated weights for policy 1, policy_version 15182 (0.0008) +[2023-10-09 04:45:24,069][60143] Updated weights for policy 0, policy_version 15032 (0.0008) +[2023-10-09 04:45:24,328][60144] Updated weights for policy 1, policy_version 15192 (0.0009) +[2023-10-09 04:45:26,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 30965760. Throughput: 0: 1675.7, 1: 1701.0. Samples: 7746330. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) +[2023-10-09 04:45:26,053][59242] Avg episode reward: [(0, '19.330'), (1, '21.410')] +[2023-10-09 04:45:27,926][60143] Updated weights for policy 0, policy_version 15042 (0.0008) +[2023-10-09 04:45:28,235][60144] Updated weights for policy 1, policy_version 15202 (0.0009) +[2023-10-09 04:45:28,294][60143] Updated weights for policy 0, policy_version 15052 (0.0007) +[2023-10-09 04:45:28,605][60144] Updated weights for policy 1, policy_version 15212 (0.0009) +[2023-10-09 04:45:28,659][60143] Updated weights for policy 0, policy_version 15062 (0.0008) +[2023-10-09 04:45:28,975][60144] Updated weights for policy 1, policy_version 15222 (0.0007) +[2023-10-09 04:45:29,028][60143] Updated weights for policy 0, policy_version 15072 (0.0008) +[2023-10-09 04:45:29,335][60144] Updated weights for policy 1, policy_version 15232 (0.0007) +[2023-10-09 04:45:31,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 31031296. Throughput: 0: 1705.4, 1: 1711.8. Samples: 7767644. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) +[2023-10-09 04:45:31,053][59242] Avg episode reward: [(0, '19.560'), (1, '21.470')] +[2023-10-09 04:45:31,065][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000015072_15433728.pth... +[2023-10-09 04:45:31,065][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000015232_15597568.pth... +[2023-10-09 04:45:31,097][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000013600_13926400.pth +[2023-10-09 04:45:31,103][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000013472_13795328.pth +[2023-10-09 04:45:33,091][60143] Updated weights for policy 0, policy_version 15082 (0.0008) +[2023-10-09 04:45:33,212][60144] Updated weights for policy 1, policy_version 15242 (0.0007) +[2023-10-09 04:45:33,456][60143] Updated weights for policy 0, policy_version 15092 (0.0007) +[2023-10-09 04:45:33,583][60144] Updated weights for policy 1, policy_version 15252 (0.0007) +[2023-10-09 04:45:33,831][60143] Updated weights for policy 0, policy_version 15102 (0.0009) +[2023-10-09 04:45:33,940][60144] Updated weights for policy 1, policy_version 15262 (0.0009) +[2023-10-09 04:45:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 31096832. Throughput: 0: 1688.5, 1: 1714.0. Samples: 7777898. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) +[2023-10-09 04:45:36,053][59242] Avg episode reward: [(0, '19.290'), (1, '19.410')] +[2023-10-09 04:45:37,736][60143] Updated weights for policy 0, policy_version 15112 (0.0009) +[2023-10-09 04:45:37,943][60144] Updated weights for policy 1, policy_version 15272 (0.0008) +[2023-10-09 04:45:38,104][60143] Updated weights for policy 0, policy_version 15122 (0.0007) +[2023-10-09 04:45:38,311][60144] Updated weights for policy 1, policy_version 15282 (0.0010) +[2023-10-09 04:45:38,471][60143] Updated weights for policy 0, policy_version 15132 (0.0011) +[2023-10-09 04:45:38,678][60144] Updated weights for policy 1, policy_version 15292 (0.0008) +[2023-10-09 04:45:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 31162368. Throughput: 0: 1691.9, 1: 1698.8. Samples: 7797864. Policy #0 lag: (min: 45.0, avg: 55.7, max: 56.0) +[2023-10-09 04:45:41,053][59242] Avg episode reward: [(0, '20.300'), (1, '19.340')] +[2023-10-09 04:45:42,467][60143] Updated weights for policy 0, policy_version 15142 (0.0007) +[2023-10-09 04:45:42,572][60144] Updated weights for policy 1, policy_version 15302 (0.0009) +[2023-10-09 04:45:42,839][60143] Updated weights for policy 0, policy_version 15152 (0.0010) +[2023-10-09 04:45:42,941][60144] Updated weights for policy 1, policy_version 15312 (0.0008) +[2023-10-09 04:45:43,215][60143] Updated weights for policy 0, policy_version 15162 (0.0009) +[2023-10-09 04:45:43,311][60144] Updated weights for policy 1, policy_version 15322 (0.0008) +[2023-10-09 04:45:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 31227904. Throughput: 0: 1699.8, 1: 1724.3. Samples: 7818932. Policy #0 lag: (min: 45.0, avg: 55.7, max: 56.0) +[2023-10-09 04:45:46,053][59242] Avg episode reward: [(0, '21.360'), (1, '19.580')] +[2023-10-09 04:45:46,065][59934] Saving new best policy, reward=21.360! +[2023-10-09 04:45:47,264][60143] Updated weights for policy 0, policy_version 15172 (0.0007) +[2023-10-09 04:45:47,359][60144] Updated weights for policy 1, policy_version 15332 (0.0008) +[2023-10-09 04:45:47,636][60143] Updated weights for policy 0, policy_version 15182 (0.0008) +[2023-10-09 04:45:47,728][60144] Updated weights for policy 1, policy_version 15342 (0.0008) +[2023-10-09 04:45:48,018][60143] Updated weights for policy 0, policy_version 15192 (0.0009) +[2023-10-09 04:45:48,089][60144] Updated weights for policy 1, policy_version 15352 (0.0011) +[2023-10-09 04:45:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 31293440. Throughput: 0: 1669.9, 1: 1696.0. Samples: 7828148. Policy #0 lag: (min: 45.0, avg: 55.7, max: 56.0) +[2023-10-09 04:45:51,053][59242] Avg episode reward: [(0, '20.080'), (1, '19.670')] +[2023-10-09 04:45:52,094][60143] Updated weights for policy 0, policy_version 15202 (0.0008) +[2023-10-09 04:45:52,124][60144] Updated weights for policy 1, policy_version 15362 (0.0007) +[2023-10-09 04:45:52,455][60143] Updated weights for policy 0, policy_version 15212 (0.0008) +[2023-10-09 04:45:52,487][60144] Updated weights for policy 1, policy_version 15372 (0.0007) +[2023-10-09 04:45:52,826][60143] Updated weights for policy 0, policy_version 15222 (0.0009) +[2023-10-09 04:45:52,849][60144] Updated weights for policy 1, policy_version 15382 (0.0008) +[2023-10-09 04:45:53,195][60143] Updated weights for policy 0, policy_version 15232 (0.0010) +[2023-10-09 04:45:53,212][60144] Updated weights for policy 1, policy_version 15392 (0.0008) +[2023-10-09 04:45:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 31358976. Throughput: 0: 1697.5, 1: 1711.3. Samples: 7849136. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:45:56,053][59242] Avg episode reward: [(0, '19.840'), (1, '19.430')] +[2023-10-09 04:45:57,095][60143] Updated weights for policy 0, policy_version 15242 (0.0008) +[2023-10-09 04:45:57,401][60144] Updated weights for policy 1, policy_version 15402 (0.0008) +[2023-10-09 04:45:57,465][60143] Updated weights for policy 0, policy_version 15252 (0.0007) +[2023-10-09 04:45:57,755][60144] Updated weights for policy 1, policy_version 15412 (0.0007) +[2023-10-09 04:45:57,826][60143] Updated weights for policy 0, policy_version 15262 (0.0007) +[2023-10-09 04:45:58,117][60144] Updated weights for policy 1, policy_version 15422 (0.0008) +[2023-10-09 04:46:01,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 31424512. Throughput: 0: 1697.2, 1: 1725.4. Samples: 7870192. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:46:01,052][59242] Avg episode reward: [(0, '20.710'), (1, '19.300')] +[2023-10-09 04:46:01,870][60143] Updated weights for policy 0, policy_version 15272 (0.0009) +[2023-10-09 04:46:02,076][60144] Updated weights for policy 1, policy_version 15432 (0.0008) +[2023-10-09 04:46:02,237][60143] Updated weights for policy 0, policy_version 15282 (0.0008) +[2023-10-09 04:46:02,456][60144] Updated weights for policy 1, policy_version 15442 (0.0007) +[2023-10-09 04:46:02,601][60143] Updated weights for policy 0, policy_version 15292 (0.0010) +[2023-10-09 04:46:02,826][60144] Updated weights for policy 1, policy_version 15452 (0.0009) +[2023-10-09 04:46:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 31490048. Throughput: 0: 1680.7, 1: 1694.4. Samples: 7879294. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:46:06,053][59242] Avg episode reward: [(0, '19.840'), (1, '19.840')] +[2023-10-09 04:46:06,718][60143] Updated weights for policy 0, policy_version 15302 (0.0007) +[2023-10-09 04:46:06,742][60144] Updated weights for policy 1, policy_version 15462 (0.0008) +[2023-10-09 04:46:07,082][60143] Updated weights for policy 0, policy_version 15312 (0.0007) +[2023-10-09 04:46:07,109][60144] Updated weights for policy 1, policy_version 15472 (0.0007) +[2023-10-09 04:46:07,448][60143] Updated weights for policy 0, policy_version 15322 (0.0007) +[2023-10-09 04:46:07,478][60144] Updated weights for policy 1, policy_version 15482 (0.0008) +[2023-10-09 04:46:11,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 31555584. Throughput: 0: 1703.3, 1: 1722.4. Samples: 7900486. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) +[2023-10-09 04:46:11,053][59242] Avg episode reward: [(0, '20.180'), (1, '19.140')] +[2023-10-09 04:46:11,348][60143] Updated weights for policy 0, policy_version 15332 (0.0009) +[2023-10-09 04:46:11,488][60144] Updated weights for policy 1, policy_version 15492 (0.0007) +[2023-10-09 04:46:11,742][60143] Updated weights for policy 0, policy_version 15342 (0.0009) +[2023-10-09 04:46:11,855][60144] Updated weights for policy 1, policy_version 15502 (0.0008) +[2023-10-09 04:46:12,119][60143] Updated weights for policy 0, policy_version 15352 (0.0008) +[2023-10-09 04:46:12,226][60144] Updated weights for policy 1, policy_version 15512 (0.0009) +[2023-10-09 04:46:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 31621120. Throughput: 0: 1700.7, 1: 1722.4. Samples: 7921686. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) +[2023-10-09 04:46:16,053][59242] Avg episode reward: [(0, '20.660'), (1, '20.250')] +[2023-10-09 04:46:16,069][60143] Updated weights for policy 0, policy_version 15362 (0.0008) +[2023-10-09 04:46:16,219][60144] Updated weights for policy 1, policy_version 15522 (0.0007) +[2023-10-09 04:46:16,423][60143] Updated weights for policy 0, policy_version 15372 (0.0009) +[2023-10-09 04:46:16,584][60144] Updated weights for policy 1, policy_version 15532 (0.0007) +[2023-10-09 04:46:16,788][60143] Updated weights for policy 0, policy_version 15382 (0.0009) +[2023-10-09 04:46:16,946][60144] Updated weights for policy 1, policy_version 15542 (0.0008) +[2023-10-09 04:46:17,159][60143] Updated weights for policy 0, policy_version 15392 (0.0008) +[2023-10-09 04:46:17,315][60144] Updated weights for policy 1, policy_version 15552 (0.0008) +[2023-10-09 04:46:21,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 31686656. Throughput: 0: 1690.2, 1: 1711.6. Samples: 7930978. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) +[2023-10-09 04:46:21,052][59242] Avg episode reward: [(0, '20.750'), (1, '21.590')] +[2023-10-09 04:46:21,062][60144] Updated weights for policy 1, policy_version 15562 (0.0008) +[2023-10-09 04:46:21,281][60143] Updated weights for policy 0, policy_version 15402 (0.0008) +[2023-10-09 04:46:21,424][60144] Updated weights for policy 1, policy_version 15572 (0.0008) +[2023-10-09 04:46:21,653][60143] Updated weights for policy 0, policy_version 15412 (0.0007) +[2023-10-09 04:46:21,801][60144] Updated weights for policy 1, policy_version 15582 (0.0009) +[2023-10-09 04:46:22,018][60143] Updated weights for policy 0, policy_version 15422 (0.0010) +[2023-10-09 04:46:25,528][60144] Updated weights for policy 1, policy_version 15592 (0.0007) +[2023-10-09 04:46:25,901][60144] Updated weights for policy 1, policy_version 15602 (0.0007) +[2023-10-09 04:46:26,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 31752192. Throughput: 0: 1700.2, 1: 1733.3. Samples: 7952374. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:46:26,052][59242] Avg episode reward: [(0, '21.330'), (1, '21.770')] +[2023-10-09 04:46:26,236][60143] Updated weights for policy 0, policy_version 15432 (0.0008) +[2023-10-09 04:46:26,258][60144] Updated weights for policy 1, policy_version 15612 (0.0007) +[2023-10-09 04:46:26,604][60143] Updated weights for policy 0, policy_version 15442 (0.0009) +[2023-10-09 04:46:26,981][60143] Updated weights for policy 0, policy_version 15452 (0.0010) +[2023-10-09 04:46:30,165][60144] Updated weights for policy 1, policy_version 15622 (0.0008) +[2023-10-09 04:46:30,536][60144] Updated weights for policy 1, policy_version 15632 (0.0007) +[2023-10-09 04:46:30,894][60144] Updated weights for policy 1, policy_version 15642 (0.0009) +[2023-10-09 04:46:30,936][60143] Updated weights for policy 0, policy_version 15462 (0.0009) +[2023-10-09 04:46:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 31817728. Throughput: 0: 1701.0, 1: 1719.6. Samples: 7972862. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:46:31,053][59242] Avg episode reward: [(0, '21.100'), (1, '21.750')] +[2023-10-09 04:46:31,298][60143] Updated weights for policy 0, policy_version 15472 (0.0007) +[2023-10-09 04:46:31,679][60143] Updated weights for policy 0, policy_version 15482 (0.0011) +[2023-10-09 04:46:34,833][60144] Updated weights for policy 1, policy_version 15652 (0.0008) +[2023-10-09 04:46:35,195][60144] Updated weights for policy 1, policy_version 15662 (0.0008) +[2023-10-09 04:46:35,556][60144] Updated weights for policy 1, policy_version 15672 (0.0008) +[2023-10-09 04:46:35,653][60143] Updated weights for policy 0, policy_version 15492 (0.0009) +[2023-10-09 04:46:36,026][60143] Updated weights for policy 0, policy_version 15502 (0.0007) +[2023-10-09 04:46:36,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 31916032. Throughput: 0: 1704.1, 1: 1735.5. Samples: 7982930. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:46:36,053][59242] Avg episode reward: [(0, '21.610'), (1, '21.680')] +[2023-10-09 04:46:36,393][60143] Updated weights for policy 0, policy_version 15512 (0.0008) +[2023-10-09 04:46:36,699][59934] Saving new best policy, reward=21.610! +[2023-10-09 04:46:39,475][60144] Updated weights for policy 1, policy_version 15682 (0.0007) +[2023-10-09 04:46:39,847][60144] Updated weights for policy 1, policy_version 15692 (0.0011) +[2023-10-09 04:46:40,221][60144] Updated weights for policy 1, policy_version 15702 (0.0010) +[2023-10-09 04:46:40,322][60143] Updated weights for policy 0, policy_version 15522 (0.0007) +[2023-10-09 04:46:40,584][60144] Updated weights for policy 1, policy_version 15712 (0.0008) +[2023-10-09 04:46:40,681][60143] Updated weights for policy 0, policy_version 15532 (0.0007) +[2023-10-09 04:46:41,051][60143] Updated weights for policy 0, policy_version 15542 (0.0008) +[2023-10-09 04:46:41,052][59242] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 31981568. Throughput: 0: 1706.6, 1: 1735.2. Samples: 8004014. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-09 04:46:41,053][59242] Avg episode reward: [(0, '20.230'), (1, '22.300')] +[2023-10-09 04:46:41,420][60143] Updated weights for policy 0, policy_version 15552 (0.0008) +[2023-10-09 04:46:44,539][60144] Updated weights for policy 1, policy_version 15722 (0.0008) +[2023-10-09 04:46:44,918][60144] Updated weights for policy 1, policy_version 15732 (0.0007) +[2023-10-09 04:46:45,287][60144] Updated weights for policy 1, policy_version 15742 (0.0007) +[2023-10-09 04:46:45,338][60143] Updated weights for policy 0, policy_version 15562 (0.0007) +[2023-10-09 04:46:45,705][60143] Updated weights for policy 0, policy_version 15572 (0.0008) +[2023-10-09 04:46:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 32047104. Throughput: 0: 1699.2, 1: 1713.2. Samples: 8023752. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-09 04:46:46,052][59242] Avg episode reward: [(0, '19.520'), (1, '22.250')] +[2023-10-09 04:46:46,075][60143] Updated weights for policy 0, policy_version 15582 (0.0010) +[2023-10-09 04:46:49,126][60144] Updated weights for policy 1, policy_version 15752 (0.0010) +[2023-10-09 04:46:49,493][60144] Updated weights for policy 1, policy_version 15762 (0.0011) +[2023-10-09 04:46:49,869][60144] Updated weights for policy 1, policy_version 15772 (0.0009) +[2023-10-09 04:46:50,050][60143] Updated weights for policy 0, policy_version 15592 (0.0010) +[2023-10-09 04:46:50,415][60143] Updated weights for policy 0, policy_version 15602 (0.0007) +[2023-10-09 04:46:50,789][60143] Updated weights for policy 0, policy_version 15612 (0.0009) +[2023-10-09 04:46:51,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 32145408. Throughput: 0: 1713.5, 1: 1748.4. Samples: 8035078. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 04:46:51,053][59242] Avg episode reward: [(0, '18.980'), (1, '22.490')] +[2023-10-09 04:46:54,031][60144] Updated weights for policy 1, policy_version 15782 (0.0010) +[2023-10-09 04:46:54,409][60144] Updated weights for policy 1, policy_version 15792 (0.0011) +[2023-10-09 04:46:54,762][60144] Updated weights for policy 1, policy_version 15802 (0.0009) +[2023-10-09 04:46:54,765][60143] Updated weights for policy 0, policy_version 15622 (0.0007) +[2023-10-09 04:46:55,140][60143] Updated weights for policy 0, policy_version 15632 (0.0007) +[2023-10-09 04:46:55,511][60143] Updated weights for policy 0, policy_version 15642 (0.0008) +[2023-10-09 04:46:56,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 32210944. Throughput: 0: 1721.3, 1: 1724.5. Samples: 8055546. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 04:46:56,053][59242] Avg episode reward: [(0, '19.210'), (1, '22.950')] +[2023-10-09 04:46:58,781][60144] Updated weights for policy 1, policy_version 15812 (0.0008) +[2023-10-09 04:46:59,151][60144] Updated weights for policy 1, policy_version 15822 (0.0010) +[2023-10-09 04:46:59,518][60144] Updated weights for policy 1, policy_version 15832 (0.0009) +[2023-10-09 04:46:59,554][60143] Updated weights for policy 0, policy_version 15652 (0.0008) +[2023-10-09 04:46:59,945][60143] Updated weights for policy 0, policy_version 15662 (0.0007) +[2023-10-09 04:47:00,319][60143] Updated weights for policy 0, policy_version 15672 (0.0008) +[2023-10-09 04:47:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 32276480. Throughput: 0: 1694.2, 1: 1713.3. Samples: 8075022. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 04:47:01,053][59242] Avg episode reward: [(0, '20.020'), (1, '23.540')] +[2023-10-09 04:47:01,061][60003] Saving new best policy, reward=23.540! +[2023-10-09 04:47:03,508][60144] Updated weights for policy 1, policy_version 15842 (0.0008) +[2023-10-09 04:47:03,879][60144] Updated weights for policy 1, policy_version 15852 (0.0008) +[2023-10-09 04:47:04,240][60144] Updated weights for policy 1, policy_version 15862 (0.0009) +[2023-10-09 04:47:04,300][60143] Updated weights for policy 0, policy_version 15682 (0.0008) +[2023-10-09 04:47:04,601][60144] Updated weights for policy 1, policy_version 15872 (0.0008) +[2023-10-09 04:47:04,667][60143] Updated weights for policy 0, policy_version 15692 (0.0009) +[2023-10-09 04:47:05,040][60143] Updated weights for policy 0, policy_version 15702 (0.0010) +[2023-10-09 04:47:05,412][60143] Updated weights for policy 0, policy_version 15712 (0.0010) +[2023-10-09 04:47:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 32342016. Throughput: 0: 1722.0, 1: 1738.4. Samples: 8086698. Policy #0 lag: (min: 1.0, avg: 11.0, max: 33.0) +[2023-10-09 04:47:06,053][59242] Avg episode reward: [(0, '20.260'), (1, '24.700')] +[2023-10-09 04:47:06,054][60003] Saving new best policy, reward=24.700! +[2023-10-09 04:47:08,535][60144] Updated weights for policy 1, policy_version 15882 (0.0008) +[2023-10-09 04:47:08,900][60144] Updated weights for policy 1, policy_version 15892 (0.0011) +[2023-10-09 04:47:09,264][60144] Updated weights for policy 1, policy_version 15902 (0.0008) +[2023-10-09 04:47:09,295][60143] Updated weights for policy 0, policy_version 15722 (0.0008) +[2023-10-09 04:47:09,654][60143] Updated weights for policy 0, policy_version 15732 (0.0011) +[2023-10-09 04:47:10,032][60143] Updated weights for policy 0, policy_version 15742 (0.0011) +[2023-10-09 04:47:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 32407552. Throughput: 0: 1711.7, 1: 1705.1. Samples: 8106130. Policy #0 lag: (min: 1.0, avg: 11.0, max: 33.0) +[2023-10-09 04:47:11,053][59242] Avg episode reward: [(0, '20.870'), (1, '24.470')] +[2023-10-09 04:47:13,130][60144] Updated weights for policy 1, policy_version 15912 (0.0008) +[2023-10-09 04:47:13,497][60144] Updated weights for policy 1, policy_version 15922 (0.0008) +[2023-10-09 04:47:13,856][60144] Updated weights for policy 1, policy_version 15932 (0.0009) +[2023-10-09 04:47:13,956][60143] Updated weights for policy 0, policy_version 15752 (0.0008) +[2023-10-09 04:47:14,324][60143] Updated weights for policy 0, policy_version 15762 (0.0008) +[2023-10-09 04:47:14,696][60143] Updated weights for policy 0, policy_version 15772 (0.0010) +[2023-10-09 04:47:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 32473088. Throughput: 0: 1701.6, 1: 1724.8. Samples: 8127054. Policy #0 lag: (min: 1.0, avg: 11.0, max: 33.0) +[2023-10-09 04:47:16,053][59242] Avg episode reward: [(0, '23.530'), (1, '25.100')] +[2023-10-09 04:47:16,065][59934] Saving new best policy, reward=23.530! +[2023-10-09 04:47:16,065][60003] Saving new best policy, reward=25.100! +[2023-10-09 04:47:17,769][60144] Updated weights for policy 1, policy_version 15942 (0.0008) +[2023-10-09 04:47:18,137][60144] Updated weights for policy 1, policy_version 15952 (0.0008) +[2023-10-09 04:47:18,507][60144] Updated weights for policy 1, policy_version 15962 (0.0007) +[2023-10-09 04:47:18,658][60143] Updated weights for policy 0, policy_version 15782 (0.0009) +[2023-10-09 04:47:19,038][60143] Updated weights for policy 0, policy_version 15792 (0.0009) +[2023-10-09 04:47:19,398][60143] Updated weights for policy 0, policy_version 15802 (0.0009) +[2023-10-09 04:47:21,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 32538624. Throughput: 0: 1724.4, 1: 1713.2. Samples: 8137618. Policy #0 lag: (min: 2.0, avg: 2.0, max: 4.0) +[2023-10-09 04:47:21,053][59242] Avg episode reward: [(0, '22.930'), (1, '24.490')] +[2023-10-09 04:47:22,471][60144] Updated weights for policy 1, policy_version 15972 (0.0007) +[2023-10-09 04:47:22,840][60144] Updated weights for policy 1, policy_version 15982 (0.0007) +[2023-10-09 04:47:23,208][60144] Updated weights for policy 1, policy_version 15992 (0.0008) +[2023-10-09 04:47:23,356][60143] Updated weights for policy 0, policy_version 15812 (0.0008) +[2023-10-09 04:47:23,723][60143] Updated weights for policy 0, policy_version 15822 (0.0009) +[2023-10-09 04:47:24,095][60143] Updated weights for policy 0, policy_version 15832 (0.0008) +[2023-10-09 04:47:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 32604160. Throughput: 0: 1695.5, 1: 1713.1. Samples: 8157400. Policy #0 lag: (min: 2.0, avg: 2.0, max: 4.0) +[2023-10-09 04:47:26,053][59242] Avg episode reward: [(0, '21.950'), (1, '23.730')] +[2023-10-09 04:47:27,138][60144] Updated weights for policy 1, policy_version 16002 (0.0008) +[2023-10-09 04:47:27,495][60144] Updated weights for policy 1, policy_version 16012 (0.0008) +[2023-10-09 04:47:27,867][60144] Updated weights for policy 1, policy_version 16022 (0.0010) +[2023-10-09 04:47:28,054][60143] Updated weights for policy 0, policy_version 15842 (0.0009) +[2023-10-09 04:47:28,238][60144] Updated weights for policy 1, policy_version 16032 (0.0009) +[2023-10-09 04:47:28,424][60143] Updated weights for policy 0, policy_version 15852 (0.0007) +[2023-10-09 04:47:28,800][60143] Updated weights for policy 0, policy_version 15862 (0.0007) +[2023-10-09 04:47:29,169][60143] Updated weights for policy 0, policy_version 15872 (0.0008) +[2023-10-09 04:47:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 32669696. Throughput: 0: 1704.3, 1: 1740.4. Samples: 8178760. Policy #0 lag: (min: 2.0, avg: 2.0, max: 4.0) +[2023-10-09 04:47:31,053][59242] Avg episode reward: [(0, '22.060'), (1, '22.780')] +[2023-10-09 04:47:31,062][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000016032_16416768.pth... +[2023-10-09 04:47:31,062][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000015872_16252928.pth... +[2023-10-09 04:47:31,099][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000014272_14614528.pth +[2023-10-09 04:47:31,099][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000014432_14778368.pth +[2023-10-09 04:47:31,103][60003] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p1/milestones/checkpoint_000016032_16416768.pth +[2023-10-09 04:47:31,103][59934] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p0/milestones/checkpoint_000015872_16252928.pth +[2023-10-09 04:47:32,277][60144] Updated weights for policy 1, policy_version 16042 (0.0010) +[2023-10-09 04:47:32,657][60144] Updated weights for policy 1, policy_version 16052 (0.0008) +[2023-10-09 04:47:33,022][60144] Updated weights for policy 1, policy_version 16062 (0.0008) +[2023-10-09 04:47:33,132][60143] Updated weights for policy 0, policy_version 15882 (0.0009) +[2023-10-09 04:47:33,511][60143] Updated weights for policy 0, policy_version 15892 (0.0010) +[2023-10-09 04:47:33,882][60143] Updated weights for policy 0, policy_version 15902 (0.0008) +[2023-10-09 04:47:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 32735232. Throughput: 0: 1706.1, 1: 1705.9. Samples: 8188622. Policy #0 lag: (min: 0.0, avg: 18.4, max: 32.0) +[2023-10-09 04:47:36,053][59242] Avg episode reward: [(0, '22.420'), (1, '22.490')] +[2023-10-09 04:47:36,865][60144] Updated weights for policy 1, policy_version 16072 (0.0008) +[2023-10-09 04:47:37,237][60144] Updated weights for policy 1, policy_version 16082 (0.0007) +[2023-10-09 04:47:37,609][60144] Updated weights for policy 1, policy_version 16092 (0.0007) +[2023-10-09 04:47:37,873][60143] Updated weights for policy 0, policy_version 15912 (0.0008) +[2023-10-09 04:47:38,241][60143] Updated weights for policy 0, policy_version 15922 (0.0007) +[2023-10-09 04:47:38,612][60143] Updated weights for policy 0, policy_version 15932 (0.0007) +[2023-10-09 04:47:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 32800768. Throughput: 0: 1689.1, 1: 1728.0. Samples: 8209316. Policy #0 lag: (min: 0.0, avg: 18.4, max: 32.0) +[2023-10-09 04:47:41,052][59242] Avg episode reward: [(0, '22.590'), (1, '22.550')] +[2023-10-09 04:47:41,627][60144] Updated weights for policy 1, policy_version 16102 (0.0010) +[2023-10-09 04:47:41,999][60144] Updated weights for policy 1, policy_version 16112 (0.0010) +[2023-10-09 04:47:42,364][60144] Updated weights for policy 1, policy_version 16122 (0.0010) +[2023-10-09 04:47:42,579][60143] Updated weights for policy 0, policy_version 15942 (0.0008) +[2023-10-09 04:47:42,944][60143] Updated weights for policy 0, policy_version 15952 (0.0009) +[2023-10-09 04:47:43,314][60143] Updated weights for policy 0, policy_version 15962 (0.0010) +[2023-10-09 04:47:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 32866304. Throughput: 0: 1718.3, 1: 1745.3. Samples: 8230886. Policy #0 lag: (min: 0.0, avg: 18.4, max: 32.0) +[2023-10-09 04:47:46,053][59242] Avg episode reward: [(0, '23.940'), (1, '21.660')] +[2023-10-09 04:47:46,063][59934] Saving new best policy, reward=23.940! +[2023-10-09 04:47:46,245][60144] Updated weights for policy 1, policy_version 16132 (0.0008) +[2023-10-09 04:47:46,621][60144] Updated weights for policy 1, policy_version 16142 (0.0007) +[2023-10-09 04:47:46,985][60144] Updated weights for policy 1, policy_version 16152 (0.0008) +[2023-10-09 04:47:47,329][60143] Updated weights for policy 0, policy_version 15972 (0.0008) +[2023-10-09 04:47:47,717][60143] Updated weights for policy 0, policy_version 15982 (0.0008) +[2023-10-09 04:47:48,093][60143] Updated weights for policy 0, policy_version 15992 (0.0010) +[2023-10-09 04:47:50,918][60144] Updated weights for policy 1, policy_version 16162 (0.0009) +[2023-10-09 04:47:51,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 32931840. Throughput: 0: 1689.7, 1: 1719.9. Samples: 8240130. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:47:51,053][59242] Avg episode reward: [(0, '24.360'), (1, '21.250')] +[2023-10-09 04:47:51,055][59934] Saving new best policy, reward=24.360! +[2023-10-09 04:47:51,293][60144] Updated weights for policy 1, policy_version 16172 (0.0010) +[2023-10-09 04:47:51,654][60144] Updated weights for policy 1, policy_version 16182 (0.0007) +[2023-10-09 04:47:52,025][60144] Updated weights for policy 1, policy_version 16192 (0.0009) +[2023-10-09 04:47:52,039][60143] Updated weights for policy 0, policy_version 16002 (0.0009) +[2023-10-09 04:47:52,411][60143] Updated weights for policy 0, policy_version 16012 (0.0011) +[2023-10-09 04:47:52,788][60143] Updated weights for policy 0, policy_version 16022 (0.0011) +[2023-10-09 04:47:53,144][60143] Updated weights for policy 0, policy_version 16032 (0.0011) +[2023-10-09 04:47:55,902][60144] Updated weights for policy 1, policy_version 16202 (0.0010) +[2023-10-09 04:47:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 32997376. Throughput: 0: 1701.7, 1: 1750.8. Samples: 8261494. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:47:56,053][59242] Avg episode reward: [(0, '25.000'), (1, '22.380')] +[2023-10-09 04:47:56,054][59934] Saving new best policy, reward=25.000! +[2023-10-09 04:47:56,270][60144] Updated weights for policy 1, policy_version 16212 (0.0010) +[2023-10-09 04:47:56,640][60144] Updated weights for policy 1, policy_version 16222 (0.0010) +[2023-10-09 04:47:57,142][60143] Updated weights for policy 0, policy_version 16042 (0.0009) +[2023-10-09 04:47:57,511][60143] Updated weights for policy 0, policy_version 16052 (0.0009) +[2023-10-09 04:47:57,876][60143] Updated weights for policy 0, policy_version 16062 (0.0008) +[2023-10-09 04:48:00,556][60144] Updated weights for policy 1, policy_version 16232 (0.0010) +[2023-10-09 04:48:00,926][60144] Updated weights for policy 1, policy_version 16242 (0.0012) +[2023-10-09 04:48:01,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 33062912. Throughput: 0: 1716.4, 1: 1739.2. Samples: 8282560. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:48:01,053][59242] Avg episode reward: [(0, '26.190'), (1, '21.630')] +[2023-10-09 04:48:01,061][59934] Saving new best policy, reward=26.190! +[2023-10-09 04:48:01,300][60144] Updated weights for policy 1, policy_version 16252 (0.0010) +[2023-10-09 04:48:01,788][60143] Updated weights for policy 0, policy_version 16072 (0.0010) +[2023-10-09 04:48:02,152][60143] Updated weights for policy 0, policy_version 16082 (0.0009) +[2023-10-09 04:48:02,523][60143] Updated weights for policy 0, policy_version 16092 (0.0008) +[2023-10-09 04:48:05,079][60144] Updated weights for policy 1, policy_version 16262 (0.0009) +[2023-10-09 04:48:05,458][60144] Updated weights for policy 1, policy_version 16272 (0.0011) +[2023-10-09 04:48:05,815][60144] Updated weights for policy 1, policy_version 16282 (0.0010) +[2023-10-09 04:48:06,052][59242] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 33161216. Throughput: 0: 1688.3, 1: 1744.1. Samples: 8292076. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-09 04:48:06,053][59242] Avg episode reward: [(0, '26.640'), (1, '20.370')] +[2023-10-09 04:48:06,053][59934] Saving new best policy, reward=26.640! +[2023-10-09 04:48:06,673][60143] Updated weights for policy 0, policy_version 16102 (0.0008) +[2023-10-09 04:48:07,045][60143] Updated weights for policy 0, policy_version 16112 (0.0008) +[2023-10-09 04:48:07,421][60143] Updated weights for policy 0, policy_version 16122 (0.0008) +[2023-10-09 04:48:09,689][60144] Updated weights for policy 1, policy_version 16292 (0.0009) +[2023-10-09 04:48:10,049][60144] Updated weights for policy 1, policy_version 16302 (0.0007) +[2023-10-09 04:48:10,425][60144] Updated weights for policy 1, policy_version 16312 (0.0008) +[2023-10-09 04:48:11,052][59242] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 33226752. Throughput: 0: 1714.3, 1: 1750.1. Samples: 8313298. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-09 04:48:11,053][59242] Avg episode reward: [(0, '27.050'), (1, '21.220')] +[2023-10-09 04:48:11,055][59934] Saving new best policy, reward=27.050! +[2023-10-09 04:48:11,584][60143] Updated weights for policy 0, policy_version 16132 (0.0008) +[2023-10-09 04:48:11,961][60143] Updated weights for policy 0, policy_version 16142 (0.0009) +[2023-10-09 04:48:12,327][60143] Updated weights for policy 0, policy_version 16152 (0.0010) +[2023-10-09 04:48:14,446][60144] Updated weights for policy 1, policy_version 16322 (0.0008) +[2023-10-09 04:48:14,808][60144] Updated weights for policy 1, policy_version 16332 (0.0009) +[2023-10-09 04:48:15,177][60144] Updated weights for policy 1, policy_version 16342 (0.0008) +[2023-10-09 04:48:15,529][60144] Updated weights for policy 1, policy_version 16352 (0.0008) +[2023-10-09 04:48:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 33292288. Throughput: 0: 1714.4, 1: 1721.2. Samples: 8333362. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-09 04:48:16,053][59242] Avg episode reward: [(0, '27.040'), (1, '21.300')] +[2023-10-09 04:48:16,305][60143] Updated weights for policy 0, policy_version 16162 (0.0010) +[2023-10-09 04:48:16,667][60143] Updated weights for policy 0, policy_version 16172 (0.0008) +[2023-10-09 04:48:17,046][60143] Updated weights for policy 0, policy_version 16182 (0.0010) +[2023-10-09 04:48:17,426][60143] Updated weights for policy 0, policy_version 16192 (0.0008) +[2023-10-09 04:48:19,404][60144] Updated weights for policy 1, policy_version 16362 (0.0008) +[2023-10-09 04:48:19,783][60144] Updated weights for policy 1, policy_version 16372 (0.0007) +[2023-10-09 04:48:20,152][60144] Updated weights for policy 1, policy_version 16382 (0.0007) +[2023-10-09 04:48:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 33357824. Throughput: 0: 1698.9, 1: 1756.8. Samples: 8344128. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) +[2023-10-09 04:48:21,053][59242] Avg episode reward: [(0, '26.200'), (1, '21.210')] +[2023-10-09 04:48:21,258][60143] Updated weights for policy 0, policy_version 16202 (0.0010) +[2023-10-09 04:48:21,612][60143] Updated weights for policy 0, policy_version 16212 (0.0009) +[2023-10-09 04:48:21,986][60143] Updated weights for policy 0, policy_version 16222 (0.0007) +[2023-10-09 04:48:24,020][60144] Updated weights for policy 1, policy_version 16392 (0.0007) +[2023-10-09 04:48:24,396][60144] Updated weights for policy 1, policy_version 16402 (0.0010) +[2023-10-09 04:48:24,773][60144] Updated weights for policy 1, policy_version 16412 (0.0010) +[2023-10-09 04:48:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 33423360. Throughput: 0: 1710.0, 1: 1739.5. Samples: 8364542. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) +[2023-10-09 04:48:26,053][59242] Avg episode reward: [(0, '25.410'), (1, '21.280')] +[2023-10-09 04:48:26,081][60143] Updated weights for policy 0, policy_version 16232 (0.0008) +[2023-10-09 04:48:26,445][60143] Updated weights for policy 0, policy_version 16242 (0.0007) +[2023-10-09 04:48:26,812][60143] Updated weights for policy 0, policy_version 16252 (0.0008) +[2023-10-09 04:48:28,812][60144] Updated weights for policy 1, policy_version 16422 (0.0008) +[2023-10-09 04:48:29,178][60144] Updated weights for policy 1, policy_version 16432 (0.0008) +[2023-10-09 04:48:29,549][60144] Updated weights for policy 1, policy_version 16442 (0.0007) +[2023-10-09 04:48:30,853][60143] Updated weights for policy 0, policy_version 16262 (0.0011) +[2023-10-09 04:48:31,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 33488896. Throughput: 0: 1709.3, 1: 1720.7. Samples: 8385236. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) +[2023-10-09 04:48:31,053][59242] Avg episode reward: [(0, '23.700'), (1, '20.350')] +[2023-10-09 04:48:31,226][60143] Updated weights for policy 0, policy_version 16272 (0.0008) +[2023-10-09 04:48:31,600][60143] Updated weights for policy 0, policy_version 16282 (0.0007) +[2023-10-09 04:48:33,268][60144] Updated weights for policy 1, policy_version 16452 (0.0008) +[2023-10-09 04:48:33,642][60144] Updated weights for policy 1, policy_version 16462 (0.0007) +[2023-10-09 04:48:34,016][60144] Updated weights for policy 1, policy_version 16472 (0.0010) +[2023-10-09 04:48:35,532][60143] Updated weights for policy 0, policy_version 16292 (0.0008) +[2023-10-09 04:48:35,918][60143] Updated weights for policy 0, policy_version 16302 (0.0010) +[2023-10-09 04:48:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 33554432. Throughput: 0: 1710.7, 1: 1742.9. Samples: 8395542. Policy #0 lag: (min: 27.0, avg: 27.1, max: 33.0) +[2023-10-09 04:48:36,052][59242] Avg episode reward: [(0, '23.990'), (1, '20.980')] +[2023-10-09 04:48:36,292][60143] Updated weights for policy 0, policy_version 16312 (0.0011) +[2023-10-09 04:48:37,951][60144] Updated weights for policy 1, policy_version 16482 (0.0008) +[2023-10-09 04:48:38,309][60144] Updated weights for policy 1, policy_version 16492 (0.0009) +[2023-10-09 04:48:38,673][60144] Updated weights for policy 1, policy_version 16502 (0.0008) +[2023-10-09 04:48:39,047][60144] Updated weights for policy 1, policy_version 16512 (0.0007) +[2023-10-09 04:48:40,052][60143] Updated weights for policy 0, policy_version 16322 (0.0009) +[2023-10-09 04:48:40,429][60143] Updated weights for policy 0, policy_version 16332 (0.0009) +[2023-10-09 04:48:40,802][60143] Updated weights for policy 0, policy_version 16342 (0.0007) +[2023-10-09 04:48:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 33619968. Throughput: 0: 1711.2, 1: 1718.8. Samples: 8415840. Policy #0 lag: (min: 27.0, avg: 27.1, max: 33.0) +[2023-10-09 04:48:41,053][59242] Avg episode reward: [(0, '24.610'), (1, '21.300')] +[2023-10-09 04:48:41,166][60143] Updated weights for policy 0, policy_version 16352 (0.0008) +[2023-10-09 04:48:43,001][60144] Updated weights for policy 1, policy_version 16522 (0.0010) +[2023-10-09 04:48:43,368][60144] Updated weights for policy 1, policy_version 16532 (0.0011) +[2023-10-09 04:48:43,732][60144] Updated weights for policy 1, policy_version 16542 (0.0009) +[2023-10-09 04:48:45,257][60143] Updated weights for policy 0, policy_version 16362 (0.0008) +[2023-10-09 04:48:45,627][60143] Updated weights for policy 0, policy_version 16372 (0.0008) +[2023-10-09 04:48:45,997][60143] Updated weights for policy 0, policy_version 16382 (0.0009) +[2023-10-09 04:48:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 33685504. Throughput: 0: 1695.5, 1: 1722.0. Samples: 8436348. Policy #0 lag: (min: 27.0, avg: 27.1, max: 33.0) +[2023-10-09 04:48:46,053][59242] Avg episode reward: [(0, '25.050'), (1, '21.090')] +[2023-10-09 04:48:47,778][60144] Updated weights for policy 1, policy_version 16552 (0.0009) +[2023-10-09 04:48:48,150][60144] Updated weights for policy 1, policy_version 16562 (0.0007) +[2023-10-09 04:48:48,517][60144] Updated weights for policy 1, policy_version 16572 (0.0007) +[2023-10-09 04:48:50,132][60143] Updated weights for policy 0, policy_version 16392 (0.0011) +[2023-10-09 04:48:50,499][60143] Updated weights for policy 0, policy_version 16402 (0.0009) +[2023-10-09 04:48:50,872][60143] Updated weights for policy 0, policy_version 16412 (0.0008) +[2023-10-09 04:48:51,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 33783808. Throughput: 0: 1712.1, 1: 1717.1. Samples: 8446392. Policy #0 lag: (min: 8.0, avg: 26.0, max: 40.0) +[2023-10-09 04:48:51,053][59242] Avg episode reward: [(0, '25.850'), (1, '20.860')] +[2023-10-09 04:48:52,363][60144] Updated weights for policy 1, policy_version 16582 (0.0011) +[2023-10-09 04:48:52,734][60144] Updated weights for policy 1, policy_version 16592 (0.0008) +[2023-10-09 04:48:53,108][60144] Updated weights for policy 1, policy_version 16602 (0.0008) +[2023-10-09 04:48:54,715][60143] Updated weights for policy 0, policy_version 16422 (0.0007) +[2023-10-09 04:48:55,082][60143] Updated weights for policy 0, policy_version 16432 (0.0009) +[2023-10-09 04:48:55,456][60143] Updated weights for policy 0, policy_version 16442 (0.0009) +[2023-10-09 04:48:56,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 33849344. Throughput: 0: 1719.3, 1: 1711.5. Samples: 8467682. Policy #0 lag: (min: 8.0, avg: 26.0, max: 40.0) +[2023-10-09 04:48:56,053][59242] Avg episode reward: [(0, '25.430'), (1, '21.600')] +[2023-10-09 04:48:57,024][60144] Updated weights for policy 1, policy_version 16612 (0.0008) +[2023-10-09 04:48:57,385][60144] Updated weights for policy 1, policy_version 16622 (0.0007) +[2023-10-09 04:48:57,747][60144] Updated weights for policy 1, policy_version 16632 (0.0008) +[2023-10-09 04:48:59,321][60143] Updated weights for policy 0, policy_version 16452 (0.0008) +[2023-10-09 04:48:59,694][60143] Updated weights for policy 0, policy_version 16462 (0.0009) +[2023-10-09 04:49:00,054][60143] Updated weights for policy 0, policy_version 16472 (0.0008) +[2023-10-09 04:49:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 33914880. Throughput: 0: 1688.1, 1: 1743.4. Samples: 8487780. Policy #0 lag: (min: 6.0, avg: 12.2, max: 38.0) +[2023-10-09 04:49:01,053][59242] Avg episode reward: [(0, '24.310'), (1, '21.620')] +[2023-10-09 04:49:01,697][60144] Updated weights for policy 1, policy_version 16642 (0.0007) +[2023-10-09 04:49:02,071][60144] Updated weights for policy 1, policy_version 16652 (0.0008) +[2023-10-09 04:49:02,429][60144] Updated weights for policy 1, policy_version 16662 (0.0008) +[2023-10-09 04:49:02,790][60144] Updated weights for policy 1, policy_version 16672 (0.0010) +[2023-10-09 04:49:04,283][60143] Updated weights for policy 0, policy_version 16482 (0.0011) +[2023-10-09 04:49:04,658][60143] Updated weights for policy 0, policy_version 16492 (0.0009) +[2023-10-09 04:49:05,035][60143] Updated weights for policy 0, policy_version 16502 (0.0007) +[2023-10-09 04:49:05,400][60143] Updated weights for policy 0, policy_version 16512 (0.0007) +[2023-10-09 04:49:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 33980416. Throughput: 0: 1716.6, 1: 1708.7. Samples: 8498266. Policy #0 lag: (min: 6.0, avg: 12.2, max: 38.0) +[2023-10-09 04:49:06,053][59242] Avg episode reward: [(0, '23.650'), (1, '22.120')] +[2023-10-09 04:49:06,889][60144] Updated weights for policy 1, policy_version 16682 (0.0008) +[2023-10-09 04:49:07,252][60144] Updated weights for policy 1, policy_version 16692 (0.0009) +[2023-10-09 04:49:07,624][60144] Updated weights for policy 1, policy_version 16702 (0.0008) +[2023-10-09 04:49:09,614][60143] Updated weights for policy 0, policy_version 16522 (0.0009) +[2023-10-09 04:49:09,987][60143] Updated weights for policy 0, policy_version 16532 (0.0009) +[2023-10-09 04:49:10,360][60143] Updated weights for policy 0, policy_version 16542 (0.0007) +[2023-10-09 04:49:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 34045952. Throughput: 0: 1703.5, 1: 1721.2. Samples: 8518650. Policy #0 lag: (min: 6.0, avg: 12.2, max: 38.0) +[2023-10-09 04:49:11,052][59242] Avg episode reward: [(0, '22.760'), (1, '22.220')] +[2023-10-09 04:49:11,633][60144] Updated weights for policy 1, policy_version 16712 (0.0009) +[2023-10-09 04:49:11,996][60144] Updated weights for policy 1, policy_version 16722 (0.0010) +[2023-10-09 04:49:12,371][60144] Updated weights for policy 1, policy_version 16732 (0.0009) +[2023-10-09 04:49:14,321][60143] Updated weights for policy 0, policy_version 16552 (0.0007) +[2023-10-09 04:49:14,693][60143] Updated weights for policy 0, policy_version 16562 (0.0008) +[2023-10-09 04:49:15,064][60143] Updated weights for policy 0, policy_version 16572 (0.0007) +[2023-10-09 04:49:16,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 34111488. Throughput: 0: 1681.8, 1: 1737.6. Samples: 8539110. Policy #0 lag: (min: 9.0, avg: 29.7, max: 41.0) +[2023-10-09 04:49:16,053][59242] Avg episode reward: [(0, '22.950'), (1, '22.010')] +[2023-10-09 04:49:16,240][60144] Updated weights for policy 1, policy_version 16742 (0.0009) +[2023-10-09 04:49:16,608][60144] Updated weights for policy 1, policy_version 16752 (0.0007) +[2023-10-09 04:49:16,973][60144] Updated weights for policy 1, policy_version 16762 (0.0009) +[2023-10-09 04:49:18,910][60143] Updated weights for policy 0, policy_version 16582 (0.0007) +[2023-10-09 04:49:19,283][60143] Updated weights for policy 0, policy_version 16592 (0.0009) +[2023-10-09 04:49:19,654][60143] Updated weights for policy 0, policy_version 16602 (0.0011) +[2023-10-09 04:49:20,788][60144] Updated weights for policy 1, policy_version 16772 (0.0008) +[2023-10-09 04:49:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 34177024. Throughput: 0: 1713.2, 1: 1715.9. Samples: 8549854. Policy #0 lag: (min: 9.0, avg: 29.7, max: 41.0) +[2023-10-09 04:49:21,053][59242] Avg episode reward: [(0, '22.980'), (1, '22.170')] +[2023-10-09 04:49:21,147][60144] Updated weights for policy 1, policy_version 16782 (0.0008) +[2023-10-09 04:49:21,523][60144] Updated weights for policy 1, policy_version 16792 (0.0010) +[2023-10-09 04:49:23,794][60143] Updated weights for policy 0, policy_version 16612 (0.0008) +[2023-10-09 04:49:24,185][60143] Updated weights for policy 0, policy_version 16622 (0.0008) +[2023-10-09 04:49:24,560][60143] Updated weights for policy 0, policy_version 16632 (0.0010) +[2023-10-09 04:49:25,406][60144] Updated weights for policy 1, policy_version 16802 (0.0010) +[2023-10-09 04:49:25,769][60144] Updated weights for policy 1, policy_version 16812 (0.0008) +[2023-10-09 04:49:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 34242560. Throughput: 0: 1690.0, 1: 1739.3. Samples: 8570160. Policy #0 lag: (min: 9.0, avg: 29.7, max: 41.0) +[2023-10-09 04:49:26,053][59242] Avg episode reward: [(0, '23.850'), (1, '21.510')] +[2023-10-09 04:49:26,143][60144] Updated weights for policy 1, policy_version 16822 (0.0008) +[2023-10-09 04:49:26,515][60144] Updated weights for policy 1, policy_version 16832 (0.0009) +[2023-10-09 04:49:28,505][60143] Updated weights for policy 0, policy_version 16642 (0.0008) +[2023-10-09 04:49:28,877][60143] Updated weights for policy 0, policy_version 16652 (0.0009) +[2023-10-09 04:49:29,248][60143] Updated weights for policy 0, policy_version 16662 (0.0009) +[2023-10-09 04:49:29,631][60143] Updated weights for policy 0, policy_version 16672 (0.0007) +[2023-10-09 04:49:30,530][60144] Updated weights for policy 1, policy_version 16842 (0.0011) +[2023-10-09 04:49:30,892][60144] Updated weights for policy 1, policy_version 16852 (0.0011) +[2023-10-09 04:49:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 34308096. Throughput: 0: 1693.4, 1: 1731.6. Samples: 8590470. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 04:49:31,053][59242] Avg episode reward: [(0, '23.030'), (1, '21.100')] +[2023-10-09 04:49:31,061][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000016672_17072128.pth... +[2023-10-09 04:49:31,091][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000015072_15433728.pth +[2023-10-09 04:49:31,263][60144] Updated weights for policy 1, policy_version 16862 (0.0010) +[2023-10-09 04:49:31,329][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000016864_17268736.pth... +[2023-10-09 04:49:31,358][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000015232_15597568.pth +[2023-10-09 04:49:33,474][60143] Updated weights for policy 0, policy_version 16682 (0.0007) +[2023-10-09 04:49:33,853][60143] Updated weights for policy 0, policy_version 16692 (0.0008) +[2023-10-09 04:49:34,220][60143] Updated weights for policy 0, policy_version 16702 (0.0010) +[2023-10-09 04:49:35,172][60144] Updated weights for policy 1, policy_version 16872 (0.0009) +[2023-10-09 04:49:35,544][60144] Updated weights for policy 1, policy_version 16882 (0.0010) +[2023-10-09 04:49:35,918][60144] Updated weights for policy 1, policy_version 16892 (0.0010) +[2023-10-09 04:49:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 34373632. Throughput: 0: 1701.1, 1: 1739.1. Samples: 8601198. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 04:49:36,053][59242] Avg episode reward: [(0, '23.480'), (1, '21.810')] +[2023-10-09 04:49:38,171][60143] Updated weights for policy 0, policy_version 16712 (0.0007) +[2023-10-09 04:49:38,547][60143] Updated weights for policy 0, policy_version 16722 (0.0007) +[2023-10-09 04:49:38,923][60143] Updated weights for policy 0, policy_version 16732 (0.0008) +[2023-10-09 04:49:39,825][60144] Updated weights for policy 1, policy_version 16902 (0.0008) +[2023-10-09 04:49:40,195][60144] Updated weights for policy 1, policy_version 16912 (0.0008) +[2023-10-09 04:49:40,558][60144] Updated weights for policy 1, policy_version 16922 (0.0007) +[2023-10-09 04:49:41,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 34471936. Throughput: 0: 1675.2, 1: 1744.8. Samples: 8621582. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 04:49:41,053][59242] Avg episode reward: [(0, '23.890'), (1, '22.150')] +[2023-10-09 04:49:42,832][60143] Updated weights for policy 0, policy_version 16742 (0.0007) +[2023-10-09 04:49:43,202][60143] Updated weights for policy 0, policy_version 16752 (0.0007) +[2023-10-09 04:49:43,573][60143] Updated weights for policy 0, policy_version 16762 (0.0007) +[2023-10-09 04:49:44,550][60144] Updated weights for policy 1, policy_version 16932 (0.0009) +[2023-10-09 04:49:44,928][60144] Updated weights for policy 1, policy_version 16942 (0.0008) +[2023-10-09 04:49:45,286][60144] Updated weights for policy 1, policy_version 16952 (0.0008) +[2023-10-09 04:49:46,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 34537472. Throughput: 0: 1711.4, 1: 1712.3. Samples: 8641846. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:49:46,053][59242] Avg episode reward: [(0, '23.700'), (1, '22.910')] +[2023-10-09 04:49:47,738][60143] Updated weights for policy 0, policy_version 16772 (0.0010) +[2023-10-09 04:49:48,111][60143] Updated weights for policy 0, policy_version 16782 (0.0009) +[2023-10-09 04:49:48,481][60143] Updated weights for policy 0, policy_version 16792 (0.0008) +[2023-10-09 04:49:49,235][60144] Updated weights for policy 1, policy_version 16962 (0.0009) +[2023-10-09 04:49:49,592][60144] Updated weights for policy 1, policy_version 16972 (0.0007) +[2023-10-09 04:49:49,964][60144] Updated weights for policy 1, policy_version 16982 (0.0007) +[2023-10-09 04:49:50,339][60144] Updated weights for policy 1, policy_version 16992 (0.0008) +[2023-10-09 04:49:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 34603008. Throughput: 0: 1691.4, 1: 1742.4. Samples: 8652788. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:49:51,052][59242] Avg episode reward: [(0, '23.360'), (1, '22.180')] +[2023-10-09 04:49:52,503][60143] Updated weights for policy 0, policy_version 16802 (0.0007) +[2023-10-09 04:49:52,884][60143] Updated weights for policy 0, policy_version 16812 (0.0007) +[2023-10-09 04:49:53,253][60143] Updated weights for policy 0, policy_version 16822 (0.0007) +[2023-10-09 04:49:53,630][60143] Updated weights for policy 0, policy_version 16832 (0.0008) +[2023-10-09 04:49:54,415][60144] Updated weights for policy 1, policy_version 17002 (0.0008) +[2023-10-09 04:49:54,780][60144] Updated weights for policy 1, policy_version 17012 (0.0008) +[2023-10-09 04:49:55,154][60144] Updated weights for policy 1, policy_version 17022 (0.0007) +[2023-10-09 04:49:56,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 34668544. Throughput: 0: 1699.8, 1: 1734.8. Samples: 8673206. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:49:56,053][59242] Avg episode reward: [(0, '23.570'), (1, '22.330')] +[2023-10-09 04:49:57,637][60143] Updated weights for policy 0, policy_version 16842 (0.0008) +[2023-10-09 04:49:57,999][60143] Updated weights for policy 0, policy_version 16852 (0.0008) +[2023-10-09 04:49:58,383][60143] Updated weights for policy 0, policy_version 16862 (0.0010) +[2023-10-09 04:49:58,871][60144] Updated weights for policy 1, policy_version 17032 (0.0010) +[2023-10-09 04:49:59,245][60144] Updated weights for policy 1, policy_version 17042 (0.0009) +[2023-10-09 04:49:59,621][60144] Updated weights for policy 1, policy_version 17052 (0.0008) +[2023-10-09 04:50:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 34734080. Throughput: 0: 1717.5, 1: 1716.4. Samples: 8693636. Policy #0 lag: (min: 31.0, avg: 32.3, max: 54.0) +[2023-10-09 04:50:01,052][59242] Avg episode reward: [(0, '23.850'), (1, '23.250')] +[2023-10-09 04:50:02,423][60143] Updated weights for policy 0, policy_version 16872 (0.0008) +[2023-10-09 04:50:02,795][60143] Updated weights for policy 0, policy_version 16882 (0.0007) +[2023-10-09 04:50:03,160][60143] Updated weights for policy 0, policy_version 16892 (0.0008) +[2023-10-09 04:50:03,292][60144] Updated weights for policy 1, policy_version 17062 (0.0007) +[2023-10-09 04:50:03,655][60144] Updated weights for policy 1, policy_version 17072 (0.0010) +[2023-10-09 04:50:04,018][60144] Updated weights for policy 1, policy_version 17082 (0.0011) +[2023-10-09 04:50:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 34799616. Throughput: 0: 1686.9, 1: 1734.6. Samples: 8703822. Policy #0 lag: (min: 31.0, avg: 32.3, max: 54.0) +[2023-10-09 04:50:06,053][59242] Avg episode reward: [(0, '23.760'), (1, '22.900')] +[2023-10-09 04:50:07,114][60143] Updated weights for policy 0, policy_version 16902 (0.0008) +[2023-10-09 04:50:07,484][60143] Updated weights for policy 0, policy_version 16912 (0.0007) +[2023-10-09 04:50:07,851][60143] Updated weights for policy 0, policy_version 16922 (0.0009) +[2023-10-09 04:50:08,121][60144] Updated weights for policy 1, policy_version 17092 (0.0008) +[2023-10-09 04:50:08,488][60144] Updated weights for policy 1, policy_version 17102 (0.0007) +[2023-10-09 04:50:08,853][60144] Updated weights for policy 1, policy_version 17112 (0.0008) +[2023-10-09 04:50:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 34865152. Throughput: 0: 1709.5, 1: 1713.6. Samples: 8724202. Policy #0 lag: (min: 31.0, avg: 32.3, max: 54.0) +[2023-10-09 04:50:11,053][59242] Avg episode reward: [(0, '24.000'), (1, '23.560')] +[2023-10-09 04:50:11,724][60143] Updated weights for policy 0, policy_version 16932 (0.0009) +[2023-10-09 04:50:12,128][60143] Updated weights for policy 0, policy_version 16942 (0.0009) +[2023-10-09 04:50:12,496][60143] Updated weights for policy 0, policy_version 16952 (0.0008) +[2023-10-09 04:50:12,796][60144] Updated weights for policy 1, policy_version 17122 (0.0008) +[2023-10-09 04:50:13,161][60144] Updated weights for policy 1, policy_version 17132 (0.0010) +[2023-10-09 04:50:13,528][60144] Updated weights for policy 1, policy_version 17142 (0.0009) +[2023-10-09 04:50:13,900][60144] Updated weights for policy 1, policy_version 17152 (0.0008) +[2023-10-09 04:50:16,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 34930688. Throughput: 0: 1718.9, 1: 1729.4. Samples: 8745646. Policy #0 lag: (min: 12.0, avg: 12.9, max: 29.0) +[2023-10-09 04:50:16,053][59242] Avg episode reward: [(0, '23.970'), (1, '22.740')] +[2023-10-09 04:50:16,420][60143] Updated weights for policy 0, policy_version 16962 (0.0007) +[2023-10-09 04:50:16,795][60143] Updated weights for policy 0, policy_version 16972 (0.0008) +[2023-10-09 04:50:17,169][60143] Updated weights for policy 0, policy_version 16982 (0.0009) +[2023-10-09 04:50:17,532][60143] Updated weights for policy 0, policy_version 16992 (0.0007) +[2023-10-09 04:50:17,761][60144] Updated weights for policy 1, policy_version 17162 (0.0007) +[2023-10-09 04:50:18,126][60144] Updated weights for policy 1, policy_version 17172 (0.0008) +[2023-10-09 04:50:18,492][60144] Updated weights for policy 1, policy_version 17182 (0.0010) +[2023-10-09 04:50:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 34996224. Throughput: 0: 1694.6, 1: 1720.0. Samples: 8754852. Policy #0 lag: (min: 12.0, avg: 12.9, max: 29.0) +[2023-10-09 04:50:21,053][59242] Avg episode reward: [(0, '24.180'), (1, '23.470')] +[2023-10-09 04:50:21,426][60143] Updated weights for policy 0, policy_version 17002 (0.0010) +[2023-10-09 04:50:21,802][60143] Updated weights for policy 0, policy_version 17012 (0.0009) +[2023-10-09 04:50:22,169][60143] Updated weights for policy 0, policy_version 17022 (0.0008) +[2023-10-09 04:50:22,404][60144] Updated weights for policy 1, policy_version 17192 (0.0009) +[2023-10-09 04:50:22,767][60144] Updated weights for policy 1, policy_version 17202 (0.0011) +[2023-10-09 04:50:23,134][60144] Updated weights for policy 1, policy_version 17212 (0.0009) +[2023-10-09 04:50:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 35061760. Throughput: 0: 1721.9, 1: 1715.1. Samples: 8776246. Policy #0 lag: (min: 12.0, avg: 12.9, max: 29.0) +[2023-10-09 04:50:26,053][59242] Avg episode reward: [(0, '24.550'), (1, '23.640')] +[2023-10-09 04:50:26,161][60143] Updated weights for policy 0, policy_version 17032 (0.0011) +[2023-10-09 04:50:26,526][60143] Updated weights for policy 0, policy_version 17042 (0.0010) +[2023-10-09 04:50:26,901][60143] Updated weights for policy 0, policy_version 17052 (0.0009) +[2023-10-09 04:50:27,103][60144] Updated weights for policy 1, policy_version 17222 (0.0008) +[2023-10-09 04:50:27,466][60144] Updated weights for policy 1, policy_version 17232 (0.0008) +[2023-10-09 04:50:27,826][60144] Updated weights for policy 1, policy_version 17242 (0.0008) +[2023-10-09 04:50:30,907][60143] Updated weights for policy 0, policy_version 17062 (0.0008) +[2023-10-09 04:50:31,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 35127296. Throughput: 0: 1712.6, 1: 1747.9. Samples: 8797566. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 04:50:31,053][59242] Avg episode reward: [(0, '24.650'), (1, '23.170')] +[2023-10-09 04:50:31,277][60143] Updated weights for policy 0, policy_version 17072 (0.0007) +[2023-10-09 04:50:31,654][60143] Updated weights for policy 0, policy_version 17082 (0.0008) +[2023-10-09 04:50:31,751][60144] Updated weights for policy 1, policy_version 17252 (0.0009) +[2023-10-09 04:50:32,124][60144] Updated weights for policy 1, policy_version 17262 (0.0009) +[2023-10-09 04:50:32,500][60144] Updated weights for policy 1, policy_version 17272 (0.0009) +[2023-10-09 04:50:35,687][60143] Updated weights for policy 0, policy_version 17092 (0.0010) +[2023-10-09 04:50:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 35192832. Throughput: 0: 1706.6, 1: 1720.8. Samples: 8807020. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 04:50:36,053][59242] Avg episode reward: [(0, '24.180'), (1, '23.190')] +[2023-10-09 04:50:36,062][60143] Updated weights for policy 0, policy_version 17102 (0.0007) +[2023-10-09 04:50:36,379][60144] Updated weights for policy 1, policy_version 17282 (0.0008) +[2023-10-09 04:50:36,435][60143] Updated weights for policy 0, policy_version 17112 (0.0008) +[2023-10-09 04:50:36,752][60144] Updated weights for policy 1, policy_version 17292 (0.0008) +[2023-10-09 04:50:37,110][60144] Updated weights for policy 1, policy_version 17302 (0.0009) +[2023-10-09 04:50:37,482][60144] Updated weights for policy 1, policy_version 17312 (0.0007) +[2023-10-09 04:50:40,384][60143] Updated weights for policy 0, policy_version 17122 (0.0008) +[2023-10-09 04:50:40,749][60143] Updated weights for policy 0, policy_version 17132 (0.0009) +[2023-10-09 04:50:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 35258368. Throughput: 0: 1709.7, 1: 1733.7. Samples: 8828158. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 04:50:41,052][59242] Avg episode reward: [(0, '24.900'), (1, '23.590')] +[2023-10-09 04:50:41,129][60143] Updated weights for policy 0, policy_version 17142 (0.0008) +[2023-10-09 04:50:41,497][60143] Updated weights for policy 0, policy_version 17152 (0.0009) +[2023-10-09 04:50:41,589][60144] Updated weights for policy 1, policy_version 17322 (0.0008) +[2023-10-09 04:50:41,970][60144] Updated weights for policy 1, policy_version 17332 (0.0011) +[2023-10-09 04:50:42,327][60144] Updated weights for policy 1, policy_version 17342 (0.0009) +[2023-10-09 04:50:45,423][60143] Updated weights for policy 0, policy_version 17162 (0.0008) +[2023-10-09 04:50:45,801][60143] Updated weights for policy 0, policy_version 17172 (0.0008) +[2023-10-09 04:50:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 35323904. Throughput: 0: 1709.2, 1: 1740.1. Samples: 8848852. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:50:46,052][59242] Avg episode reward: [(0, '23.600'), (1, '23.620')] +[2023-10-09 04:50:46,167][60143] Updated weights for policy 0, policy_version 17182 (0.0008) +[2023-10-09 04:50:46,251][60144] Updated weights for policy 1, policy_version 17352 (0.0007) +[2023-10-09 04:50:46,615][60144] Updated weights for policy 1, policy_version 17362 (0.0007) +[2023-10-09 04:50:46,991][60144] Updated weights for policy 1, policy_version 17372 (0.0009) +[2023-10-09 04:50:50,248][60143] Updated weights for policy 0, policy_version 17192 (0.0008) +[2023-10-09 04:50:50,628][60143] Updated weights for policy 0, policy_version 17202 (0.0008) +[2023-10-09 04:50:50,997][60143] Updated weights for policy 0, policy_version 17212 (0.0007) +[2023-10-09 04:50:51,005][60144] Updated weights for policy 1, policy_version 17382 (0.0009) +[2023-10-09 04:50:51,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 35389440. Throughput: 0: 1721.2, 1: 1720.2. Samples: 8858682. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:50:51,053][59242] Avg episode reward: [(0, '23.740'), (1, '24.230')] +[2023-10-09 04:50:51,379][60144] Updated weights for policy 1, policy_version 17392 (0.0009) +[2023-10-09 04:50:51,741][60144] Updated weights for policy 1, policy_version 17402 (0.0008) +[2023-10-09 04:50:54,975][60143] Updated weights for policy 0, policy_version 17222 (0.0009) +[2023-10-09 04:50:55,357][60143] Updated weights for policy 0, policy_version 17232 (0.0010) +[2023-10-09 04:50:55,663][60144] Updated weights for policy 1, policy_version 17412 (0.0008) +[2023-10-09 04:50:55,730][60143] Updated weights for policy 0, policy_version 17242 (0.0007) +[2023-10-09 04:50:56,025][60144] Updated weights for policy 1, policy_version 17422 (0.0009) +[2023-10-09 04:50:56,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 35487744. Throughput: 0: 1719.1, 1: 1740.4. Samples: 8879876. Policy #0 lag: (min: 26.0, avg: 28.1, max: 57.0) +[2023-10-09 04:50:56,053][59242] Avg episode reward: [(0, '23.500'), (1, '24.200')] +[2023-10-09 04:50:56,399][60144] Updated weights for policy 1, policy_version 17432 (0.0008) +[2023-10-09 04:50:59,850][60143] Updated weights for policy 0, policy_version 17252 (0.0008) +[2023-10-09 04:51:00,247][60143] Updated weights for policy 0, policy_version 17262 (0.0010) +[2023-10-09 04:51:00,500][60144] Updated weights for policy 1, policy_version 17442 (0.0010) +[2023-10-09 04:51:00,608][60143] Updated weights for policy 0, policy_version 17272 (0.0008) +[2023-10-09 04:51:00,863][60144] Updated weights for policy 1, policy_version 17452 (0.0007) +[2023-10-09 04:51:01,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 35553280. Throughput: 0: 1695.1, 1: 1725.2. Samples: 8899558. Policy #0 lag: (min: 26.0, avg: 28.1, max: 57.0) +[2023-10-09 04:51:01,052][59242] Avg episode reward: [(0, '23.250'), (1, '24.050')] +[2023-10-09 04:51:01,235][60144] Updated weights for policy 1, policy_version 17462 (0.0007) +[2023-10-09 04:51:01,596][60144] Updated weights for policy 1, policy_version 17472 (0.0009) +[2023-10-09 04:51:04,407][60143] Updated weights for policy 0, policy_version 17282 (0.0007) +[2023-10-09 04:51:04,778][60143] Updated weights for policy 0, policy_version 17292 (0.0008) +[2023-10-09 04:51:05,155][60143] Updated weights for policy 0, policy_version 17302 (0.0008) +[2023-10-09 04:51:05,519][60143] Updated weights for policy 0, policy_version 17312 (0.0010) +[2023-10-09 04:51:05,613][60144] Updated weights for policy 1, policy_version 17482 (0.0007) +[2023-10-09 04:51:05,981][60144] Updated weights for policy 1, policy_version 17492 (0.0008) +[2023-10-09 04:51:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 35618816. Throughput: 0: 1720.6, 1: 1727.1. Samples: 8910000. Policy #0 lag: (min: 26.0, avg: 28.1, max: 57.0) +[2023-10-09 04:51:06,053][59242] Avg episode reward: [(0, '23.030'), (1, '24.090')] +[2023-10-09 04:51:06,359][60144] Updated weights for policy 1, policy_version 17502 (0.0007) +[2023-10-09 04:51:09,622][60143] Updated weights for policy 0, policy_version 17322 (0.0011) +[2023-10-09 04:51:09,997][60143] Updated weights for policy 0, policy_version 17332 (0.0009) +[2023-10-09 04:51:10,303][60144] Updated weights for policy 1, policy_version 17512 (0.0009) +[2023-10-09 04:51:10,366][60143] Updated weights for policy 0, policy_version 17342 (0.0007) +[2023-10-09 04:51:10,661][60144] Updated weights for policy 1, policy_version 17522 (0.0008) +[2023-10-09 04:51:11,021][60144] Updated weights for policy 1, policy_version 17532 (0.0010) +[2023-10-09 04:51:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 35684352. Throughput: 0: 1703.9, 1: 1732.4. Samples: 8930880. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-09 04:51:11,053][59242] Avg episode reward: [(0, '23.570'), (1, '24.590')] +[2023-10-09 04:51:14,345][60143] Updated weights for policy 0, policy_version 17352 (0.0009) +[2023-10-09 04:51:14,721][60143] Updated weights for policy 0, policy_version 17362 (0.0008) +[2023-10-09 04:51:14,964][60144] Updated weights for policy 1, policy_version 17542 (0.0009) +[2023-10-09 04:51:15,089][60143] Updated weights for policy 0, policy_version 17372 (0.0009) +[2023-10-09 04:51:15,330][60144] Updated weights for policy 1, policy_version 17552 (0.0008) +[2023-10-09 04:51:15,698][60144] Updated weights for policy 1, policy_version 17562 (0.0010) +[2023-10-09 04:51:16,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 35782656. Throughput: 0: 1681.4, 1: 1708.9. Samples: 8950130. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-09 04:51:16,053][59242] Avg episode reward: [(0, '23.300'), (1, '23.810')] +[2023-10-09 04:51:19,010][60143] Updated weights for policy 0, policy_version 17382 (0.0010) +[2023-10-09 04:51:19,375][60143] Updated weights for policy 0, policy_version 17392 (0.0009) +[2023-10-09 04:51:19,589][60144] Updated weights for policy 1, policy_version 17572 (0.0007) +[2023-10-09 04:51:19,746][60143] Updated weights for policy 0, policy_version 17402 (0.0008) +[2023-10-09 04:51:19,951][60144] Updated weights for policy 1, policy_version 17582 (0.0009) +[2023-10-09 04:51:20,321][60144] Updated weights for policy 1, policy_version 17592 (0.0009) +[2023-10-09 04:51:21,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 35848192. Throughput: 0: 1713.2, 1: 1728.7. Samples: 8961904. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-09 04:51:21,053][59242] Avg episode reward: [(0, '22.870'), (1, '23.380')] +[2023-10-09 04:51:23,739][60143] Updated weights for policy 0, policy_version 17412 (0.0008) +[2023-10-09 04:51:24,107][60143] Updated weights for policy 0, policy_version 17422 (0.0008) +[2023-10-09 04:51:24,131][60144] Updated weights for policy 1, policy_version 17602 (0.0007) +[2023-10-09 04:51:24,468][60143] Updated weights for policy 0, policy_version 17432 (0.0008) +[2023-10-09 04:51:24,498][60144] Updated weights for policy 1, policy_version 17612 (0.0007) +[2023-10-09 04:51:24,863][60144] Updated weights for policy 1, policy_version 17622 (0.0007) +[2023-10-09 04:51:25,227][60144] Updated weights for policy 1, policy_version 17632 (0.0008) +[2023-10-09 04:51:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 35913728. Throughput: 0: 1694.8, 1: 1720.4. Samples: 8981844. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:51:26,053][59242] Avg episode reward: [(0, '23.180'), (1, '24.330')] +[2023-10-09 04:51:28,427][60143] Updated weights for policy 0, policy_version 17442 (0.0008) +[2023-10-09 04:51:28,794][60143] Updated weights for policy 0, policy_version 17452 (0.0010) +[2023-10-09 04:51:29,162][60143] Updated weights for policy 0, policy_version 17462 (0.0007) +[2023-10-09 04:51:29,167][60144] Updated weights for policy 1, policy_version 17642 (0.0008) +[2023-10-09 04:51:29,528][60143] Updated weights for policy 0, policy_version 17472 (0.0007) +[2023-10-09 04:51:29,538][60144] Updated weights for policy 1, policy_version 17652 (0.0009) +[2023-10-09 04:51:29,914][60144] Updated weights for policy 1, policy_version 17662 (0.0007) +[2023-10-09 04:51:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 35979264. Throughput: 0: 1690.3, 1: 1707.9. Samples: 9001772. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:51:31,053][59242] Avg episode reward: [(0, '23.730'), (1, '24.420')] +[2023-10-09 04:51:31,063][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000017664_18087936.pth... +[2023-10-09 04:51:31,063][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000017472_17891328.pth... +[2023-10-09 04:51:31,100][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000015872_16252928.pth +[2023-10-09 04:51:31,101][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000016032_16416768.pth +[2023-10-09 04:51:33,638][60143] Updated weights for policy 0, policy_version 17482 (0.0007) +[2023-10-09 04:51:33,711][60144] Updated weights for policy 1, policy_version 17672 (0.0007) +[2023-10-09 04:51:34,005][60143] Updated weights for policy 0, policy_version 17492 (0.0007) +[2023-10-09 04:51:34,067][60144] Updated weights for policy 1, policy_version 17682 (0.0007) +[2023-10-09 04:51:34,384][60143] Updated weights for policy 0, policy_version 17502 (0.0007) +[2023-10-09 04:51:34,433][60144] Updated weights for policy 1, policy_version 17692 (0.0007) +[2023-10-09 04:51:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 36044800. Throughput: 0: 1699.7, 1: 1733.5. Samples: 9013176. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:51:36,052][59242] Avg episode reward: [(0, '24.410'), (1, '24.660')] +[2023-10-09 04:51:38,330][60143] Updated weights for policy 0, policy_version 17512 (0.0008) +[2023-10-09 04:51:38,387][60144] Updated weights for policy 1, policy_version 17702 (0.0007) +[2023-10-09 04:51:38,697][60143] Updated weights for policy 0, policy_version 17522 (0.0007) +[2023-10-09 04:51:38,758][60144] Updated weights for policy 1, policy_version 17712 (0.0007) +[2023-10-09 04:51:39,067][60143] Updated weights for policy 0, policy_version 17532 (0.0007) +[2023-10-09 04:51:39,126][60144] Updated weights for policy 1, policy_version 17722 (0.0007) +[2023-10-09 04:51:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 36110336. Throughput: 0: 1681.0, 1: 1704.3. Samples: 9032212. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:51:41,053][59242] Avg episode reward: [(0, '23.830'), (1, '24.330')] +[2023-10-09 04:51:43,142][60144] Updated weights for policy 1, policy_version 17732 (0.0007) +[2023-10-09 04:51:43,310][60143] Updated weights for policy 0, policy_version 17542 (0.0009) +[2023-10-09 04:51:43,496][60144] Updated weights for policy 1, policy_version 17742 (0.0008) +[2023-10-09 04:51:43,680][60143] Updated weights for policy 0, policy_version 17552 (0.0007) +[2023-10-09 04:51:43,869][60144] Updated weights for policy 1, policy_version 17752 (0.0008) +[2023-10-09 04:51:44,058][60143] Updated weights for policy 0, policy_version 17562 (0.0008) +[2023-10-09 04:51:46,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 36175872. Throughput: 0: 1705.9, 1: 1710.3. Samples: 9053292. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:51:46,053][59242] Avg episode reward: [(0, '24.280'), (1, '23.410')] +[2023-10-09 04:51:48,045][60143] Updated weights for policy 0, policy_version 17572 (0.0009) +[2023-10-09 04:51:48,169][60144] Updated weights for policy 1, policy_version 17762 (0.0008) +[2023-10-09 04:51:48,434][60143] Updated weights for policy 0, policy_version 17582 (0.0009) +[2023-10-09 04:51:48,534][60144] Updated weights for policy 1, policy_version 17772 (0.0007) +[2023-10-09 04:51:48,801][60143] Updated weights for policy 0, policy_version 17592 (0.0008) +[2023-10-09 04:51:48,896][60144] Updated weights for policy 1, policy_version 17782 (0.0008) +[2023-10-09 04:51:49,265][60144] Updated weights for policy 1, policy_version 17792 (0.0007) +[2023-10-09 04:51:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 36241408. Throughput: 0: 1697.0, 1: 1723.5. Samples: 9063922. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:51:51,053][59242] Avg episode reward: [(0, '24.170'), (1, '23.500')] +[2023-10-09 04:51:52,975][60143] Updated weights for policy 0, policy_version 17602 (0.0008) +[2023-10-09 04:51:53,344][60143] Updated weights for policy 0, policy_version 17612 (0.0009) +[2023-10-09 04:51:53,367][60144] Updated weights for policy 1, policy_version 17802 (0.0008) +[2023-10-09 04:51:53,715][60143] Updated weights for policy 0, policy_version 17622 (0.0009) +[2023-10-09 04:51:53,730][60144] Updated weights for policy 1, policy_version 17812 (0.0009) +[2023-10-09 04:51:54,085][60143] Updated weights for policy 0, policy_version 17632 (0.0007) +[2023-10-09 04:51:54,104][60144] Updated weights for policy 1, policy_version 17822 (0.0010) +[2023-10-09 04:51:56,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 36306944. Throughput: 0: 1688.7, 1: 1698.7. Samples: 9083310. Policy #0 lag: (min: 33.0, avg: 54.3, max: 56.0) +[2023-10-09 04:51:56,052][59242] Avg episode reward: [(0, '23.010'), (1, '23.920')] +[2023-10-09 04:51:58,013][60143] Updated weights for policy 0, policy_version 17642 (0.0009) +[2023-10-09 04:51:58,071][60144] Updated weights for policy 1, policy_version 17832 (0.0008) +[2023-10-09 04:51:58,384][60143] Updated weights for policy 0, policy_version 17652 (0.0009) +[2023-10-09 04:51:58,433][60144] Updated weights for policy 1, policy_version 17842 (0.0008) +[2023-10-09 04:51:58,754][60143] Updated weights for policy 0, policy_version 17662 (0.0007) +[2023-10-09 04:51:58,796][60144] Updated weights for policy 1, policy_version 17852 (0.0009) +[2023-10-09 04:52:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 36372480. Throughput: 0: 1707.4, 1: 1717.6. Samples: 9104258. Policy #0 lag: (min: 33.0, avg: 54.3, max: 56.0) +[2023-10-09 04:52:01,053][59242] Avg episode reward: [(0, '23.000'), (1, '24.200')] +[2023-10-09 04:52:02,677][60143] Updated weights for policy 0, policy_version 17672 (0.0008) +[2023-10-09 04:52:02,716][60144] Updated weights for policy 1, policy_version 17862 (0.0008) +[2023-10-09 04:52:03,045][60143] Updated weights for policy 0, policy_version 17682 (0.0009) +[2023-10-09 04:52:03,084][60144] Updated weights for policy 1, policy_version 17872 (0.0007) +[2023-10-09 04:52:03,416][60143] Updated weights for policy 0, policy_version 17692 (0.0008) +[2023-10-09 04:52:03,440][60144] Updated weights for policy 1, policy_version 17882 (0.0007) +[2023-10-09 04:52:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 36438016. Throughput: 0: 1678.2, 1: 1700.9. Samples: 9113964. Policy #0 lag: (min: 33.0, avg: 54.3, max: 56.0) +[2023-10-09 04:52:06,053][59242] Avg episode reward: [(0, '23.660'), (1, '23.290')] +[2023-10-09 04:52:07,294][60143] Updated weights for policy 0, policy_version 17702 (0.0008) +[2023-10-09 04:52:07,447][60144] Updated weights for policy 1, policy_version 17892 (0.0007) +[2023-10-09 04:52:07,666][60143] Updated weights for policy 0, policy_version 17712 (0.0007) +[2023-10-09 04:52:07,811][60144] Updated weights for policy 1, policy_version 17902 (0.0008) +[2023-10-09 04:52:08,036][60143] Updated weights for policy 0, policy_version 17722 (0.0008) +[2023-10-09 04:52:08,182][60144] Updated weights for policy 1, policy_version 17912 (0.0009) +[2023-10-09 04:52:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 36503552. Throughput: 0: 1696.4, 1: 1703.3. Samples: 9134830. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) +[2023-10-09 04:52:11,053][59242] Avg episode reward: [(0, '23.650'), (1, '23.870')] +[2023-10-09 04:52:12,117][60144] Updated weights for policy 1, policy_version 17922 (0.0007) +[2023-10-09 04:52:12,251][60143] Updated weights for policy 0, policy_version 17732 (0.0008) +[2023-10-09 04:52:12,483][60144] Updated weights for policy 1, policy_version 17932 (0.0007) +[2023-10-09 04:52:12,618][60143] Updated weights for policy 0, policy_version 17742 (0.0009) +[2023-10-09 04:52:12,845][60144] Updated weights for policy 1, policy_version 17942 (0.0008) +[2023-10-09 04:52:12,982][60143] Updated weights for policy 0, policy_version 17752 (0.0008) +[2023-10-09 04:52:13,214][60144] Updated weights for policy 1, policy_version 17952 (0.0009) +[2023-10-09 04:52:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 36569088. Throughput: 0: 1701.3, 1: 1724.4. Samples: 9155926. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) +[2023-10-09 04:52:16,052][59242] Avg episode reward: [(0, '22.090'), (1, '23.370')] +[2023-10-09 04:52:16,995][60143] Updated weights for policy 0, policy_version 17762 (0.0007) +[2023-10-09 04:52:17,082][60144] Updated weights for policy 1, policy_version 17962 (0.0008) +[2023-10-09 04:52:17,367][60143] Updated weights for policy 0, policy_version 17772 (0.0007) +[2023-10-09 04:52:17,458][60144] Updated weights for policy 1, policy_version 17972 (0.0009) +[2023-10-09 04:52:17,742][60143] Updated weights for policy 0, policy_version 17782 (0.0007) +[2023-10-09 04:52:17,822][60144] Updated weights for policy 1, policy_version 17982 (0.0009) +[2023-10-09 04:52:18,103][60143] Updated weights for policy 0, policy_version 17792 (0.0008) +[2023-10-09 04:52:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 36634624. Throughput: 0: 1683.1, 1: 1696.3. Samples: 9165248. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) +[2023-10-09 04:52:21,053][59242] Avg episode reward: [(0, '22.510'), (1, '24.250')] +[2023-10-09 04:52:21,826][60144] Updated weights for policy 1, policy_version 17992 (0.0009) +[2023-10-09 04:52:22,113][60143] Updated weights for policy 0, policy_version 17802 (0.0007) +[2023-10-09 04:52:22,195][60144] Updated weights for policy 1, policy_version 18002 (0.0010) +[2023-10-09 04:52:22,490][60143] Updated weights for policy 0, policy_version 17812 (0.0008) +[2023-10-09 04:52:22,573][60144] Updated weights for policy 1, policy_version 18012 (0.0007) +[2023-10-09 04:52:22,866][60143] Updated weights for policy 0, policy_version 17822 (0.0009) +[2023-10-09 04:52:26,052][59242] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 36700160. Throughput: 0: 1700.3, 1: 1722.9. Samples: 9186254. Policy #0 lag: (min: 18.0, avg: 20.8, max: 50.0) +[2023-10-09 04:52:26,054][59242] Avg episode reward: [(0, '22.070'), (1, '24.490')] +[2023-10-09 04:52:26,584][60144] Updated weights for policy 1, policy_version 18022 (0.0008) +[2023-10-09 04:52:26,661][60143] Updated weights for policy 0, policy_version 17832 (0.0008) +[2023-10-09 04:52:26,958][60144] Updated weights for policy 1, policy_version 18032 (0.0010) +[2023-10-09 04:52:27,029][60143] Updated weights for policy 0, policy_version 17842 (0.0007) +[2023-10-09 04:52:27,333][60144] Updated weights for policy 1, policy_version 18042 (0.0007) +[2023-10-09 04:52:27,406][60143] Updated weights for policy 0, policy_version 17852 (0.0009) +[2023-10-09 04:52:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 36765696. Throughput: 0: 1705.3, 1: 1725.1. Samples: 9207660. Policy #0 lag: (min: 18.0, avg: 20.8, max: 50.0) +[2023-10-09 04:52:31,053][59242] Avg episode reward: [(0, '22.020'), (1, '24.300')] +[2023-10-09 04:52:31,236][60144] Updated weights for policy 1, policy_version 18052 (0.0008) +[2023-10-09 04:52:31,424][60143] Updated weights for policy 0, policy_version 17862 (0.0007) +[2023-10-09 04:52:31,610][60144] Updated weights for policy 1, policy_version 18062 (0.0009) +[2023-10-09 04:52:31,796][60143] Updated weights for policy 0, policy_version 17872 (0.0007) +[2023-10-09 04:52:31,975][60144] Updated weights for policy 1, policy_version 18072 (0.0009) +[2023-10-09 04:52:32,179][60143] Updated weights for policy 0, policy_version 17882 (0.0007) +[2023-10-09 04:52:35,952][60144] Updated weights for policy 1, policy_version 18082 (0.0008) +[2023-10-09 04:52:36,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 36831232. Throughput: 0: 1691.4, 1: 1709.0. Samples: 9216940. Policy #0 lag: (min: 18.0, avg: 20.8, max: 50.0) +[2023-10-09 04:52:36,053][59242] Avg episode reward: [(0, '22.380'), (1, '23.220')] +[2023-10-09 04:52:36,194][60143] Updated weights for policy 0, policy_version 17892 (0.0007) +[2023-10-09 04:52:36,324][60144] Updated weights for policy 1, policy_version 18092 (0.0007) +[2023-10-09 04:52:36,564][60143] Updated weights for policy 0, policy_version 17902 (0.0007) +[2023-10-09 04:52:36,691][60144] Updated weights for policy 1, policy_version 18102 (0.0008) +[2023-10-09 04:52:36,939][60143] Updated weights for policy 0, policy_version 17912 (0.0007) +[2023-10-09 04:52:37,056][60144] Updated weights for policy 1, policy_version 18112 (0.0009) +[2023-10-09 04:52:40,825][60143] Updated weights for policy 0, policy_version 17922 (0.0009) +[2023-10-09 04:52:40,967][60144] Updated weights for policy 1, policy_version 18122 (0.0008) +[2023-10-09 04:52:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 36896768. Throughput: 0: 1704.5, 1: 1730.8. Samples: 9237900. Policy #0 lag: (min: 22.0, avg: 26.0, max: 54.0) +[2023-10-09 04:52:41,053][59242] Avg episode reward: [(0, '21.000'), (1, '24.040')] +[2023-10-09 04:52:41,200][60143] Updated weights for policy 0, policy_version 17932 (0.0009) +[2023-10-09 04:52:41,327][60144] Updated weights for policy 1, policy_version 18132 (0.0008) +[2023-10-09 04:52:41,564][60143] Updated weights for policy 0, policy_version 17942 (0.0010) +[2023-10-09 04:52:41,688][60144] Updated weights for policy 1, policy_version 18142 (0.0007) +[2023-10-09 04:52:41,931][60143] Updated weights for policy 0, policy_version 17952 (0.0008) +[2023-10-09 04:52:45,689][60144] Updated weights for policy 1, policy_version 18152 (0.0007) +[2023-10-09 04:52:45,878][60143] Updated weights for policy 0, policy_version 17962 (0.0010) +[2023-10-09 04:52:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 36962304. Throughput: 0: 1713.5, 1: 1724.7. Samples: 9258978. Policy #0 lag: (min: 22.0, avg: 26.0, max: 54.0) +[2023-10-09 04:52:46,052][59242] Avg episode reward: [(0, '20.820'), (1, '23.410')] +[2023-10-09 04:52:46,060][60144] Updated weights for policy 1, policy_version 18162 (0.0008) +[2023-10-09 04:52:46,249][60143] Updated weights for policy 0, policy_version 17972 (0.0007) +[2023-10-09 04:52:46,422][60144] Updated weights for policy 1, policy_version 18172 (0.0009) +[2023-10-09 04:52:46,619][60143] Updated weights for policy 0, policy_version 17982 (0.0008) +[2023-10-09 04:52:50,446][60144] Updated weights for policy 1, policy_version 18182 (0.0009) +[2023-10-09 04:52:50,665][60143] Updated weights for policy 0, policy_version 17992 (0.0008) +[2023-10-09 04:52:50,817][60144] Updated weights for policy 1, policy_version 18192 (0.0008) +[2023-10-09 04:52:51,022][60143] Updated weights for policy 0, policy_version 18002 (0.0007) +[2023-10-09 04:52:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 37027840. Throughput: 0: 1707.9, 1: 1723.9. Samples: 9268392. Policy #0 lag: (min: 22.0, avg: 26.0, max: 54.0) +[2023-10-09 04:52:51,053][59242] Avg episode reward: [(0, '21.830'), (1, '23.680')] +[2023-10-09 04:52:51,184][60144] Updated weights for policy 1, policy_version 18202 (0.0009) +[2023-10-09 04:52:51,390][60143] Updated weights for policy 0, policy_version 18012 (0.0007) +[2023-10-09 04:52:55,091][60144] Updated weights for policy 1, policy_version 18212 (0.0009) +[2023-10-09 04:52:55,469][60144] Updated weights for policy 1, policy_version 18222 (0.0007) +[2023-10-09 04:52:55,533][60143] Updated weights for policy 0, policy_version 18022 (0.0010) +[2023-10-09 04:52:55,833][60144] Updated weights for policy 1, policy_version 18232 (0.0009) +[2023-10-09 04:52:55,909][60143] Updated weights for policy 0, policy_version 18032 (0.0008) +[2023-10-09 04:52:56,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13662.6). Total num frames: 37093376. Throughput: 0: 1708.9, 1: 1731.7. Samples: 9289660. Policy #0 lag: (min: 24.0, avg: 43.1, max: 56.0) +[2023-10-09 04:52:56,053][59242] Avg episode reward: [(0, '22.020'), (1, '23.790')] +[2023-10-09 04:52:56,282][60143] Updated weights for policy 0, policy_version 18042 (0.0009) +[2023-10-09 04:52:59,694][60144] Updated weights for policy 1, policy_version 18242 (0.0008) +[2023-10-09 04:53:00,075][60144] Updated weights for policy 1, policy_version 18252 (0.0007) +[2023-10-09 04:53:00,296][60143] Updated weights for policy 0, policy_version 18052 (0.0009) +[2023-10-09 04:53:00,446][60144] Updated weights for policy 1, policy_version 18262 (0.0009) +[2023-10-09 04:53:00,662][60143] Updated weights for policy 0, policy_version 18062 (0.0007) +[2023-10-09 04:53:00,811][60144] Updated weights for policy 1, policy_version 18272 (0.0008) +[2023-10-09 04:53:01,033][60143] Updated weights for policy 0, policy_version 18072 (0.0007) +[2023-10-09 04:53:01,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 37191680. Throughput: 0: 1706.1, 1: 1709.6. Samples: 9309632. Policy #0 lag: (min: 24.0, avg: 43.1, max: 56.0) +[2023-10-09 04:53:01,053][59242] Avg episode reward: [(0, '23.050'), (1, '23.800')] +[2023-10-09 04:53:04,789][60144] Updated weights for policy 1, policy_version 18282 (0.0009) +[2023-10-09 04:53:05,005][60143] Updated weights for policy 0, policy_version 18082 (0.0008) +[2023-10-09 04:53:05,162][60144] Updated weights for policy 1, policy_version 18292 (0.0009) +[2023-10-09 04:53:05,376][60143] Updated weights for policy 0, policy_version 18092 (0.0009) +[2023-10-09 04:53:05,533][60144] Updated weights for policy 1, policy_version 18302 (0.0008) +[2023-10-09 04:53:05,748][60143] Updated weights for policy 0, policy_version 18102 (0.0007) +[2023-10-09 04:53:06,052][59242] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 37257216. Throughput: 0: 1716.8, 1: 1732.3. Samples: 9320454. Policy #0 lag: (min: 24.0, avg: 43.1, max: 56.0) +[2023-10-09 04:53:06,053][59242] Avg episode reward: [(0, '22.930'), (1, '23.000')] +[2023-10-09 04:53:06,114][60143] Updated weights for policy 0, policy_version 18112 (0.0008) +[2023-10-09 04:53:09,291][60144] Updated weights for policy 1, policy_version 18312 (0.0008) +[2023-10-09 04:53:09,653][60144] Updated weights for policy 1, policy_version 18322 (0.0009) +[2023-10-09 04:53:10,023][60144] Updated weights for policy 1, policy_version 18332 (0.0008) +[2023-10-09 04:53:10,279][60143] Updated weights for policy 0, policy_version 18122 (0.0010) +[2023-10-09 04:53:10,649][60143] Updated weights for policy 0, policy_version 18132 (0.0009) +[2023-10-09 04:53:11,021][60143] Updated weights for policy 0, policy_version 18142 (0.0008) +[2023-10-09 04:53:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 37322752. Throughput: 0: 1718.1, 1: 1717.1. Samples: 9340836. Policy #0 lag: (min: 31.0, avg: 31.3, max: 44.0) +[2023-10-09 04:53:11,053][59242] Avg episode reward: [(0, '23.760'), (1, '23.090')] +[2023-10-09 04:53:13,866][60144] Updated weights for policy 1, policy_version 18342 (0.0009) +[2023-10-09 04:53:14,238][60144] Updated weights for policy 1, policy_version 18352 (0.0008) +[2023-10-09 04:53:14,612][60144] Updated weights for policy 1, policy_version 18362 (0.0010) +[2023-10-09 04:53:14,975][60143] Updated weights for policy 0, policy_version 18152 (0.0009) +[2023-10-09 04:53:15,338][60143] Updated weights for policy 0, policy_version 18162 (0.0009) +[2023-10-09 04:53:15,705][60143] Updated weights for policy 0, policy_version 18172 (0.0008) +[2023-10-09 04:53:16,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 37421056. Throughput: 0: 1693.8, 1: 1707.1. Samples: 9360702. Policy #0 lag: (min: 31.0, avg: 31.3, max: 44.0) +[2023-10-09 04:53:16,052][59242] Avg episode reward: [(0, '23.490'), (1, '23.670')] +[2023-10-09 04:53:18,649][60144] Updated weights for policy 1, policy_version 18372 (0.0009) +[2023-10-09 04:53:19,021][60144] Updated weights for policy 1, policy_version 18382 (0.0009) +[2023-10-09 04:53:19,382][60144] Updated weights for policy 1, policy_version 18392 (0.0010) +[2023-10-09 04:53:19,740][60143] Updated weights for policy 0, policy_version 18182 (0.0010) +[2023-10-09 04:53:20,113][60143] Updated weights for policy 0, policy_version 18192 (0.0009) +[2023-10-09 04:53:20,485][60143] Updated weights for policy 0, policy_version 18202 (0.0009) +[2023-10-09 04:53:21,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 37486592. Throughput: 0: 1714.2, 1: 1731.6. Samples: 9372002. Policy #0 lag: (min: 27.0, avg: 47.1, max: 48.0) +[2023-10-09 04:53:21,053][59242] Avg episode reward: [(0, '23.100'), (1, '23.140')] +[2023-10-09 04:53:23,102][60144] Updated weights for policy 1, policy_version 18402 (0.0009) +[2023-10-09 04:53:23,478][60144] Updated weights for policy 1, policy_version 18412 (0.0009) +[2023-10-09 04:53:23,852][60144] Updated weights for policy 1, policy_version 18422 (0.0009) +[2023-10-09 04:53:24,213][60144] Updated weights for policy 1, policy_version 18432 (0.0010) +[2023-10-09 04:53:24,351][60143] Updated weights for policy 0, policy_version 18212 (0.0008) +[2023-10-09 04:53:24,742][60143] Updated weights for policy 0, policy_version 18222 (0.0007) +[2023-10-09 04:53:25,119][60143] Updated weights for policy 0, policy_version 18232 (0.0009) +[2023-10-09 04:53:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.6, 300 sec: 13773.7). Total num frames: 37552128. Throughput: 0: 1710.1, 1: 1706.0. Samples: 9391624. Policy #0 lag: (min: 27.0, avg: 47.1, max: 48.0) +[2023-10-09 04:53:26,053][59242] Avg episode reward: [(0, '22.730'), (1, '24.140')] +[2023-10-09 04:53:28,078][60144] Updated weights for policy 1, policy_version 18442 (0.0008) +[2023-10-09 04:53:28,453][60144] Updated weights for policy 1, policy_version 18452 (0.0007) +[2023-10-09 04:53:28,813][60144] Updated weights for policy 1, policy_version 18462 (0.0007) +[2023-10-09 04:53:28,891][60143] Updated weights for policy 0, policy_version 18242 (0.0009) +[2023-10-09 04:53:29,258][60143] Updated weights for policy 0, policy_version 18252 (0.0010) +[2023-10-09 04:53:29,631][60143] Updated weights for policy 0, policy_version 18262 (0.0007) +[2023-10-09 04:53:29,994][60143] Updated weights for policy 0, policy_version 18272 (0.0009) +[2023-10-09 04:53:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 37617664. Throughput: 0: 1685.1, 1: 1713.6. Samples: 9411920. Policy #0 lag: (min: 27.0, avg: 47.1, max: 48.0) +[2023-10-09 04:53:31,053][59242] Avg episode reward: [(0, '23.180'), (1, '23.200')] +[2023-10-09 04:53:31,065][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000018272_18710528.pth... +[2023-10-09 04:53:31,065][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000018464_18907136.pth... +[2023-10-09 04:53:31,096][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000016672_17072128.pth +[2023-10-09 04:53:31,099][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000016864_17268736.pth +[2023-10-09 04:53:33,049][60144] Updated weights for policy 1, policy_version 18472 (0.0009) +[2023-10-09 04:53:33,426][60144] Updated weights for policy 1, policy_version 18482 (0.0007) +[2023-10-09 04:53:33,793][60144] Updated weights for policy 1, policy_version 18492 (0.0007) +[2023-10-09 04:53:34,012][60143] Updated weights for policy 0, policy_version 18282 (0.0010) +[2023-10-09 04:53:34,380][60143] Updated weights for policy 0, policy_version 18292 (0.0011) +[2023-10-09 04:53:34,757][60143] Updated weights for policy 0, policy_version 18302 (0.0010) +[2023-10-09 04:53:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 37683200. Throughput: 0: 1716.6, 1: 1717.5. Samples: 9422926. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-09 04:53:36,053][59242] Avg episode reward: [(0, '23.450'), (1, '23.050')] +[2023-10-09 04:53:37,782][60144] Updated weights for policy 1, policy_version 18502 (0.0009) +[2023-10-09 04:53:38,139][60144] Updated weights for policy 1, policy_version 18512 (0.0008) +[2023-10-09 04:53:38,511][60144] Updated weights for policy 1, policy_version 18522 (0.0008) +[2023-10-09 04:53:38,824][60143] Updated weights for policy 0, policy_version 18312 (0.0009) +[2023-10-09 04:53:39,185][60143] Updated weights for policy 0, policy_version 18322 (0.0010) +[2023-10-09 04:53:39,559][60143] Updated weights for policy 0, policy_version 18332 (0.0010) +[2023-10-09 04:53:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 37748736. Throughput: 0: 1690.5, 1: 1702.8. Samples: 9442360. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-09 04:53:41,053][59242] Avg episode reward: [(0, '23.180'), (1, '23.070')] +[2023-10-09 04:53:42,591][60144] Updated weights for policy 1, policy_version 18532 (0.0008) +[2023-10-09 04:53:42,969][60144] Updated weights for policy 1, policy_version 18542 (0.0011) +[2023-10-09 04:53:43,335][60144] Updated weights for policy 1, policy_version 18552 (0.0008) +[2023-10-09 04:53:43,500][60143] Updated weights for policy 0, policy_version 18342 (0.0009) +[2023-10-09 04:53:43,871][60143] Updated weights for policy 0, policy_version 18352 (0.0009) +[2023-10-09 04:53:44,244][60143] Updated weights for policy 0, policy_version 18362 (0.0009) +[2023-10-09 04:53:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 37814272. Throughput: 0: 1692.2, 1: 1721.2. Samples: 9463234. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-09 04:53:46,053][59242] Avg episode reward: [(0, '22.420'), (1, '21.820')] +[2023-10-09 04:53:47,511][60144] Updated weights for policy 1, policy_version 18562 (0.0007) +[2023-10-09 04:53:47,882][60144] Updated weights for policy 1, policy_version 18572 (0.0009) +[2023-10-09 04:53:48,241][60144] Updated weights for policy 1, policy_version 18582 (0.0008) +[2023-10-09 04:53:48,273][60143] Updated weights for policy 0, policy_version 18372 (0.0008) +[2023-10-09 04:53:48,603][60144] Updated weights for policy 1, policy_version 18592 (0.0007) +[2023-10-09 04:53:48,644][60143] Updated weights for policy 0, policy_version 18382 (0.0008) +[2023-10-09 04:53:49,006][60143] Updated weights for policy 0, policy_version 18392 (0.0010) +[2023-10-09 04:53:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 37879808. Throughput: 0: 1695.9, 1: 1696.8. Samples: 9473122. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:53:51,053][59242] Avg episode reward: [(0, '22.650'), (1, '22.840')] +[2023-10-09 04:53:52,628][60144] Updated weights for policy 1, policy_version 18602 (0.0010) +[2023-10-09 04:53:52,997][60144] Updated weights for policy 1, policy_version 18612 (0.0010) +[2023-10-09 04:53:53,055][60143] Updated weights for policy 0, policy_version 18402 (0.0010) +[2023-10-09 04:53:53,367][60144] Updated weights for policy 1, policy_version 18622 (0.0007) +[2023-10-09 04:53:53,434][60143] Updated weights for policy 0, policy_version 18412 (0.0007) +[2023-10-09 04:53:53,811][60143] Updated weights for policy 0, policy_version 18422 (0.0008) +[2023-10-09 04:53:54,178][60143] Updated weights for policy 0, policy_version 18432 (0.0009) +[2023-10-09 04:53:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 37945344. Throughput: 0: 1678.6, 1: 1708.6. Samples: 9493262. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:53:56,053][59242] Avg episode reward: [(0, '22.220'), (1, '22.390')] +[2023-10-09 04:53:57,320][60144] Updated weights for policy 1, policy_version 18632 (0.0009) +[2023-10-09 04:53:57,689][60144] Updated weights for policy 1, policy_version 18642 (0.0008) +[2023-10-09 04:53:58,056][60144] Updated weights for policy 1, policy_version 18652 (0.0010) +[2023-10-09 04:53:58,203][60143] Updated weights for policy 0, policy_version 18442 (0.0009) +[2023-10-09 04:53:58,585][60143] Updated weights for policy 0, policy_version 18452 (0.0010) +[2023-10-09 04:53:58,951][60143] Updated weights for policy 0, policy_version 18462 (0.0010) +[2023-10-09 04:54:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 38010880. Throughput: 0: 1697.6, 1: 1713.2. Samples: 9514184. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:54:01,053][59242] Avg episode reward: [(0, '22.790'), (1, '22.590')] +[2023-10-09 04:54:02,032][60144] Updated weights for policy 1, policy_version 18662 (0.0008) +[2023-10-09 04:54:02,397][60144] Updated weights for policy 1, policy_version 18672 (0.0007) +[2023-10-09 04:54:02,763][60144] Updated weights for policy 1, policy_version 18682 (0.0007) +[2023-10-09 04:54:02,953][60143] Updated weights for policy 0, policy_version 18472 (0.0008) +[2023-10-09 04:54:03,313][60143] Updated weights for policy 0, policy_version 18482 (0.0008) +[2023-10-09 04:54:03,681][60143] Updated weights for policy 0, policy_version 18492 (0.0007) +[2023-10-09 04:54:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 38076416. Throughput: 0: 1686.2, 1: 1690.1. Samples: 9523932. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 04:54:06,053][59242] Avg episode reward: [(0, '21.160'), (1, '22.640')] +[2023-10-09 04:54:06,620][60144] Updated weights for policy 1, policy_version 18692 (0.0008) +[2023-10-09 04:54:06,997][60144] Updated weights for policy 1, policy_version 18702 (0.0008) +[2023-10-09 04:54:07,365][60144] Updated weights for policy 1, policy_version 18712 (0.0008) +[2023-10-09 04:54:07,702][60143] Updated weights for policy 0, policy_version 18502 (0.0007) +[2023-10-09 04:54:08,082][60143] Updated weights for policy 0, policy_version 18512 (0.0009) +[2023-10-09 04:54:08,450][60143] Updated weights for policy 0, policy_version 18522 (0.0010) +[2023-10-09 04:54:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 38141952. Throughput: 0: 1686.8, 1: 1715.6. Samples: 9544736. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 04:54:11,053][59242] Avg episode reward: [(0, '21.210'), (1, '22.590')] +[2023-10-09 04:54:11,347][60144] Updated weights for policy 1, policy_version 18722 (0.0007) +[2023-10-09 04:54:11,711][60144] Updated weights for policy 1, policy_version 18732 (0.0008) +[2023-10-09 04:54:12,082][60144] Updated weights for policy 1, policy_version 18742 (0.0009) +[2023-10-09 04:54:12,452][60144] Updated weights for policy 1, policy_version 18752 (0.0009) +[2023-10-09 04:54:12,525][60143] Updated weights for policy 0, policy_version 18532 (0.0007) +[2023-10-09 04:54:12,895][60143] Updated weights for policy 0, policy_version 18542 (0.0009) +[2023-10-09 04:54:13,263][60143] Updated weights for policy 0, policy_version 18552 (0.0010) +[2023-10-09 04:54:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 38207488. Throughput: 0: 1706.8, 1: 1709.5. Samples: 9565652. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 04:54:16,053][59242] Avg episode reward: [(0, '20.740'), (1, '21.770')] +[2023-10-09 04:54:16,421][60144] Updated weights for policy 1, policy_version 18762 (0.0007) +[2023-10-09 04:54:16,780][60144] Updated weights for policy 1, policy_version 18772 (0.0008) +[2023-10-09 04:54:17,154][60144] Updated weights for policy 1, policy_version 18782 (0.0009) +[2023-10-09 04:54:17,220][60143] Updated weights for policy 0, policy_version 18562 (0.0011) +[2023-10-09 04:54:17,594][60143] Updated weights for policy 0, policy_version 18572 (0.0008) +[2023-10-09 04:54:17,955][60143] Updated weights for policy 0, policy_version 18582 (0.0008) +[2023-10-09 04:54:18,339][60143] Updated weights for policy 0, policy_version 18592 (0.0009) +[2023-10-09 04:54:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 38273024. Throughput: 0: 1676.5, 1: 1701.8. Samples: 9574946. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:54:21,052][59242] Avg episode reward: [(0, '20.490'), (1, '21.160')] +[2023-10-09 04:54:21,182][60144] Updated weights for policy 1, policy_version 18792 (0.0009) +[2023-10-09 04:54:21,548][60144] Updated weights for policy 1, policy_version 18802 (0.0008) +[2023-10-09 04:54:21,914][60144] Updated weights for policy 1, policy_version 18812 (0.0008) +[2023-10-09 04:54:22,275][60143] Updated weights for policy 0, policy_version 18602 (0.0007) +[2023-10-09 04:54:22,654][60143] Updated weights for policy 0, policy_version 18612 (0.0009) +[2023-10-09 04:54:23,033][60143] Updated weights for policy 0, policy_version 18622 (0.0009) +[2023-10-09 04:54:25,858][60144] Updated weights for policy 1, policy_version 18822 (0.0009) +[2023-10-09 04:54:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 38338560. Throughput: 0: 1704.0, 1: 1713.7. Samples: 9596156. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:54:26,052][59242] Avg episode reward: [(0, '20.540'), (1, '19.880')] +[2023-10-09 04:54:26,218][60144] Updated weights for policy 1, policy_version 18832 (0.0008) +[2023-10-09 04:54:26,585][60144] Updated weights for policy 1, policy_version 18842 (0.0010) +[2023-10-09 04:54:27,124][60143] Updated weights for policy 0, policy_version 18632 (0.0010) +[2023-10-09 04:54:27,493][60143] Updated weights for policy 0, policy_version 18642 (0.0011) +[2023-10-09 04:54:27,866][60143] Updated weights for policy 0, policy_version 18652 (0.0011) +[2023-10-09 04:54:30,527][60144] Updated weights for policy 1, policy_version 18852 (0.0009) +[2023-10-09 04:54:30,895][60144] Updated weights for policy 1, policy_version 18862 (0.0008) +[2023-10-09 04:54:31,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 38404096. Throughput: 0: 1710.0, 1: 1712.0. Samples: 9617228. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:54:31,053][59242] Avg episode reward: [(0, '22.520'), (1, '21.630')] +[2023-10-09 04:54:31,254][60144] Updated weights for policy 1, policy_version 18872 (0.0009) +[2023-10-09 04:54:31,785][60143] Updated weights for policy 0, policy_version 18662 (0.0008) +[2023-10-09 04:54:32,167][60143] Updated weights for policy 0, policy_version 18672 (0.0007) +[2023-10-09 04:54:32,546][60143] Updated weights for policy 0, policy_version 18682 (0.0007) +[2023-10-09 04:54:35,191][60144] Updated weights for policy 1, policy_version 18882 (0.0008) +[2023-10-09 04:54:35,561][60144] Updated weights for policy 1, policy_version 18892 (0.0009) +[2023-10-09 04:54:35,931][60144] Updated weights for policy 1, policy_version 18902 (0.0009) +[2023-10-09 04:54:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 38469632. Throughput: 0: 1692.4, 1: 1724.4. Samples: 9626874. Policy #0 lag: (min: 11.0, avg: 16.2, max: 43.0) +[2023-10-09 04:54:36,053][59242] Avg episode reward: [(0, '22.620'), (1, '21.940')] +[2023-10-09 04:54:36,299][60144] Updated weights for policy 1, policy_version 18912 (0.0009) +[2023-10-09 04:54:36,715][60143] Updated weights for policy 0, policy_version 18692 (0.0008) +[2023-10-09 04:54:37,087][60143] Updated weights for policy 0, policy_version 18702 (0.0010) +[2023-10-09 04:54:37,454][60143] Updated weights for policy 0, policy_version 18712 (0.0009) +[2023-10-09 04:54:40,354][60144] Updated weights for policy 1, policy_version 18922 (0.0011) +[2023-10-09 04:54:40,717][60144] Updated weights for policy 1, policy_version 18932 (0.0011) +[2023-10-09 04:54:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 38535168. Throughput: 0: 1713.4, 1: 1729.8. Samples: 9648208. Policy #0 lag: (min: 11.0, avg: 16.2, max: 43.0) +[2023-10-09 04:54:41,053][59242] Avg episode reward: [(0, '22.650'), (1, '21.460')] +[2023-10-09 04:54:41,081][60144] Updated weights for policy 1, policy_version 18942 (0.0011) +[2023-10-09 04:54:41,411][60143] Updated weights for policy 0, policy_version 18722 (0.0008) +[2023-10-09 04:54:41,785][60143] Updated weights for policy 0, policy_version 18732 (0.0008) +[2023-10-09 04:54:42,159][60143] Updated weights for policy 0, policy_version 18742 (0.0008) +[2023-10-09 04:54:42,524][60143] Updated weights for policy 0, policy_version 18752 (0.0007) +[2023-10-09 04:54:45,031][60144] Updated weights for policy 1, policy_version 18952 (0.0008) +[2023-10-09 04:54:45,397][60144] Updated weights for policy 1, policy_version 18962 (0.0008) +[2023-10-09 04:54:45,761][60144] Updated weights for policy 1, policy_version 18972 (0.0009) +[2023-10-09 04:54:46,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 38633472. Throughput: 0: 1716.5, 1: 1713.7. Samples: 9668544. Policy #0 lag: (min: 11.0, avg: 16.2, max: 43.0) +[2023-10-09 04:54:46,053][59242] Avg episode reward: [(0, '22.750'), (1, '22.390')] +[2023-10-09 04:54:46,446][60143] Updated weights for policy 0, policy_version 18762 (0.0008) +[2023-10-09 04:54:46,821][60143] Updated weights for policy 0, policy_version 18772 (0.0010) +[2023-10-09 04:54:47,198][60143] Updated weights for policy 0, policy_version 18782 (0.0010) +[2023-10-09 04:54:49,509][60144] Updated weights for policy 1, policy_version 18982 (0.0008) +[2023-10-09 04:54:49,885][60144] Updated weights for policy 1, policy_version 18992 (0.0007) +[2023-10-09 04:54:50,249][60144] Updated weights for policy 1, policy_version 19002 (0.0008) +[2023-10-09 04:54:51,052][59242] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 38699008. Throughput: 0: 1706.3, 1: 1737.8. Samples: 9678916. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) +[2023-10-09 04:54:51,053][59242] Avg episode reward: [(0, '23.040'), (1, '22.810')] +[2023-10-09 04:54:51,160][60143] Updated weights for policy 0, policy_version 18792 (0.0010) +[2023-10-09 04:54:51,536][60143] Updated weights for policy 0, policy_version 18802 (0.0007) +[2023-10-09 04:54:51,911][60143] Updated weights for policy 0, policy_version 18812 (0.0009) +[2023-10-09 04:54:54,174][60144] Updated weights for policy 1, policy_version 19012 (0.0009) +[2023-10-09 04:54:54,541][60144] Updated weights for policy 1, policy_version 19022 (0.0009) +[2023-10-09 04:54:54,906][60144] Updated weights for policy 1, policy_version 19032 (0.0009) +[2023-10-09 04:54:55,816][60143] Updated weights for policy 0, policy_version 18822 (0.0009) +[2023-10-09 04:54:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 38764544. Throughput: 0: 1714.4, 1: 1720.9. Samples: 9699326. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) +[2023-10-09 04:54:56,052][59242] Avg episode reward: [(0, '22.890'), (1, '22.910')] +[2023-10-09 04:54:56,194][60143] Updated weights for policy 0, policy_version 18832 (0.0007) +[2023-10-09 04:54:56,557][60143] Updated weights for policy 0, policy_version 18842 (0.0008) +[2023-10-09 04:54:58,974][60144] Updated weights for policy 1, policy_version 19042 (0.0011) +[2023-10-09 04:54:59,349][60144] Updated weights for policy 1, policy_version 19052 (0.0011) +[2023-10-09 04:54:59,719][60144] Updated weights for policy 1, policy_version 19062 (0.0010) +[2023-10-09 04:55:00,100][60144] Updated weights for policy 1, policy_version 19072 (0.0010) +[2023-10-09 04:55:00,544][60143] Updated weights for policy 0, policy_version 18852 (0.0009) +[2023-10-09 04:55:00,933][60143] Updated weights for policy 0, policy_version 18862 (0.0009) +[2023-10-09 04:55:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 38830080. Throughput: 0: 1714.2, 1: 1706.1. Samples: 9719564. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) +[2023-10-09 04:55:01,053][59242] Avg episode reward: [(0, '22.950'), (1, '23.070')] +[2023-10-09 04:55:01,311][60143] Updated weights for policy 0, policy_version 18872 (0.0009) +[2023-10-09 04:55:04,027][60144] Updated weights for policy 1, policy_version 19082 (0.0008) +[2023-10-09 04:55:04,391][60144] Updated weights for policy 1, policy_version 19092 (0.0009) +[2023-10-09 04:55:04,768][60144] Updated weights for policy 1, policy_version 19102 (0.0009) +[2023-10-09 04:55:05,255][60143] Updated weights for policy 0, policy_version 18882 (0.0010) +[2023-10-09 04:55:05,630][60143] Updated weights for policy 0, policy_version 18892 (0.0011) +[2023-10-09 04:55:05,991][60143] Updated weights for policy 0, policy_version 18902 (0.0010) +[2023-10-09 04:55:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 38895616. Throughput: 0: 1717.0, 1: 1736.7. Samples: 9730362. Policy #0 lag: (min: 23.0, avg: 27.7, max: 55.0) +[2023-10-09 04:55:06,052][59242] Avg episode reward: [(0, '21.210'), (1, '22.770')] +[2023-10-09 04:55:06,363][60143] Updated weights for policy 0, policy_version 18912 (0.0010) +[2023-10-09 04:55:08,814][60144] Updated weights for policy 1, policy_version 19112 (0.0008) +[2023-10-09 04:55:09,175][60144] Updated weights for policy 1, policy_version 19122 (0.0008) +[2023-10-09 04:55:09,544][60144] Updated weights for policy 1, policy_version 19132 (0.0008) +[2023-10-09 04:55:10,493][60143] Updated weights for policy 0, policy_version 18922 (0.0007) +[2023-10-09 04:55:10,862][60143] Updated weights for policy 0, policy_version 18932 (0.0008) +[2023-10-09 04:55:11,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 38961152. Throughput: 0: 1716.9, 1: 1706.8. Samples: 9750226. Policy #0 lag: (min: 23.0, avg: 27.7, max: 55.0) +[2023-10-09 04:55:11,053][59242] Avg episode reward: [(0, '20.890'), (1, '23.340')] +[2023-10-09 04:55:11,228][60143] Updated weights for policy 0, policy_version 18942 (0.0007) +[2023-10-09 04:55:13,581][60144] Updated weights for policy 1, policy_version 19142 (0.0008) +[2023-10-09 04:55:13,948][60144] Updated weights for policy 1, policy_version 19152 (0.0007) +[2023-10-09 04:55:14,322][60144] Updated weights for policy 1, policy_version 19162 (0.0007) +[2023-10-09 04:55:15,145][60143] Updated weights for policy 0, policy_version 18952 (0.0009) +[2023-10-09 04:55:15,518][60143] Updated weights for policy 0, policy_version 18962 (0.0011) +[2023-10-09 04:55:15,886][60143] Updated weights for policy 0, policy_version 18972 (0.0009) +[2023-10-09 04:55:16,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 39059456. Throughput: 0: 1699.6, 1: 1709.2. Samples: 9770624. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-09 04:55:16,053][59242] Avg episode reward: [(0, '21.880'), (1, '24.690')] +[2023-10-09 04:55:18,228][60144] Updated weights for policy 1, policy_version 19172 (0.0008) +[2023-10-09 04:55:18,595][60144] Updated weights for policy 1, policy_version 19182 (0.0007) +[2023-10-09 04:55:18,962][60144] Updated weights for policy 1, policy_version 19192 (0.0008) +[2023-10-09 04:55:19,705][60143] Updated weights for policy 0, policy_version 18982 (0.0008) +[2023-10-09 04:55:20,071][60143] Updated weights for policy 0, policy_version 18992 (0.0007) +[2023-10-09 04:55:20,441][60143] Updated weights for policy 0, policy_version 19002 (0.0007) +[2023-10-09 04:55:21,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 39124992. Throughput: 0: 1716.0, 1: 1718.9. Samples: 9781448. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-09 04:55:21,052][59242] Avg episode reward: [(0, '21.020'), (1, '25.980')] +[2023-10-09 04:55:21,053][60003] Saving new best policy, reward=25.980! +[2023-10-09 04:55:22,860][60144] Updated weights for policy 1, policy_version 19202 (0.0009) +[2023-10-09 04:55:23,224][60144] Updated weights for policy 1, policy_version 19212 (0.0007) +[2023-10-09 04:55:23,600][60144] Updated weights for policy 1, policy_version 19222 (0.0008) +[2023-10-09 04:55:23,960][60144] Updated weights for policy 1, policy_version 19232 (0.0010) +[2023-10-09 04:55:24,421][60143] Updated weights for policy 0, policy_version 19012 (0.0008) +[2023-10-09 04:55:24,794][60143] Updated weights for policy 0, policy_version 19022 (0.0007) +[2023-10-09 04:55:25,163][60143] Updated weights for policy 0, policy_version 19032 (0.0007) +[2023-10-09 04:55:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 39190528. Throughput: 0: 1712.0, 1: 1702.4. Samples: 9801856. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-09 04:55:26,053][59242] Avg episode reward: [(0, '20.620'), (1, '25.560')] +[2023-10-09 04:55:27,916][60144] Updated weights for policy 1, policy_version 19242 (0.0008) +[2023-10-09 04:55:28,285][60144] Updated weights for policy 1, policy_version 19252 (0.0008) +[2023-10-09 04:55:28,658][60144] Updated weights for policy 1, policy_version 19262 (0.0009) +[2023-10-09 04:55:29,077][60143] Updated weights for policy 0, policy_version 19042 (0.0010) +[2023-10-09 04:55:29,445][60143] Updated weights for policy 0, policy_version 19052 (0.0008) +[2023-10-09 04:55:29,817][60143] Updated weights for policy 0, policy_version 19062 (0.0010) +[2023-10-09 04:55:30,190][60143] Updated weights for policy 0, policy_version 19072 (0.0008) +[2023-10-09 04:55:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 39256064. Throughput: 0: 1689.4, 1: 1723.8. Samples: 9822136. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-09 04:55:31,053][59242] Avg episode reward: [(0, '20.030'), (1, '26.100')] +[2023-10-09 04:55:31,062][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000019264_19726336.pth... +[2023-10-09 04:55:31,063][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000019072_19529728.pth... +[2023-10-09 04:55:31,102][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000017472_17891328.pth +[2023-10-09 04:55:31,104][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000017664_18087936.pth +[2023-10-09 04:55:31,109][60003] Saving new best policy, reward=26.100! +[2023-10-09 04:55:32,582][60144] Updated weights for policy 1, policy_version 19272 (0.0009) +[2023-10-09 04:55:32,954][60144] Updated weights for policy 1, policy_version 19282 (0.0009) +[2023-10-09 04:55:33,322][60144] Updated weights for policy 1, policy_version 19292 (0.0009) +[2023-10-09 04:55:34,179][60143] Updated weights for policy 0, policy_version 19082 (0.0009) +[2023-10-09 04:55:34,538][60143] Updated weights for policy 0, policy_version 19092 (0.0008) +[2023-10-09 04:55:34,915][60143] Updated weights for policy 0, policy_version 19102 (0.0008) +[2023-10-09 04:55:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 39321600. Throughput: 0: 1722.5, 1: 1695.9. Samples: 9832742. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-09 04:55:36,053][59242] Avg episode reward: [(0, '19.740'), (1, '27.040')] +[2023-10-09 04:55:36,054][60003] Saving new best policy, reward=27.040! +[2023-10-09 04:55:37,305][60144] Updated weights for policy 1, policy_version 19302 (0.0008) +[2023-10-09 04:55:37,677][60144] Updated weights for policy 1, policy_version 19312 (0.0007) +[2023-10-09 04:55:38,041][60144] Updated weights for policy 1, policy_version 19322 (0.0008) +[2023-10-09 04:55:39,130][60143] Updated weights for policy 0, policy_version 19112 (0.0007) +[2023-10-09 04:55:39,501][60143] Updated weights for policy 0, policy_version 19122 (0.0007) +[2023-10-09 04:55:39,871][60143] Updated weights for policy 0, policy_version 19132 (0.0009) +[2023-10-09 04:55:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 39387136. Throughput: 0: 1706.0, 1: 1719.9. Samples: 9853492. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-09 04:55:41,055][59242] Avg episode reward: [(0, '20.250'), (1, '27.350')] +[2023-10-09 04:55:41,056][60003] Saving new best policy, reward=27.350! +[2023-10-09 04:55:41,905][60144] Updated weights for policy 1, policy_version 19332 (0.0007) +[2023-10-09 04:55:42,270][60144] Updated weights for policy 1, policy_version 19342 (0.0008) +[2023-10-09 04:55:42,632][60144] Updated weights for policy 1, policy_version 19352 (0.0008) +[2023-10-09 04:55:43,843][60143] Updated weights for policy 0, policy_version 19142 (0.0009) +[2023-10-09 04:55:44,212][60143] Updated weights for policy 0, policy_version 19152 (0.0008) +[2023-10-09 04:55:44,583][60143] Updated weights for policy 0, policy_version 19162 (0.0010) +[2023-10-09 04:55:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 39452672. Throughput: 0: 1696.0, 1: 1744.9. Samples: 9874406. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-09 04:55:46,053][59242] Avg episode reward: [(0, '20.280'), (1, '27.100')] +[2023-10-09 04:55:46,592][60144] Updated weights for policy 1, policy_version 19362 (0.0009) +[2023-10-09 04:55:46,960][60144] Updated weights for policy 1, policy_version 19372 (0.0008) +[2023-10-09 04:55:47,332][60144] Updated weights for policy 1, policy_version 19382 (0.0007) +[2023-10-09 04:55:47,692][60144] Updated weights for policy 1, policy_version 19392 (0.0009) +[2023-10-09 04:55:48,583][60143] Updated weights for policy 0, policy_version 19172 (0.0010) +[2023-10-09 04:55:48,981][60143] Updated weights for policy 0, policy_version 19182 (0.0008) +[2023-10-09 04:55:49,350][60143] Updated weights for policy 0, policy_version 19192 (0.0008) +[2023-10-09 04:55:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 39518208. Throughput: 0: 1717.6, 1: 1713.3. Samples: 9884756. Policy #0 lag: (min: 22.0, avg: 27.0, max: 54.0) +[2023-10-09 04:55:51,052][59242] Avg episode reward: [(0, '20.100'), (1, '26.240')] +[2023-10-09 04:55:51,465][60144] Updated weights for policy 1, policy_version 19402 (0.0007) +[2023-10-09 04:55:51,822][60144] Updated weights for policy 1, policy_version 19412 (0.0012) +[2023-10-09 04:55:52,194][60144] Updated weights for policy 1, policy_version 19422 (0.0011) +[2023-10-09 04:55:53,241][60143] Updated weights for policy 0, policy_version 19202 (0.0007) +[2023-10-09 04:55:53,614][60143] Updated weights for policy 0, policy_version 19212 (0.0007) +[2023-10-09 04:55:53,991][60143] Updated weights for policy 0, policy_version 19222 (0.0008) +[2023-10-09 04:55:54,364][60143] Updated weights for policy 0, policy_version 19232 (0.0008) +[2023-10-09 04:55:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 39583744. Throughput: 0: 1691.3, 1: 1744.3. Samples: 9904830. Policy #0 lag: (min: 22.0, avg: 27.0, max: 54.0) +[2023-10-09 04:55:56,053][59242] Avg episode reward: [(0, '20.140'), (1, '25.810')] +[2023-10-09 04:55:56,249][60144] Updated weights for policy 1, policy_version 19432 (0.0010) +[2023-10-09 04:55:56,613][60144] Updated weights for policy 1, policy_version 19442 (0.0009) +[2023-10-09 04:55:56,981][60144] Updated weights for policy 1, policy_version 19452 (0.0009) +[2023-10-09 04:55:58,487][60143] Updated weights for policy 0, policy_version 19242 (0.0010) +[2023-10-09 04:55:58,866][60143] Updated weights for policy 0, policy_version 19252 (0.0010) +[2023-10-09 04:55:59,245][60143] Updated weights for policy 0, policy_version 19262 (0.0009) +[2023-10-09 04:56:00,880][60144] Updated weights for policy 1, policy_version 19462 (0.0010) +[2023-10-09 04:56:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 39649280. Throughput: 0: 1704.5, 1: 1748.8. Samples: 9926026. Policy #0 lag: (min: 22.0, avg: 27.0, max: 54.0) +[2023-10-09 04:56:01,053][59242] Avg episode reward: [(0, '20.250'), (1, '25.610')] +[2023-10-09 04:56:01,254][60144] Updated weights for policy 1, policy_version 19472 (0.0008) +[2023-10-09 04:56:01,629][60144] Updated weights for policy 1, policy_version 19482 (0.0009) +[2023-10-09 04:56:03,246][60143] Updated weights for policy 0, policy_version 19272 (0.0008) +[2023-10-09 04:56:03,618][60143] Updated weights for policy 0, policy_version 19282 (0.0008) +[2023-10-09 04:56:03,987][60143] Updated weights for policy 0, policy_version 19292 (0.0008) +[2023-10-09 04:56:05,410][60144] Updated weights for policy 1, policy_version 19492 (0.0009) +[2023-10-09 04:56:05,773][60144] Updated weights for policy 1, policy_version 19502 (0.0009) +[2023-10-09 04:56:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 39714816. Throughput: 0: 1705.1, 1: 1732.1. Samples: 9936120. Policy #0 lag: (min: 31.0, avg: 47.0, max: 63.0) +[2023-10-09 04:56:06,053][59242] Avg episode reward: [(0, '20.490'), (1, '27.100')] +[2023-10-09 04:56:06,143][60144] Updated weights for policy 1, policy_version 19512 (0.0011) +[2023-10-09 04:56:08,034][60143] Updated weights for policy 0, policy_version 19302 (0.0008) +[2023-10-09 04:56:08,406][60143] Updated weights for policy 0, policy_version 19312 (0.0009) +[2023-10-09 04:56:08,791][60143] Updated weights for policy 0, policy_version 19322 (0.0008) +[2023-10-09 04:56:10,219][60144] Updated weights for policy 1, policy_version 19522 (0.0010) +[2023-10-09 04:56:10,591][60144] Updated weights for policy 1, policy_version 19532 (0.0008) +[2023-10-09 04:56:10,963][60144] Updated weights for policy 1, policy_version 19542 (0.0008) +[2023-10-09 04:56:11,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 39780352. Throughput: 0: 1692.3, 1: 1750.4. Samples: 9956774. Policy #0 lag: (min: 31.0, avg: 47.0, max: 63.0) +[2023-10-09 04:56:11,052][59242] Avg episode reward: [(0, '21.580'), (1, '26.930')] +[2023-10-09 04:56:11,325][60144] Updated weights for policy 1, policy_version 19552 (0.0008) +[2023-10-09 04:56:12,689][60143] Updated weights for policy 0, policy_version 19332 (0.0008) +[2023-10-09 04:56:13,067][60143] Updated weights for policy 0, policy_version 19342 (0.0008) +[2023-10-09 04:56:13,442][60143] Updated weights for policy 0, policy_version 19352 (0.0008) +[2023-10-09 04:56:15,280][60144] Updated weights for policy 1, policy_version 19562 (0.0007) +[2023-10-09 04:56:15,644][60144] Updated weights for policy 1, policy_version 19572 (0.0008) +[2023-10-09 04:56:16,012][60144] Updated weights for policy 1, policy_version 19582 (0.0008) +[2023-10-09 04:56:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 39845888. Throughput: 0: 1715.8, 1: 1731.0. Samples: 9977242. Policy #0 lag: (min: 31.0, avg: 47.0, max: 63.0) +[2023-10-09 04:56:16,053][59242] Avg episode reward: [(0, '21.650'), (1, '26.790')] +[2023-10-09 04:56:17,445][60143] Updated weights for policy 0, policy_version 19362 (0.0008) +[2023-10-09 04:56:17,819][60143] Updated weights for policy 0, policy_version 19372 (0.0008) +[2023-10-09 04:56:18,193][60143] Updated weights for policy 0, policy_version 19382 (0.0010) +[2023-10-09 04:56:18,564][60143] Updated weights for policy 0, policy_version 19392 (0.0008) +[2023-10-09 04:56:20,007][60144] Updated weights for policy 1, policy_version 19592 (0.0008) +[2023-10-09 04:56:20,370][60144] Updated weights for policy 1, policy_version 19602 (0.0007) +[2023-10-09 04:56:20,725][60144] Updated weights for policy 1, policy_version 19612 (0.0008) +[2023-10-09 04:56:21,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 39944192. Throughput: 0: 1686.2, 1: 1750.6. Samples: 9987400. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:56:21,052][59242] Avg episode reward: [(0, '20.900'), (1, '26.130')] +[2023-10-09 04:56:22,636][60143] Updated weights for policy 0, policy_version 19402 (0.0007) +[2023-10-09 04:56:22,996][60143] Updated weights for policy 0, policy_version 19412 (0.0009) +[2023-10-09 04:56:23,368][60143] Updated weights for policy 0, policy_version 19422 (0.0008) +[2023-10-09 04:56:24,588][60144] Updated weights for policy 1, policy_version 19622 (0.0008) +[2023-10-09 04:56:24,963][60144] Updated weights for policy 1, policy_version 19632 (0.0009) +[2023-10-09 04:56:25,328][60144] Updated weights for policy 1, policy_version 19642 (0.0007) +[2023-10-09 04:56:26,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 40009728. Throughput: 0: 1699.9, 1: 1738.8. Samples: 10008232. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:56:26,053][59242] Avg episode reward: [(0, '21.040'), (1, '25.900')] +[2023-10-09 04:56:27,248][60143] Updated weights for policy 0, policy_version 19432 (0.0007) +[2023-10-09 04:56:27,612][60143] Updated weights for policy 0, policy_version 19442 (0.0009) +[2023-10-09 04:56:27,976][60143] Updated weights for policy 0, policy_version 19452 (0.0010) +[2023-10-09 04:56:29,225][60144] Updated weights for policy 1, policy_version 19652 (0.0008) +[2023-10-09 04:56:29,588][60144] Updated weights for policy 1, policy_version 19662 (0.0007) +[2023-10-09 04:56:29,953][60144] Updated weights for policy 1, policy_version 19672 (0.0007) +[2023-10-09 04:56:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 40075264. Throughput: 0: 1712.3, 1: 1707.3. Samples: 10028288. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:56:31,053][59242] Avg episode reward: [(0, '21.980'), (1, '24.880')] +[2023-10-09 04:56:32,017][60143] Updated weights for policy 0, policy_version 19462 (0.0008) +[2023-10-09 04:56:32,383][60143] Updated weights for policy 0, policy_version 19472 (0.0007) +[2023-10-09 04:56:32,760][60143] Updated weights for policy 0, policy_version 19482 (0.0008) +[2023-10-09 04:56:33,927][60144] Updated weights for policy 1, policy_version 19682 (0.0008) +[2023-10-09 04:56:34,296][60144] Updated weights for policy 1, policy_version 19692 (0.0010) +[2023-10-09 04:56:34,662][60144] Updated weights for policy 1, policy_version 19702 (0.0011) +[2023-10-09 04:56:35,029][60144] Updated weights for policy 1, policy_version 19712 (0.0009) +[2023-10-09 04:56:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 40140800. Throughput: 0: 1687.3, 1: 1739.8. Samples: 10038976. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:56:36,053][59242] Avg episode reward: [(0, '24.250'), (1, '24.440')] +[2023-10-09 04:56:36,797][60143] Updated weights for policy 0, policy_version 19492 (0.0008) +[2023-10-09 04:56:37,164][60143] Updated weights for policy 0, policy_version 19502 (0.0010) +[2023-10-09 04:56:37,528][60143] Updated weights for policy 0, policy_version 19512 (0.0008) +[2023-10-09 04:56:39,052][60144] Updated weights for policy 1, policy_version 19722 (0.0007) +[2023-10-09 04:56:39,421][60144] Updated weights for policy 1, policy_version 19732 (0.0007) +[2023-10-09 04:56:39,775][60144] Updated weights for policy 1, policy_version 19742 (0.0009) +[2023-10-09 04:56:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 40206336. Throughput: 0: 1711.1, 1: 1717.7. Samples: 10059126. Policy #0 lag: (min: 26.0, avg: 28.8, max: 53.0) +[2023-10-09 04:56:41,053][59242] Avg episode reward: [(0, '25.170'), (1, '24.440')] +[2023-10-09 04:56:41,593][60143] Updated weights for policy 0, policy_version 19522 (0.0007) +[2023-10-09 04:56:41,999][60143] Updated weights for policy 0, policy_version 19532 (0.0010) +[2023-10-09 04:56:42,360][60143] Updated weights for policy 0, policy_version 19542 (0.0009) +[2023-10-09 04:56:42,725][60143] Updated weights for policy 0, policy_version 19552 (0.0007) +[2023-10-09 04:56:43,523][60144] Updated weights for policy 1, policy_version 19752 (0.0008) +[2023-10-09 04:56:43,897][60144] Updated weights for policy 1, policy_version 19762 (0.0007) +[2023-10-09 04:56:44,255][60144] Updated weights for policy 1, policy_version 19772 (0.0008) +[2023-10-09 04:56:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 40271872. Throughput: 0: 1710.6, 1: 1712.0. Samples: 10080040. Policy #0 lag: (min: 26.0, avg: 28.8, max: 53.0) +[2023-10-09 04:56:46,052][59242] Avg episode reward: [(0, '25.370'), (1, '25.500')] +[2023-10-09 04:56:46,503][60143] Updated weights for policy 0, policy_version 19562 (0.0007) +[2023-10-09 04:56:46,869][60143] Updated weights for policy 0, policy_version 19572 (0.0008) +[2023-10-09 04:56:47,234][60143] Updated weights for policy 0, policy_version 19582 (0.0007) +[2023-10-09 04:56:48,086][60144] Updated weights for policy 1, policy_version 19782 (0.0009) +[2023-10-09 04:56:48,447][60144] Updated weights for policy 1, policy_version 19792 (0.0008) +[2023-10-09 04:56:48,824][60144] Updated weights for policy 1, policy_version 19802 (0.0009) +[2023-10-09 04:56:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 40337408. Throughput: 0: 1694.3, 1: 1727.1. Samples: 10090080. Policy #0 lag: (min: 26.0, avg: 28.8, max: 53.0) +[2023-10-09 04:56:51,052][59242] Avg episode reward: [(0, '25.230'), (1, '24.850')] +[2023-10-09 04:56:51,099][60143] Updated weights for policy 0, policy_version 19592 (0.0011) +[2023-10-09 04:56:51,469][60143] Updated weights for policy 0, policy_version 19602 (0.0008) +[2023-10-09 04:56:51,834][60143] Updated weights for policy 0, policy_version 19612 (0.0010) +[2023-10-09 04:56:52,662][60144] Updated weights for policy 1, policy_version 19812 (0.0007) +[2023-10-09 04:56:53,028][60144] Updated weights for policy 1, policy_version 19822 (0.0008) +[2023-10-09 04:56:53,402][60144] Updated weights for policy 1, policy_version 19832 (0.0008) +[2023-10-09 04:56:56,052][59242] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 40402944. Throughput: 0: 1711.3, 1: 1710.0. Samples: 10110736. Policy #0 lag: (min: 26.0, avg: 28.8, max: 53.0) +[2023-10-09 04:56:56,054][59242] Avg episode reward: [(0, '24.090'), (1, '24.330')] +[2023-10-09 04:56:56,102][60143] Updated weights for policy 0, policy_version 19622 (0.0009) +[2023-10-09 04:56:56,467][60143] Updated weights for policy 0, policy_version 19632 (0.0008) +[2023-10-09 04:56:56,850][60143] Updated weights for policy 0, policy_version 19642 (0.0009) +[2023-10-09 04:56:57,195][60144] Updated weights for policy 1, policy_version 19842 (0.0007) +[2023-10-09 04:56:57,565][60144] Updated weights for policy 1, policy_version 19852 (0.0007) +[2023-10-09 04:56:57,924][60144] Updated weights for policy 1, policy_version 19862 (0.0007) +[2023-10-09 04:56:58,299][60144] Updated weights for policy 1, policy_version 19872 (0.0009) +[2023-10-09 04:57:00,864][60143] Updated weights for policy 0, policy_version 19652 (0.0008) +[2023-10-09 04:57:01,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 40468480. Throughput: 0: 1707.5, 1: 1731.1. Samples: 10131982. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:57:01,053][59242] Avg episode reward: [(0, '24.980'), (1, '24.480')] +[2023-10-09 04:57:01,240][60143] Updated weights for policy 0, policy_version 19662 (0.0009) +[2023-10-09 04:57:01,603][60143] Updated weights for policy 0, policy_version 19672 (0.0007) +[2023-10-09 04:57:02,484][60144] Updated weights for policy 1, policy_version 19882 (0.0010) +[2023-10-09 04:57:02,864][60144] Updated weights for policy 1, policy_version 19892 (0.0008) +[2023-10-09 04:57:03,227][60144] Updated weights for policy 1, policy_version 19902 (0.0007) +[2023-10-09 04:57:05,408][60143] Updated weights for policy 0, policy_version 19682 (0.0011) +[2023-10-09 04:57:05,780][60143] Updated weights for policy 0, policy_version 19692 (0.0010) +[2023-10-09 04:57:06,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 40534016. Throughput: 0: 1708.8, 1: 1709.8. Samples: 10141236. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:57:06,053][59242] Avg episode reward: [(0, '25.240'), (1, '24.480')] +[2023-10-09 04:57:06,160][60143] Updated weights for policy 0, policy_version 19702 (0.0009) +[2023-10-09 04:57:06,526][60143] Updated weights for policy 0, policy_version 19712 (0.0008) +[2023-10-09 04:57:07,300][60144] Updated weights for policy 1, policy_version 19912 (0.0007) +[2023-10-09 04:57:07,669][60144] Updated weights for policy 1, policy_version 19922 (0.0010) +[2023-10-09 04:57:08,038][60144] Updated weights for policy 1, policy_version 19932 (0.0011) +[2023-10-09 04:57:10,535][60143] Updated weights for policy 0, policy_version 19722 (0.0010) +[2023-10-09 04:57:10,907][60143] Updated weights for policy 0, policy_version 19732 (0.0010) +[2023-10-09 04:57:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 40599552. Throughput: 0: 1709.9, 1: 1707.1. Samples: 10161998. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:57:11,053][59242] Avg episode reward: [(0, '25.870'), (1, '25.360')] +[2023-10-09 04:57:11,273][60143] Updated weights for policy 0, policy_version 19742 (0.0009) +[2023-10-09 04:57:12,013][60144] Updated weights for policy 1, policy_version 19942 (0.0009) +[2023-10-09 04:57:12,383][60144] Updated weights for policy 1, policy_version 19952 (0.0011) +[2023-10-09 04:57:12,762][60144] Updated weights for policy 1, policy_version 19962 (0.0010) +[2023-10-09 04:57:15,491][60143] Updated weights for policy 0, policy_version 19752 (0.0009) +[2023-10-09 04:57:15,867][60143] Updated weights for policy 0, policy_version 19762 (0.0008) +[2023-10-09 04:57:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 40665088. Throughput: 0: 1699.7, 1: 1733.5. Samples: 10182780. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:57:16,053][59242] Avg episode reward: [(0, '26.030'), (1, '25.030')] +[2023-10-09 04:57:16,234][60143] Updated weights for policy 0, policy_version 19772 (0.0008) +[2023-10-09 04:57:16,772][60144] Updated weights for policy 1, policy_version 19972 (0.0007) +[2023-10-09 04:57:17,149][60144] Updated weights for policy 1, policy_version 19982 (0.0008) +[2023-10-09 04:57:17,514][60144] Updated weights for policy 1, policy_version 19992 (0.0011) +[2023-10-09 04:57:20,250][60143] Updated weights for policy 0, policy_version 19782 (0.0008) +[2023-10-09 04:57:20,611][60143] Updated weights for policy 0, policy_version 19792 (0.0008) +[2023-10-09 04:57:20,987][60143] Updated weights for policy 0, policy_version 19802 (0.0008) +[2023-10-09 04:57:21,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 40730624. Throughput: 0: 1709.8, 1: 1703.4. Samples: 10192570. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 04:57:21,052][59242] Avg episode reward: [(0, '27.240'), (1, '24.110')] +[2023-10-09 04:57:21,199][59934] Saving new best policy, reward=27.240! +[2023-10-09 04:57:21,568][60144] Updated weights for policy 1, policy_version 20002 (0.0010) +[2023-10-09 04:57:21,943][60144] Updated weights for policy 1, policy_version 20012 (0.0008) +[2023-10-09 04:57:22,304][60144] Updated weights for policy 1, policy_version 20022 (0.0009) +[2023-10-09 04:57:22,672][60144] Updated weights for policy 1, policy_version 20032 (0.0008) +[2023-10-09 04:57:24,766][60143] Updated weights for policy 0, policy_version 19812 (0.0008) +[2023-10-09 04:57:25,143][60143] Updated weights for policy 0, policy_version 19822 (0.0009) +[2023-10-09 04:57:25,511][60143] Updated weights for policy 0, policy_version 19832 (0.0011) +[2023-10-09 04:57:26,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 40828928. Throughput: 0: 1714.7, 1: 1727.2. Samples: 10214014. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 04:57:26,053][59242] Avg episode reward: [(0, '27.110'), (1, '24.370')] +[2023-10-09 04:57:26,446][60144] Updated weights for policy 1, policy_version 20042 (0.0010) +[2023-10-09 04:57:26,809][60144] Updated weights for policy 1, policy_version 20052 (0.0009) +[2023-10-09 04:57:27,173][60144] Updated weights for policy 1, policy_version 20062 (0.0011) +[2023-10-09 04:57:29,631][60143] Updated weights for policy 0, policy_version 19842 (0.0011) +[2023-10-09 04:57:30,051][60143] Updated weights for policy 0, policy_version 19852 (0.0008) +[2023-10-09 04:57:30,421][60143] Updated weights for policy 0, policy_version 19862 (0.0009) +[2023-10-09 04:57:30,786][60143] Updated weights for policy 0, policy_version 19872 (0.0009) +[2023-10-09 04:57:31,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 40894464. Throughput: 0: 1691.6, 1: 1734.9. Samples: 10234232. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:57:31,053][59242] Avg episode reward: [(0, '25.570'), (1, '25.000')] +[2023-10-09 04:57:31,062][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000019872_20348928.pth... +[2023-10-09 04:57:31,098][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000018272_18710528.pth +[2023-10-09 04:57:31,129][60144] Updated weights for policy 1, policy_version 20072 (0.0009) +[2023-10-09 04:57:31,496][60144] Updated weights for policy 1, policy_version 20082 (0.0007) +[2023-10-09 04:57:31,862][60144] Updated weights for policy 1, policy_version 20092 (0.0008) +[2023-10-09 04:57:32,007][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000020096_20578304.pth... +[2023-10-09 04:57:32,045][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000018464_18907136.pth +[2023-10-09 04:57:34,811][60143] Updated weights for policy 0, policy_version 19882 (0.0010) +[2023-10-09 04:57:35,177][60143] Updated weights for policy 0, policy_version 19892 (0.0009) +[2023-10-09 04:57:35,543][60143] Updated weights for policy 0, policy_version 19902 (0.0008) +[2023-10-09 04:57:35,879][60144] Updated weights for policy 1, policy_version 20102 (0.0008) +[2023-10-09 04:57:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 40960000. Throughput: 0: 1712.0, 1: 1717.1. Samples: 10244394. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:57:36,053][59242] Avg episode reward: [(0, '25.940'), (1, '23.910')] +[2023-10-09 04:57:36,242][60144] Updated weights for policy 1, policy_version 20112 (0.0008) +[2023-10-09 04:57:36,606][60144] Updated weights for policy 1, policy_version 20122 (0.0008) +[2023-10-09 04:57:39,526][60143] Updated weights for policy 0, policy_version 19912 (0.0009) +[2023-10-09 04:57:39,889][60143] Updated weights for policy 0, policy_version 19922 (0.0008) +[2023-10-09 04:57:40,258][60143] Updated weights for policy 0, policy_version 19932 (0.0008) +[2023-10-09 04:57:40,535][60144] Updated weights for policy 1, policy_version 20132 (0.0009) +[2023-10-09 04:57:40,909][60144] Updated weights for policy 1, policy_version 20142 (0.0009) +[2023-10-09 04:57:41,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 41025536. Throughput: 0: 1704.6, 1: 1725.0. Samples: 10265068. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:57:41,053][59242] Avg episode reward: [(0, '25.880'), (1, '23.850')] +[2023-10-09 04:57:41,269][60144] Updated weights for policy 1, policy_version 20152 (0.0009) +[2023-10-09 04:57:44,192][60143] Updated weights for policy 0, policy_version 19942 (0.0008) +[2023-10-09 04:57:44,561][60143] Updated weights for policy 0, policy_version 19952 (0.0007) +[2023-10-09 04:57:44,939][60143] Updated weights for policy 0, policy_version 19962 (0.0007) +[2023-10-09 04:57:45,173][60144] Updated weights for policy 1, policy_version 20162 (0.0008) +[2023-10-09 04:57:45,531][60144] Updated weights for policy 1, policy_version 20172 (0.0010) +[2023-10-09 04:57:45,898][60144] Updated weights for policy 1, policy_version 20182 (0.0010) +[2023-10-09 04:57:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 41091072. Throughput: 0: 1685.1, 1: 1714.5. Samples: 10284962. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:57:46,053][59242] Avg episode reward: [(0, '24.730'), (1, '25.210')] +[2023-10-09 04:57:46,262][60144] Updated weights for policy 1, policy_version 20192 (0.0010) +[2023-10-09 04:57:48,769][60143] Updated weights for policy 0, policy_version 19972 (0.0007) +[2023-10-09 04:57:49,137][60143] Updated weights for policy 0, policy_version 19982 (0.0007) +[2023-10-09 04:57:49,504][60143] Updated weights for policy 0, policy_version 19992 (0.0007) +[2023-10-09 04:57:50,324][60144] Updated weights for policy 1, policy_version 20202 (0.0009) +[2023-10-09 04:57:50,686][60144] Updated weights for policy 1, policy_version 20212 (0.0008) +[2023-10-09 04:57:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 41156608. Throughput: 0: 1717.0, 1: 1728.5. Samples: 10296284. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:57:51,053][59242] Avg episode reward: [(0, '24.240'), (1, '25.680')] +[2023-10-09 04:57:51,055][60144] Updated weights for policy 1, policy_version 20222 (0.0012) +[2023-10-09 04:57:53,596][60143] Updated weights for policy 0, policy_version 20002 (0.0009) +[2023-10-09 04:57:53,966][60143] Updated weights for policy 0, policy_version 20012 (0.0009) +[2023-10-09 04:57:54,343][60143] Updated weights for policy 0, policy_version 20022 (0.0008) +[2023-10-09 04:57:54,710][60143] Updated weights for policy 0, policy_version 20032 (0.0008) +[2023-10-09 04:57:54,926][60144] Updated weights for policy 1, policy_version 20232 (0.0008) +[2023-10-09 04:57:55,287][60144] Updated weights for policy 1, policy_version 20242 (0.0007) +[2023-10-09 04:57:55,649][60144] Updated weights for policy 1, policy_version 20252 (0.0009) +[2023-10-09 04:57:56,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 41254912. Throughput: 0: 1695.2, 1: 1737.9. Samples: 10316486. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:57:56,053][59242] Avg episode reward: [(0, '22.640'), (1, '26.010')] +[2023-10-09 04:57:58,917][60143] Updated weights for policy 0, policy_version 20042 (0.0008) +[2023-10-09 04:57:59,295][60143] Updated weights for policy 0, policy_version 20052 (0.0009) +[2023-10-09 04:57:59,662][60144] Updated weights for policy 1, policy_version 20262 (0.0009) +[2023-10-09 04:57:59,669][60143] Updated weights for policy 0, policy_version 20062 (0.0008) +[2023-10-09 04:58:00,041][60144] Updated weights for policy 1, policy_version 20272 (0.0010) +[2023-10-09 04:58:00,404][60144] Updated weights for policy 1, policy_version 20282 (0.0008) +[2023-10-09 04:58:01,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 41320448. Throughput: 0: 1698.5, 1: 1714.4. Samples: 10336362. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:58:01,053][59242] Avg episode reward: [(0, '23.280'), (1, '25.640')] +[2023-10-09 04:58:03,653][60143] Updated weights for policy 0, policy_version 20072 (0.0010) +[2023-10-09 04:58:04,028][60143] Updated weights for policy 0, policy_version 20082 (0.0008) +[2023-10-09 04:58:04,360][60144] Updated weights for policy 1, policy_version 20292 (0.0008) +[2023-10-09 04:58:04,391][60143] Updated weights for policy 0, policy_version 20092 (0.0008) +[2023-10-09 04:58:04,734][60144] Updated weights for policy 1, policy_version 20302 (0.0007) +[2023-10-09 04:58:05,102][60144] Updated weights for policy 1, policy_version 20312 (0.0008) +[2023-10-09 04:58:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 41385984. Throughput: 0: 1713.5, 1: 1736.0. Samples: 10347798. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:58:06,053][59242] Avg episode reward: [(0, '23.510'), (1, '26.870')] +[2023-10-09 04:58:08,262][60143] Updated weights for policy 0, policy_version 20102 (0.0008) +[2023-10-09 04:58:08,636][60143] Updated weights for policy 0, policy_version 20112 (0.0009) +[2023-10-09 04:58:09,001][60143] Updated weights for policy 0, policy_version 20122 (0.0009) +[2023-10-09 04:58:09,277][60144] Updated weights for policy 1, policy_version 20322 (0.0010) +[2023-10-09 04:58:09,639][60144] Updated weights for policy 1, policy_version 20332 (0.0009) +[2023-10-09 04:58:10,019][60144] Updated weights for policy 1, policy_version 20342 (0.0009) +[2023-10-09 04:58:10,383][60144] Updated weights for policy 1, policy_version 20352 (0.0010) +[2023-10-09 04:58:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 41451520. Throughput: 0: 1684.9, 1: 1726.8. Samples: 10367540. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) +[2023-10-09 04:58:11,053][59242] Avg episode reward: [(0, '22.550'), (1, '27.610')] +[2023-10-09 04:58:11,055][60003] Saving new best policy, reward=27.610! +[2023-10-09 04:58:12,939][60143] Updated weights for policy 0, policy_version 20132 (0.0008) +[2023-10-09 04:58:13,303][60143] Updated weights for policy 0, policy_version 20142 (0.0010) +[2023-10-09 04:58:13,683][60143] Updated weights for policy 0, policy_version 20152 (0.0009) +[2023-10-09 04:58:14,300][60144] Updated weights for policy 1, policy_version 20362 (0.0009) +[2023-10-09 04:58:14,677][60144] Updated weights for policy 1, policy_version 20372 (0.0009) +[2023-10-09 04:58:15,046][60144] Updated weights for policy 1, policy_version 20382 (0.0009) +[2023-10-09 04:58:16,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 41517056. Throughput: 0: 1716.1, 1: 1702.1. Samples: 10388050. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) +[2023-10-09 04:58:16,053][59242] Avg episode reward: [(0, '21.420'), (1, '27.480')] +[2023-10-09 04:58:17,703][60143] Updated weights for policy 0, policy_version 20162 (0.0009) +[2023-10-09 04:58:18,119][60143] Updated weights for policy 0, policy_version 20172 (0.0010) +[2023-10-09 04:58:18,485][60143] Updated weights for policy 0, policy_version 20182 (0.0007) +[2023-10-09 04:58:18,745][60144] Updated weights for policy 1, policy_version 20392 (0.0008) +[2023-10-09 04:58:18,854][60143] Updated weights for policy 0, policy_version 20192 (0.0007) +[2023-10-09 04:58:19,104][60144] Updated weights for policy 1, policy_version 20402 (0.0010) +[2023-10-09 04:58:19,478][60144] Updated weights for policy 1, policy_version 20412 (0.0007) +[2023-10-09 04:58:21,052][59242] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 41582592. Throughput: 0: 1699.2, 1: 1732.4. Samples: 10398814. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) +[2023-10-09 04:58:21,052][59242] Avg episode reward: [(0, '22.090'), (1, '26.840')] +[2023-10-09 04:58:22,747][60143] Updated weights for policy 0, policy_version 20202 (0.0007) +[2023-10-09 04:58:23,109][60143] Updated weights for policy 0, policy_version 20212 (0.0010) +[2023-10-09 04:58:23,386][60144] Updated weights for policy 1, policy_version 20422 (0.0008) +[2023-10-09 04:58:23,485][60143] Updated weights for policy 0, policy_version 20222 (0.0007) +[2023-10-09 04:58:23,754][60144] Updated weights for policy 1, policy_version 20432 (0.0010) +[2023-10-09 04:58:24,122][60144] Updated weights for policy 1, policy_version 20442 (0.0010) +[2023-10-09 04:58:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 41648128. Throughput: 0: 1694.7, 1: 1713.8. Samples: 10418448. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) +[2023-10-09 04:58:26,053][59242] Avg episode reward: [(0, '23.190'), (1, '27.030')] +[2023-10-09 04:58:27,363][60143] Updated weights for policy 0, policy_version 20232 (0.0009) +[2023-10-09 04:58:27,734][60143] Updated weights for policy 0, policy_version 20242 (0.0008) +[2023-10-09 04:58:27,991][60144] Updated weights for policy 1, policy_version 20452 (0.0009) +[2023-10-09 04:58:28,112][60143] Updated weights for policy 0, policy_version 20252 (0.0008) +[2023-10-09 04:58:28,350][60144] Updated weights for policy 1, policy_version 20462 (0.0008) +[2023-10-09 04:58:28,723][60144] Updated weights for policy 1, policy_version 20472 (0.0009) +[2023-10-09 04:58:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 41713664. Throughput: 0: 1718.4, 1: 1728.7. Samples: 10440078. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:58:31,053][59242] Avg episode reward: [(0, '22.620'), (1, '28.190')] +[2023-10-09 04:58:31,059][60003] Saving new best policy, reward=28.190! +[2023-10-09 04:58:32,087][60143] Updated weights for policy 0, policy_version 20262 (0.0009) +[2023-10-09 04:58:32,457][60143] Updated weights for policy 0, policy_version 20272 (0.0010) +[2023-10-09 04:58:32,677][60144] Updated weights for policy 1, policy_version 20482 (0.0007) +[2023-10-09 04:58:32,829][60143] Updated weights for policy 0, policy_version 20282 (0.0007) +[2023-10-09 04:58:33,047][60144] Updated weights for policy 1, policy_version 20492 (0.0007) +[2023-10-09 04:58:33,416][60144] Updated weights for policy 1, policy_version 20502 (0.0007) +[2023-10-09 04:58:33,790][60144] Updated weights for policy 1, policy_version 20512 (0.0011) +[2023-10-09 04:58:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 41779200. Throughput: 0: 1679.8, 1: 1725.2. Samples: 10449510. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:58:36,052][59242] Avg episode reward: [(0, '22.610'), (1, '27.270')] +[2023-10-09 04:58:36,893][60143] Updated weights for policy 0, policy_version 20292 (0.0009) +[2023-10-09 04:58:37,258][60143] Updated weights for policy 0, policy_version 20302 (0.0010) +[2023-10-09 04:58:37,643][60143] Updated weights for policy 0, policy_version 20312 (0.0007) +[2023-10-09 04:58:37,686][60144] Updated weights for policy 1, policy_version 20522 (0.0008) +[2023-10-09 04:58:38,062][60144] Updated weights for policy 1, policy_version 20532 (0.0008) +[2023-10-09 04:58:38,435][60144] Updated weights for policy 1, policy_version 20542 (0.0010) +[2023-10-09 04:58:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 41844736. Throughput: 0: 1705.2, 1: 1715.6. Samples: 10470422. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:58:41,053][59242] Avg episode reward: [(0, '22.820'), (1, '26.730')] +[2023-10-09 04:58:41,778][60143] Updated weights for policy 0, policy_version 20322 (0.0007) +[2023-10-09 04:58:42,153][60143] Updated weights for policy 0, policy_version 20332 (0.0008) +[2023-10-09 04:58:42,512][60144] Updated weights for policy 1, policy_version 20552 (0.0009) +[2023-10-09 04:58:42,517][60143] Updated weights for policy 0, policy_version 20342 (0.0007) +[2023-10-09 04:58:42,882][60144] Updated weights for policy 1, policy_version 20562 (0.0008) +[2023-10-09 04:58:42,893][60143] Updated weights for policy 0, policy_version 20352 (0.0008) +[2023-10-09 04:58:43,254][60144] Updated weights for policy 1, policy_version 20572 (0.0009) +[2023-10-09 04:58:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 41910272. Throughput: 0: 1704.8, 1: 1737.2. Samples: 10491254. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:58:46,053][59242] Avg episode reward: [(0, '23.430'), (1, '26.690')] +[2023-10-09 04:58:46,949][60143] Updated weights for policy 0, policy_version 20362 (0.0008) +[2023-10-09 04:58:47,230][60144] Updated weights for policy 1, policy_version 20582 (0.0009) +[2023-10-09 04:58:47,320][60143] Updated weights for policy 0, policy_version 20372 (0.0010) +[2023-10-09 04:58:47,599][60144] Updated weights for policy 1, policy_version 20592 (0.0009) +[2023-10-09 04:58:47,699][60143] Updated weights for policy 0, policy_version 20382 (0.0008) +[2023-10-09 04:58:47,970][60144] Updated weights for policy 1, policy_version 20602 (0.0008) +[2023-10-09 04:58:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 41975808. Throughput: 0: 1678.3, 1: 1714.2. Samples: 10500458. Policy #0 lag: (min: 18.0, avg: 22.6, max: 50.0) +[2023-10-09 04:58:51,053][59242] Avg episode reward: [(0, '24.600'), (1, '26.550')] +[2023-10-09 04:58:51,718][60144] Updated weights for policy 1, policy_version 20612 (0.0008) +[2023-10-09 04:58:51,790][60143] Updated weights for policy 0, policy_version 20392 (0.0010) +[2023-10-09 04:58:52,090][60144] Updated weights for policy 1, policy_version 20622 (0.0007) +[2023-10-09 04:58:52,158][60143] Updated weights for policy 0, policy_version 20402 (0.0010) +[2023-10-09 04:58:52,463][60144] Updated weights for policy 1, policy_version 20632 (0.0007) +[2023-10-09 04:58:52,522][60143] Updated weights for policy 0, policy_version 20412 (0.0011) +[2023-10-09 04:58:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 42041344. Throughput: 0: 1700.1, 1: 1725.9. Samples: 10521708. Policy #0 lag: (min: 18.0, avg: 22.6, max: 50.0) +[2023-10-09 04:58:56,052][59242] Avg episode reward: [(0, '24.760'), (1, '26.690')] +[2023-10-09 04:58:56,419][60144] Updated weights for policy 1, policy_version 20642 (0.0008) +[2023-10-09 04:58:56,727][60143] Updated weights for policy 0, policy_version 20422 (0.0009) +[2023-10-09 04:58:56,794][60144] Updated weights for policy 1, policy_version 20652 (0.0008) +[2023-10-09 04:58:57,102][60143] Updated weights for policy 0, policy_version 20432 (0.0008) +[2023-10-09 04:58:57,150][60144] Updated weights for policy 1, policy_version 20662 (0.0007) +[2023-10-09 04:58:57,470][60143] Updated weights for policy 0, policy_version 20442 (0.0008) +[2023-10-09 04:58:57,524][60144] Updated weights for policy 1, policy_version 20672 (0.0007) +[2023-10-09 04:59:01,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 42106880. Throughput: 0: 1688.2, 1: 1746.1. Samples: 10542596. Policy #0 lag: (min: 18.0, avg: 22.6, max: 50.0) +[2023-10-09 04:59:01,053][59242] Avg episode reward: [(0, '24.470'), (1, '27.190')] +[2023-10-09 04:59:01,518][60144] Updated weights for policy 1, policy_version 20682 (0.0008) +[2023-10-09 04:59:01,519][60143] Updated weights for policy 0, policy_version 20452 (0.0010) +[2023-10-09 04:59:01,876][60144] Updated weights for policy 1, policy_version 20692 (0.0008) +[2023-10-09 04:59:01,877][60143] Updated weights for policy 0, policy_version 20462 (0.0009) +[2023-10-09 04:59:02,246][60144] Updated weights for policy 1, policy_version 20702 (0.0008) +[2023-10-09 04:59:02,250][60143] Updated weights for policy 0, policy_version 20472 (0.0008) +[2023-10-09 04:59:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 42172416. Throughput: 0: 1683.2, 1: 1714.1. Samples: 10551692. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:59:06,053][59242] Avg episode reward: [(0, '25.030'), (1, '26.330')] +[2023-10-09 04:59:06,249][60143] Updated weights for policy 0, policy_version 20482 (0.0008) +[2023-10-09 04:59:06,371][60144] Updated weights for policy 1, policy_version 20712 (0.0008) +[2023-10-09 04:59:06,627][60143] Updated weights for policy 0, policy_version 20492 (0.0008) +[2023-10-09 04:59:06,738][60144] Updated weights for policy 1, policy_version 20722 (0.0007) +[2023-10-09 04:59:07,001][60143] Updated weights for policy 0, policy_version 20502 (0.0009) +[2023-10-09 04:59:07,101][60144] Updated weights for policy 1, policy_version 20732 (0.0007) +[2023-10-09 04:59:07,372][60143] Updated weights for policy 0, policy_version 20512 (0.0009) +[2023-10-09 04:59:10,885][60144] Updated weights for policy 1, policy_version 20742 (0.0008) +[2023-10-09 04:59:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 42237952. Throughput: 0: 1689.9, 1: 1738.4. Samples: 10572722. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:59:11,053][59242] Avg episode reward: [(0, '25.300'), (1, '26.090')] +[2023-10-09 04:59:11,258][60144] Updated weights for policy 1, policy_version 20752 (0.0007) +[2023-10-09 04:59:11,347][60143] Updated weights for policy 0, policy_version 20522 (0.0007) +[2023-10-09 04:59:11,616][60144] Updated weights for policy 1, policy_version 20762 (0.0007) +[2023-10-09 04:59:11,710][60143] Updated weights for policy 0, policy_version 20532 (0.0008) +[2023-10-09 04:59:12,079][60143] Updated weights for policy 0, policy_version 20542 (0.0009) +[2023-10-09 04:59:15,644][60144] Updated weights for policy 1, policy_version 20772 (0.0010) +[2023-10-09 04:59:16,010][60144] Updated weights for policy 1, policy_version 20782 (0.0007) +[2023-10-09 04:59:16,048][60143] Updated weights for policy 0, policy_version 20552 (0.0008) +[2023-10-09 04:59:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 42303488. Throughput: 0: 1687.0, 1: 1726.5. Samples: 10593688. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:59:16,053][59242] Avg episode reward: [(0, '25.710'), (1, '27.670')] +[2023-10-09 04:59:16,372][60144] Updated weights for policy 1, policy_version 20792 (0.0008) +[2023-10-09 04:59:16,420][60143] Updated weights for policy 0, policy_version 20562 (0.0007) +[2023-10-09 04:59:16,784][60143] Updated weights for policy 0, policy_version 20572 (0.0007) +[2023-10-09 04:59:20,426][60144] Updated weights for policy 1, policy_version 20802 (0.0009) +[2023-10-09 04:59:20,788][60144] Updated weights for policy 1, policy_version 20812 (0.0009) +[2023-10-09 04:59:20,789][60143] Updated weights for policy 0, policy_version 20582 (0.0009) +[2023-10-09 04:59:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 42369024. Throughput: 0: 1688.1, 1: 1721.2. Samples: 10602932. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 04:59:21,053][59242] Avg episode reward: [(0, '26.060'), (1, '25.270')] +[2023-10-09 04:59:21,160][60143] Updated weights for policy 0, policy_version 20592 (0.0008) +[2023-10-09 04:59:21,164][60144] Updated weights for policy 1, policy_version 20822 (0.0008) +[2023-10-09 04:59:21,533][60143] Updated weights for policy 0, policy_version 20602 (0.0009) +[2023-10-09 04:59:21,536][60144] Updated weights for policy 1, policy_version 20832 (0.0009) +[2023-10-09 04:59:25,484][60143] Updated weights for policy 0, policy_version 20612 (0.0010) +[2023-10-09 04:59:25,486][60144] Updated weights for policy 1, policy_version 20842 (0.0010) +[2023-10-09 04:59:25,841][60144] Updated weights for policy 1, policy_version 20852 (0.0008) +[2023-10-09 04:59:25,859][60143] Updated weights for policy 0, policy_version 20622 (0.0008) +[2023-10-09 04:59:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 42434560. Throughput: 0: 1692.0, 1: 1727.0. Samples: 10624278. Policy #0 lag: (min: 31.0, avg: 32.4, max: 55.0) +[2023-10-09 04:59:26,053][59242] Avg episode reward: [(0, '25.610'), (1, '24.630')] +[2023-10-09 04:59:26,211][60144] Updated weights for policy 1, policy_version 20862 (0.0007) +[2023-10-09 04:59:26,223][60143] Updated weights for policy 0, policy_version 20632 (0.0007) +[2023-10-09 04:59:30,206][60144] Updated weights for policy 1, policy_version 20872 (0.0008) +[2023-10-09 04:59:30,290][60143] Updated weights for policy 0, policy_version 20642 (0.0010) +[2023-10-09 04:59:30,588][60144] Updated weights for policy 1, policy_version 20882 (0.0009) +[2023-10-09 04:59:30,660][60143] Updated weights for policy 0, policy_version 20652 (0.0009) +[2023-10-09 04:59:30,947][60144] Updated weights for policy 1, policy_version 20892 (0.0008) +[2023-10-09 04:59:31,029][60143] Updated weights for policy 0, policy_version 20662 (0.0008) +[2023-10-09 04:59:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 42500096. Throughput: 0: 1693.4, 1: 1713.2. Samples: 10644554. Policy #0 lag: (min: 31.0, avg: 32.4, max: 55.0) +[2023-10-09 04:59:31,053][59242] Avg episode reward: [(0, '26.170'), (1, '25.050')] +[2023-10-09 04:59:31,094][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000020896_21397504.pth... +[2023-10-09 04:59:31,123][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000019264_19726336.pth +[2023-10-09 04:59:31,395][60143] Updated weights for policy 0, policy_version 20672 (0.0008) +[2023-10-09 04:59:31,395][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000020672_21168128.pth... +[2023-10-09 04:59:31,435][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000019072_19529728.pth +[2023-10-09 04:59:35,050][60144] Updated weights for policy 1, policy_version 20902 (0.0009) +[2023-10-09 04:59:35,286][60143] Updated weights for policy 0, policy_version 20682 (0.0008) +[2023-10-09 04:59:35,409][60144] Updated weights for policy 1, policy_version 20912 (0.0008) +[2023-10-09 04:59:35,651][60143] Updated weights for policy 0, policy_version 20692 (0.0010) +[2023-10-09 04:59:35,773][60144] Updated weights for policy 1, policy_version 20922 (0.0009) +[2023-10-09 04:59:36,023][60143] Updated weights for policy 0, policy_version 20702 (0.0008) +[2023-10-09 04:59:36,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 42598400. Throughput: 0: 1701.3, 1: 1727.2. Samples: 10654740. Policy #0 lag: (min: 31.0, avg: 32.4, max: 55.0) +[2023-10-09 04:59:36,053][59242] Avg episode reward: [(0, '27.680'), (1, '25.300')] +[2023-10-09 04:59:36,088][59934] Saving new best policy, reward=27.680! +[2023-10-09 04:59:39,836][60144] Updated weights for policy 1, policy_version 20932 (0.0008) +[2023-10-09 04:59:39,936][60143] Updated weights for policy 0, policy_version 20712 (0.0007) +[2023-10-09 04:59:40,194][60144] Updated weights for policy 1, policy_version 20942 (0.0007) +[2023-10-09 04:59:40,302][60143] Updated weights for policy 0, policy_version 20722 (0.0008) +[2023-10-09 04:59:40,562][60144] Updated weights for policy 1, policy_version 20952 (0.0008) +[2023-10-09 04:59:40,673][60143] Updated weights for policy 0, policy_version 20732 (0.0009) +[2023-10-09 04:59:41,052][59242] Fps is (10 sec: 19661.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 42696704. Throughput: 0: 1706.3, 1: 1717.4. Samples: 10675776. Policy #0 lag: (min: 0.0, avg: 21.7, max: 32.0) +[2023-10-09 04:59:41,053][59242] Avg episode reward: [(0, '28.790'), (1, '24.310')] +[2023-10-09 04:59:41,054][59934] Saving new best policy, reward=28.790! +[2023-10-09 04:59:44,405][60144] Updated weights for policy 1, policy_version 20962 (0.0009) +[2023-10-09 04:59:44,773][60144] Updated weights for policy 1, policy_version 20972 (0.0008) +[2023-10-09 04:59:44,793][60143] Updated weights for policy 0, policy_version 20742 (0.0008) +[2023-10-09 04:59:45,149][60144] Updated weights for policy 1, policy_version 20982 (0.0008) +[2023-10-09 04:59:45,155][60143] Updated weights for policy 0, policy_version 20752 (0.0009) +[2023-10-09 04:59:45,512][60144] Updated weights for policy 1, policy_version 20992 (0.0008) +[2023-10-09 04:59:45,527][60143] Updated weights for policy 0, policy_version 20762 (0.0012) +[2023-10-09 04:59:46,052][59242] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 42762240. Throughput: 0: 1688.8, 1: 1690.0. Samples: 10694642. Policy #0 lag: (min: 0.0, avg: 21.7, max: 32.0) +[2023-10-09 04:59:46,052][59242] Avg episode reward: [(0, '27.090'), (1, '24.940')] +[2023-10-09 04:59:49,577][60144] Updated weights for policy 1, policy_version 21002 (0.0008) +[2023-10-09 04:59:49,596][60143] Updated weights for policy 0, policy_version 20772 (0.0011) +[2023-10-09 04:59:49,939][60144] Updated weights for policy 1, policy_version 21012 (0.0008) +[2023-10-09 04:59:49,974][60143] Updated weights for policy 0, policy_version 20782 (0.0008) +[2023-10-09 04:59:50,306][60144] Updated weights for policy 1, policy_version 21022 (0.0009) +[2023-10-09 04:59:50,334][60143] Updated weights for policy 0, policy_version 20792 (0.0008) +[2023-10-09 04:59:51,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 42827776. Throughput: 0: 1709.1, 1: 1721.0. Samples: 10706044. Policy #0 lag: (min: 0.0, avg: 21.7, max: 32.0) +[2023-10-09 04:59:51,053][59242] Avg episode reward: [(0, '26.780'), (1, '24.480')] +[2023-10-09 04:59:54,092][60144] Updated weights for policy 1, policy_version 21032 (0.0008) +[2023-10-09 04:59:54,334][60143] Updated weights for policy 0, policy_version 20802 (0.0009) +[2023-10-09 04:59:54,465][60144] Updated weights for policy 1, policy_version 21042 (0.0009) +[2023-10-09 04:59:54,739][60143] Updated weights for policy 0, policy_version 20812 (0.0009) +[2023-10-09 04:59:54,828][60144] Updated weights for policy 1, policy_version 21052 (0.0009) +[2023-10-09 04:59:55,122][60143] Updated weights for policy 0, policy_version 20822 (0.0008) +[2023-10-09 04:59:55,488][60143] Updated weights for policy 0, policy_version 20832 (0.0008) +[2023-10-09 04:59:56,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 42893312. Throughput: 0: 1705.3, 1: 1705.2. Samples: 10726194. Policy #0 lag: (min: 20.0, avg: 26.8, max: 52.0) +[2023-10-09 04:59:56,053][59242] Avg episode reward: [(0, '27.980'), (1, '25.300')] +[2023-10-09 04:59:58,821][60144] Updated weights for policy 1, policy_version 21062 (0.0009) +[2023-10-09 04:59:59,189][60144] Updated weights for policy 1, policy_version 21072 (0.0009) +[2023-10-09 04:59:59,488][60143] Updated weights for policy 0, policy_version 20842 (0.0009) +[2023-10-09 04:59:59,552][60144] Updated weights for policy 1, policy_version 21082 (0.0007) +[2023-10-09 04:59:59,858][60143] Updated weights for policy 0, policy_version 20852 (0.0008) +[2023-10-09 05:00:00,232][60143] Updated weights for policy 0, policy_version 20862 (0.0012) +[2023-10-09 05:00:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 42958848. Throughput: 0: 1681.6, 1: 1699.3. Samples: 10745828. Policy #0 lag: (min: 20.0, avg: 26.8, max: 52.0) +[2023-10-09 05:00:01,052][59242] Avg episode reward: [(0, '27.040'), (1, '25.310')] +[2023-10-09 05:00:03,684][60144] Updated weights for policy 1, policy_version 21092 (0.0009) +[2023-10-09 05:00:04,058][60144] Updated weights for policy 1, policy_version 21102 (0.0011) +[2023-10-09 05:00:04,411][60143] Updated weights for policy 0, policy_version 20872 (0.0008) +[2023-10-09 05:00:04,422][60144] Updated weights for policy 1, policy_version 21112 (0.0008) +[2023-10-09 05:00:04,783][60143] Updated weights for policy 0, policy_version 20882 (0.0009) +[2023-10-09 05:00:05,157][60143] Updated weights for policy 0, policy_version 20892 (0.0009) +[2023-10-09 05:00:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 43024384. Throughput: 0: 1713.0, 1: 1722.1. Samples: 10757512. Policy #0 lag: (min: 20.0, avg: 26.8, max: 52.0) +[2023-10-09 05:00:06,053][59242] Avg episode reward: [(0, '25.680'), (1, '25.530')] +[2023-10-09 05:00:08,437][60144] Updated weights for policy 1, policy_version 21122 (0.0007) +[2023-10-09 05:00:08,803][60144] Updated weights for policy 1, policy_version 21132 (0.0008) +[2023-10-09 05:00:09,169][60144] Updated weights for policy 1, policy_version 21142 (0.0008) +[2023-10-09 05:00:09,309][60143] Updated weights for policy 0, policy_version 20902 (0.0009) +[2023-10-09 05:00:09,542][60144] Updated weights for policy 1, policy_version 21152 (0.0008) +[2023-10-09 05:00:09,680][60143] Updated weights for policy 0, policy_version 20912 (0.0008) +[2023-10-09 05:00:10,058][60143] Updated weights for policy 0, policy_version 20922 (0.0008) +[2023-10-09 05:00:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 43089920. Throughput: 0: 1689.9, 1: 1691.5. Samples: 10776440. Policy #0 lag: (min: 20.0, avg: 26.8, max: 52.0) +[2023-10-09 05:00:11,053][59242] Avg episode reward: [(0, '25.790'), (1, '25.650')] +[2023-10-09 05:00:13,582][60144] Updated weights for policy 1, policy_version 21162 (0.0010) +[2023-10-09 05:00:13,951][60144] Updated weights for policy 1, policy_version 21172 (0.0007) +[2023-10-09 05:00:14,053][60143] Updated weights for policy 0, policy_version 20932 (0.0009) +[2023-10-09 05:00:14,321][60144] Updated weights for policy 1, policy_version 21182 (0.0009) +[2023-10-09 05:00:14,423][60143] Updated weights for policy 0, policy_version 20942 (0.0007) +[2023-10-09 05:00:14,784][60143] Updated weights for policy 0, policy_version 20952 (0.0008) +[2023-10-09 05:00:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 43155456. Throughput: 0: 1675.4, 1: 1709.4. Samples: 10796870. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 05:00:16,053][59242] Avg episode reward: [(0, '26.260'), (1, '26.300')] +[2023-10-09 05:00:18,301][60144] Updated weights for policy 1, policy_version 21192 (0.0007) +[2023-10-09 05:00:18,682][60144] Updated weights for policy 1, policy_version 21202 (0.0007) +[2023-10-09 05:00:18,759][60143] Updated weights for policy 0, policy_version 20962 (0.0008) +[2023-10-09 05:00:19,051][60144] Updated weights for policy 1, policy_version 21212 (0.0010) +[2023-10-09 05:00:19,130][60143] Updated weights for policy 0, policy_version 20972 (0.0008) +[2023-10-09 05:00:19,496][60143] Updated weights for policy 0, policy_version 20982 (0.0007) +[2023-10-09 05:00:19,866][60143] Updated weights for policy 0, policy_version 20992 (0.0007) +[2023-10-09 05:00:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 43220992. Throughput: 0: 1698.4, 1: 1706.8. Samples: 10807976. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 05:00:21,053][59242] Avg episode reward: [(0, '26.660'), (1, '25.970')] +[2023-10-09 05:00:22,775][60144] Updated weights for policy 1, policy_version 21222 (0.0008) +[2023-10-09 05:00:23,150][60144] Updated weights for policy 1, policy_version 21232 (0.0011) +[2023-10-09 05:00:23,516][60144] Updated weights for policy 1, policy_version 21242 (0.0009) +[2023-10-09 05:00:23,782][60143] Updated weights for policy 0, policy_version 21002 (0.0008) +[2023-10-09 05:00:24,161][60143] Updated weights for policy 0, policy_version 21012 (0.0011) +[2023-10-09 05:00:24,533][60143] Updated weights for policy 0, policy_version 21022 (0.0009) +[2023-10-09 05:00:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 43286528. Throughput: 0: 1674.0, 1: 1700.3. Samples: 10827618. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 05:00:26,052][59242] Avg episode reward: [(0, '25.710'), (1, '25.400')] +[2023-10-09 05:00:27,557][60144] Updated weights for policy 1, policy_version 21252 (0.0009) +[2023-10-09 05:00:27,930][60144] Updated weights for policy 1, policy_version 21262 (0.0007) +[2023-10-09 05:00:28,298][60144] Updated weights for policy 1, policy_version 21272 (0.0010) +[2023-10-09 05:00:28,514][60143] Updated weights for policy 0, policy_version 21032 (0.0008) +[2023-10-09 05:00:28,883][60143] Updated weights for policy 0, policy_version 21042 (0.0008) +[2023-10-09 05:00:29,246][60143] Updated weights for policy 0, policy_version 21052 (0.0008) +[2023-10-09 05:00:31,053][59242] Fps is (10 sec: 13106.6, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 43352064. Throughput: 0: 1690.4, 1: 1723.0. Samples: 10848244. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 05:00:31,054][59242] Avg episode reward: [(0, '26.300'), (1, '25.030')] +[2023-10-09 05:00:32,420][60144] Updated weights for policy 1, policy_version 21282 (0.0009) +[2023-10-09 05:00:32,783][60144] Updated weights for policy 1, policy_version 21292 (0.0011) +[2023-10-09 05:00:33,162][60144] Updated weights for policy 1, policy_version 21302 (0.0007) +[2023-10-09 05:00:33,189][60143] Updated weights for policy 0, policy_version 21062 (0.0007) +[2023-10-09 05:00:33,527][60144] Updated weights for policy 1, policy_version 21312 (0.0008) +[2023-10-09 05:00:33,554][60143] Updated weights for policy 0, policy_version 21072 (0.0009) +[2023-10-09 05:00:33,922][60143] Updated weights for policy 0, policy_version 21082 (0.0007) +[2023-10-09 05:00:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 43417600. Throughput: 0: 1691.7, 1: 1692.4. Samples: 10858332. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:00:36,053][59242] Avg episode reward: [(0, '26.700'), (1, '25.580')] +[2023-10-09 05:00:37,425][60144] Updated weights for policy 1, policy_version 21322 (0.0009) +[2023-10-09 05:00:37,784][60144] Updated weights for policy 1, policy_version 21332 (0.0008) +[2023-10-09 05:00:37,897][60143] Updated weights for policy 0, policy_version 21092 (0.0007) +[2023-10-09 05:00:38,162][60144] Updated weights for policy 1, policy_version 21342 (0.0009) +[2023-10-09 05:00:38,260][60143] Updated weights for policy 0, policy_version 21102 (0.0008) +[2023-10-09 05:00:38,635][60143] Updated weights for policy 0, policy_version 21112 (0.0010) +[2023-10-09 05:00:41,052][59242] Fps is (10 sec: 13107.8, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 43483136. Throughput: 0: 1682.2, 1: 1707.2. Samples: 10878714. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:00:41,053][59242] Avg episode reward: [(0, '27.440'), (1, '26.350')] +[2023-10-09 05:00:41,956][60144] Updated weights for policy 1, policy_version 21352 (0.0007) +[2023-10-09 05:00:42,321][60144] Updated weights for policy 1, policy_version 21362 (0.0007) +[2023-10-09 05:00:42,688][60144] Updated weights for policy 1, policy_version 21372 (0.0008) +[2023-10-09 05:00:42,848][60143] Updated weights for policy 0, policy_version 21122 (0.0010) +[2023-10-09 05:00:43,249][60143] Updated weights for policy 0, policy_version 21132 (0.0009) +[2023-10-09 05:00:43,621][60143] Updated weights for policy 0, policy_version 21142 (0.0008) +[2023-10-09 05:00:43,992][60143] Updated weights for policy 0, policy_version 21152 (0.0008) +[2023-10-09 05:00:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13662.6). Total num frames: 43548672. Throughput: 0: 1699.2, 1: 1721.2. Samples: 10899750. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:00:46,053][59242] Avg episode reward: [(0, '27.000'), (1, '26.300')] +[2023-10-09 05:00:46,532][60144] Updated weights for policy 1, policy_version 21382 (0.0007) +[2023-10-09 05:00:46,902][60144] Updated weights for policy 1, policy_version 21392 (0.0007) +[2023-10-09 05:00:47,272][60144] Updated weights for policy 1, policy_version 21402 (0.0008) +[2023-10-09 05:00:47,845][60143] Updated weights for policy 0, policy_version 21162 (0.0009) +[2023-10-09 05:00:48,214][60143] Updated weights for policy 0, policy_version 21172 (0.0008) +[2023-10-09 05:00:48,581][60143] Updated weights for policy 0, policy_version 21182 (0.0009) +[2023-10-09 05:00:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 43614208. Throughput: 0: 1676.4, 1: 1700.4. Samples: 10909468. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:00:51,052][59242] Avg episode reward: [(0, '26.380'), (1, '26.030')] +[2023-10-09 05:00:51,105][60144] Updated weights for policy 1, policy_version 21412 (0.0009) +[2023-10-09 05:00:51,472][60144] Updated weights for policy 1, policy_version 21422 (0.0007) +[2023-10-09 05:00:51,838][60144] Updated weights for policy 1, policy_version 21432 (0.0008) +[2023-10-09 05:00:52,710][60143] Updated weights for policy 0, policy_version 21192 (0.0009) +[2023-10-09 05:00:53,098][60143] Updated weights for policy 0, policy_version 21202 (0.0007) +[2023-10-09 05:00:53,470][60143] Updated weights for policy 0, policy_version 21212 (0.0012) +[2023-10-09 05:00:55,878][60144] Updated weights for policy 1, policy_version 21442 (0.0009) +[2023-10-09 05:00:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 43679744. Throughput: 0: 1687.2, 1: 1735.0. Samples: 10930436. Policy #0 lag: (min: 11.0, avg: 19.6, max: 43.0) +[2023-10-09 05:00:56,052][59242] Avg episode reward: [(0, '25.240'), (1, '27.060')] +[2023-10-09 05:00:56,250][60144] Updated weights for policy 1, policy_version 21452 (0.0010) +[2023-10-09 05:00:56,624][60144] Updated weights for policy 1, policy_version 21462 (0.0009) +[2023-10-09 05:00:56,993][60144] Updated weights for policy 1, policy_version 21472 (0.0010) +[2023-10-09 05:00:57,379][60143] Updated weights for policy 0, policy_version 21222 (0.0009) +[2023-10-09 05:00:57,752][60143] Updated weights for policy 0, policy_version 21232 (0.0008) +[2023-10-09 05:00:58,119][60143] Updated weights for policy 0, policy_version 21242 (0.0009) +[2023-10-09 05:01:00,901][60144] Updated weights for policy 1, policy_version 21482 (0.0009) +[2023-10-09 05:01:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 43745280. Throughput: 0: 1709.5, 1: 1729.2. Samples: 10951612. Policy #0 lag: (min: 11.0, avg: 19.6, max: 43.0) +[2023-10-09 05:01:01,053][59242] Avg episode reward: [(0, '25.320'), (1, '26.930')] +[2023-10-09 05:01:01,281][60144] Updated weights for policy 1, policy_version 21492 (0.0007) +[2023-10-09 05:01:01,642][60144] Updated weights for policy 1, policy_version 21502 (0.0007) +[2023-10-09 05:01:02,101][60143] Updated weights for policy 0, policy_version 21252 (0.0009) +[2023-10-09 05:01:02,479][60143] Updated weights for policy 0, policy_version 21262 (0.0007) +[2023-10-09 05:01:02,838][60143] Updated weights for policy 0, policy_version 21272 (0.0010) +[2023-10-09 05:01:05,672][60144] Updated weights for policy 1, policy_version 21512 (0.0008) +[2023-10-09 05:01:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 43810816. Throughput: 0: 1680.3, 1: 1721.5. Samples: 10961058. Policy #0 lag: (min: 11.0, avg: 19.6, max: 43.0) +[2023-10-09 05:01:06,052][59242] Avg episode reward: [(0, '24.850'), (1, '25.640')] +[2023-10-09 05:01:06,055][60144] Updated weights for policy 1, policy_version 21522 (0.0007) +[2023-10-09 05:01:06,416][60144] Updated weights for policy 1, policy_version 21532 (0.0008) +[2023-10-09 05:01:06,865][60143] Updated weights for policy 0, policy_version 21282 (0.0012) +[2023-10-09 05:01:07,229][60143] Updated weights for policy 0, policy_version 21292 (0.0008) +[2023-10-09 05:01:07,599][60143] Updated weights for policy 0, policy_version 21302 (0.0008) +[2023-10-09 05:01:07,971][60143] Updated weights for policy 0, policy_version 21312 (0.0008) +[2023-10-09 05:01:10,364][60144] Updated weights for policy 1, policy_version 21542 (0.0008) +[2023-10-09 05:01:10,724][60144] Updated weights for policy 1, policy_version 21552 (0.0008) +[2023-10-09 05:01:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 43876352. Throughput: 0: 1707.2, 1: 1732.0. Samples: 10982382. Policy #0 lag: (min: 11.0, avg: 19.6, max: 43.0) +[2023-10-09 05:01:11,053][59242] Avg episode reward: [(0, '24.750'), (1, '26.700')] +[2023-10-09 05:01:11,092][60144] Updated weights for policy 1, policy_version 21562 (0.0008) +[2023-10-09 05:01:12,012][60143] Updated weights for policy 0, policy_version 21322 (0.0007) +[2023-10-09 05:01:12,385][60143] Updated weights for policy 0, policy_version 21332 (0.0007) +[2023-10-09 05:01:12,761][60143] Updated weights for policy 0, policy_version 21342 (0.0010) +[2023-10-09 05:01:15,107][60144] Updated weights for policy 1, policy_version 21572 (0.0008) +[2023-10-09 05:01:15,466][60144] Updated weights for policy 1, policy_version 21582 (0.0009) +[2023-10-09 05:01:15,839][60144] Updated weights for policy 1, policy_version 21592 (0.0008) +[2023-10-09 05:01:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 43941888. Throughput: 0: 1714.9, 1: 1721.0. Samples: 11002858. Policy #0 lag: (min: 27.0, avg: 33.7, max: 59.0) +[2023-10-09 05:01:16,053][59242] Avg episode reward: [(0, '24.550'), (1, '26.670')] +[2023-10-09 05:01:16,708][60143] Updated weights for policy 0, policy_version 21352 (0.0011) +[2023-10-09 05:01:17,081][60143] Updated weights for policy 0, policy_version 21362 (0.0010) +[2023-10-09 05:01:17,449][60143] Updated weights for policy 0, policy_version 21372 (0.0009) +[2023-10-09 05:01:19,864][60144] Updated weights for policy 1, policy_version 21602 (0.0008) +[2023-10-09 05:01:20,233][60144] Updated weights for policy 1, policy_version 21612 (0.0007) +[2023-10-09 05:01:20,597][60144] Updated weights for policy 1, policy_version 21622 (0.0007) +[2023-10-09 05:01:20,965][60144] Updated weights for policy 1, policy_version 21632 (0.0007) +[2023-10-09 05:01:21,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 44040192. Throughput: 0: 1692.2, 1: 1738.8. Samples: 11012728. Policy #0 lag: (min: 27.0, avg: 33.7, max: 59.0) +[2023-10-09 05:01:21,053][59242] Avg episode reward: [(0, '25.500'), (1, '26.400')] +[2023-10-09 05:01:21,419][60143] Updated weights for policy 0, policy_version 21382 (0.0008) +[2023-10-09 05:01:21,793][60143] Updated weights for policy 0, policy_version 21392 (0.0008) +[2023-10-09 05:01:22,164][60143] Updated weights for policy 0, policy_version 21402 (0.0007) +[2023-10-09 05:01:24,857][60144] Updated weights for policy 1, policy_version 21642 (0.0009) +[2023-10-09 05:01:25,223][60144] Updated weights for policy 1, policy_version 21652 (0.0009) +[2023-10-09 05:01:25,589][60144] Updated weights for policy 1, policy_version 21662 (0.0009) +[2023-10-09 05:01:26,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 44105728. Throughput: 0: 1707.0, 1: 1740.9. Samples: 11033868. Policy #0 lag: (min: 27.0, avg: 33.7, max: 59.0) +[2023-10-09 05:01:26,053][59242] Avg episode reward: [(0, '25.490'), (1, '26.240')] +[2023-10-09 05:01:26,064][60143] Updated weights for policy 0, policy_version 21412 (0.0008) +[2023-10-09 05:01:26,439][60143] Updated weights for policy 0, policy_version 21422 (0.0010) +[2023-10-09 05:01:26,809][60143] Updated weights for policy 0, policy_version 21432 (0.0009) +[2023-10-09 05:01:29,423][60144] Updated weights for policy 1, policy_version 21672 (0.0007) +[2023-10-09 05:01:29,787][60144] Updated weights for policy 1, policy_version 21682 (0.0007) +[2023-10-09 05:01:30,161][60144] Updated weights for policy 1, policy_version 21692 (0.0011) +[2023-10-09 05:01:30,747][60143] Updated weights for policy 0, policy_version 21442 (0.0007) +[2023-10-09 05:01:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 44171264. Throughput: 0: 1716.8, 1: 1715.6. Samples: 11054210. Policy #0 lag: (min: 27.0, avg: 33.7, max: 59.0) +[2023-10-09 05:01:31,053][59242] Avg episode reward: [(0, '25.390'), (1, '24.870')] +[2023-10-09 05:01:31,060][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000021696_22216704.pth... +[2023-10-09 05:01:31,089][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000020096_20578304.pth +[2023-10-09 05:01:31,153][60143] Updated weights for policy 0, policy_version 21452 (0.0009) +[2023-10-09 05:01:31,532][60143] Updated weights for policy 0, policy_version 21462 (0.0008) +[2023-10-09 05:01:31,896][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000021472_21987328.pth... +[2023-10-09 05:01:31,900][60143] Updated weights for policy 0, policy_version 21472 (0.0009) +[2023-10-09 05:01:31,934][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000019872_20348928.pth +[2023-10-09 05:01:33,921][60144] Updated weights for policy 1, policy_version 21702 (0.0009) +[2023-10-09 05:01:34,292][60144] Updated weights for policy 1, policy_version 21712 (0.0009) +[2023-10-09 05:01:34,668][60144] Updated weights for policy 1, policy_version 21722 (0.0011) +[2023-10-09 05:01:35,887][60143] Updated weights for policy 0, policy_version 21482 (0.0010) +[2023-10-09 05:01:36,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 44236800. Throughput: 0: 1709.0, 1: 1747.0. Samples: 11064986. Policy #0 lag: (min: 17.0, avg: 24.2, max: 49.0) +[2023-10-09 05:01:36,053][59242] Avg episode reward: [(0, '24.310'), (1, '24.870')] +[2023-10-09 05:01:36,256][60143] Updated weights for policy 0, policy_version 21492 (0.0010) +[2023-10-09 05:01:36,632][60143] Updated weights for policy 0, policy_version 21502 (0.0010) +[2023-10-09 05:01:38,637][60144] Updated weights for policy 1, policy_version 21732 (0.0008) +[2023-10-09 05:01:39,002][60144] Updated weights for policy 1, policy_version 21742 (0.0008) +[2023-10-09 05:01:39,377][60144] Updated weights for policy 1, policy_version 21752 (0.0009) +[2023-10-09 05:01:40,782][60143] Updated weights for policy 0, policy_version 21512 (0.0007) +[2023-10-09 05:01:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 44302336. Throughput: 0: 1716.0, 1: 1719.0. Samples: 11085012. Policy #0 lag: (min: 17.0, avg: 24.2, max: 49.0) +[2023-10-09 05:01:41,053][59242] Avg episode reward: [(0, '23.290'), (1, '24.920')] +[2023-10-09 05:01:41,151][60143] Updated weights for policy 0, policy_version 21522 (0.0008) +[2023-10-09 05:01:41,516][60143] Updated weights for policy 0, policy_version 21532 (0.0010) +[2023-10-09 05:01:43,357][60144] Updated weights for policy 1, policy_version 21762 (0.0007) +[2023-10-09 05:01:43,723][60144] Updated weights for policy 1, policy_version 21772 (0.0007) +[2023-10-09 05:01:44,086][60144] Updated weights for policy 1, policy_version 21782 (0.0007) +[2023-10-09 05:01:44,454][60144] Updated weights for policy 1, policy_version 21792 (0.0008) +[2023-10-09 05:01:45,438][60143] Updated weights for policy 0, policy_version 21542 (0.0009) +[2023-10-09 05:01:45,804][60143] Updated weights for policy 0, policy_version 21552 (0.0008) +[2023-10-09 05:01:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 44367872. Throughput: 0: 1708.2, 1: 1720.0. Samples: 11105880. Policy #0 lag: (min: 17.0, avg: 24.2, max: 49.0) +[2023-10-09 05:01:46,053][59242] Avg episode reward: [(0, '23.430'), (1, '24.690')] +[2023-10-09 05:01:46,181][60143] Updated weights for policy 0, policy_version 21562 (0.0008) +[2023-10-09 05:01:48,382][60144] Updated weights for policy 1, policy_version 21802 (0.0011) +[2023-10-09 05:01:48,747][60144] Updated weights for policy 1, policy_version 21812 (0.0009) +[2023-10-09 05:01:49,109][60144] Updated weights for policy 1, policy_version 21822 (0.0011) +[2023-10-09 05:01:50,110][60143] Updated weights for policy 0, policy_version 21572 (0.0008) +[2023-10-09 05:01:50,477][60143] Updated weights for policy 0, policy_version 21582 (0.0011) +[2023-10-09 05:01:50,852][60143] Updated weights for policy 0, policy_version 21592 (0.0010) +[2023-10-09 05:01:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 44433408. Throughput: 0: 1718.4, 1: 1731.0. Samples: 11116278. Policy #0 lag: (min: 17.0, avg: 24.2, max: 49.0) +[2023-10-09 05:01:51,052][59242] Avg episode reward: [(0, '23.130'), (1, '24.330')] +[2023-10-09 05:01:53,175][60144] Updated weights for policy 1, policy_version 21832 (0.0009) +[2023-10-09 05:01:53,549][60144] Updated weights for policy 1, policy_version 21842 (0.0010) +[2023-10-09 05:01:53,911][60144] Updated weights for policy 1, policy_version 21852 (0.0009) +[2023-10-09 05:01:54,922][60143] Updated weights for policy 0, policy_version 21602 (0.0009) +[2023-10-09 05:01:55,299][60143] Updated weights for policy 0, policy_version 21612 (0.0008) +[2023-10-09 05:01:55,668][60143] Updated weights for policy 0, policy_version 21622 (0.0008) +[2023-10-09 05:01:56,042][60143] Updated weights for policy 0, policy_version 21632 (0.0009) +[2023-10-09 05:01:56,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 44531712. Throughput: 0: 1716.4, 1: 1712.0. Samples: 11136660. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:01:56,053][59242] Avg episode reward: [(0, '23.820'), (1, '24.940')] +[2023-10-09 05:01:57,710][60144] Updated weights for policy 1, policy_version 21862 (0.0008) +[2023-10-09 05:01:58,075][60144] Updated weights for policy 1, policy_version 21872 (0.0008) +[2023-10-09 05:01:58,447][60144] Updated weights for policy 1, policy_version 21882 (0.0010) +[2023-10-09 05:02:00,053][60143] Updated weights for policy 0, policy_version 21642 (0.0009) +[2023-10-09 05:02:00,415][60143] Updated weights for policy 0, policy_version 21652 (0.0010) +[2023-10-09 05:02:00,780][60143] Updated weights for policy 0, policy_version 21662 (0.0008) +[2023-10-09 05:02:01,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 44597248. Throughput: 0: 1695.0, 1: 1735.0. Samples: 11157206. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:02:01,053][59242] Avg episode reward: [(0, '23.210'), (1, '24.770')] +[2023-10-09 05:02:02,561][60144] Updated weights for policy 1, policy_version 21892 (0.0009) +[2023-10-09 05:02:02,930][60144] Updated weights for policy 1, policy_version 21902 (0.0007) +[2023-10-09 05:02:03,298][60144] Updated weights for policy 1, policy_version 21912 (0.0010) +[2023-10-09 05:02:04,720][60143] Updated weights for policy 0, policy_version 21672 (0.0010) +[2023-10-09 05:02:05,086][60143] Updated weights for policy 0, policy_version 21682 (0.0011) +[2023-10-09 05:02:05,461][60143] Updated weights for policy 0, policy_version 21692 (0.0008) +[2023-10-09 05:02:06,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 44662784. Throughput: 0: 1724.0, 1: 1717.5. Samples: 11167598. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:02:06,053][59242] Avg episode reward: [(0, '21.840'), (1, '23.860')] +[2023-10-09 05:02:07,196][60144] Updated weights for policy 1, policy_version 21922 (0.0009) +[2023-10-09 05:02:07,552][60144] Updated weights for policy 1, policy_version 21932 (0.0007) +[2023-10-09 05:02:07,919][60144] Updated weights for policy 1, policy_version 21942 (0.0008) +[2023-10-09 05:02:08,284][60144] Updated weights for policy 1, policy_version 21952 (0.0008) +[2023-10-09 05:02:09,417][60143] Updated weights for policy 0, policy_version 21702 (0.0008) +[2023-10-09 05:02:09,774][60143] Updated weights for policy 0, policy_version 21712 (0.0009) +[2023-10-09 05:02:10,152][60143] Updated weights for policy 0, policy_version 21722 (0.0007) +[2023-10-09 05:02:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 44728320. Throughput: 0: 1715.2, 1: 1718.6. Samples: 11188388. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) +[2023-10-09 05:02:11,053][59242] Avg episode reward: [(0, '22.770'), (1, '24.230')] +[2023-10-09 05:02:12,211][60144] Updated weights for policy 1, policy_version 21962 (0.0008) +[2023-10-09 05:02:12,594][60144] Updated weights for policy 1, policy_version 21972 (0.0009) +[2023-10-09 05:02:12,967][60144] Updated weights for policy 1, policy_version 21982 (0.0007) +[2023-10-09 05:02:14,218][60143] Updated weights for policy 0, policy_version 21732 (0.0009) +[2023-10-09 05:02:14,590][60143] Updated weights for policy 0, policy_version 21742 (0.0009) +[2023-10-09 05:02:14,960][60143] Updated weights for policy 0, policy_version 21752 (0.0009) +[2023-10-09 05:02:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 44793856. Throughput: 0: 1690.4, 1: 1745.1. Samples: 11208810. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) +[2023-10-09 05:02:16,053][59242] Avg episode reward: [(0, '23.860'), (1, '24.620')] +[2023-10-09 05:02:16,821][60144] Updated weights for policy 1, policy_version 21992 (0.0008) +[2023-10-09 05:02:17,189][60144] Updated weights for policy 1, policy_version 22002 (0.0007) +[2023-10-09 05:02:17,547][60144] Updated weights for policy 1, policy_version 22012 (0.0009) +[2023-10-09 05:02:19,016][60143] Updated weights for policy 0, policy_version 21762 (0.0010) +[2023-10-09 05:02:19,414][60143] Updated weights for policy 0, policy_version 21772 (0.0007) +[2023-10-09 05:02:19,778][60143] Updated weights for policy 0, policy_version 21782 (0.0007) +[2023-10-09 05:02:20,153][60143] Updated weights for policy 0, policy_version 21792 (0.0009) +[2023-10-09 05:02:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 44859392. Throughput: 0: 1723.7, 1: 1708.2. Samples: 11219424. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) +[2023-10-09 05:02:21,052][59242] Avg episode reward: [(0, '24.790'), (1, '23.530')] +[2023-10-09 05:02:21,447][60144] Updated weights for policy 1, policy_version 22022 (0.0008) +[2023-10-09 05:02:21,820][60144] Updated weights for policy 1, policy_version 22032 (0.0008) +[2023-10-09 05:02:22,183][60144] Updated weights for policy 1, policy_version 22042 (0.0008) +[2023-10-09 05:02:24,100][60143] Updated weights for policy 0, policy_version 21802 (0.0007) +[2023-10-09 05:02:24,469][60143] Updated weights for policy 0, policy_version 21812 (0.0007) +[2023-10-09 05:02:24,837][60143] Updated weights for policy 0, policy_version 21822 (0.0009) +[2023-10-09 05:02:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 44924928. Throughput: 0: 1705.1, 1: 1734.2. Samples: 11239782. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) +[2023-10-09 05:02:26,053][59242] Avg episode reward: [(0, '24.380'), (1, '23.900')] +[2023-10-09 05:02:26,225][60144] Updated weights for policy 1, policy_version 22052 (0.0007) +[2023-10-09 05:02:26,586][60144] Updated weights for policy 1, policy_version 22062 (0.0007) +[2023-10-09 05:02:26,952][60144] Updated weights for policy 1, policy_version 22072 (0.0008) +[2023-10-09 05:02:28,673][60143] Updated weights for policy 0, policy_version 21832 (0.0009) +[2023-10-09 05:02:29,045][60143] Updated weights for policy 0, policy_version 21842 (0.0008) +[2023-10-09 05:02:29,407][60143] Updated weights for policy 0, policy_version 21852 (0.0009) +[2023-10-09 05:02:30,773][60144] Updated weights for policy 1, policy_version 22082 (0.0010) +[2023-10-09 05:02:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 44990464. Throughput: 0: 1699.8, 1: 1737.1. Samples: 11260542. Policy #0 lag: (min: 13.0, avg: 20.8, max: 45.0) +[2023-10-09 05:02:31,053][59242] Avg episode reward: [(0, '24.920'), (1, '25.040')] +[2023-10-09 05:02:31,135][60144] Updated weights for policy 1, policy_version 22092 (0.0009) +[2023-10-09 05:02:31,504][60144] Updated weights for policy 1, policy_version 22102 (0.0007) +[2023-10-09 05:02:31,874][60144] Updated weights for policy 1, policy_version 22112 (0.0010) +[2023-10-09 05:02:33,405][60143] Updated weights for policy 0, policy_version 21862 (0.0007) +[2023-10-09 05:02:33,772][60143] Updated weights for policy 0, policy_version 21872 (0.0007) +[2023-10-09 05:02:34,146][60143] Updated weights for policy 0, policy_version 21882 (0.0009) +[2023-10-09 05:02:35,772][60144] Updated weights for policy 1, policy_version 22122 (0.0007) +[2023-10-09 05:02:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 45056000. Throughput: 0: 1717.2, 1: 1720.8. Samples: 11270984. Policy #0 lag: (min: 13.0, avg: 20.8, max: 45.0) +[2023-10-09 05:02:36,052][59242] Avg episode reward: [(0, '24.760'), (1, '24.090')] +[2023-10-09 05:02:36,139][60144] Updated weights for policy 1, policy_version 22132 (0.0008) +[2023-10-09 05:02:36,509][60144] Updated weights for policy 1, policy_version 22142 (0.0008) +[2023-10-09 05:02:38,135][60143] Updated weights for policy 0, policy_version 21892 (0.0007) +[2023-10-09 05:02:38,504][60143] Updated weights for policy 0, policy_version 21902 (0.0007) +[2023-10-09 05:02:38,877][60143] Updated weights for policy 0, policy_version 21912 (0.0008) +[2023-10-09 05:02:40,644][60144] Updated weights for policy 1, policy_version 22152 (0.0008) +[2023-10-09 05:02:41,009][60144] Updated weights for policy 1, policy_version 22162 (0.0009) +[2023-10-09 05:02:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 45121536. Throughput: 0: 1695.3, 1: 1741.6. Samples: 11291318. Policy #0 lag: (min: 13.0, avg: 20.8, max: 45.0) +[2023-10-09 05:02:41,052][59242] Avg episode reward: [(0, '24.630'), (1, '23.770')] +[2023-10-09 05:02:41,384][60144] Updated weights for policy 1, policy_version 22172 (0.0010) +[2023-10-09 05:02:42,710][60143] Updated weights for policy 0, policy_version 21922 (0.0009) +[2023-10-09 05:02:43,079][60143] Updated weights for policy 0, policy_version 21932 (0.0009) +[2023-10-09 05:02:43,458][60143] Updated weights for policy 0, policy_version 21942 (0.0009) +[2023-10-09 05:02:43,822][60143] Updated weights for policy 0, policy_version 21952 (0.0009) +[2023-10-09 05:02:45,297][60144] Updated weights for policy 1, policy_version 22182 (0.0009) +[2023-10-09 05:02:45,662][60144] Updated weights for policy 1, policy_version 22192 (0.0011) +[2023-10-09 05:02:46,039][60144] Updated weights for policy 1, policy_version 22202 (0.0010) +[2023-10-09 05:02:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 45187072. Throughput: 0: 1717.8, 1: 1717.7. Samples: 11311802. Policy #0 lag: (min: 13.0, avg: 20.8, max: 45.0) +[2023-10-09 05:02:46,053][59242] Avg episode reward: [(0, '24.960'), (1, '23.670')] +[2023-10-09 05:02:47,738][60143] Updated weights for policy 0, policy_version 21962 (0.0009) +[2023-10-09 05:02:48,107][60143] Updated weights for policy 0, policy_version 21972 (0.0009) +[2023-10-09 05:02:48,487][60143] Updated weights for policy 0, policy_version 21982 (0.0010) +[2023-10-09 05:02:50,089][60144] Updated weights for policy 1, policy_version 22212 (0.0008) +[2023-10-09 05:02:50,461][60144] Updated weights for policy 1, policy_version 22222 (0.0008) +[2023-10-09 05:02:50,834][60144] Updated weights for policy 1, policy_version 22232 (0.0007) +[2023-10-09 05:02:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 45252608. Throughput: 0: 1696.0, 1: 1732.8. Samples: 11321892. Policy #0 lag: (min: 18.0, avg: 20.6, max: 49.0) +[2023-10-09 05:02:51,053][59242] Avg episode reward: [(0, '24.980'), (1, '24.200')] +[2023-10-09 05:02:52,546][60143] Updated weights for policy 0, policy_version 21992 (0.0009) +[2023-10-09 05:02:52,925][60143] Updated weights for policy 0, policy_version 22002 (0.0008) +[2023-10-09 05:02:53,299][60143] Updated weights for policy 0, policy_version 22012 (0.0009) +[2023-10-09 05:02:54,773][60144] Updated weights for policy 1, policy_version 22242 (0.0007) +[2023-10-09 05:02:55,133][60144] Updated weights for policy 1, policy_version 22252 (0.0007) +[2023-10-09 05:02:55,504][60144] Updated weights for policy 1, policy_version 22262 (0.0009) +[2023-10-09 05:02:55,865][60144] Updated weights for policy 1, policy_version 22272 (0.0008) +[2023-10-09 05:02:56,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 45350912. Throughput: 0: 1704.3, 1: 1731.9. Samples: 11343014. Policy #0 lag: (min: 18.0, avg: 20.6, max: 49.0) +[2023-10-09 05:02:56,053][59242] Avg episode reward: [(0, '24.800'), (1, '24.200')] +[2023-10-09 05:02:57,100][60143] Updated weights for policy 0, policy_version 22022 (0.0007) +[2023-10-09 05:02:57,481][60143] Updated weights for policy 0, policy_version 22032 (0.0007) +[2023-10-09 05:02:57,859][60143] Updated weights for policy 0, policy_version 22042 (0.0008) +[2023-10-09 05:02:59,807][60144] Updated weights for policy 1, policy_version 22282 (0.0007) +[2023-10-09 05:03:00,182][60144] Updated weights for policy 1, policy_version 22292 (0.0007) +[2023-10-09 05:03:00,546][60144] Updated weights for policy 1, policy_version 22302 (0.0008) +[2023-10-09 05:03:01,052][59242] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 45416448. Throughput: 0: 1733.2, 1: 1704.2. Samples: 11363494. Policy #0 lag: (min: 18.0, avg: 20.6, max: 49.0) +[2023-10-09 05:03:01,053][59242] Avg episode reward: [(0, '25.630'), (1, '24.540')] +[2023-10-09 05:03:01,761][60143] Updated weights for policy 0, policy_version 22052 (0.0009) +[2023-10-09 05:03:02,131][60143] Updated weights for policy 0, policy_version 22062 (0.0008) +[2023-10-09 05:03:02,505][60143] Updated weights for policy 0, policy_version 22072 (0.0007) +[2023-10-09 05:03:04,321][60144] Updated weights for policy 1, policy_version 22312 (0.0007) +[2023-10-09 05:03:04,689][60144] Updated weights for policy 1, policy_version 22322 (0.0007) +[2023-10-09 05:03:05,064][60144] Updated weights for policy 1, policy_version 22332 (0.0008) +[2023-10-09 05:03:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 45481984. Throughput: 0: 1699.5, 1: 1739.3. Samples: 11374168. Policy #0 lag: (min: 18.0, avg: 20.6, max: 49.0) +[2023-10-09 05:03:06,053][59242] Avg episode reward: [(0, '26.440'), (1, '23.860')] +[2023-10-09 05:03:06,472][60143] Updated weights for policy 0, policy_version 22082 (0.0007) +[2023-10-09 05:03:06,831][60143] Updated weights for policy 0, policy_version 22092 (0.0007) +[2023-10-09 05:03:07,210][60143] Updated weights for policy 0, policy_version 22102 (0.0007) +[2023-10-09 05:03:07,586][60143] Updated weights for policy 0, policy_version 22112 (0.0010) +[2023-10-09 05:03:09,157][60144] Updated weights for policy 1, policy_version 22342 (0.0010) +[2023-10-09 05:03:09,528][60144] Updated weights for policy 1, policy_version 22352 (0.0008) +[2023-10-09 05:03:09,896][60144] Updated weights for policy 1, policy_version 22362 (0.0008) +[2023-10-09 05:03:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 45547520. Throughput: 0: 1717.0, 1: 1721.2. Samples: 11394502. Policy #0 lag: (min: 31.0, avg: 42.2, max: 63.0) +[2023-10-09 05:03:11,053][59242] Avg episode reward: [(0, '27.280'), (1, '23.490')] +[2023-10-09 05:03:11,701][60143] Updated weights for policy 0, policy_version 22122 (0.0010) +[2023-10-09 05:03:12,068][60143] Updated weights for policy 0, policy_version 22132 (0.0007) +[2023-10-09 05:03:12,428][60143] Updated weights for policy 0, policy_version 22142 (0.0009) +[2023-10-09 05:03:13,722][60144] Updated weights for policy 1, policy_version 22372 (0.0008) +[2023-10-09 05:03:14,078][60144] Updated weights for policy 1, policy_version 22382 (0.0007) +[2023-10-09 05:03:14,447][60144] Updated weights for policy 1, policy_version 22392 (0.0007) +[2023-10-09 05:03:16,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 45613056. Throughput: 0: 1728.8, 1: 1708.7. Samples: 11415228. Policy #0 lag: (min: 31.0, avg: 42.2, max: 63.0) +[2023-10-09 05:03:16,052][59242] Avg episode reward: [(0, '27.030'), (1, '24.640')] +[2023-10-09 05:03:16,282][60143] Updated weights for policy 0, policy_version 22152 (0.0008) +[2023-10-09 05:03:16,652][60143] Updated weights for policy 0, policy_version 22162 (0.0009) +[2023-10-09 05:03:17,013][60143] Updated weights for policy 0, policy_version 22172 (0.0008) +[2023-10-09 05:03:18,460][60144] Updated weights for policy 1, policy_version 22402 (0.0008) +[2023-10-09 05:03:18,825][60144] Updated weights for policy 1, policy_version 22412 (0.0007) +[2023-10-09 05:03:19,180][60144] Updated weights for policy 1, policy_version 22422 (0.0007) +[2023-10-09 05:03:19,551][60144] Updated weights for policy 1, policy_version 22432 (0.0007) +[2023-10-09 05:03:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 45678592. Throughput: 0: 1701.4, 1: 1736.1. Samples: 11425672. Policy #0 lag: (min: 31.0, avg: 42.2, max: 63.0) +[2023-10-09 05:03:21,052][59242] Avg episode reward: [(0, '27.130'), (1, '24.970')] +[2023-10-09 05:03:21,089][60143] Updated weights for policy 0, policy_version 22182 (0.0009) +[2023-10-09 05:03:21,463][60143] Updated weights for policy 0, policy_version 22192 (0.0011) +[2023-10-09 05:03:21,827][60143] Updated weights for policy 0, policy_version 22202 (0.0010) +[2023-10-09 05:03:23,584][60144] Updated weights for policy 1, policy_version 22442 (0.0011) +[2023-10-09 05:03:23,946][60144] Updated weights for policy 1, policy_version 22452 (0.0009) +[2023-10-09 05:03:24,312][60144] Updated weights for policy 1, policy_version 22462 (0.0010) +[2023-10-09 05:03:25,613][60143] Updated weights for policy 0, policy_version 22212 (0.0007) +[2023-10-09 05:03:25,991][60143] Updated weights for policy 0, policy_version 22222 (0.0007) +[2023-10-09 05:03:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 45744128. Throughput: 0: 1725.5, 1: 1704.6. Samples: 11445672. Policy #0 lag: (min: 31.0, avg: 42.2, max: 63.0) +[2023-10-09 05:03:26,053][59242] Avg episode reward: [(0, '27.090'), (1, '24.690')] +[2023-10-09 05:03:26,359][60143] Updated weights for policy 0, policy_version 22232 (0.0007) +[2023-10-09 05:03:28,334][60144] Updated weights for policy 1, policy_version 22472 (0.0008) +[2023-10-09 05:03:28,716][60144] Updated weights for policy 1, policy_version 22482 (0.0010) +[2023-10-09 05:03:29,081][60144] Updated weights for policy 1, policy_version 22492 (0.0007) +[2023-10-09 05:03:30,203][60143] Updated weights for policy 0, policy_version 22242 (0.0007) +[2023-10-09 05:03:30,572][60143] Updated weights for policy 0, policy_version 22252 (0.0009) +[2023-10-09 05:03:30,944][60143] Updated weights for policy 0, policy_version 22262 (0.0011) +[2023-10-09 05:03:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 45809664. Throughput: 0: 1716.9, 1: 1724.3. Samples: 11466658. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:03:31,053][59242] Avg episode reward: [(0, '26.150'), (1, '25.010')] +[2023-10-09 05:03:31,061][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000022496_23035904.pth... +[2023-10-09 05:03:31,096][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000020896_21397504.pth +[2023-10-09 05:03:31,315][60143] Updated weights for policy 0, policy_version 22272 (0.0009) +[2023-10-09 05:03:31,315][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000022272_22806528.pth... +[2023-10-09 05:03:31,354][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000020672_21168128.pth +[2023-10-09 05:03:32,939][60144] Updated weights for policy 1, policy_version 22502 (0.0008) +[2023-10-09 05:03:33,312][60144] Updated weights for policy 1, policy_version 22512 (0.0008) +[2023-10-09 05:03:33,673][60144] Updated weights for policy 1, policy_version 22522 (0.0007) +[2023-10-09 05:03:35,216][60143] Updated weights for policy 0, policy_version 22282 (0.0011) +[2023-10-09 05:03:35,580][60143] Updated weights for policy 0, policy_version 22292 (0.0009) +[2023-10-09 05:03:35,958][60143] Updated weights for policy 0, policy_version 22302 (0.0010) +[2023-10-09 05:03:36,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 45907968. Throughput: 0: 1725.4, 1: 1720.7. Samples: 11476964. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:03:36,054][59242] Avg episode reward: [(0, '25.350'), (1, '25.120')] +[2023-10-09 05:03:37,605][60144] Updated weights for policy 1, policy_version 22532 (0.0007) +[2023-10-09 05:03:37,975][60144] Updated weights for policy 1, policy_version 22542 (0.0007) +[2023-10-09 05:03:38,334][60144] Updated weights for policy 1, policy_version 22552 (0.0010) +[2023-10-09 05:03:40,086][60143] Updated weights for policy 0, policy_version 22312 (0.0010) +[2023-10-09 05:03:40,448][60143] Updated weights for policy 0, policy_version 22322 (0.0010) +[2023-10-09 05:03:40,814][60143] Updated weights for policy 0, policy_version 22332 (0.0010) +[2023-10-09 05:03:41,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 45973504. Throughput: 0: 1730.9, 1: 1710.8. Samples: 11497892. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:03:41,053][59242] Avg episode reward: [(0, '26.490'), (1, '26.160')] +[2023-10-09 05:03:42,181][60144] Updated weights for policy 1, policy_version 22562 (0.0008) +[2023-10-09 05:03:42,553][60144] Updated weights for policy 1, policy_version 22572 (0.0008) +[2023-10-09 05:03:42,926][60144] Updated weights for policy 1, policy_version 22582 (0.0008) +[2023-10-09 05:03:43,287][60144] Updated weights for policy 1, policy_version 22592 (0.0009) +[2023-10-09 05:03:44,842][60143] Updated weights for policy 0, policy_version 22342 (0.0007) +[2023-10-09 05:03:45,212][60143] Updated weights for policy 0, policy_version 22352 (0.0009) +[2023-10-09 05:03:45,578][60143] Updated weights for policy 0, policy_version 22362 (0.0009) +[2023-10-09 05:03:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 46039040. Throughput: 0: 1699.6, 1: 1739.1. Samples: 11518236. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:03:46,053][59242] Avg episode reward: [(0, '26.650'), (1, '26.620')] +[2023-10-09 05:03:47,198][60144] Updated weights for policy 1, policy_version 22602 (0.0009) +[2023-10-09 05:03:47,564][60144] Updated weights for policy 1, policy_version 22612 (0.0008) +[2023-10-09 05:03:47,922][60144] Updated weights for policy 1, policy_version 22622 (0.0008) +[2023-10-09 05:03:49,698][60143] Updated weights for policy 0, policy_version 22372 (0.0009) +[2023-10-09 05:03:50,068][60143] Updated weights for policy 0, policy_version 22382 (0.0008) +[2023-10-09 05:03:50,435][60143] Updated weights for policy 0, policy_version 22392 (0.0010) +[2023-10-09 05:03:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 46104576. Throughput: 0: 1722.4, 1: 1704.0. Samples: 11528352. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:03:51,053][59242] Avg episode reward: [(0, '26.630'), (1, '26.020')] +[2023-10-09 05:03:51,806][60144] Updated weights for policy 1, policy_version 22632 (0.0008) +[2023-10-09 05:03:52,178][60144] Updated weights for policy 1, policy_version 22642 (0.0008) +[2023-10-09 05:03:52,548][60144] Updated weights for policy 1, policy_version 22652 (0.0007) +[2023-10-09 05:03:54,468][60143] Updated weights for policy 0, policy_version 22402 (0.0008) +[2023-10-09 05:03:54,840][60143] Updated weights for policy 0, policy_version 22412 (0.0010) +[2023-10-09 05:03:55,205][60143] Updated weights for policy 0, policy_version 22422 (0.0009) +[2023-10-09 05:03:55,578][60143] Updated weights for policy 0, policy_version 22432 (0.0010) +[2023-10-09 05:03:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 46170112. Throughput: 0: 1726.2, 1: 1725.6. Samples: 11549834. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:03:56,053][59242] Avg episode reward: [(0, '26.290'), (1, '25.820')] +[2023-10-09 05:03:56,413][60144] Updated weights for policy 1, policy_version 22662 (0.0007) +[2023-10-09 05:03:56,779][60144] Updated weights for policy 1, policy_version 22672 (0.0008) +[2023-10-09 05:03:57,153][60144] Updated weights for policy 1, policy_version 22682 (0.0007) +[2023-10-09 05:03:59,608][60143] Updated weights for policy 0, policy_version 22442 (0.0009) +[2023-10-09 05:03:59,989][60143] Updated weights for policy 0, policy_version 22452 (0.0011) +[2023-10-09 05:04:00,356][60143] Updated weights for policy 0, policy_version 22462 (0.0009) +[2023-10-09 05:04:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 46235648. Throughput: 0: 1694.9, 1: 1737.6. Samples: 11569694. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:04:01,053][59242] Avg episode reward: [(0, '27.130'), (1, '25.970')] +[2023-10-09 05:04:01,166][60144] Updated weights for policy 1, policy_version 22692 (0.0009) +[2023-10-09 05:04:01,534][60144] Updated weights for policy 1, policy_version 22702 (0.0008) +[2023-10-09 05:04:01,899][60144] Updated weights for policy 1, policy_version 22712 (0.0008) +[2023-10-09 05:04:04,140][60143] Updated weights for policy 0, policy_version 22472 (0.0008) +[2023-10-09 05:04:04,519][60143] Updated weights for policy 0, policy_version 22482 (0.0007) +[2023-10-09 05:04:04,890][60143] Updated weights for policy 0, policy_version 22492 (0.0008) +[2023-10-09 05:04:05,835][60144] Updated weights for policy 1, policy_version 22722 (0.0009) +[2023-10-09 05:04:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 46301184. Throughput: 0: 1727.2, 1: 1709.8. Samples: 11580334. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-09 05:04:06,053][59242] Avg episode reward: [(0, '26.800'), (1, '26.520')] +[2023-10-09 05:04:06,205][60144] Updated weights for policy 1, policy_version 22732 (0.0008) +[2023-10-09 05:04:06,576][60144] Updated weights for policy 1, policy_version 22742 (0.0007) +[2023-10-09 05:04:06,949][60144] Updated weights for policy 1, policy_version 22752 (0.0008) +[2023-10-09 05:04:08,853][60143] Updated weights for policy 0, policy_version 22502 (0.0008) +[2023-10-09 05:04:09,213][60143] Updated weights for policy 0, policy_version 22512 (0.0008) +[2023-10-09 05:04:09,594][60143] Updated weights for policy 0, policy_version 22522 (0.0007) +[2023-10-09 05:04:10,897][60144] Updated weights for policy 1, policy_version 22762 (0.0008) +[2023-10-09 05:04:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 46366720. Throughput: 0: 1705.1, 1: 1744.2. Samples: 11600890. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-09 05:04:11,053][59242] Avg episode reward: [(0, '27.780'), (1, '25.980')] +[2023-10-09 05:04:11,255][60144] Updated weights for policy 1, policy_version 22772 (0.0009) +[2023-10-09 05:04:11,622][60144] Updated weights for policy 1, policy_version 22782 (0.0008) +[2023-10-09 05:04:13,495][60143] Updated weights for policy 0, policy_version 22532 (0.0007) +[2023-10-09 05:04:13,859][60143] Updated weights for policy 0, policy_version 22542 (0.0007) +[2023-10-09 05:04:14,234][60143] Updated weights for policy 0, policy_version 22552 (0.0008) +[2023-10-09 05:04:15,697][60144] Updated weights for policy 1, policy_version 22792 (0.0010) +[2023-10-09 05:04:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 46432256. Throughput: 0: 1708.8, 1: 1735.1. Samples: 11621636. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-09 05:04:16,053][59242] Avg episode reward: [(0, '27.310'), (1, '25.650')] +[2023-10-09 05:04:16,070][60144] Updated weights for policy 1, policy_version 22802 (0.0009) +[2023-10-09 05:04:16,438][60144] Updated weights for policy 1, policy_version 22812 (0.0007) +[2023-10-09 05:04:18,393][60143] Updated weights for policy 0, policy_version 22562 (0.0010) +[2023-10-09 05:04:18,772][60143] Updated weights for policy 0, policy_version 22572 (0.0010) +[2023-10-09 05:04:19,141][60143] Updated weights for policy 0, policy_version 22582 (0.0009) +[2023-10-09 05:04:19,495][60143] Updated weights for policy 0, policy_version 22592 (0.0011) +[2023-10-09 05:04:20,302][60144] Updated weights for policy 1, policy_version 22822 (0.0010) +[2023-10-09 05:04:20,672][60144] Updated weights for policy 1, policy_version 22832 (0.0011) +[2023-10-09 05:04:21,040][60144] Updated weights for policy 1, policy_version 22842 (0.0010) +[2023-10-09 05:04:21,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 46497792. Throughput: 0: 1714.9, 1: 1730.0. Samples: 11631980. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-09 05:04:21,052][59242] Avg episode reward: [(0, '26.610'), (1, '26.250')] +[2023-10-09 05:04:23,596][60143] Updated weights for policy 0, policy_version 22602 (0.0009) +[2023-10-09 05:04:23,957][60143] Updated weights for policy 0, policy_version 22612 (0.0009) +[2023-10-09 05:04:24,330][60143] Updated weights for policy 0, policy_version 22622 (0.0007) +[2023-10-09 05:04:25,083][60144] Updated weights for policy 1, policy_version 22852 (0.0010) +[2023-10-09 05:04:25,453][60144] Updated weights for policy 1, policy_version 22862 (0.0008) +[2023-10-09 05:04:25,823][60144] Updated weights for policy 1, policy_version 22872 (0.0009) +[2023-10-09 05:04:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 46563328. Throughput: 0: 1685.0, 1: 1741.7. Samples: 11652096. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:04:26,053][59242] Avg episode reward: [(0, '26.250'), (1, '26.560')] +[2023-10-09 05:04:28,286][60143] Updated weights for policy 0, policy_version 22632 (0.0007) +[2023-10-09 05:04:28,658][60143] Updated weights for policy 0, policy_version 22642 (0.0009) +[2023-10-09 05:04:29,022][60143] Updated weights for policy 0, policy_version 22652 (0.0009) +[2023-10-09 05:04:29,805][60144] Updated weights for policy 1, policy_version 22882 (0.0008) +[2023-10-09 05:04:30,168][60144] Updated weights for policy 1, policy_version 22892 (0.0008) +[2023-10-09 05:04:30,533][60144] Updated weights for policy 1, policy_version 22902 (0.0008) +[2023-10-09 05:04:30,901][60144] Updated weights for policy 1, policy_version 22912 (0.0009) +[2023-10-09 05:04:31,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 46661632. Throughput: 0: 1712.1, 1: 1715.9. Samples: 11672496. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:04:31,052][59242] Avg episode reward: [(0, '27.190'), (1, '26.870')] +[2023-10-09 05:04:32,847][60143] Updated weights for policy 0, policy_version 22662 (0.0010) +[2023-10-09 05:04:33,217][60143] Updated weights for policy 0, policy_version 22672 (0.0010) +[2023-10-09 05:04:33,592][60143] Updated weights for policy 0, policy_version 22682 (0.0009) +[2023-10-09 05:04:34,758][60144] Updated weights for policy 1, policy_version 22922 (0.0010) +[2023-10-09 05:04:35,130][60144] Updated weights for policy 1, policy_version 22932 (0.0007) +[2023-10-09 05:04:35,499][60144] Updated weights for policy 1, policy_version 22942 (0.0008) +[2023-10-09 05:04:36,052][59242] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 46727168. Throughput: 0: 1698.0, 1: 1741.0. Samples: 11683108. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:04:36,053][59242] Avg episode reward: [(0, '25.880'), (1, '26.490')] +[2023-10-09 05:04:37,642][60143] Updated weights for policy 0, policy_version 22692 (0.0009) +[2023-10-09 05:04:38,001][60143] Updated weights for policy 0, policy_version 22702 (0.0008) +[2023-10-09 05:04:38,371][60143] Updated weights for policy 0, policy_version 22712 (0.0008) +[2023-10-09 05:04:39,395][60144] Updated weights for policy 1, policy_version 22952 (0.0007) +[2023-10-09 05:04:39,766][60144] Updated weights for policy 1, policy_version 22962 (0.0007) +[2023-10-09 05:04:40,143][60144] Updated weights for policy 1, policy_version 22972 (0.0008) +[2023-10-09 05:04:41,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 46792704. Throughput: 0: 1686.0, 1: 1730.7. Samples: 11703588. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:04:41,053][59242] Avg episode reward: [(0, '27.870'), (1, '28.650')] +[2023-10-09 05:04:41,055][60003] Saving new best policy, reward=28.650! +[2023-10-09 05:04:42,307][60143] Updated weights for policy 0, policy_version 22722 (0.0008) +[2023-10-09 05:04:42,679][60143] Updated weights for policy 0, policy_version 22732 (0.0007) +[2023-10-09 05:04:43,055][60143] Updated weights for policy 0, policy_version 22742 (0.0007) +[2023-10-09 05:04:43,417][60143] Updated weights for policy 0, policy_version 22752 (0.0008) +[2023-10-09 05:04:43,931][60144] Updated weights for policy 1, policy_version 22982 (0.0008) +[2023-10-09 05:04:44,294][60144] Updated weights for policy 1, policy_version 22992 (0.0008) +[2023-10-09 05:04:44,671][60144] Updated weights for policy 1, policy_version 23002 (0.0009) +[2023-10-09 05:04:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 46858240. Throughput: 0: 1722.2, 1: 1712.1. Samples: 11724240. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:04:46,052][59242] Avg episode reward: [(0, '27.340'), (1, '26.920')] +[2023-10-09 05:04:47,507][60143] Updated weights for policy 0, policy_version 22762 (0.0009) +[2023-10-09 05:04:47,869][60143] Updated weights for policy 0, policy_version 22772 (0.0008) +[2023-10-09 05:04:48,236][60143] Updated weights for policy 0, policy_version 22782 (0.0010) +[2023-10-09 05:04:48,425][60144] Updated weights for policy 1, policy_version 23012 (0.0008) +[2023-10-09 05:04:48,793][60144] Updated weights for policy 1, policy_version 23022 (0.0009) +[2023-10-09 05:04:49,159][60144] Updated weights for policy 1, policy_version 23032 (0.0010) +[2023-10-09 05:04:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 46923776. Throughput: 0: 1685.5, 1: 1742.8. Samples: 11734610. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:04:51,053][59242] Avg episode reward: [(0, '27.550'), (1, '25.560')] +[2023-10-09 05:04:52,166][60143] Updated weights for policy 0, policy_version 22792 (0.0010) +[2023-10-09 05:04:52,534][60143] Updated weights for policy 0, policy_version 22802 (0.0009) +[2023-10-09 05:04:52,909][60143] Updated weights for policy 0, policy_version 22812 (0.0007) +[2023-10-09 05:04:53,070][60144] Updated weights for policy 1, policy_version 23042 (0.0010) +[2023-10-09 05:04:53,442][60144] Updated weights for policy 1, policy_version 23052 (0.0007) +[2023-10-09 05:04:53,807][60144] Updated weights for policy 1, policy_version 23062 (0.0008) +[2023-10-09 05:04:54,171][60144] Updated weights for policy 1, policy_version 23072 (0.0008) +[2023-10-09 05:04:56,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 46989312. Throughput: 0: 1706.1, 1: 1715.2. Samples: 11754848. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:04:56,053][59242] Avg episode reward: [(0, '25.350'), (1, '26.000')] +[2023-10-09 05:04:56,919][60143] Updated weights for policy 0, policy_version 22822 (0.0007) +[2023-10-09 05:04:57,298][60143] Updated weights for policy 0, policy_version 22832 (0.0007) +[2023-10-09 05:04:57,671][60143] Updated weights for policy 0, policy_version 22842 (0.0007) +[2023-10-09 05:04:58,094][60144] Updated weights for policy 1, policy_version 23082 (0.0009) +[2023-10-09 05:04:58,460][60144] Updated weights for policy 1, policy_version 23092 (0.0009) +[2023-10-09 05:04:58,826][60144] Updated weights for policy 1, policy_version 23102 (0.0011) +[2023-10-09 05:05:01,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 47054848. Throughput: 0: 1713.1, 1: 1725.3. Samples: 11776364. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:05:01,052][59242] Avg episode reward: [(0, '26.040'), (1, '26.480')] +[2023-10-09 05:05:01,638][60143] Updated weights for policy 0, policy_version 22852 (0.0008) +[2023-10-09 05:05:01,996][60143] Updated weights for policy 0, policy_version 22862 (0.0011) +[2023-10-09 05:05:02,368][60143] Updated weights for policy 0, policy_version 22872 (0.0008) +[2023-10-09 05:05:02,836][60144] Updated weights for policy 1, policy_version 23112 (0.0011) +[2023-10-09 05:05:03,215][60144] Updated weights for policy 1, policy_version 23122 (0.0008) +[2023-10-09 05:05:03,589][60144] Updated weights for policy 1, policy_version 23132 (0.0007) +[2023-10-09 05:05:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 47120384. Throughput: 0: 1693.4, 1: 1726.6. Samples: 11785880. Policy #0 lag: (min: 20.0, avg: 44.0, max: 48.0) +[2023-10-09 05:05:06,053][59242] Avg episode reward: [(0, '24.950'), (1, '25.370')] +[2023-10-09 05:05:06,383][60143] Updated weights for policy 0, policy_version 22882 (0.0008) +[2023-10-09 05:05:06,748][60143] Updated weights for policy 0, policy_version 22892 (0.0007) +[2023-10-09 05:05:07,125][60143] Updated weights for policy 0, policy_version 22902 (0.0008) +[2023-10-09 05:05:07,487][60143] Updated weights for policy 0, policy_version 22912 (0.0007) +[2023-10-09 05:05:07,585][60144] Updated weights for policy 1, policy_version 23142 (0.0009) +[2023-10-09 05:05:07,954][60144] Updated weights for policy 1, policy_version 23152 (0.0008) +[2023-10-09 05:05:08,322][60144] Updated weights for policy 1, policy_version 23162 (0.0009) +[2023-10-09 05:05:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 47185920. Throughput: 0: 1725.1, 1: 1713.3. Samples: 11806824. Policy #0 lag: (min: 20.0, avg: 44.0, max: 48.0) +[2023-10-09 05:05:11,053][59242] Avg episode reward: [(0, '24.890'), (1, '27.310')] +[2023-10-09 05:05:11,475][60143] Updated weights for policy 0, policy_version 22922 (0.0008) +[2023-10-09 05:05:11,843][60143] Updated weights for policy 0, policy_version 22932 (0.0008) +[2023-10-09 05:05:12,200][60144] Updated weights for policy 1, policy_version 23172 (0.0008) +[2023-10-09 05:05:12,220][60143] Updated weights for policy 0, policy_version 22942 (0.0007) +[2023-10-09 05:05:12,563][60144] Updated weights for policy 1, policy_version 23182 (0.0007) +[2023-10-09 05:05:12,927][60144] Updated weights for policy 1, policy_version 23192 (0.0009) +[2023-10-09 05:05:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 47251456. Throughput: 0: 1720.0, 1: 1738.2. Samples: 11828118. Policy #0 lag: (min: 20.0, avg: 44.0, max: 48.0) +[2023-10-09 05:05:16,053][59242] Avg episode reward: [(0, '25.370'), (1, '27.160')] +[2023-10-09 05:05:16,116][60143] Updated weights for policy 0, policy_version 22952 (0.0008) +[2023-10-09 05:05:16,494][60143] Updated weights for policy 0, policy_version 22962 (0.0008) +[2023-10-09 05:05:16,821][60144] Updated weights for policy 1, policy_version 23202 (0.0009) +[2023-10-09 05:05:16,870][60143] Updated weights for policy 0, policy_version 22972 (0.0009) +[2023-10-09 05:05:17,192][60144] Updated weights for policy 1, policy_version 23212 (0.0009) +[2023-10-09 05:05:17,568][60144] Updated weights for policy 1, policy_version 23222 (0.0009) +[2023-10-09 05:05:17,927][60144] Updated weights for policy 1, policy_version 23232 (0.0010) +[2023-10-09 05:05:20,907][60143] Updated weights for policy 0, policy_version 22982 (0.0008) +[2023-10-09 05:05:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 47316992. Throughput: 0: 1711.9, 1: 1717.9. Samples: 11837448. Policy #0 lag: (min: 20.0, avg: 44.0, max: 48.0) +[2023-10-09 05:05:21,053][59242] Avg episode reward: [(0, '25.670'), (1, '27.550')] +[2023-10-09 05:05:21,281][60143] Updated weights for policy 0, policy_version 22992 (0.0007) +[2023-10-09 05:05:21,642][60143] Updated weights for policy 0, policy_version 23002 (0.0007) +[2023-10-09 05:05:21,912][60144] Updated weights for policy 1, policy_version 23242 (0.0008) +[2023-10-09 05:05:22,277][60144] Updated weights for policy 1, policy_version 23252 (0.0011) +[2023-10-09 05:05:22,650][60144] Updated weights for policy 1, policy_version 23262 (0.0009) +[2023-10-09 05:05:25,741][60143] Updated weights for policy 0, policy_version 23012 (0.0008) +[2023-10-09 05:05:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 47382528. Throughput: 0: 1720.6, 1: 1723.5. Samples: 11858570. Policy #0 lag: (min: 6.0, avg: 6.0, max: 9.0) +[2023-10-09 05:05:26,053][59242] Avg episode reward: [(0, '25.510'), (1, '26.670')] +[2023-10-09 05:05:26,111][60143] Updated weights for policy 0, policy_version 23022 (0.0007) +[2023-10-09 05:05:26,481][60143] Updated weights for policy 0, policy_version 23032 (0.0007) +[2023-10-09 05:05:26,643][60144] Updated weights for policy 1, policy_version 23272 (0.0009) +[2023-10-09 05:05:27,005][60144] Updated weights for policy 1, policy_version 23282 (0.0009) +[2023-10-09 05:05:27,372][60144] Updated weights for policy 1, policy_version 23292 (0.0010) +[2023-10-09 05:05:30,457][60143] Updated weights for policy 0, policy_version 23042 (0.0010) +[2023-10-09 05:05:30,828][60143] Updated weights for policy 0, policy_version 23052 (0.0009) +[2023-10-09 05:05:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 47448064. Throughput: 0: 1710.9, 1: 1744.1. Samples: 11879714. Policy #0 lag: (min: 6.0, avg: 6.0, max: 9.0) +[2023-10-09 05:05:31,053][59242] Avg episode reward: [(0, '24.550'), (1, '24.870')] +[2023-10-09 05:05:31,194][60143] Updated weights for policy 0, policy_version 23062 (0.0009) +[2023-10-09 05:05:31,245][60144] Updated weights for policy 1, policy_version 23302 (0.0008) +[2023-10-09 05:05:31,561][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000023072_23625728.pth... +[2023-10-09 05:05:31,565][60143] Updated weights for policy 0, policy_version 23072 (0.0009) +[2023-10-09 05:05:31,595][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000021472_21987328.pth +[2023-10-09 05:05:31,608][60144] Updated weights for policy 1, policy_version 23312 (0.0009) +[2023-10-09 05:05:31,989][60144] Updated weights for policy 1, policy_version 23322 (0.0010) +[2023-10-09 05:05:32,200][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000023328_23887872.pth... +[2023-10-09 05:05:32,240][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000021696_22216704.pth +[2023-10-09 05:05:35,529][60143] Updated weights for policy 0, policy_version 23082 (0.0009) +[2023-10-09 05:05:35,847][60144] Updated weights for policy 1, policy_version 23332 (0.0009) +[2023-10-09 05:05:35,902][60143] Updated weights for policy 0, policy_version 23092 (0.0009) +[2023-10-09 05:05:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 47513600. Throughput: 0: 1719.5, 1: 1713.7. Samples: 11889104. Policy #0 lag: (min: 6.0, avg: 6.0, max: 9.0) +[2023-10-09 05:05:36,053][59242] Avg episode reward: [(0, '25.450'), (1, '23.930')] +[2023-10-09 05:05:36,217][60144] Updated weights for policy 1, policy_version 23342 (0.0007) +[2023-10-09 05:05:36,276][60143] Updated weights for policy 0, policy_version 23102 (0.0009) +[2023-10-09 05:05:36,575][60144] Updated weights for policy 1, policy_version 23352 (0.0007) +[2023-10-09 05:05:40,219][60143] Updated weights for policy 0, policy_version 23112 (0.0007) +[2023-10-09 05:05:40,567][60144] Updated weights for policy 1, policy_version 23362 (0.0009) +[2023-10-09 05:05:40,588][60143] Updated weights for policy 0, policy_version 23122 (0.0008) +[2023-10-09 05:05:40,922][60144] Updated weights for policy 1, policy_version 23372 (0.0009) +[2023-10-09 05:05:40,960][60143] Updated weights for policy 0, policy_version 23132 (0.0008) +[2023-10-09 05:05:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 47579136. Throughput: 0: 1713.2, 1: 1741.0. Samples: 11910286. Policy #0 lag: (min: 6.0, avg: 6.0, max: 9.0) +[2023-10-09 05:05:41,053][59242] Avg episode reward: [(0, '24.030'), (1, '24.990')] +[2023-10-09 05:05:41,300][60144] Updated weights for policy 1, policy_version 23382 (0.0011) +[2023-10-09 05:05:41,664][60144] Updated weights for policy 1, policy_version 23392 (0.0009) +[2023-10-09 05:05:44,981][60143] Updated weights for policy 0, policy_version 23142 (0.0007) +[2023-10-09 05:05:45,342][60143] Updated weights for policy 0, policy_version 23152 (0.0010) +[2023-10-09 05:05:45,563][60144] Updated weights for policy 1, policy_version 23402 (0.0012) +[2023-10-09 05:05:45,714][60143] Updated weights for policy 0, policy_version 23162 (0.0008) +[2023-10-09 05:05:45,930][60144] Updated weights for policy 1, policy_version 23412 (0.0008) +[2023-10-09 05:05:46,052][59242] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 47677440. Throughput: 0: 1693.7, 1: 1735.9. Samples: 11930694. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:05:46,053][59242] Avg episode reward: [(0, '24.150'), (1, '24.440')] +[2023-10-09 05:05:46,294][60144] Updated weights for policy 1, policy_version 23422 (0.0010) +[2023-10-09 05:05:49,731][60143] Updated weights for policy 0, policy_version 23172 (0.0007) +[2023-10-09 05:05:50,109][60143] Updated weights for policy 0, policy_version 23182 (0.0009) +[2023-10-09 05:05:50,307][60144] Updated weights for policy 1, policy_version 23432 (0.0008) +[2023-10-09 05:05:50,488][60143] Updated weights for policy 0, policy_version 23192 (0.0009) +[2023-10-09 05:05:50,680][60144] Updated weights for policy 1, policy_version 23442 (0.0009) +[2023-10-09 05:05:51,044][60144] Updated weights for policy 1, policy_version 23452 (0.0008) +[2023-10-09 05:05:51,052][59242] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 47742976. Throughput: 0: 1710.9, 1: 1739.2. Samples: 11941132. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:05:51,053][59242] Avg episode reward: [(0, '23.410'), (1, '24.010')] +[2023-10-09 05:05:54,405][60143] Updated weights for policy 0, policy_version 23202 (0.0009) +[2023-10-09 05:05:54,779][60143] Updated weights for policy 0, policy_version 23212 (0.0008) +[2023-10-09 05:05:55,013][60144] Updated weights for policy 1, policy_version 23462 (0.0008) +[2023-10-09 05:05:55,148][60143] Updated weights for policy 0, policy_version 23222 (0.0007) +[2023-10-09 05:05:55,379][60144] Updated weights for policy 1, policy_version 23472 (0.0009) +[2023-10-09 05:05:55,519][60143] Updated weights for policy 0, policy_version 23232 (0.0008) +[2023-10-09 05:05:55,745][60144] Updated weights for policy 1, policy_version 23482 (0.0008) +[2023-10-09 05:05:56,052][59242] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 47841280. Throughput: 0: 1703.3, 1: 1746.4. Samples: 11962060. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:05:56,053][59242] Avg episode reward: [(0, '24.400'), (1, '24.060')] +[2023-10-09 05:05:59,471][60143] Updated weights for policy 0, policy_version 23242 (0.0008) +[2023-10-09 05:05:59,762][60144] Updated weights for policy 1, policy_version 23492 (0.0009) +[2023-10-09 05:05:59,831][60143] Updated weights for policy 0, policy_version 23252 (0.0010) +[2023-10-09 05:06:00,134][60144] Updated weights for policy 1, policy_version 23502 (0.0008) +[2023-10-09 05:06:00,210][60143] Updated weights for policy 0, policy_version 23262 (0.0009) +[2023-10-09 05:06:00,498][60144] Updated weights for policy 1, policy_version 23512 (0.0007) +[2023-10-09 05:06:01,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 47906816. Throughput: 0: 1682.0, 1: 1715.7. Samples: 11981014. Policy #0 lag: (min: 0.0, avg: 27.3, max: 32.0) +[2023-10-09 05:06:01,053][59242] Avg episode reward: [(0, '23.290'), (1, '24.810')] +[2023-10-09 05:06:04,250][60143] Updated weights for policy 0, policy_version 23272 (0.0008) +[2023-10-09 05:06:04,496][60144] Updated weights for policy 1, policy_version 23522 (0.0007) +[2023-10-09 05:06:04,627][60143] Updated weights for policy 0, policy_version 23282 (0.0008) +[2023-10-09 05:06:04,868][60144] Updated weights for policy 1, policy_version 23532 (0.0008) +[2023-10-09 05:06:05,003][60143] Updated weights for policy 0, policy_version 23292 (0.0008) +[2023-10-09 05:06:05,227][60144] Updated weights for policy 1, policy_version 23542 (0.0008) +[2023-10-09 05:06:05,601][60144] Updated weights for policy 1, policy_version 23552 (0.0008) +[2023-10-09 05:06:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 47972352. Throughput: 0: 1713.1, 1: 1735.9. Samples: 11992652. Policy #0 lag: (min: 0.0, avg: 27.3, max: 32.0) +[2023-10-09 05:06:06,053][59242] Avg episode reward: [(0, '23.000'), (1, '24.900')] +[2023-10-09 05:06:09,061][60143] Updated weights for policy 0, policy_version 23302 (0.0009) +[2023-10-09 05:06:09,445][60143] Updated weights for policy 0, policy_version 23312 (0.0008) +[2023-10-09 05:06:09,552][60144] Updated weights for policy 1, policy_version 23562 (0.0008) +[2023-10-09 05:06:09,815][60143] Updated weights for policy 0, policy_version 23322 (0.0009) +[2023-10-09 05:06:09,917][60144] Updated weights for policy 1, policy_version 23572 (0.0007) +[2023-10-09 05:06:10,288][60144] Updated weights for policy 1, policy_version 23582 (0.0010) +[2023-10-09 05:06:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 48037888. Throughput: 0: 1696.7, 1: 1725.5. Samples: 12012570. Policy #0 lag: (min: 0.0, avg: 27.3, max: 32.0) +[2023-10-09 05:06:11,053][59242] Avg episode reward: [(0, '23.510'), (1, '24.350')] +[2023-10-09 05:06:13,830][60143] Updated weights for policy 0, policy_version 23332 (0.0008) +[2023-10-09 05:06:14,203][60143] Updated weights for policy 0, policy_version 23342 (0.0009) +[2023-10-09 05:06:14,277][60144] Updated weights for policy 1, policy_version 23592 (0.0008) +[2023-10-09 05:06:14,581][60143] Updated weights for policy 0, policy_version 23352 (0.0008) +[2023-10-09 05:06:14,641][60144] Updated weights for policy 1, policy_version 23602 (0.0008) +[2023-10-09 05:06:15,013][60144] Updated weights for policy 1, policy_version 23612 (0.0007) +[2023-10-09 05:06:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 48103424. Throughput: 0: 1687.5, 1: 1701.3. Samples: 12032210. Policy #0 lag: (min: 0.0, avg: 27.3, max: 32.0) +[2023-10-09 05:06:16,052][59242] Avg episode reward: [(0, '24.500'), (1, '24.790')] +[2023-10-09 05:06:18,752][60143] Updated weights for policy 0, policy_version 23362 (0.0008) +[2023-10-09 05:06:18,995][60144] Updated weights for policy 1, policy_version 23622 (0.0009) +[2023-10-09 05:06:19,111][60143] Updated weights for policy 0, policy_version 23372 (0.0008) +[2023-10-09 05:06:19,356][60144] Updated weights for policy 1, policy_version 23632 (0.0007) +[2023-10-09 05:06:19,483][60143] Updated weights for policy 0, policy_version 23382 (0.0009) +[2023-10-09 05:06:19,719][60144] Updated weights for policy 1, policy_version 23642 (0.0007) +[2023-10-09 05:06:19,865][60143] Updated weights for policy 0, policy_version 23392 (0.0009) +[2023-10-09 05:06:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 48168960. Throughput: 0: 1707.9, 1: 1730.4. Samples: 12043828. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) +[2023-10-09 05:06:21,053][59242] Avg episode reward: [(0, '24.490'), (1, '26.260')] +[2023-10-09 05:06:23,571][60144] Updated weights for policy 1, policy_version 23652 (0.0008) +[2023-10-09 05:06:23,923][60144] Updated weights for policy 1, policy_version 23662 (0.0008) +[2023-10-09 05:06:24,042][60143] Updated weights for policy 0, policy_version 23402 (0.0008) +[2023-10-09 05:06:24,286][60144] Updated weights for policy 1, policy_version 23672 (0.0008) +[2023-10-09 05:06:24,410][60143] Updated weights for policy 0, policy_version 23412 (0.0010) +[2023-10-09 05:06:24,780][60143] Updated weights for policy 0, policy_version 23422 (0.0009) +[2023-10-09 05:06:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 48234496. Throughput: 0: 1684.0, 1: 1701.5. Samples: 12062632. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) +[2023-10-09 05:06:26,053][59242] Avg episode reward: [(0, '25.610'), (1, '25.040')] +[2023-10-09 05:06:28,293][60144] Updated weights for policy 1, policy_version 23682 (0.0008) +[2023-10-09 05:06:28,658][60143] Updated weights for policy 0, policy_version 23432 (0.0008) +[2023-10-09 05:06:28,662][60144] Updated weights for policy 1, policy_version 23692 (0.0008) +[2023-10-09 05:06:29,034][60144] Updated weights for policy 1, policy_version 23702 (0.0008) +[2023-10-09 05:06:29,036][60143] Updated weights for policy 0, policy_version 23442 (0.0008) +[2023-10-09 05:06:29,399][60144] Updated weights for policy 1, policy_version 23712 (0.0008) +[2023-10-09 05:06:29,405][60143] Updated weights for policy 0, policy_version 23452 (0.0008) +[2023-10-09 05:06:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 48300032. Throughput: 0: 1692.8, 1: 1704.4. Samples: 12083570. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) +[2023-10-09 05:06:31,053][59242] Avg episode reward: [(0, '24.420'), (1, '27.030')] +[2023-10-09 05:06:33,254][60143] Updated weights for policy 0, policy_version 23462 (0.0008) +[2023-10-09 05:06:33,286][60144] Updated weights for policy 1, policy_version 23722 (0.0007) +[2023-10-09 05:06:33,614][60143] Updated weights for policy 0, policy_version 23472 (0.0008) +[2023-10-09 05:06:33,648][60144] Updated weights for policy 1, policy_version 23732 (0.0009) +[2023-10-09 05:06:33,986][60143] Updated weights for policy 0, policy_version 23482 (0.0008) +[2023-10-09 05:06:34,021][60144] Updated weights for policy 1, policy_version 23742 (0.0009) +[2023-10-09 05:06:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 48365568. Throughput: 0: 1695.9, 1: 1712.4. Samples: 12094506. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) +[2023-10-09 05:06:36,053][59242] Avg episode reward: [(0, '24.820'), (1, '26.030')] +[2023-10-09 05:06:37,868][60144] Updated weights for policy 1, policy_version 23752 (0.0007) +[2023-10-09 05:06:38,000][60143] Updated weights for policy 0, policy_version 23492 (0.0007) +[2023-10-09 05:06:38,227][60144] Updated weights for policy 1, policy_version 23762 (0.0008) +[2023-10-09 05:06:38,376][60143] Updated weights for policy 0, policy_version 23502 (0.0009) +[2023-10-09 05:06:38,588][60144] Updated weights for policy 1, policy_version 23772 (0.0009) +[2023-10-09 05:06:38,736][60143] Updated weights for policy 0, policy_version 23512 (0.0009) +[2023-10-09 05:06:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 48431104. Throughput: 0: 1679.0, 1: 1701.8. Samples: 12114196. Policy #0 lag: (min: 8.0, avg: 33.5, max: 40.0) +[2023-10-09 05:06:41,053][59242] Avg episode reward: [(0, '24.800'), (1, '26.530')] +[2023-10-09 05:06:42,789][60143] Updated weights for policy 0, policy_version 23522 (0.0007) +[2023-10-09 05:06:42,802][60144] Updated weights for policy 1, policy_version 23782 (0.0008) +[2023-10-09 05:06:43,163][60143] Updated weights for policy 0, policy_version 23532 (0.0008) +[2023-10-09 05:06:43,184][60144] Updated weights for policy 1, policy_version 23792 (0.0009) +[2023-10-09 05:06:43,530][60143] Updated weights for policy 0, policy_version 23542 (0.0008) +[2023-10-09 05:06:43,556][60144] Updated weights for policy 1, policy_version 23802 (0.0009) +[2023-10-09 05:06:43,900][60143] Updated weights for policy 0, policy_version 23552 (0.0008) +[2023-10-09 05:06:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 48496640. Throughput: 0: 1699.7, 1: 1725.4. Samples: 12135142. Policy #0 lag: (min: 8.0, avg: 33.5, max: 40.0) +[2023-10-09 05:06:46,053][59242] Avg episode reward: [(0, '24.070'), (1, '26.420')] +[2023-10-09 05:06:47,326][60144] Updated weights for policy 1, policy_version 23812 (0.0009) +[2023-10-09 05:06:47,697][60144] Updated weights for policy 1, policy_version 23822 (0.0008) +[2023-10-09 05:06:47,728][60143] Updated weights for policy 0, policy_version 23562 (0.0007) +[2023-10-09 05:06:48,054][60144] Updated weights for policy 1, policy_version 23832 (0.0009) +[2023-10-09 05:06:48,099][60143] Updated weights for policy 0, policy_version 23572 (0.0008) +[2023-10-09 05:06:48,470][60143] Updated weights for policy 0, policy_version 23582 (0.0010) +[2023-10-09 05:06:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 48562176. Throughput: 0: 1672.6, 1: 1705.6. Samples: 12144668. Policy #0 lag: (min: 8.0, avg: 33.5, max: 40.0) +[2023-10-09 05:06:51,052][59242] Avg episode reward: [(0, '24.590'), (1, '26.430')] +[2023-10-09 05:06:51,985][60144] Updated weights for policy 1, policy_version 23842 (0.0008) +[2023-10-09 05:06:52,346][60144] Updated weights for policy 1, policy_version 23852 (0.0010) +[2023-10-09 05:06:52,604][60143] Updated weights for policy 0, policy_version 23592 (0.0008) +[2023-10-09 05:06:52,711][60144] Updated weights for policy 1, policy_version 23862 (0.0009) +[2023-10-09 05:06:52,978][60143] Updated weights for policy 0, policy_version 23602 (0.0008) +[2023-10-09 05:06:53,071][60144] Updated weights for policy 1, policy_version 23872 (0.0009) +[2023-10-09 05:06:53,346][60143] Updated weights for policy 0, policy_version 23612 (0.0009) +[2023-10-09 05:06:56,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 48627712. Throughput: 0: 1686.1, 1: 1719.9. Samples: 12165840. Policy #0 lag: (min: 8.0, avg: 33.5, max: 40.0) +[2023-10-09 05:06:56,053][59242] Avg episode reward: [(0, '25.140'), (1, '25.800')] +[2023-10-09 05:06:57,170][60144] Updated weights for policy 1, policy_version 23882 (0.0008) +[2023-10-09 05:06:57,401][60143] Updated weights for policy 0, policy_version 23622 (0.0008) +[2023-10-09 05:06:57,534][60144] Updated weights for policy 1, policy_version 23892 (0.0009) +[2023-10-09 05:06:57,765][60143] Updated weights for policy 0, policy_version 23632 (0.0010) +[2023-10-09 05:06:57,913][60144] Updated weights for policy 1, policy_version 23902 (0.0009) +[2023-10-09 05:06:58,134][60143] Updated weights for policy 0, policy_version 23642 (0.0008) +[2023-10-09 05:07:01,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 48693248. Throughput: 0: 1699.7, 1: 1742.0. Samples: 12187088. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:07:01,053][59242] Avg episode reward: [(0, '25.610'), (1, '25.090')] +[2023-10-09 05:07:01,846][60144] Updated weights for policy 1, policy_version 23912 (0.0007) +[2023-10-09 05:07:02,198][60143] Updated weights for policy 0, policy_version 23652 (0.0009) +[2023-10-09 05:07:02,204][60144] Updated weights for policy 1, policy_version 23922 (0.0007) +[2023-10-09 05:07:02,562][60143] Updated weights for policy 0, policy_version 23662 (0.0008) +[2023-10-09 05:07:02,572][60144] Updated weights for policy 1, policy_version 23932 (0.0007) +[2023-10-09 05:07:02,930][60143] Updated weights for policy 0, policy_version 23672 (0.0007) +[2023-10-09 05:07:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 48758784. Throughput: 0: 1676.3, 1: 1713.2. Samples: 12196354. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:07:06,053][59242] Avg episode reward: [(0, '25.880'), (1, '25.680')] +[2023-10-09 05:07:06,490][60144] Updated weights for policy 1, policy_version 23942 (0.0008) +[2023-10-09 05:07:06,832][60143] Updated weights for policy 0, policy_version 23682 (0.0008) +[2023-10-09 05:07:06,866][60144] Updated weights for policy 1, policy_version 23952 (0.0008) +[2023-10-09 05:07:07,207][60143] Updated weights for policy 0, policy_version 23692 (0.0008) +[2023-10-09 05:07:07,232][60144] Updated weights for policy 1, policy_version 23962 (0.0007) +[2023-10-09 05:07:07,571][60143] Updated weights for policy 0, policy_version 23702 (0.0008) +[2023-10-09 05:07:07,949][60143] Updated weights for policy 0, policy_version 23712 (0.0009) +[2023-10-09 05:07:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 48824320. Throughput: 0: 1701.5, 1: 1738.6. Samples: 12217436. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:07:11,053][59242] Avg episode reward: [(0, '26.070'), (1, '26.160')] +[2023-10-09 05:07:11,386][60144] Updated weights for policy 1, policy_version 23972 (0.0008) +[2023-10-09 05:07:11,758][60144] Updated weights for policy 1, policy_version 23982 (0.0008) +[2023-10-09 05:07:12,129][60144] Updated weights for policy 1, policy_version 23992 (0.0009) +[2023-10-09 05:07:12,267][60143] Updated weights for policy 0, policy_version 23722 (0.0007) +[2023-10-09 05:07:12,650][60143] Updated weights for policy 0, policy_version 23732 (0.0007) +[2023-10-09 05:07:13,019][60143] Updated weights for policy 0, policy_version 23742 (0.0009) +[2023-10-09 05:07:16,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 48889856. Throughput: 0: 1709.2, 1: 1729.5. Samples: 12238312. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:07:16,052][59242] Avg episode reward: [(0, '25.820'), (1, '26.850')] +[2023-10-09 05:07:16,099][60144] Updated weights for policy 1, policy_version 24002 (0.0010) +[2023-10-09 05:07:16,473][60144] Updated weights for policy 1, policy_version 24012 (0.0009) +[2023-10-09 05:07:16,599][60143] Updated weights for policy 0, policy_version 23752 (0.0009) +[2023-10-09 05:07:16,842][60144] Updated weights for policy 1, policy_version 24022 (0.0007) +[2023-10-09 05:07:16,973][60143] Updated weights for policy 0, policy_version 23762 (0.0009) +[2023-10-09 05:07:17,206][60144] Updated weights for policy 1, policy_version 24032 (0.0008) +[2023-10-09 05:07:17,349][60143] Updated weights for policy 0, policy_version 23772 (0.0008) +[2023-10-09 05:07:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 48955392. Throughput: 0: 1691.1, 1: 1710.6. Samples: 12247582. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:07:21,052][59242] Avg episode reward: [(0, '25.280'), (1, '26.790')] +[2023-10-09 05:07:21,057][60144] Updated weights for policy 1, policy_version 24042 (0.0007) +[2023-10-09 05:07:21,401][60143] Updated weights for policy 0, policy_version 23782 (0.0007) +[2023-10-09 05:07:21,429][60144] Updated weights for policy 1, policy_version 24052 (0.0009) +[2023-10-09 05:07:21,772][60143] Updated weights for policy 0, policy_version 23792 (0.0007) +[2023-10-09 05:07:21,795][60144] Updated weights for policy 1, policy_version 24062 (0.0007) +[2023-10-09 05:07:22,143][60143] Updated weights for policy 0, policy_version 23802 (0.0009) +[2023-10-09 05:07:25,833][60144] Updated weights for policy 1, policy_version 24072 (0.0011) +[2023-10-09 05:07:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 49020928. Throughput: 0: 1709.8, 1: 1727.1. Samples: 12268854. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:07:26,053][59242] Avg episode reward: [(0, '24.230'), (1, '27.010')] +[2023-10-09 05:07:26,146][60143] Updated weights for policy 0, policy_version 23812 (0.0008) +[2023-10-09 05:07:26,200][60144] Updated weights for policy 1, policy_version 24082 (0.0008) +[2023-10-09 05:07:26,514][60143] Updated weights for policy 0, policy_version 23822 (0.0007) +[2023-10-09 05:07:26,565][60144] Updated weights for policy 1, policy_version 24092 (0.0008) +[2023-10-09 05:07:26,892][60143] Updated weights for policy 0, policy_version 23832 (0.0008) +[2023-10-09 05:07:30,463][60144] Updated weights for policy 1, policy_version 24102 (0.0009) +[2023-10-09 05:07:30,841][60144] Updated weights for policy 1, policy_version 24112 (0.0008) +[2023-10-09 05:07:30,963][60143] Updated weights for policy 0, policy_version 23842 (0.0009) +[2023-10-09 05:07:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 49086464. Throughput: 0: 1710.5, 1: 1721.4. Samples: 12289578. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:07:31,053][59242] Avg episode reward: [(0, '25.320'), (1, '25.930')] +[2023-10-09 05:07:31,209][60144] Updated weights for policy 1, policy_version 24122 (0.0007) +[2023-10-09 05:07:31,342][60143] Updated weights for policy 0, policy_version 23852 (0.0009) +[2023-10-09 05:07:31,424][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000024128_24707072.pth... +[2023-10-09 05:07:31,462][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000022496_23035904.pth +[2023-10-09 05:07:31,466][60003] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p1/milestones/checkpoint_000024128_24707072.pth +[2023-10-09 05:07:31,697][60143] Updated weights for policy 0, policy_version 23862 (0.0011) +[2023-10-09 05:07:32,064][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000023872_24444928.pth... +[2023-10-09 05:07:32,068][60143] Updated weights for policy 0, policy_version 23872 (0.0010) +[2023-10-09 05:07:32,098][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000022272_22806528.pth +[2023-10-09 05:07:32,103][59934] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p0/milestones/checkpoint_000023872_24444928.pth +[2023-10-09 05:07:35,134][60144] Updated weights for policy 1, policy_version 24132 (0.0008) +[2023-10-09 05:07:35,497][60144] Updated weights for policy 1, policy_version 24142 (0.0009) +[2023-10-09 05:07:35,863][60144] Updated weights for policy 1, policy_version 24152 (0.0008) +[2023-10-09 05:07:36,018][60143] Updated weights for policy 0, policy_version 23882 (0.0007) +[2023-10-09 05:07:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 49152000. Throughput: 0: 1704.3, 1: 1727.3. Samples: 12299090. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:07:36,053][59242] Avg episode reward: [(0, '24.990'), (1, '25.950')] +[2023-10-09 05:07:36,385][60143] Updated weights for policy 0, policy_version 23892 (0.0009) +[2023-10-09 05:07:36,752][60143] Updated weights for policy 0, policy_version 23902 (0.0007) +[2023-10-09 05:07:39,850][60144] Updated weights for policy 1, policy_version 24162 (0.0010) +[2023-10-09 05:07:40,215][60144] Updated weights for policy 1, policy_version 24172 (0.0008) +[2023-10-09 05:07:40,582][60144] Updated weights for policy 1, policy_version 24182 (0.0007) +[2023-10-09 05:07:40,633][60143] Updated weights for policy 0, policy_version 23912 (0.0008) +[2023-10-09 05:07:40,944][60144] Updated weights for policy 1, policy_version 24192 (0.0007) +[2023-10-09 05:07:41,000][60143] Updated weights for policy 0, policy_version 23922 (0.0009) +[2023-10-09 05:07:41,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 49250304. Throughput: 0: 1713.6, 1: 1721.6. Samples: 12320426. Policy #0 lag: (min: 13.0, avg: 17.4, max: 45.0) +[2023-10-09 05:07:41,053][59242] Avg episode reward: [(0, '24.540'), (1, '25.810')] +[2023-10-09 05:07:41,374][60143] Updated weights for policy 0, policy_version 23932 (0.0011) +[2023-10-09 05:07:44,909][60144] Updated weights for policy 1, policy_version 24202 (0.0007) +[2023-10-09 05:07:45,282][60144] Updated weights for policy 1, policy_version 24212 (0.0009) +[2023-10-09 05:07:45,478][60143] Updated weights for policy 0, policy_version 23942 (0.0008) +[2023-10-09 05:07:45,646][60144] Updated weights for policy 1, policy_version 24222 (0.0009) +[2023-10-09 05:07:45,842][60143] Updated weights for policy 0, policy_version 23952 (0.0009) +[2023-10-09 05:07:46,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 49315840. Throughput: 0: 1703.9, 1: 1697.0. Samples: 12340130. Policy #0 lag: (min: 13.0, avg: 17.4, max: 45.0) +[2023-10-09 05:07:46,053][59242] Avg episode reward: [(0, '25.810'), (1, '25.530')] +[2023-10-09 05:07:46,220][60143] Updated weights for policy 0, policy_version 23962 (0.0012) +[2023-10-09 05:07:49,611][60144] Updated weights for policy 1, policy_version 24232 (0.0008) +[2023-10-09 05:07:49,985][60144] Updated weights for policy 1, policy_version 24242 (0.0011) +[2023-10-09 05:07:50,228][60143] Updated weights for policy 0, policy_version 23972 (0.0008) +[2023-10-09 05:07:50,350][60144] Updated weights for policy 1, policy_version 24252 (0.0008) +[2023-10-09 05:07:50,597][60143] Updated weights for policy 0, policy_version 23982 (0.0007) +[2023-10-09 05:07:50,957][60143] Updated weights for policy 0, policy_version 23992 (0.0010) +[2023-10-09 05:07:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 49381376. Throughput: 0: 1709.5, 1: 1723.2. Samples: 12350826. Policy #0 lag: (min: 13.0, avg: 17.4, max: 45.0) +[2023-10-09 05:07:51,053][59242] Avg episode reward: [(0, '25.050'), (1, '25.450')] +[2023-10-09 05:07:54,392][60144] Updated weights for policy 1, policy_version 24262 (0.0008) +[2023-10-09 05:07:54,764][60144] Updated weights for policy 1, policy_version 24272 (0.0007) +[2023-10-09 05:07:54,903][60143] Updated weights for policy 0, policy_version 24002 (0.0009) +[2023-10-09 05:07:55,126][60144] Updated weights for policy 1, policy_version 24282 (0.0008) +[2023-10-09 05:07:55,278][60143] Updated weights for policy 0, policy_version 24012 (0.0008) +[2023-10-09 05:07:55,648][60143] Updated weights for policy 0, policy_version 24022 (0.0010) +[2023-10-09 05:07:56,013][60143] Updated weights for policy 0, policy_version 24032 (0.0007) +[2023-10-09 05:07:56,052][59242] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 49479680. Throughput: 0: 1714.9, 1: 1712.5. Samples: 12371670. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 05:07:56,052][59242] Avg episode reward: [(0, '25.300'), (1, '26.530')] +[2023-10-09 05:07:59,238][60144] Updated weights for policy 1, policy_version 24292 (0.0009) +[2023-10-09 05:07:59,598][60144] Updated weights for policy 1, policy_version 24302 (0.0007) +[2023-10-09 05:07:59,976][60144] Updated weights for policy 1, policy_version 24312 (0.0008) +[2023-10-09 05:08:00,103][60143] Updated weights for policy 0, policy_version 24042 (0.0008) +[2023-10-09 05:08:00,478][60143] Updated weights for policy 0, policy_version 24052 (0.0008) +[2023-10-09 05:08:00,843][60143] Updated weights for policy 0, policy_version 24062 (0.0008) +[2023-10-09 05:08:01,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 49545216. Throughput: 0: 1695.7, 1: 1699.0. Samples: 12391076. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 05:08:01,053][59242] Avg episode reward: [(0, '25.190'), (1, '27.240')] +[2023-10-09 05:08:03,797][60144] Updated weights for policy 1, policy_version 24322 (0.0007) +[2023-10-09 05:08:04,168][60144] Updated weights for policy 1, policy_version 24332 (0.0008) +[2023-10-09 05:08:04,534][60144] Updated weights for policy 1, policy_version 24342 (0.0010) +[2023-10-09 05:08:04,788][60143] Updated weights for policy 0, policy_version 24072 (0.0008) +[2023-10-09 05:08:04,908][60144] Updated weights for policy 1, policy_version 24352 (0.0010) +[2023-10-09 05:08:05,165][60143] Updated weights for policy 0, policy_version 24082 (0.0010) +[2023-10-09 05:08:05,532][60143] Updated weights for policy 0, policy_version 24092 (0.0009) +[2023-10-09 05:08:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 49610752. Throughput: 0: 1713.9, 1: 1733.4. Samples: 12402712. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 05:08:06,053][59242] Avg episode reward: [(0, '24.500'), (1, '27.530')] +[2023-10-09 05:08:08,920][60144] Updated weights for policy 1, policy_version 24362 (0.0008) +[2023-10-09 05:08:09,283][60144] Updated weights for policy 1, policy_version 24372 (0.0007) +[2023-10-09 05:08:09,401][60143] Updated weights for policy 0, policy_version 24102 (0.0007) +[2023-10-09 05:08:09,654][60144] Updated weights for policy 1, policy_version 24382 (0.0008) +[2023-10-09 05:08:09,773][60143] Updated weights for policy 0, policy_version 24112 (0.0007) +[2023-10-09 05:08:10,142][60143] Updated weights for policy 0, policy_version 24122 (0.0008) +[2023-10-09 05:08:11,052][59242] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 49676288. Throughput: 0: 1715.0, 1: 1703.2. Samples: 12422670. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 05:08:11,052][59242] Avg episode reward: [(0, '23.810'), (1, '28.310')] +[2023-10-09 05:08:13,418][60144] Updated weights for policy 1, policy_version 24392 (0.0008) +[2023-10-09 05:08:13,790][60144] Updated weights for policy 1, policy_version 24402 (0.0009) +[2023-10-09 05:08:14,156][60144] Updated weights for policy 1, policy_version 24412 (0.0008) +[2023-10-09 05:08:14,214][60143] Updated weights for policy 0, policy_version 24132 (0.0007) +[2023-10-09 05:08:14,591][60143] Updated weights for policy 0, policy_version 24142 (0.0008) +[2023-10-09 05:08:14,958][60143] Updated weights for policy 0, policy_version 24152 (0.0008) +[2023-10-09 05:08:16,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 49741824. Throughput: 0: 1694.7, 1: 1711.8. Samples: 12442868. Policy #0 lag: (min: 5.0, avg: 6.9, max: 27.0) +[2023-10-09 05:08:16,053][59242] Avg episode reward: [(0, '23.650'), (1, '28.060')] +[2023-10-09 05:08:17,995][60144] Updated weights for policy 1, policy_version 24422 (0.0007) +[2023-10-09 05:08:18,381][60144] Updated weights for policy 1, policy_version 24432 (0.0007) +[2023-10-09 05:08:18,752][60144] Updated weights for policy 1, policy_version 24442 (0.0007) +[2023-10-09 05:08:18,970][60143] Updated weights for policy 0, policy_version 24162 (0.0008) +[2023-10-09 05:08:19,342][60143] Updated weights for policy 0, policy_version 24172 (0.0009) +[2023-10-09 05:08:19,717][60143] Updated weights for policy 0, policy_version 24182 (0.0009) +[2023-10-09 05:08:20,078][60143] Updated weights for policy 0, policy_version 24192 (0.0010) +[2023-10-09 05:08:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 49807360. Throughput: 0: 1729.0, 1: 1715.4. Samples: 12454090. Policy #0 lag: (min: 5.0, avg: 6.9, max: 27.0) +[2023-10-09 05:08:21,053][59242] Avg episode reward: [(0, '23.860'), (1, '28.890')] +[2023-10-09 05:08:21,055][60003] Saving new best policy, reward=28.890! +[2023-10-09 05:08:22,711][60144] Updated weights for policy 1, policy_version 24452 (0.0009) +[2023-10-09 05:08:23,064][60144] Updated weights for policy 1, policy_version 24462 (0.0008) +[2023-10-09 05:08:23,431][60144] Updated weights for policy 1, policy_version 24472 (0.0007) +[2023-10-09 05:08:24,058][60143] Updated weights for policy 0, policy_version 24202 (0.0009) +[2023-10-09 05:08:24,425][60143] Updated weights for policy 0, policy_version 24212 (0.0007) +[2023-10-09 05:08:24,800][60143] Updated weights for policy 0, policy_version 24222 (0.0010) +[2023-10-09 05:08:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 49872896. Throughput: 0: 1707.5, 1: 1708.8. Samples: 12474158. Policy #0 lag: (min: 5.0, avg: 6.9, max: 27.0) +[2023-10-09 05:08:26,053][59242] Avg episode reward: [(0, '23.500'), (1, '28.210')] +[2023-10-09 05:08:27,407][60144] Updated weights for policy 1, policy_version 24482 (0.0008) +[2023-10-09 05:08:27,775][60144] Updated weights for policy 1, policy_version 24492 (0.0008) +[2023-10-09 05:08:28,143][60144] Updated weights for policy 1, policy_version 24502 (0.0007) +[2023-10-09 05:08:28,508][60144] Updated weights for policy 1, policy_version 24512 (0.0007) +[2023-10-09 05:08:28,738][60143] Updated weights for policy 0, policy_version 24232 (0.0008) +[2023-10-09 05:08:29,104][60143] Updated weights for policy 0, policy_version 24242 (0.0007) +[2023-10-09 05:08:29,482][60143] Updated weights for policy 0, policy_version 24252 (0.0007) +[2023-10-09 05:08:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 49938432. Throughput: 0: 1705.6, 1: 1735.3. Samples: 12494970. Policy #0 lag: (min: 5.0, avg: 6.9, max: 27.0) +[2023-10-09 05:08:31,053][59242] Avg episode reward: [(0, '23.590'), (1, '27.770')] +[2023-10-09 05:08:32,440][60144] Updated weights for policy 1, policy_version 24522 (0.0007) +[2023-10-09 05:08:32,807][60144] Updated weights for policy 1, policy_version 24532 (0.0007) +[2023-10-09 05:08:33,175][60144] Updated weights for policy 1, policy_version 24542 (0.0007) +[2023-10-09 05:08:33,478][60143] Updated weights for policy 0, policy_version 24262 (0.0009) +[2023-10-09 05:08:33,856][60143] Updated weights for policy 0, policy_version 24272 (0.0008) +[2023-10-09 05:08:34,226][60143] Updated weights for policy 0, policy_version 24282 (0.0007) +[2023-10-09 05:08:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 50003968. Throughput: 0: 1725.4, 1: 1711.6. Samples: 12505492. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:08:36,053][59242] Avg episode reward: [(0, '24.870'), (1, '26.050')] +[2023-10-09 05:08:37,108][60144] Updated weights for policy 1, policy_version 24552 (0.0007) +[2023-10-09 05:08:37,481][60144] Updated weights for policy 1, policy_version 24562 (0.0008) +[2023-10-09 05:08:37,849][60144] Updated weights for policy 1, policy_version 24572 (0.0008) +[2023-10-09 05:08:38,265][60143] Updated weights for policy 0, policy_version 24292 (0.0009) +[2023-10-09 05:08:38,634][60143] Updated weights for policy 0, policy_version 24302 (0.0007) +[2023-10-09 05:08:39,000][60143] Updated weights for policy 0, policy_version 24312 (0.0007) +[2023-10-09 05:08:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 50069504. Throughput: 0: 1695.0, 1: 1722.0. Samples: 12525438. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:08:41,053][59242] Avg episode reward: [(0, '24.280'), (1, '26.110')] +[2023-10-09 05:08:41,846][60144] Updated weights for policy 1, policy_version 24582 (0.0008) +[2023-10-09 05:08:42,219][60144] Updated weights for policy 1, policy_version 24592 (0.0009) +[2023-10-09 05:08:42,596][60144] Updated weights for policy 1, policy_version 24602 (0.0010) +[2023-10-09 05:08:42,958][60143] Updated weights for policy 0, policy_version 24322 (0.0008) +[2023-10-09 05:08:43,334][60143] Updated weights for policy 0, policy_version 24332 (0.0008) +[2023-10-09 05:08:43,700][60143] Updated weights for policy 0, policy_version 24342 (0.0010) +[2023-10-09 05:08:44,067][60143] Updated weights for policy 0, policy_version 24352 (0.0011) +[2023-10-09 05:08:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 50135040. Throughput: 0: 1713.9, 1: 1745.0. Samples: 12546726. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:08:46,053][59242] Avg episode reward: [(0, '24.110'), (1, '27.010')] +[2023-10-09 05:08:46,503][60144] Updated weights for policy 1, policy_version 24612 (0.0010) +[2023-10-09 05:08:46,878][60144] Updated weights for policy 1, policy_version 24622 (0.0007) +[2023-10-09 05:08:47,240][60144] Updated weights for policy 1, policy_version 24632 (0.0009) +[2023-10-09 05:08:47,999][60143] Updated weights for policy 0, policy_version 24362 (0.0009) +[2023-10-09 05:08:48,378][60143] Updated weights for policy 0, policy_version 24372 (0.0009) +[2023-10-09 05:08:48,750][60143] Updated weights for policy 0, policy_version 24382 (0.0008) +[2023-10-09 05:08:51,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 50200576. Throughput: 0: 1702.9, 1: 1713.0. Samples: 12556430. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:08:51,053][59242] Avg episode reward: [(0, '23.290'), (1, '28.510')] +[2023-10-09 05:08:51,185][60144] Updated weights for policy 1, policy_version 24642 (0.0011) +[2023-10-09 05:08:51,559][60144] Updated weights for policy 1, policy_version 24652 (0.0009) +[2023-10-09 05:08:51,925][60144] Updated weights for policy 1, policy_version 24662 (0.0008) +[2023-10-09 05:08:52,294][60144] Updated weights for policy 1, policy_version 24672 (0.0008) +[2023-10-09 05:08:52,753][60143] Updated weights for policy 0, policy_version 24392 (0.0010) +[2023-10-09 05:08:53,134][60143] Updated weights for policy 0, policy_version 24402 (0.0010) +[2023-10-09 05:08:53,503][60143] Updated weights for policy 0, policy_version 24412 (0.0011) +[2023-10-09 05:08:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 50266112. Throughput: 0: 1692.5, 1: 1742.4. Samples: 12577242. Policy #0 lag: (min: 22.0, avg: 45.8, max: 48.0) +[2023-10-09 05:08:56,053][59242] Avg episode reward: [(0, '23.400'), (1, '26.910')] +[2023-10-09 05:08:56,147][60144] Updated weights for policy 1, policy_version 24682 (0.0008) +[2023-10-09 05:08:56,514][60144] Updated weights for policy 1, policy_version 24692 (0.0007) +[2023-10-09 05:08:56,880][60144] Updated weights for policy 1, policy_version 24702 (0.0007) +[2023-10-09 05:08:57,585][60143] Updated weights for policy 0, policy_version 24422 (0.0010) +[2023-10-09 05:08:57,957][60143] Updated weights for policy 0, policy_version 24432 (0.0007) +[2023-10-09 05:08:58,329][60143] Updated weights for policy 0, policy_version 24442 (0.0009) +[2023-10-09 05:09:00,807][60144] Updated weights for policy 1, policy_version 24712 (0.0010) +[2023-10-09 05:09:01,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 50331648. Throughput: 0: 1714.9, 1: 1743.2. Samples: 12598480. Policy #0 lag: (min: 22.0, avg: 45.8, max: 48.0) +[2023-10-09 05:09:01,053][59242] Avg episode reward: [(0, '24.320'), (1, '26.570')] +[2023-10-09 05:09:01,173][60144] Updated weights for policy 1, policy_version 24722 (0.0008) +[2023-10-09 05:09:01,537][60144] Updated weights for policy 1, policy_version 24732 (0.0007) +[2023-10-09 05:09:02,484][60143] Updated weights for policy 0, policy_version 24452 (0.0009) +[2023-10-09 05:09:02,855][60143] Updated weights for policy 0, policy_version 24462 (0.0010) +[2023-10-09 05:09:03,228][60143] Updated weights for policy 0, policy_version 24472 (0.0010) +[2023-10-09 05:09:05,458][60144] Updated weights for policy 1, policy_version 24742 (0.0008) +[2023-10-09 05:09:05,838][60144] Updated weights for policy 1, policy_version 24752 (0.0010) +[2023-10-09 05:09:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 50397184. Throughput: 0: 1686.1, 1: 1736.9. Samples: 12608126. Policy #0 lag: (min: 22.0, avg: 45.8, max: 48.0) +[2023-10-09 05:09:06,053][59242] Avg episode reward: [(0, '24.770'), (1, '25.920')] +[2023-10-09 05:09:06,207][60144] Updated weights for policy 1, policy_version 24762 (0.0009) +[2023-10-09 05:09:06,989][60143] Updated weights for policy 0, policy_version 24482 (0.0008) +[2023-10-09 05:09:07,351][60143] Updated weights for policy 0, policy_version 24492 (0.0007) +[2023-10-09 05:09:07,729][60143] Updated weights for policy 0, policy_version 24502 (0.0009) +[2023-10-09 05:09:08,096][60143] Updated weights for policy 0, policy_version 24512 (0.0009) +[2023-10-09 05:09:10,159][60144] Updated weights for policy 1, policy_version 24772 (0.0007) +[2023-10-09 05:09:10,530][60144] Updated weights for policy 1, policy_version 24782 (0.0007) +[2023-10-09 05:09:10,893][60144] Updated weights for policy 1, policy_version 24792 (0.0009) +[2023-10-09 05:09:11,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 50462720. Throughput: 0: 1704.1, 1: 1745.5. Samples: 12629390. Policy #0 lag: (min: 22.0, avg: 45.8, max: 48.0) +[2023-10-09 05:09:11,053][59242] Avg episode reward: [(0, '24.110'), (1, '26.290')] +[2023-10-09 05:09:11,978][60143] Updated weights for policy 0, policy_version 24522 (0.0009) +[2023-10-09 05:09:12,340][60143] Updated weights for policy 0, policy_version 24532 (0.0009) +[2023-10-09 05:09:12,715][60143] Updated weights for policy 0, policy_version 24542 (0.0007) +[2023-10-09 05:09:14,700][60144] Updated weights for policy 1, policy_version 24802 (0.0008) +[2023-10-09 05:09:15,060][60144] Updated weights for policy 1, policy_version 24812 (0.0007) +[2023-10-09 05:09:15,432][60144] Updated weights for policy 1, policy_version 24822 (0.0008) +[2023-10-09 05:09:15,799][60144] Updated weights for policy 1, policy_version 24832 (0.0009) +[2023-10-09 05:09:16,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 50561024. Throughput: 0: 1715.5, 1: 1723.5. Samples: 12649726. Policy #0 lag: (min: 3.0, avg: 10.5, max: 35.0) +[2023-10-09 05:09:16,053][59242] Avg episode reward: [(0, '24.570'), (1, '26.260')] +[2023-10-09 05:09:16,711][60143] Updated weights for policy 0, policy_version 24552 (0.0007) +[2023-10-09 05:09:17,093][60143] Updated weights for policy 0, policy_version 24562 (0.0008) +[2023-10-09 05:09:17,469][60143] Updated weights for policy 0, policy_version 24572 (0.0010) +[2023-10-09 05:09:19,569][60144] Updated weights for policy 1, policy_version 24842 (0.0009) +[2023-10-09 05:09:19,939][60144] Updated weights for policy 1, policy_version 24852 (0.0009) +[2023-10-09 05:09:20,305][60144] Updated weights for policy 1, policy_version 24862 (0.0008) +[2023-10-09 05:09:21,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 50626560. Throughput: 0: 1687.4, 1: 1749.9. Samples: 12660170. Policy #0 lag: (min: 3.0, avg: 10.5, max: 35.0) +[2023-10-09 05:09:21,052][59242] Avg episode reward: [(0, '24.530'), (1, '26.050')] +[2023-10-09 05:09:21,264][60143] Updated weights for policy 0, policy_version 24582 (0.0009) +[2023-10-09 05:09:21,639][60143] Updated weights for policy 0, policy_version 24592 (0.0010) +[2023-10-09 05:09:22,011][60143] Updated weights for policy 0, policy_version 24602 (0.0009) +[2023-10-09 05:09:24,428][60144] Updated weights for policy 1, policy_version 24872 (0.0010) +[2023-10-09 05:09:24,790][60144] Updated weights for policy 1, policy_version 24882 (0.0010) +[2023-10-09 05:09:25,155][60144] Updated weights for policy 1, policy_version 24892 (0.0010) +[2023-10-09 05:09:25,990][60143] Updated weights for policy 0, policy_version 24612 (0.0010) +[2023-10-09 05:09:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 50692096. Throughput: 0: 1719.2, 1: 1733.1. Samples: 12680790. Policy #0 lag: (min: 3.0, avg: 10.5, max: 35.0) +[2023-10-09 05:09:26,053][59242] Avg episode reward: [(0, '24.180'), (1, '26.930')] +[2023-10-09 05:09:26,351][60143] Updated weights for policy 0, policy_version 24622 (0.0009) +[2023-10-09 05:09:26,726][60143] Updated weights for policy 0, policy_version 24632 (0.0007) +[2023-10-09 05:09:29,213][60144] Updated weights for policy 1, policy_version 24902 (0.0010) +[2023-10-09 05:09:29,585][60144] Updated weights for policy 1, policy_version 24912 (0.0010) +[2023-10-09 05:09:29,955][60144] Updated weights for policy 1, policy_version 24922 (0.0011) +[2023-10-09 05:09:30,851][60143] Updated weights for policy 0, policy_version 24642 (0.0009) +[2023-10-09 05:09:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 50757632. Throughput: 0: 1715.0, 1: 1713.3. Samples: 12701002. Policy #0 lag: (min: 3.0, avg: 10.5, max: 35.0) +[2023-10-09 05:09:31,053][59242] Avg episode reward: [(0, '24.290'), (1, '26.670')] +[2023-10-09 05:09:31,065][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000024928_25526272.pth... +[2023-10-09 05:09:31,102][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000023328_23887872.pth +[2023-10-09 05:09:31,226][60143] Updated weights for policy 0, policy_version 24652 (0.0007) +[2023-10-09 05:09:31,583][60143] Updated weights for policy 0, policy_version 24662 (0.0008) +[2023-10-09 05:09:31,951][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000024672_25264128.pth... +[2023-10-09 05:09:31,954][60143] Updated weights for policy 0, policy_version 24672 (0.0007) +[2023-10-09 05:09:31,982][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000023072_23625728.pth +[2023-10-09 05:09:33,932][60144] Updated weights for policy 1, policy_version 24932 (0.0008) +[2023-10-09 05:09:34,298][60144] Updated weights for policy 1, policy_version 24942 (0.0008) +[2023-10-09 05:09:34,670][60144] Updated weights for policy 1, policy_version 24952 (0.0012) +[2023-10-09 05:09:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 50823168. Throughput: 0: 1706.5, 1: 1746.7. Samples: 12711820. Policy #0 lag: (min: 4.0, avg: 11.9, max: 36.0) +[2023-10-09 05:09:36,053][59242] Avg episode reward: [(0, '23.860'), (1, '26.920')] +[2023-10-09 05:09:36,115][60143] Updated weights for policy 0, policy_version 24682 (0.0011) +[2023-10-09 05:09:36,494][60143] Updated weights for policy 0, policy_version 24692 (0.0008) +[2023-10-09 05:09:36,864][60143] Updated weights for policy 0, policy_version 24702 (0.0009) +[2023-10-09 05:09:38,674][60144] Updated weights for policy 1, policy_version 24962 (0.0010) +[2023-10-09 05:09:39,038][60144] Updated weights for policy 1, policy_version 24972 (0.0009) +[2023-10-09 05:09:39,421][60144] Updated weights for policy 1, policy_version 24982 (0.0011) +[2023-10-09 05:09:39,781][60144] Updated weights for policy 1, policy_version 24992 (0.0011) +[2023-10-09 05:09:40,636][60143] Updated weights for policy 0, policy_version 24712 (0.0009) +[2023-10-09 05:09:41,004][60143] Updated weights for policy 0, policy_version 24722 (0.0009) +[2023-10-09 05:09:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 50888704. Throughput: 0: 1717.2, 1: 1719.2. Samples: 12731876. Policy #0 lag: (min: 4.0, avg: 11.9, max: 36.0) +[2023-10-09 05:09:41,052][59242] Avg episode reward: [(0, '24.250'), (1, '26.480')] +[2023-10-09 05:09:41,361][60143] Updated weights for policy 0, policy_version 24732 (0.0008) +[2023-10-09 05:09:43,513][60144] Updated weights for policy 1, policy_version 25002 (0.0011) +[2023-10-09 05:09:43,873][60144] Updated weights for policy 1, policy_version 25012 (0.0008) +[2023-10-09 05:09:44,234][60144] Updated weights for policy 1, policy_version 25022 (0.0011) +[2023-10-09 05:09:45,257][60143] Updated weights for policy 0, policy_version 24742 (0.0008) +[2023-10-09 05:09:45,638][60143] Updated weights for policy 0, policy_version 24752 (0.0008) +[2023-10-09 05:09:46,009][60143] Updated weights for policy 0, policy_version 24762 (0.0007) +[2023-10-09 05:09:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 50954240. Throughput: 0: 1710.4, 1: 1715.5. Samples: 12752644. Policy #0 lag: (min: 4.0, avg: 11.9, max: 36.0) +[2023-10-09 05:09:46,053][59242] Avg episode reward: [(0, '23.670'), (1, '27.350')] +[2023-10-09 05:09:48,008][60144] Updated weights for policy 1, policy_version 25032 (0.0009) +[2023-10-09 05:09:48,378][60144] Updated weights for policy 1, policy_version 25042 (0.0007) +[2023-10-09 05:09:48,752][60144] Updated weights for policy 1, policy_version 25052 (0.0009) +[2023-10-09 05:09:49,997][60143] Updated weights for policy 0, policy_version 24772 (0.0008) +[2023-10-09 05:09:50,376][60143] Updated weights for policy 0, policy_version 24782 (0.0009) +[2023-10-09 05:09:50,742][60143] Updated weights for policy 0, policy_version 24792 (0.0009) +[2023-10-09 05:09:51,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 51052544. Throughput: 0: 1720.1, 1: 1719.6. Samples: 12762910. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 05:09:51,053][59242] Avg episode reward: [(0, '24.980'), (1, '27.550')] +[2023-10-09 05:09:52,827][60144] Updated weights for policy 1, policy_version 25062 (0.0009) +[2023-10-09 05:09:53,186][60144] Updated weights for policy 1, policy_version 25072 (0.0007) +[2023-10-09 05:09:53,558][60144] Updated weights for policy 1, policy_version 25082 (0.0007) +[2023-10-09 05:09:54,861][60143] Updated weights for policy 0, policy_version 24802 (0.0010) +[2023-10-09 05:09:55,228][60143] Updated weights for policy 0, policy_version 24812 (0.0008) +[2023-10-09 05:09:55,599][60143] Updated weights for policy 0, policy_version 24822 (0.0009) +[2023-10-09 05:09:55,970][60143] Updated weights for policy 0, policy_version 24832 (0.0009) +[2023-10-09 05:09:56,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 51118080. Throughput: 0: 1716.7, 1: 1706.7. Samples: 12783442. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 05:09:56,053][59242] Avg episode reward: [(0, '25.410'), (1, '27.250')] +[2023-10-09 05:09:57,504][60144] Updated weights for policy 1, policy_version 25092 (0.0009) +[2023-10-09 05:09:57,875][60144] Updated weights for policy 1, policy_version 25102 (0.0008) +[2023-10-09 05:09:58,245][60144] Updated weights for policy 1, policy_version 25112 (0.0008) +[2023-10-09 05:09:59,878][60143] Updated weights for policy 0, policy_version 24842 (0.0008) +[2023-10-09 05:10:00,247][60143] Updated weights for policy 0, policy_version 24852 (0.0011) +[2023-10-09 05:10:00,619][60143] Updated weights for policy 0, policy_version 24862 (0.0009) +[2023-10-09 05:10:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 51183616. Throughput: 0: 1694.9, 1: 1728.4. Samples: 12803770. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 05:10:01,053][59242] Avg episode reward: [(0, '24.940'), (1, '26.830')] +[2023-10-09 05:10:02,349][60144] Updated weights for policy 1, policy_version 25122 (0.0008) +[2023-10-09 05:10:02,717][60144] Updated weights for policy 1, policy_version 25132 (0.0008) +[2023-10-09 05:10:03,081][60144] Updated weights for policy 1, policy_version 25142 (0.0007) +[2023-10-09 05:10:03,447][60144] Updated weights for policy 1, policy_version 25152 (0.0008) +[2023-10-09 05:10:04,570][60143] Updated weights for policy 0, policy_version 24872 (0.0010) +[2023-10-09 05:10:04,938][60143] Updated weights for policy 0, policy_version 24882 (0.0008) +[2023-10-09 05:10:05,312][60143] Updated weights for policy 0, policy_version 24892 (0.0008) +[2023-10-09 05:10:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 51249152. Throughput: 0: 1725.9, 1: 1699.2. Samples: 12814300. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 05:10:06,053][59242] Avg episode reward: [(0, '25.720'), (1, '25.190')] +[2023-10-09 05:10:07,521][60144] Updated weights for policy 1, policy_version 25162 (0.0009) +[2023-10-09 05:10:07,895][60144] Updated weights for policy 1, policy_version 25172 (0.0008) +[2023-10-09 05:10:08,264][60144] Updated weights for policy 1, policy_version 25182 (0.0009) +[2023-10-09 05:10:09,284][60143] Updated weights for policy 0, policy_version 24902 (0.0010) +[2023-10-09 05:10:09,645][60143] Updated weights for policy 0, policy_version 24912 (0.0008) +[2023-10-09 05:10:10,015][60143] Updated weights for policy 0, policy_version 24922 (0.0007) +[2023-10-09 05:10:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 51314688. Throughput: 0: 1710.8, 1: 1710.2. Samples: 12834738. Policy #0 lag: (min: 29.0, avg: 34.4, max: 61.0) +[2023-10-09 05:10:11,053][59242] Avg episode reward: [(0, '25.130'), (1, '26.000')] +[2023-10-09 05:10:12,165][60144] Updated weights for policy 1, policy_version 25192 (0.0009) +[2023-10-09 05:10:12,538][60144] Updated weights for policy 1, policy_version 25202 (0.0007) +[2023-10-09 05:10:12,906][60144] Updated weights for policy 1, policy_version 25212 (0.0007) +[2023-10-09 05:10:14,242][60143] Updated weights for policy 0, policy_version 24932 (0.0009) +[2023-10-09 05:10:14,604][60143] Updated weights for policy 0, policy_version 24942 (0.0009) +[2023-10-09 05:10:14,987][60143] Updated weights for policy 0, policy_version 24952 (0.0010) +[2023-10-09 05:10:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 51380224. Throughput: 0: 1688.3, 1: 1731.0. Samples: 12854870. Policy #0 lag: (min: 29.0, avg: 34.4, max: 61.0) +[2023-10-09 05:10:16,053][59242] Avg episode reward: [(0, '24.040'), (1, '27.590')] +[2023-10-09 05:10:16,780][60144] Updated weights for policy 1, policy_version 25222 (0.0008) +[2023-10-09 05:10:17,152][60144] Updated weights for policy 1, policy_version 25232 (0.0007) +[2023-10-09 05:10:17,530][60144] Updated weights for policy 1, policy_version 25242 (0.0008) +[2023-10-09 05:10:18,893][60143] Updated weights for policy 0, policy_version 24962 (0.0009) +[2023-10-09 05:10:19,274][60143] Updated weights for policy 0, policy_version 24972 (0.0009) +[2023-10-09 05:10:19,647][60143] Updated weights for policy 0, policy_version 24982 (0.0009) +[2023-10-09 05:10:20,013][60143] Updated weights for policy 0, policy_version 24992 (0.0008) +[2023-10-09 05:10:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 51445760. Throughput: 0: 1720.3, 1: 1697.7. Samples: 12865630. Policy #0 lag: (min: 29.0, avg: 34.4, max: 61.0) +[2023-10-09 05:10:21,053][59242] Avg episode reward: [(0, '24.320'), (1, '27.150')] +[2023-10-09 05:10:21,521][60144] Updated weights for policy 1, policy_version 25252 (0.0009) +[2023-10-09 05:10:21,889][60144] Updated weights for policy 1, policy_version 25262 (0.0007) +[2023-10-09 05:10:22,262][60144] Updated weights for policy 1, policy_version 25272 (0.0007) +[2023-10-09 05:10:24,068][60143] Updated weights for policy 0, policy_version 25002 (0.0009) +[2023-10-09 05:10:24,450][60143] Updated weights for policy 0, policy_version 25012 (0.0008) +[2023-10-09 05:10:24,811][60143] Updated weights for policy 0, policy_version 25022 (0.0007) +[2023-10-09 05:10:26,020][60144] Updated weights for policy 1, policy_version 25282 (0.0008) +[2023-10-09 05:10:26,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 51511296. Throughput: 0: 1700.0, 1: 1722.3. Samples: 12885876. Policy #0 lag: (min: 29.0, avg: 34.4, max: 61.0) +[2023-10-09 05:10:26,052][59242] Avg episode reward: [(0, '25.520'), (1, '26.670')] +[2023-10-09 05:10:26,393][60144] Updated weights for policy 1, policy_version 25292 (0.0008) +[2023-10-09 05:10:26,764][60144] Updated weights for policy 1, policy_version 25302 (0.0007) +[2023-10-09 05:10:27,133][60144] Updated weights for policy 1, policy_version 25312 (0.0010) +[2023-10-09 05:10:28,612][60143] Updated weights for policy 0, policy_version 25032 (0.0010) +[2023-10-09 05:10:28,980][60143] Updated weights for policy 0, policy_version 25042 (0.0010) +[2023-10-09 05:10:29,350][60143] Updated weights for policy 0, policy_version 25052 (0.0009) +[2023-10-09 05:10:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 51576832. Throughput: 0: 1701.4, 1: 1727.2. Samples: 12906932. Policy #0 lag: (min: 1.0, avg: 14.4, max: 33.0) +[2023-10-09 05:10:31,052][59242] Avg episode reward: [(0, '24.790'), (1, '26.670')] +[2023-10-09 05:10:31,073][60144] Updated weights for policy 1, policy_version 25322 (0.0007) +[2023-10-09 05:10:31,440][60144] Updated weights for policy 1, policy_version 25332 (0.0008) +[2023-10-09 05:10:31,810][60144] Updated weights for policy 1, policy_version 25342 (0.0008) +[2023-10-09 05:10:33,294][60143] Updated weights for policy 0, policy_version 25062 (0.0009) +[2023-10-09 05:10:33,669][60143] Updated weights for policy 0, policy_version 25072 (0.0008) +[2023-10-09 05:10:34,037][60143] Updated weights for policy 0, policy_version 25082 (0.0009) +[2023-10-09 05:10:35,564][60144] Updated weights for policy 1, policy_version 25352 (0.0007) +[2023-10-09 05:10:35,920][60144] Updated weights for policy 1, policy_version 25362 (0.0007) +[2023-10-09 05:10:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 51642368. Throughput: 0: 1710.3, 1: 1718.0. Samples: 12917182. Policy #0 lag: (min: 1.0, avg: 14.4, max: 33.0) +[2023-10-09 05:10:36,053][59242] Avg episode reward: [(0, '24.940'), (1, '27.490')] +[2023-10-09 05:10:36,290][60144] Updated weights for policy 1, policy_version 25372 (0.0007) +[2023-10-09 05:10:37,986][60143] Updated weights for policy 0, policy_version 25092 (0.0008) +[2023-10-09 05:10:38,366][60143] Updated weights for policy 0, policy_version 25102 (0.0009) +[2023-10-09 05:10:38,733][60143] Updated weights for policy 0, policy_version 25112 (0.0009) +[2023-10-09 05:10:40,214][60144] Updated weights for policy 1, policy_version 25382 (0.0008) +[2023-10-09 05:10:40,592][60144] Updated weights for policy 1, policy_version 25392 (0.0008) +[2023-10-09 05:10:40,961][60144] Updated weights for policy 1, policy_version 25402 (0.0011) +[2023-10-09 05:10:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 51707904. Throughput: 0: 1694.7, 1: 1740.0. Samples: 12938004. Policy #0 lag: (min: 1.0, avg: 14.4, max: 33.0) +[2023-10-09 05:10:41,052][59242] Avg episode reward: [(0, '26.040'), (1, '26.750')] +[2023-10-09 05:10:42,773][60143] Updated weights for policy 0, policy_version 25122 (0.0007) +[2023-10-09 05:10:43,144][60143] Updated weights for policy 0, policy_version 25132 (0.0009) +[2023-10-09 05:10:43,522][60143] Updated weights for policy 0, policy_version 25142 (0.0008) +[2023-10-09 05:10:43,892][60143] Updated weights for policy 0, policy_version 25152 (0.0009) +[2023-10-09 05:10:44,991][60144] Updated weights for policy 1, policy_version 25412 (0.0010) +[2023-10-09 05:10:45,364][60144] Updated weights for policy 1, policy_version 25422 (0.0011) +[2023-10-09 05:10:45,729][60144] Updated weights for policy 1, policy_version 25432 (0.0010) +[2023-10-09 05:10:46,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 51806208. Throughput: 0: 1714.0, 1: 1719.2. Samples: 12958262. Policy #0 lag: (min: 1.0, avg: 14.4, max: 33.0) +[2023-10-09 05:10:46,052][59242] Avg episode reward: [(0, '25.920'), (1, '26.840')] +[2023-10-09 05:10:47,979][60143] Updated weights for policy 0, policy_version 25162 (0.0009) +[2023-10-09 05:10:48,348][60143] Updated weights for policy 0, policy_version 25172 (0.0008) +[2023-10-09 05:10:48,714][60143] Updated weights for policy 0, policy_version 25182 (0.0011) +[2023-10-09 05:10:49,573][60144] Updated weights for policy 1, policy_version 25442 (0.0007) +[2023-10-09 05:10:49,940][60144] Updated weights for policy 1, policy_version 25452 (0.0008) +[2023-10-09 05:10:50,308][60144] Updated weights for policy 1, policy_version 25462 (0.0007) +[2023-10-09 05:10:50,673][60144] Updated weights for policy 1, policy_version 25472 (0.0008) +[2023-10-09 05:10:51,052][59242] Fps is (10 sec: 16383.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 51871744. Throughput: 0: 1690.8, 1: 1741.9. Samples: 12968770. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:10:51,054][59242] Avg episode reward: [(0, '25.420'), (1, '26.840')] +[2023-10-09 05:10:52,813][60143] Updated weights for policy 0, policy_version 25192 (0.0008) +[2023-10-09 05:10:53,185][60143] Updated weights for policy 0, policy_version 25202 (0.0010) +[2023-10-09 05:10:53,560][60143] Updated weights for policy 0, policy_version 25212 (0.0010) +[2023-10-09 05:10:54,540][60144] Updated weights for policy 1, policy_version 25482 (0.0009) +[2023-10-09 05:10:54,914][60144] Updated weights for policy 1, policy_version 25492 (0.0010) +[2023-10-09 05:10:55,273][60144] Updated weights for policy 1, policy_version 25502 (0.0007) +[2023-10-09 05:10:56,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 51937280. Throughput: 0: 1691.1, 1: 1745.9. Samples: 12989400. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:10:56,053][59242] Avg episode reward: [(0, '26.460'), (1, '26.930')] +[2023-10-09 05:10:57,341][60143] Updated weights for policy 0, policy_version 25222 (0.0009) +[2023-10-09 05:10:57,694][60143] Updated weights for policy 0, policy_version 25232 (0.0008) +[2023-10-09 05:10:58,058][60143] Updated weights for policy 0, policy_version 25242 (0.0010) +[2023-10-09 05:10:59,319][60144] Updated weights for policy 1, policy_version 25512 (0.0009) +[2023-10-09 05:10:59,690][60144] Updated weights for policy 1, policy_version 25522 (0.0008) +[2023-10-09 05:11:00,061][60144] Updated weights for policy 1, policy_version 25532 (0.0007) +[2023-10-09 05:11:01,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 52002816. Throughput: 0: 1720.7, 1: 1723.2. Samples: 13009844. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:11:01,053][59242] Avg episode reward: [(0, '25.850'), (1, '28.550')] +[2023-10-09 05:11:02,082][60143] Updated weights for policy 0, policy_version 25252 (0.0011) +[2023-10-09 05:11:02,453][60143] Updated weights for policy 0, policy_version 25262 (0.0009) +[2023-10-09 05:11:02,818][60143] Updated weights for policy 0, policy_version 25272 (0.0008) +[2023-10-09 05:11:04,113][60144] Updated weights for policy 1, policy_version 25542 (0.0007) +[2023-10-09 05:11:04,471][60144] Updated weights for policy 1, policy_version 25552 (0.0007) +[2023-10-09 05:11:04,844][60144] Updated weights for policy 1, policy_version 25562 (0.0008) +[2023-10-09 05:11:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 52068352. Throughput: 0: 1686.7, 1: 1753.3. Samples: 13020430. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:11:06,053][59242] Avg episode reward: [(0, '25.190'), (1, '27.580')] +[2023-10-09 05:11:06,781][60143] Updated weights for policy 0, policy_version 25282 (0.0011) +[2023-10-09 05:11:07,160][60143] Updated weights for policy 0, policy_version 25292 (0.0009) +[2023-10-09 05:11:07,525][60143] Updated weights for policy 0, policy_version 25302 (0.0011) +[2023-10-09 05:11:07,904][60143] Updated weights for policy 0, policy_version 25312 (0.0009) +[2023-10-09 05:11:08,739][60144] Updated weights for policy 1, policy_version 25572 (0.0009) +[2023-10-09 05:11:09,104][60144] Updated weights for policy 1, policy_version 25582 (0.0007) +[2023-10-09 05:11:09,474][60144] Updated weights for policy 1, policy_version 25592 (0.0009) +[2023-10-09 05:11:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 52133888. Throughput: 0: 1708.4, 1: 1733.1. Samples: 13040748. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-09 05:11:11,053][59242] Avg episode reward: [(0, '23.580'), (1, '26.670')] +[2023-10-09 05:11:12,013][60143] Updated weights for policy 0, policy_version 25322 (0.0010) +[2023-10-09 05:11:12,382][60143] Updated weights for policy 0, policy_version 25332 (0.0010) +[2023-10-09 05:11:12,763][60143] Updated weights for policy 0, policy_version 25342 (0.0007) +[2023-10-09 05:11:13,408][60144] Updated weights for policy 1, policy_version 25602 (0.0009) +[2023-10-09 05:11:13,787][60144] Updated weights for policy 1, policy_version 25612 (0.0009) +[2023-10-09 05:11:14,148][60144] Updated weights for policy 1, policy_version 25622 (0.0009) +[2023-10-09 05:11:14,517][60144] Updated weights for policy 1, policy_version 25632 (0.0011) +[2023-10-09 05:11:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 52199424. Throughput: 0: 1713.1, 1: 1721.4. Samples: 13061484. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-09 05:11:16,053][59242] Avg episode reward: [(0, '24.760'), (1, '27.020')] +[2023-10-09 05:11:16,742][60143] Updated weights for policy 0, policy_version 25352 (0.0009) +[2023-10-09 05:11:17,107][60143] Updated weights for policy 0, policy_version 25362 (0.0008) +[2023-10-09 05:11:17,472][60143] Updated weights for policy 0, policy_version 25372 (0.0008) +[2023-10-09 05:11:18,431][60144] Updated weights for policy 1, policy_version 25642 (0.0009) +[2023-10-09 05:11:18,810][60144] Updated weights for policy 1, policy_version 25652 (0.0008) +[2023-10-09 05:11:19,177][60144] Updated weights for policy 1, policy_version 25662 (0.0008) +[2023-10-09 05:11:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 52264960. Throughput: 0: 1690.1, 1: 1740.9. Samples: 13071578. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-09 05:11:21,053][59242] Avg episode reward: [(0, '24.970'), (1, '26.790')] +[2023-10-09 05:11:21,394][60143] Updated weights for policy 0, policy_version 25382 (0.0008) +[2023-10-09 05:11:21,772][60143] Updated weights for policy 0, policy_version 25392 (0.0010) +[2023-10-09 05:11:22,136][60143] Updated weights for policy 0, policy_version 25402 (0.0009) +[2023-10-09 05:11:23,088][60144] Updated weights for policy 1, policy_version 25672 (0.0009) +[2023-10-09 05:11:23,452][60144] Updated weights for policy 1, policy_version 25682 (0.0010) +[2023-10-09 05:11:23,823][60144] Updated weights for policy 1, policy_version 25692 (0.0009) +[2023-10-09 05:11:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 52330496. Throughput: 0: 1712.9, 1: 1712.8. Samples: 13092162. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-09 05:11:26,053][59242] Avg episode reward: [(0, '24.930'), (1, '27.230')] +[2023-10-09 05:11:26,289][60143] Updated weights for policy 0, policy_version 25412 (0.0009) +[2023-10-09 05:11:26,659][60143] Updated weights for policy 0, policy_version 25422 (0.0010) +[2023-10-09 05:11:27,038][60143] Updated weights for policy 0, policy_version 25432 (0.0009) +[2023-10-09 05:11:27,879][60144] Updated weights for policy 1, policy_version 25702 (0.0008) +[2023-10-09 05:11:28,262][60144] Updated weights for policy 1, policy_version 25712 (0.0008) +[2023-10-09 05:11:28,634][60144] Updated weights for policy 1, policy_version 25722 (0.0007) +[2023-10-09 05:11:30,919][60143] Updated weights for policy 0, policy_version 25442 (0.0011) +[2023-10-09 05:11:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 52396032. Throughput: 0: 1710.9, 1: 1728.8. Samples: 13113050. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:11:31,052][59242] Avg episode reward: [(0, '24.690'), (1, '26.780')] +[2023-10-09 05:11:31,061][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000025728_26345472.pth... +[2023-10-09 05:11:31,098][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000024128_24707072.pth +[2023-10-09 05:11:31,280][60143] Updated weights for policy 0, policy_version 25452 (0.0007) +[2023-10-09 05:11:31,650][60143] Updated weights for policy 0, policy_version 25462 (0.0008) +[2023-10-09 05:11:32,021][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000025472_26083328.pth... +[2023-10-09 05:11:32,022][60143] Updated weights for policy 0, policy_version 25472 (0.0009) +[2023-10-09 05:11:32,060][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000023872_24444928.pth +[2023-10-09 05:11:32,504][60144] Updated weights for policy 1, policy_version 25732 (0.0008) +[2023-10-09 05:11:32,872][60144] Updated weights for policy 1, policy_version 25742 (0.0007) +[2023-10-09 05:11:33,229][60144] Updated weights for policy 1, policy_version 25752 (0.0007) +[2023-10-09 05:11:36,040][60143] Updated weights for policy 0, policy_version 25482 (0.0010) +[2023-10-09 05:11:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 52461568. Throughput: 0: 1706.5, 1: 1710.3. Samples: 13122526. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:11:36,053][59242] Avg episode reward: [(0, '24.600'), (1, '26.910')] +[2023-10-09 05:11:36,408][60143] Updated weights for policy 0, policy_version 25492 (0.0007) +[2023-10-09 05:11:36,776][60143] Updated weights for policy 0, policy_version 25502 (0.0010) +[2023-10-09 05:11:37,301][60144] Updated weights for policy 1, policy_version 25762 (0.0008) +[2023-10-09 05:11:37,667][60144] Updated weights for policy 1, policy_version 25772 (0.0010) +[2023-10-09 05:11:38,035][60144] Updated weights for policy 1, policy_version 25782 (0.0010) +[2023-10-09 05:11:38,403][60144] Updated weights for policy 1, policy_version 25792 (0.0011) +[2023-10-09 05:11:40,804][60143] Updated weights for policy 0, policy_version 25512 (0.0009) +[2023-10-09 05:11:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 52527104. Throughput: 0: 1715.2, 1: 1710.7. Samples: 13143564. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:11:41,053][59242] Avg episode reward: [(0, '25.080'), (1, '27.520')] +[2023-10-09 05:11:41,174][60143] Updated weights for policy 0, policy_version 25522 (0.0010) +[2023-10-09 05:11:41,540][60143] Updated weights for policy 0, policy_version 25532 (0.0009) +[2023-10-09 05:11:42,280][60144] Updated weights for policy 1, policy_version 25802 (0.0007) +[2023-10-09 05:11:42,638][60144] Updated weights for policy 1, policy_version 25812 (0.0007) +[2023-10-09 05:11:43,005][60144] Updated weights for policy 1, policy_version 25822 (0.0007) +[2023-10-09 05:11:45,537][60143] Updated weights for policy 0, policy_version 25542 (0.0009) +[2023-10-09 05:11:45,902][60143] Updated weights for policy 0, policy_version 25552 (0.0008) +[2023-10-09 05:11:46,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 52592640. Throughput: 0: 1709.0, 1: 1730.7. Samples: 13164628. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:11:46,053][59242] Avg episode reward: [(0, '25.770'), (1, '27.250')] +[2023-10-09 05:11:46,281][60143] Updated weights for policy 0, policy_version 25562 (0.0009) +[2023-10-09 05:11:46,916][60144] Updated weights for policy 1, policy_version 25832 (0.0007) +[2023-10-09 05:11:47,285][60144] Updated weights for policy 1, policy_version 25842 (0.0010) +[2023-10-09 05:11:47,661][60144] Updated weights for policy 1, policy_version 25852 (0.0008) +[2023-10-09 05:11:50,209][60143] Updated weights for policy 0, policy_version 25572 (0.0008) +[2023-10-09 05:11:50,584][60143] Updated weights for policy 0, policy_version 25582 (0.0010) +[2023-10-09 05:11:50,956][60143] Updated weights for policy 0, policy_version 25592 (0.0009) +[2023-10-09 05:11:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 52658176. Throughput: 0: 1716.3, 1: 1700.8. Samples: 13174196. Policy #0 lag: (min: 15.0, avg: 21.2, max: 47.0) +[2023-10-09 05:11:51,053][59242] Avg episode reward: [(0, '26.030'), (1, '27.350')] +[2023-10-09 05:11:51,632][60144] Updated weights for policy 1, policy_version 25862 (0.0009) +[2023-10-09 05:11:52,002][60144] Updated weights for policy 1, policy_version 25872 (0.0010) +[2023-10-09 05:11:52,377][60144] Updated weights for policy 1, policy_version 25882 (0.0010) +[2023-10-09 05:11:54,934][60143] Updated weights for policy 0, policy_version 25602 (0.0010) +[2023-10-09 05:11:55,309][60143] Updated weights for policy 0, policy_version 25612 (0.0008) +[2023-10-09 05:11:55,673][60143] Updated weights for policy 0, policy_version 25622 (0.0011) +[2023-10-09 05:11:56,051][60143] Updated weights for policy 0, policy_version 25632 (0.0011) +[2023-10-09 05:11:56,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 52756480. Throughput: 0: 1715.4, 1: 1718.1. Samples: 13195256. Policy #0 lag: (min: 15.0, avg: 21.2, max: 47.0) +[2023-10-09 05:11:56,053][59242] Avg episode reward: [(0, '25.380'), (1, '27.230')] +[2023-10-09 05:11:56,371][60144] Updated weights for policy 1, policy_version 25892 (0.0009) +[2023-10-09 05:11:56,748][60144] Updated weights for policy 1, policy_version 25902 (0.0009) +[2023-10-09 05:11:57,115][60144] Updated weights for policy 1, policy_version 25912 (0.0009) +[2023-10-09 05:12:00,087][60143] Updated weights for policy 0, policy_version 25642 (0.0009) +[2023-10-09 05:12:00,448][60143] Updated weights for policy 0, policy_version 25652 (0.0010) +[2023-10-09 05:12:00,822][60143] Updated weights for policy 0, policy_version 25662 (0.0009) +[2023-10-09 05:12:01,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 52822016. Throughput: 0: 1692.6, 1: 1722.1. Samples: 13215146. Policy #0 lag: (min: 15.0, avg: 21.2, max: 47.0) +[2023-10-09 05:12:01,053][59242] Avg episode reward: [(0, '25.620'), (1, '26.680')] +[2023-10-09 05:12:01,266][60144] Updated weights for policy 1, policy_version 25922 (0.0009) +[2023-10-09 05:12:01,631][60144] Updated weights for policy 1, policy_version 25932 (0.0008) +[2023-10-09 05:12:02,013][60144] Updated weights for policy 1, policy_version 25942 (0.0008) +[2023-10-09 05:12:02,382][60144] Updated weights for policy 1, policy_version 25952 (0.0009) +[2023-10-09 05:12:04,977][60143] Updated weights for policy 0, policy_version 25672 (0.0009) +[2023-10-09 05:12:05,354][60143] Updated weights for policy 0, policy_version 25682 (0.0008) +[2023-10-09 05:12:05,722][60143] Updated weights for policy 0, policy_version 25692 (0.0010) +[2023-10-09 05:12:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 52887552. Throughput: 0: 1711.4, 1: 1701.5. Samples: 13225158. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:12:06,053][59242] Avg episode reward: [(0, '25.530'), (1, '27.630')] +[2023-10-09 05:12:06,393][60144] Updated weights for policy 1, policy_version 25962 (0.0009) +[2023-10-09 05:12:06,764][60144] Updated weights for policy 1, policy_version 25972 (0.0007) +[2023-10-09 05:12:07,134][60144] Updated weights for policy 1, policy_version 25982 (0.0008) +[2023-10-09 05:12:09,508][60143] Updated weights for policy 0, policy_version 25702 (0.0009) +[2023-10-09 05:12:09,882][60143] Updated weights for policy 0, policy_version 25712 (0.0007) +[2023-10-09 05:12:10,258][60143] Updated weights for policy 0, policy_version 25722 (0.0008) +[2023-10-09 05:12:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 52953088. Throughput: 0: 1701.6, 1: 1720.4. Samples: 13246150. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:12:11,052][60144] Updated weights for policy 1, policy_version 25992 (0.0009) +[2023-10-09 05:12:11,052][59242] Avg episode reward: [(0, '25.200'), (1, '28.060')] +[2023-10-09 05:12:11,424][60144] Updated weights for policy 1, policy_version 26002 (0.0008) +[2023-10-09 05:12:11,790][60144] Updated weights for policy 1, policy_version 26012 (0.0007) +[2023-10-09 05:12:14,245][60143] Updated weights for policy 0, policy_version 25732 (0.0007) +[2023-10-09 05:12:14,609][60143] Updated weights for policy 0, policy_version 25742 (0.0009) +[2023-10-09 05:12:14,985][60143] Updated weights for policy 0, policy_version 25752 (0.0008) +[2023-10-09 05:12:15,630][60144] Updated weights for policy 1, policy_version 26022 (0.0007) +[2023-10-09 05:12:16,010][60144] Updated weights for policy 1, policy_version 26032 (0.0007) +[2023-10-09 05:12:16,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 53018624. Throughput: 0: 1679.9, 1: 1723.4. Samples: 13266196. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:12:16,053][59242] Avg episode reward: [(0, '26.890'), (1, '26.650')] +[2023-10-09 05:12:16,387][60144] Updated weights for policy 1, policy_version 26042 (0.0007) +[2023-10-09 05:12:18,987][60143] Updated weights for policy 0, policy_version 25762 (0.0010) +[2023-10-09 05:12:19,365][60143] Updated weights for policy 0, policy_version 25772 (0.0009) +[2023-10-09 05:12:19,739][60143] Updated weights for policy 0, policy_version 25782 (0.0009) +[2023-10-09 05:12:20,110][60143] Updated weights for policy 0, policy_version 25792 (0.0008) +[2023-10-09 05:12:20,226][60144] Updated weights for policy 1, policy_version 26052 (0.0008) +[2023-10-09 05:12:20,600][60144] Updated weights for policy 1, policy_version 26062 (0.0008) +[2023-10-09 05:12:20,961][60144] Updated weights for policy 1, policy_version 26072 (0.0009) +[2023-10-09 05:12:21,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 53084160. Throughput: 0: 1709.9, 1: 1726.2. Samples: 13277150. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:12:21,053][59242] Avg episode reward: [(0, '27.860'), (1, '26.610')] +[2023-10-09 05:12:24,163][60143] Updated weights for policy 0, policy_version 25802 (0.0008) +[2023-10-09 05:12:24,542][60143] Updated weights for policy 0, policy_version 25812 (0.0011) +[2023-10-09 05:12:24,865][60144] Updated weights for policy 1, policy_version 26082 (0.0010) +[2023-10-09 05:12:24,908][60143] Updated weights for policy 0, policy_version 25822 (0.0009) +[2023-10-09 05:12:25,228][60144] Updated weights for policy 1, policy_version 26092 (0.0010) +[2023-10-09 05:12:25,596][60144] Updated weights for policy 1, policy_version 26102 (0.0011) +[2023-10-09 05:12:25,960][60144] Updated weights for policy 1, policy_version 26112 (0.0011) +[2023-10-09 05:12:26,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 53182464. Throughput: 0: 1693.8, 1: 1733.1. Samples: 13297774. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-09 05:12:26,053][59242] Avg episode reward: [(0, '27.740'), (1, '26.860')] +[2023-10-09 05:12:29,147][60143] Updated weights for policy 0, policy_version 25832 (0.0009) +[2023-10-09 05:12:29,520][60143] Updated weights for policy 0, policy_version 25842 (0.0009) +[2023-10-09 05:12:29,775][60144] Updated weights for policy 1, policy_version 26122 (0.0008) +[2023-10-09 05:12:29,897][60143] Updated weights for policy 0, policy_version 25852 (0.0008) +[2023-10-09 05:12:30,130][60144] Updated weights for policy 1, policy_version 26132 (0.0008) +[2023-10-09 05:12:30,493][60144] Updated weights for policy 1, policy_version 26142 (0.0008) +[2023-10-09 05:12:31,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 53248000. Throughput: 0: 1683.4, 1: 1710.1. Samples: 13317338. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-09 05:12:31,053][59242] Avg episode reward: [(0, '26.760'), (1, '27.930')] +[2023-10-09 05:12:33,848][60143] Updated weights for policy 0, policy_version 25862 (0.0009) +[2023-10-09 05:12:34,220][60143] Updated weights for policy 0, policy_version 25872 (0.0009) +[2023-10-09 05:12:34,430][60144] Updated weights for policy 1, policy_version 26152 (0.0009) +[2023-10-09 05:12:34,595][60143] Updated weights for policy 0, policy_version 25882 (0.0008) +[2023-10-09 05:12:34,803][60144] Updated weights for policy 1, policy_version 26162 (0.0007) +[2023-10-09 05:12:35,172][60144] Updated weights for policy 1, policy_version 26172 (0.0007) +[2023-10-09 05:12:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 53313536. Throughput: 0: 1707.9, 1: 1737.4. Samples: 13329232. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-09 05:12:36,053][59242] Avg episode reward: [(0, '25.980'), (1, '27.940')] +[2023-10-09 05:12:38,586][60143] Updated weights for policy 0, policy_version 25892 (0.0009) +[2023-10-09 05:12:38,962][60143] Updated weights for policy 0, policy_version 25902 (0.0010) +[2023-10-09 05:12:39,023][60144] Updated weights for policy 1, policy_version 26182 (0.0008) +[2023-10-09 05:12:39,332][60143] Updated weights for policy 0, policy_version 25912 (0.0007) +[2023-10-09 05:12:39,392][60144] Updated weights for policy 1, policy_version 26192 (0.0009) +[2023-10-09 05:12:39,756][60144] Updated weights for policy 1, policy_version 26202 (0.0008) +[2023-10-09 05:12:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 53379072. Throughput: 0: 1682.6, 1: 1724.1. Samples: 13348560. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-09 05:12:41,053][59242] Avg episode reward: [(0, '25.740'), (1, '28.140')] +[2023-10-09 05:12:43,446][60143] Updated weights for policy 0, policy_version 25922 (0.0008) +[2023-10-09 05:12:43,821][60143] Updated weights for policy 0, policy_version 25932 (0.0008) +[2023-10-09 05:12:43,867][60144] Updated weights for policy 1, policy_version 26212 (0.0009) +[2023-10-09 05:12:44,184][60143] Updated weights for policy 0, policy_version 25942 (0.0008) +[2023-10-09 05:12:44,239][60144] Updated weights for policy 1, policy_version 26222 (0.0010) +[2023-10-09 05:12:44,561][60143] Updated weights for policy 0, policy_version 25952 (0.0009) +[2023-10-09 05:12:44,605][60144] Updated weights for policy 1, policy_version 26232 (0.0009) +[2023-10-09 05:12:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 53444608. Throughput: 0: 1698.3, 1: 1717.1. Samples: 13368838. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-09 05:12:46,053][59242] Avg episode reward: [(0, '26.460'), (1, '28.310')] +[2023-10-09 05:12:48,504][60143] Updated weights for policy 0, policy_version 25962 (0.0010) +[2023-10-09 05:12:48,539][60144] Updated weights for policy 1, policy_version 26242 (0.0011) +[2023-10-09 05:12:48,874][60143] Updated weights for policy 0, policy_version 25972 (0.0008) +[2023-10-09 05:12:48,902][60144] Updated weights for policy 1, policy_version 26252 (0.0009) +[2023-10-09 05:12:49,247][60143] Updated weights for policy 0, policy_version 25982 (0.0008) +[2023-10-09 05:12:49,258][60144] Updated weights for policy 1, policy_version 26262 (0.0008) +[2023-10-09 05:12:49,622][60144] Updated weights for policy 1, policy_version 26272 (0.0008) +[2023-10-09 05:12:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 53510144. Throughput: 0: 1696.3, 1: 1744.1. Samples: 13379978. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-09 05:12:51,053][59242] Avg episode reward: [(0, '25.490'), (1, '26.690')] +[2023-10-09 05:12:53,242][60143] Updated weights for policy 0, policy_version 25992 (0.0010) +[2023-10-09 05:12:53,617][60143] Updated weights for policy 0, policy_version 26002 (0.0009) +[2023-10-09 05:12:53,710][60144] Updated weights for policy 1, policy_version 26282 (0.0007) +[2023-10-09 05:12:53,985][60143] Updated weights for policy 0, policy_version 26012 (0.0008) +[2023-10-09 05:12:54,083][60144] Updated weights for policy 1, policy_version 26292 (0.0008) +[2023-10-09 05:12:54,449][60144] Updated weights for policy 1, policy_version 26302 (0.0007) +[2023-10-09 05:12:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 53575680. Throughput: 0: 1680.1, 1: 1717.8. Samples: 13399058. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-09 05:12:56,053][59242] Avg episode reward: [(0, '24.900'), (1, '26.960')] +[2023-10-09 05:12:57,856][60143] Updated weights for policy 0, policy_version 26022 (0.0007) +[2023-10-09 05:12:58,212][60143] Updated weights for policy 0, policy_version 26032 (0.0009) +[2023-10-09 05:12:58,333][60144] Updated weights for policy 1, policy_version 26312 (0.0008) +[2023-10-09 05:12:58,580][60143] Updated weights for policy 0, policy_version 26042 (0.0008) +[2023-10-09 05:12:58,699][60144] Updated weights for policy 1, policy_version 26322 (0.0008) +[2023-10-09 05:12:59,060][60144] Updated weights for policy 1, policy_version 26332 (0.0010) +[2023-10-09 05:13:01,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 53641216. Throughput: 0: 1703.8, 1: 1722.9. Samples: 13420398. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-09 05:13:01,053][59242] Avg episode reward: [(0, '24.860'), (1, '26.910')] +[2023-10-09 05:13:02,678][60143] Updated weights for policy 0, policy_version 26052 (0.0008) +[2023-10-09 05:13:03,043][60143] Updated weights for policy 0, policy_version 26062 (0.0009) +[2023-10-09 05:13:03,080][60144] Updated weights for policy 1, policy_version 26342 (0.0009) +[2023-10-09 05:13:03,410][60143] Updated weights for policy 0, policy_version 26072 (0.0008) +[2023-10-09 05:13:03,442][60144] Updated weights for policy 1, policy_version 26352 (0.0008) +[2023-10-09 05:13:03,808][60144] Updated weights for policy 1, policy_version 26362 (0.0008) +[2023-10-09 05:13:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 53706752. Throughput: 0: 1674.2, 1: 1728.6. Samples: 13430276. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:13:06,052][59242] Avg episode reward: [(0, '24.950'), (1, '26.610')] +[2023-10-09 05:13:07,465][60143] Updated weights for policy 0, policy_version 26082 (0.0007) +[2023-10-09 05:13:07,523][60144] Updated weights for policy 1, policy_version 26372 (0.0007) +[2023-10-09 05:13:07,839][60143] Updated weights for policy 0, policy_version 26092 (0.0008) +[2023-10-09 05:13:07,896][60144] Updated weights for policy 1, policy_version 26382 (0.0008) +[2023-10-09 05:13:08,203][60143] Updated weights for policy 0, policy_version 26102 (0.0008) +[2023-10-09 05:13:08,249][60144] Updated weights for policy 1, policy_version 26392 (0.0008) +[2023-10-09 05:13:08,568][60143] Updated weights for policy 0, policy_version 26112 (0.0009) +[2023-10-09 05:13:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 53772288. Throughput: 0: 1685.4, 1: 1715.2. Samples: 13450800. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:13:11,053][59242] Avg episode reward: [(0, '25.170'), (1, '26.300')] +[2023-10-09 05:13:12,237][60144] Updated weights for policy 1, policy_version 26402 (0.0007) +[2023-10-09 05:13:12,605][60144] Updated weights for policy 1, policy_version 26412 (0.0007) +[2023-10-09 05:13:12,679][60143] Updated weights for policy 0, policy_version 26122 (0.0007) +[2023-10-09 05:13:12,974][60144] Updated weights for policy 1, policy_version 26422 (0.0007) +[2023-10-09 05:13:13,042][60143] Updated weights for policy 0, policy_version 26132 (0.0009) +[2023-10-09 05:13:13,332][60144] Updated weights for policy 1, policy_version 26432 (0.0008) +[2023-10-09 05:13:13,407][60143] Updated weights for policy 0, policy_version 26142 (0.0008) +[2023-10-09 05:13:16,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 53837824. Throughput: 0: 1697.6, 1: 1741.2. Samples: 13472080. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:13:16,053][59242] Avg episode reward: [(0, '25.680'), (1, '24.900')] +[2023-10-09 05:13:17,197][60144] Updated weights for policy 1, policy_version 26442 (0.0007) +[2023-10-09 05:13:17,528][60143] Updated weights for policy 0, policy_version 26152 (0.0008) +[2023-10-09 05:13:17,567][60144] Updated weights for policy 1, policy_version 26452 (0.0010) +[2023-10-09 05:13:17,897][60143] Updated weights for policy 0, policy_version 26162 (0.0007) +[2023-10-09 05:13:17,934][60144] Updated weights for policy 1, policy_version 26462 (0.0008) +[2023-10-09 05:13:18,268][60143] Updated weights for policy 0, policy_version 26172 (0.0010) +[2023-10-09 05:13:21,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 53903360. Throughput: 0: 1666.7, 1: 1712.0. Samples: 13481274. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:13:21,052][59242] Avg episode reward: [(0, '25.610'), (1, '25.060')] +[2023-10-09 05:13:22,041][60144] Updated weights for policy 1, policy_version 26472 (0.0008) +[2023-10-09 05:13:22,240][60143] Updated weights for policy 0, policy_version 26182 (0.0008) +[2023-10-09 05:13:22,412][60144] Updated weights for policy 1, policy_version 26482 (0.0008) +[2023-10-09 05:13:22,606][60143] Updated weights for policy 0, policy_version 26192 (0.0008) +[2023-10-09 05:13:22,776][60144] Updated weights for policy 1, policy_version 26492 (0.0008) +[2023-10-09 05:13:22,972][60143] Updated weights for policy 0, policy_version 26202 (0.0009) +[2023-10-09 05:13:26,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 53968896. Throughput: 0: 1691.6, 1: 1728.8. Samples: 13502480. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:13:26,053][59242] Avg episode reward: [(0, '24.740'), (1, '24.940')] +[2023-10-09 05:13:26,634][60144] Updated weights for policy 1, policy_version 26502 (0.0007) +[2023-10-09 05:13:26,896][60143] Updated weights for policy 0, policy_version 26212 (0.0010) +[2023-10-09 05:13:26,992][60144] Updated weights for policy 1, policy_version 26512 (0.0007) +[2023-10-09 05:13:27,270][60143] Updated weights for policy 0, policy_version 26222 (0.0009) +[2023-10-09 05:13:27,355][60144] Updated weights for policy 1, policy_version 26522 (0.0008) +[2023-10-09 05:13:27,635][60143] Updated weights for policy 0, policy_version 26232 (0.0008) +[2023-10-09 05:13:31,052][59242] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 54034432. Throughput: 0: 1699.5, 1: 1743.2. Samples: 13523760. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:13:31,053][59242] Avg episode reward: [(0, '25.170'), (1, '24.180')] +[2023-10-09 05:13:31,066][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000026240_26869760.pth... +[2023-10-09 05:13:31,067][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000026528_27164672.pth... +[2023-10-09 05:13:31,105][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000024928_25526272.pth +[2023-10-09 05:13:31,107][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000024672_25264128.pth +[2023-10-09 05:13:31,346][60144] Updated weights for policy 1, policy_version 26532 (0.0008) +[2023-10-09 05:13:31,662][60143] Updated weights for policy 0, policy_version 26242 (0.0007) +[2023-10-09 05:13:31,721][60144] Updated weights for policy 1, policy_version 26542 (0.0009) +[2023-10-09 05:13:32,029][60143] Updated weights for policy 0, policy_version 26252 (0.0008) +[2023-10-09 05:13:32,080][60144] Updated weights for policy 1, policy_version 26552 (0.0009) +[2023-10-09 05:13:32,398][60143] Updated weights for policy 0, policy_version 26262 (0.0008) +[2023-10-09 05:13:32,768][60143] Updated weights for policy 0, policy_version 26272 (0.0008) +[2023-10-09 05:13:36,013][60144] Updated weights for policy 1, policy_version 26562 (0.0007) +[2023-10-09 05:13:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 54099968. Throughput: 0: 1684.0, 1: 1717.2. Samples: 13533032. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:13:36,053][59242] Avg episode reward: [(0, '24.880'), (1, '23.470')] +[2023-10-09 05:13:36,378][60144] Updated weights for policy 1, policy_version 26572 (0.0009) +[2023-10-09 05:13:36,652][60143] Updated weights for policy 0, policy_version 26282 (0.0009) +[2023-10-09 05:13:36,748][60144] Updated weights for policy 1, policy_version 26582 (0.0007) +[2023-10-09 05:13:37,016][60143] Updated weights for policy 0, policy_version 26292 (0.0008) +[2023-10-09 05:13:37,110][60144] Updated weights for policy 1, policy_version 26592 (0.0007) +[2023-10-09 05:13:37,398][60143] Updated weights for policy 0, policy_version 26302 (0.0007) +[2023-10-09 05:13:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 54165504. Throughput: 0: 1705.8, 1: 1743.3. Samples: 13554266. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:13:41,053][59242] Avg episode reward: [(0, '24.770'), (1, '23.170')] +[2023-10-09 05:13:41,141][60144] Updated weights for policy 1, policy_version 26602 (0.0010) +[2023-10-09 05:13:41,505][60144] Updated weights for policy 1, policy_version 26612 (0.0009) +[2023-10-09 05:13:41,712][60143] Updated weights for policy 0, policy_version 26312 (0.0009) +[2023-10-09 05:13:41,872][60144] Updated weights for policy 1, policy_version 26622 (0.0008) +[2023-10-09 05:13:42,085][60143] Updated weights for policy 0, policy_version 26322 (0.0008) +[2023-10-09 05:13:42,453][60143] Updated weights for policy 0, policy_version 26332 (0.0009) +[2023-10-09 05:13:45,807][60144] Updated weights for policy 1, policy_version 26632 (0.0008) +[2023-10-09 05:13:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 54231040. Throughput: 0: 1707.4, 1: 1739.6. Samples: 13575510. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) +[2023-10-09 05:13:46,053][59242] Avg episode reward: [(0, '25.380'), (1, '23.270')] +[2023-10-09 05:13:46,167][60144] Updated weights for policy 1, policy_version 26642 (0.0009) +[2023-10-09 05:13:46,338][60143] Updated weights for policy 0, policy_version 26342 (0.0009) +[2023-10-09 05:13:46,528][60144] Updated weights for policy 1, policy_version 26652 (0.0007) +[2023-10-09 05:13:46,704][60143] Updated weights for policy 0, policy_version 26352 (0.0007) +[2023-10-09 05:13:47,080][60143] Updated weights for policy 0, policy_version 26362 (0.0009) +[2023-10-09 05:13:50,580][60144] Updated weights for policy 1, policy_version 26662 (0.0008) +[2023-10-09 05:13:50,954][60144] Updated weights for policy 1, policy_version 26672 (0.0007) +[2023-10-09 05:13:51,010][60143] Updated weights for policy 0, policy_version 26372 (0.0007) +[2023-10-09 05:13:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 54296576. Throughput: 0: 1704.3, 1: 1729.6. Samples: 13584800. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) +[2023-10-09 05:13:51,052][59242] Avg episode reward: [(0, '25.270'), (1, '23.450')] +[2023-10-09 05:13:51,323][60144] Updated weights for policy 1, policy_version 26682 (0.0007) +[2023-10-09 05:13:51,375][60143] Updated weights for policy 0, policy_version 26382 (0.0008) +[2023-10-09 05:13:51,755][60143] Updated weights for policy 0, policy_version 26392 (0.0008) +[2023-10-09 05:13:55,112][60144] Updated weights for policy 1, policy_version 26692 (0.0007) +[2023-10-09 05:13:55,479][60144] Updated weights for policy 1, policy_version 26702 (0.0008) +[2023-10-09 05:13:55,742][60143] Updated weights for policy 0, policy_version 26402 (0.0009) +[2023-10-09 05:13:55,840][60144] Updated weights for policy 1, policy_version 26712 (0.0009) +[2023-10-09 05:13:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 54362112. Throughput: 0: 1709.5, 1: 1738.1. Samples: 13605944. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) +[2023-10-09 05:13:56,053][59242] Avg episode reward: [(0, '26.270'), (1, '24.220')] +[2023-10-09 05:13:56,121][60143] Updated weights for policy 0, policy_version 26412 (0.0009) +[2023-10-09 05:13:56,490][60143] Updated weights for policy 0, policy_version 26422 (0.0010) +[2023-10-09 05:13:56,860][60143] Updated weights for policy 0, policy_version 26432 (0.0010) +[2023-10-09 05:13:59,797][60144] Updated weights for policy 1, policy_version 26722 (0.0008) +[2023-10-09 05:14:00,164][60144] Updated weights for policy 1, policy_version 26732 (0.0009) +[2023-10-09 05:14:00,525][60144] Updated weights for policy 1, policy_version 26742 (0.0009) +[2023-10-09 05:14:00,804][60143] Updated weights for policy 0, policy_version 26442 (0.0008) +[2023-10-09 05:14:00,890][60144] Updated weights for policy 1, policy_version 26752 (0.0009) +[2023-10-09 05:14:01,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 54460416. Throughput: 0: 1712.3, 1: 1716.1. Samples: 13626358. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) +[2023-10-09 05:14:01,052][59242] Avg episode reward: [(0, '27.040'), (1, '24.870')] +[2023-10-09 05:14:01,172][60143] Updated weights for policy 0, policy_version 26452 (0.0008) +[2023-10-09 05:14:01,540][60143] Updated weights for policy 0, policy_version 26462 (0.0008) +[2023-10-09 05:14:04,941][60144] Updated weights for policy 1, policy_version 26762 (0.0008) +[2023-10-09 05:14:05,311][60144] Updated weights for policy 1, policy_version 26772 (0.0007) +[2023-10-09 05:14:05,519][60143] Updated weights for policy 0, policy_version 26472 (0.0008) +[2023-10-09 05:14:05,677][60144] Updated weights for policy 1, policy_version 26782 (0.0009) +[2023-10-09 05:14:05,886][60143] Updated weights for policy 0, policy_version 26482 (0.0009) +[2023-10-09 05:14:06,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 54525952. Throughput: 0: 1713.4, 1: 1734.0. Samples: 13636408. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 05:14:06,052][59242] Avg episode reward: [(0, '27.090'), (1, '25.460')] +[2023-10-09 05:14:06,255][60143] Updated weights for policy 0, policy_version 26492 (0.0008) +[2023-10-09 05:14:09,585][60144] Updated weights for policy 1, policy_version 26792 (0.0007) +[2023-10-09 05:14:09,958][60144] Updated weights for policy 1, policy_version 26802 (0.0007) +[2023-10-09 05:14:10,283][60143] Updated weights for policy 0, policy_version 26502 (0.0007) +[2023-10-09 05:14:10,317][60144] Updated weights for policy 1, policy_version 26812 (0.0009) +[2023-10-09 05:14:10,649][60143] Updated weights for policy 0, policy_version 26512 (0.0010) +[2023-10-09 05:14:11,018][60143] Updated weights for policy 0, policy_version 26522 (0.0009) +[2023-10-09 05:14:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 54591488. Throughput: 0: 1714.8, 1: 1724.1. Samples: 13657230. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 05:14:11,053][59242] Avg episode reward: [(0, '27.610'), (1, '26.520')] +[2023-10-09 05:14:14,316][60144] Updated weights for policy 1, policy_version 26822 (0.0011) +[2023-10-09 05:14:14,678][60144] Updated weights for policy 1, policy_version 26832 (0.0010) +[2023-10-09 05:14:14,874][60143] Updated weights for policy 0, policy_version 26532 (0.0008) +[2023-10-09 05:14:15,048][60144] Updated weights for policy 1, policy_version 26842 (0.0008) +[2023-10-09 05:14:15,239][60143] Updated weights for policy 0, policy_version 26542 (0.0008) +[2023-10-09 05:14:15,605][60143] Updated weights for policy 0, policy_version 26552 (0.0008) +[2023-10-09 05:14:16,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 54689792. Throughput: 0: 1698.7, 1: 1699.2. Samples: 13676664. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 05:14:16,053][59242] Avg episode reward: [(0, '29.300'), (1, '25.370')] +[2023-10-09 05:14:16,066][59934] Saving new best policy, reward=29.300! +[2023-10-09 05:14:19,057][60144] Updated weights for policy 1, policy_version 26852 (0.0009) +[2023-10-09 05:14:19,423][60144] Updated weights for policy 1, policy_version 26862 (0.0008) +[2023-10-09 05:14:19,575][60143] Updated weights for policy 0, policy_version 26562 (0.0007) +[2023-10-09 05:14:19,796][60144] Updated weights for policy 1, policy_version 26872 (0.0007) +[2023-10-09 05:14:19,955][60143] Updated weights for policy 0, policy_version 26572 (0.0009) +[2023-10-09 05:14:20,322][60143] Updated weights for policy 0, policy_version 26582 (0.0009) +[2023-10-09 05:14:20,700][60143] Updated weights for policy 0, policy_version 26592 (0.0008) +[2023-10-09 05:14:21,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 54755328. Throughput: 0: 1719.5, 1: 1726.3. Samples: 13688090. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 05:14:21,053][59242] Avg episode reward: [(0, '28.900'), (1, '26.120')] +[2023-10-09 05:14:23,871][60144] Updated weights for policy 1, policy_version 26882 (0.0007) +[2023-10-09 05:14:24,247][60144] Updated weights for policy 1, policy_version 26892 (0.0009) +[2023-10-09 05:14:24,613][60144] Updated weights for policy 1, policy_version 26902 (0.0007) +[2023-10-09 05:14:24,681][60143] Updated weights for policy 0, policy_version 26602 (0.0008) +[2023-10-09 05:14:24,980][60144] Updated weights for policy 1, policy_version 26912 (0.0007) +[2023-10-09 05:14:25,052][60143] Updated weights for policy 0, policy_version 26612 (0.0007) +[2023-10-09 05:14:25,418][60143] Updated weights for policy 0, policy_version 26622 (0.0008) +[2023-10-09 05:14:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 54820864. Throughput: 0: 1717.1, 1: 1704.7. Samples: 13708248. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 05:14:26,053][59242] Avg episode reward: [(0, '28.240'), (1, '27.060')] +[2023-10-09 05:14:28,970][60144] Updated weights for policy 1, policy_version 26922 (0.0010) +[2023-10-09 05:14:29,337][60144] Updated weights for policy 1, policy_version 26932 (0.0007) +[2023-10-09 05:14:29,460][60143] Updated weights for policy 0, policy_version 26632 (0.0007) +[2023-10-09 05:14:29,711][60144] Updated weights for policy 1, policy_version 26942 (0.0007) +[2023-10-09 05:14:29,839][60143] Updated weights for policy 0, policy_version 26642 (0.0007) +[2023-10-09 05:14:30,214][60143] Updated weights for policy 0, policy_version 26652 (0.0007) +[2023-10-09 05:14:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 54886400. Throughput: 0: 1688.3, 1: 1691.6. Samples: 13727606. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 05:14:31,053][59242] Avg episode reward: [(0, '28.240'), (1, '26.270')] +[2023-10-09 05:14:33,700][60144] Updated weights for policy 1, policy_version 26952 (0.0009) +[2023-10-09 05:14:34,055][60144] Updated weights for policy 1, policy_version 26962 (0.0009) +[2023-10-09 05:14:34,124][60143] Updated weights for policy 0, policy_version 26662 (0.0009) +[2023-10-09 05:14:34,425][60144] Updated weights for policy 1, policy_version 26972 (0.0008) +[2023-10-09 05:14:34,493][60143] Updated weights for policy 0, policy_version 26672 (0.0009) +[2023-10-09 05:14:34,864][60143] Updated weights for policy 0, policy_version 26682 (0.0008) +[2023-10-09 05:14:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 54951936. Throughput: 0: 1718.0, 1: 1715.6. Samples: 13739316. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 05:14:36,053][59242] Avg episode reward: [(0, '27.780'), (1, '26.380')] +[2023-10-09 05:14:38,328][60144] Updated weights for policy 1, policy_version 26982 (0.0008) +[2023-10-09 05:14:38,694][60144] Updated weights for policy 1, policy_version 26992 (0.0007) +[2023-10-09 05:14:39,035][60143] Updated weights for policy 0, policy_version 26692 (0.0008) +[2023-10-09 05:14:39,055][60144] Updated weights for policy 1, policy_version 27002 (0.0007) +[2023-10-09 05:14:39,397][60143] Updated weights for policy 0, policy_version 26702 (0.0009) +[2023-10-09 05:14:39,770][60143] Updated weights for policy 0, policy_version 26712 (0.0008) +[2023-10-09 05:14:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 55017472. Throughput: 0: 1703.6, 1: 1691.0. Samples: 13758702. Policy #0 lag: (min: 24.0, avg: 48.0, max: 56.0) +[2023-10-09 05:14:41,052][59242] Avg episode reward: [(0, '27.750'), (1, '27.050')] +[2023-10-09 05:14:43,054][60144] Updated weights for policy 1, policy_version 27012 (0.0007) +[2023-10-09 05:14:43,442][60144] Updated weights for policy 1, policy_version 27022 (0.0008) +[2023-10-09 05:14:43,757][60143] Updated weights for policy 0, policy_version 26722 (0.0008) +[2023-10-09 05:14:43,798][60144] Updated weights for policy 1, policy_version 27032 (0.0009) +[2023-10-09 05:14:44,125][60143] Updated weights for policy 0, policy_version 26732 (0.0010) +[2023-10-09 05:14:44,495][60143] Updated weights for policy 0, policy_version 26742 (0.0009) +[2023-10-09 05:14:44,875][60143] Updated weights for policy 0, policy_version 26752 (0.0008) +[2023-10-09 05:14:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 55083008. Throughput: 0: 1685.6, 1: 1711.2. Samples: 13779216. Policy #0 lag: (min: 24.0, avg: 48.0, max: 56.0) +[2023-10-09 05:14:46,053][59242] Avg episode reward: [(0, '28.610'), (1, '27.360')] +[2023-10-09 05:14:47,572][60144] Updated weights for policy 1, policy_version 27042 (0.0009) +[2023-10-09 05:14:47,942][60144] Updated weights for policy 1, policy_version 27052 (0.0010) +[2023-10-09 05:14:48,307][60144] Updated weights for policy 1, policy_version 27062 (0.0009) +[2023-10-09 05:14:48,680][60144] Updated weights for policy 1, policy_version 27072 (0.0008) +[2023-10-09 05:14:48,913][60143] Updated weights for policy 0, policy_version 26762 (0.0009) +[2023-10-09 05:14:49,294][60143] Updated weights for policy 0, policy_version 26772 (0.0009) +[2023-10-09 05:14:49,668][60143] Updated weights for policy 0, policy_version 26782 (0.0010) +[2023-10-09 05:14:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 55148544. Throughput: 0: 1716.0, 1: 1699.2. Samples: 13790092. Policy #0 lag: (min: 24.0, avg: 48.0, max: 56.0) +[2023-10-09 05:14:51,053][59242] Avg episode reward: [(0, '27.820'), (1, '27.010')] +[2023-10-09 05:14:52,626][60144] Updated weights for policy 1, policy_version 27082 (0.0008) +[2023-10-09 05:14:53,001][60144] Updated weights for policy 1, policy_version 27092 (0.0007) +[2023-10-09 05:14:53,372][60144] Updated weights for policy 1, policy_version 27102 (0.0008) +[2023-10-09 05:14:53,691][60143] Updated weights for policy 0, policy_version 26792 (0.0008) +[2023-10-09 05:14:54,061][60143] Updated weights for policy 0, policy_version 26802 (0.0008) +[2023-10-09 05:14:54,442][60143] Updated weights for policy 0, policy_version 26812 (0.0008) +[2023-10-09 05:14:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 55214080. Throughput: 0: 1690.3, 1: 1703.4. Samples: 13809946. Policy #0 lag: (min: 24.0, avg: 48.0, max: 56.0) +[2023-10-09 05:14:56,053][59242] Avg episode reward: [(0, '28.000'), (1, '27.740')] +[2023-10-09 05:14:57,395][60144] Updated weights for policy 1, policy_version 27112 (0.0007) +[2023-10-09 05:14:57,769][60144] Updated weights for policy 1, policy_version 27122 (0.0007) +[2023-10-09 05:14:58,139][60144] Updated weights for policy 1, policy_version 27132 (0.0010) +[2023-10-09 05:14:58,353][60143] Updated weights for policy 0, policy_version 26822 (0.0007) +[2023-10-09 05:14:58,726][60143] Updated weights for policy 0, policy_version 26832 (0.0008) +[2023-10-09 05:14:59,095][60143] Updated weights for policy 0, policy_version 26842 (0.0009) +[2023-10-09 05:15:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 55279616. Throughput: 0: 1700.3, 1: 1733.0. Samples: 13831160. Policy #0 lag: (min: 31.0, avg: 33.5, max: 63.0) +[2023-10-09 05:15:01,053][59242] Avg episode reward: [(0, '27.450'), (1, '26.980')] +[2023-10-09 05:15:02,106][60144] Updated weights for policy 1, policy_version 27142 (0.0008) +[2023-10-09 05:15:02,474][60144] Updated weights for policy 1, policy_version 27152 (0.0009) +[2023-10-09 05:15:02,841][60144] Updated weights for policy 1, policy_version 27162 (0.0010) +[2023-10-09 05:15:03,120][60143] Updated weights for policy 0, policy_version 26852 (0.0010) +[2023-10-09 05:15:03,501][60143] Updated weights for policy 0, policy_version 26862 (0.0011) +[2023-10-09 05:15:03,868][60143] Updated weights for policy 0, policy_version 26872 (0.0010) +[2023-10-09 05:15:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 55345152. Throughput: 0: 1698.4, 1: 1704.1. Samples: 13841204. Policy #0 lag: (min: 31.0, avg: 33.5, max: 63.0) +[2023-10-09 05:15:06,053][59242] Avg episode reward: [(0, '27.840'), (1, '27.930')] +[2023-10-09 05:15:06,927][60144] Updated weights for policy 1, policy_version 27172 (0.0009) +[2023-10-09 05:15:07,294][60144] Updated weights for policy 1, policy_version 27182 (0.0009) +[2023-10-09 05:15:07,662][60144] Updated weights for policy 1, policy_version 27192 (0.0008) +[2023-10-09 05:15:07,907][60143] Updated weights for policy 0, policy_version 26882 (0.0010) +[2023-10-09 05:15:08,269][60143] Updated weights for policy 0, policy_version 26892 (0.0011) +[2023-10-09 05:15:08,644][60143] Updated weights for policy 0, policy_version 26902 (0.0008) +[2023-10-09 05:15:09,015][60143] Updated weights for policy 0, policy_version 26912 (0.0008) +[2023-10-09 05:15:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 55410688. Throughput: 0: 1684.8, 1: 1722.9. Samples: 13861594. Policy #0 lag: (min: 31.0, avg: 33.5, max: 63.0) +[2023-10-09 05:15:11,053][59242] Avg episode reward: [(0, '26.960'), (1, '27.940')] +[2023-10-09 05:15:11,689][60144] Updated weights for policy 1, policy_version 27202 (0.0009) +[2023-10-09 05:15:12,063][60144] Updated weights for policy 1, policy_version 27212 (0.0009) +[2023-10-09 05:15:12,430][60144] Updated weights for policy 1, policy_version 27222 (0.0008) +[2023-10-09 05:15:12,798][60144] Updated weights for policy 1, policy_version 27232 (0.0007) +[2023-10-09 05:15:12,986][60143] Updated weights for policy 0, policy_version 26922 (0.0009) +[2023-10-09 05:15:13,353][60143] Updated weights for policy 0, policy_version 26932 (0.0008) +[2023-10-09 05:15:13,733][60143] Updated weights for policy 0, policy_version 26942 (0.0007) +[2023-10-09 05:15:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 55476224. Throughput: 0: 1714.5, 1: 1730.7. Samples: 13882640. Policy #0 lag: (min: 31.0, avg: 33.5, max: 63.0) +[2023-10-09 05:15:16,052][59242] Avg episode reward: [(0, '25.020'), (1, '25.840')] +[2023-10-09 05:15:16,790][60144] Updated weights for policy 1, policy_version 27242 (0.0010) +[2023-10-09 05:15:17,145][60144] Updated weights for policy 1, policy_version 27252 (0.0007) +[2023-10-09 05:15:17,504][60144] Updated weights for policy 1, policy_version 27262 (0.0007) +[2023-10-09 05:15:17,683][60143] Updated weights for policy 0, policy_version 26952 (0.0009) +[2023-10-09 05:15:18,062][60143] Updated weights for policy 0, policy_version 26962 (0.0007) +[2023-10-09 05:15:18,433][60143] Updated weights for policy 0, policy_version 26972 (0.0010) +[2023-10-09 05:15:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 55541760. Throughput: 0: 1686.4, 1: 1707.3. Samples: 13892032. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-09 05:15:21,053][59242] Avg episode reward: [(0, '25.170'), (1, '26.230')] +[2023-10-09 05:15:21,518][60144] Updated weights for policy 1, policy_version 27272 (0.0009) +[2023-10-09 05:15:21,877][60144] Updated weights for policy 1, policy_version 27282 (0.0009) +[2023-10-09 05:15:22,243][60144] Updated weights for policy 1, policy_version 27292 (0.0010) +[2023-10-09 05:15:22,346][60143] Updated weights for policy 0, policy_version 26982 (0.0008) +[2023-10-09 05:15:22,715][60143] Updated weights for policy 0, policy_version 26992 (0.0009) +[2023-10-09 05:15:23,095][60143] Updated weights for policy 0, policy_version 27002 (0.0007) +[2023-10-09 05:15:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 55607296. Throughput: 0: 1697.2, 1: 1734.9. Samples: 13913148. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-09 05:15:26,052][59242] Avg episode reward: [(0, '24.160'), (1, '26.020')] +[2023-10-09 05:15:26,164][60144] Updated weights for policy 1, policy_version 27302 (0.0010) +[2023-10-09 05:15:26,544][60144] Updated weights for policy 1, policy_version 27312 (0.0008) +[2023-10-09 05:15:26,907][60144] Updated weights for policy 1, policy_version 27322 (0.0007) +[2023-10-09 05:15:26,995][60143] Updated weights for policy 0, policy_version 27012 (0.0007) +[2023-10-09 05:15:27,378][60143] Updated weights for policy 0, policy_version 27022 (0.0007) +[2023-10-09 05:15:27,744][60143] Updated weights for policy 0, policy_version 27032 (0.0009) +[2023-10-09 05:15:30,885][60144] Updated weights for policy 1, policy_version 27332 (0.0009) +[2023-10-09 05:15:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 55672832. Throughput: 0: 1714.3, 1: 1731.0. Samples: 13934254. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-09 05:15:31,053][59242] Avg episode reward: [(0, '23.520'), (1, '24.970')] +[2023-10-09 05:15:31,059][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000027040_27688960.pth... +[2023-10-09 05:15:31,088][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000025472_26083328.pth +[2023-10-09 05:15:31,280][60144] Updated weights for policy 1, policy_version 27342 (0.0010) +[2023-10-09 05:15:31,646][60144] Updated weights for policy 1, policy_version 27352 (0.0009) +[2023-10-09 05:15:31,816][60143] Updated weights for policy 0, policy_version 27042 (0.0008) +[2023-10-09 05:15:31,935][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000027360_28016640.pth... +[2023-10-09 05:15:31,965][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000025728_26345472.pth +[2023-10-09 05:15:32,185][60143] Updated weights for policy 0, policy_version 27052 (0.0008) +[2023-10-09 05:15:32,565][60143] Updated weights for policy 0, policy_version 27062 (0.0007) +[2023-10-09 05:15:32,934][60143] Updated weights for policy 0, policy_version 27072 (0.0008) +[2023-10-09 05:15:35,565][60144] Updated weights for policy 1, policy_version 27362 (0.0008) +[2023-10-09 05:15:35,935][60144] Updated weights for policy 1, policy_version 27372 (0.0008) +[2023-10-09 05:15:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 55738368. Throughput: 0: 1680.5, 1: 1723.9. Samples: 13943290. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-09 05:15:36,053][59242] Avg episode reward: [(0, '24.450'), (1, '27.270')] +[2023-10-09 05:15:36,298][60144] Updated weights for policy 1, policy_version 27382 (0.0010) +[2023-10-09 05:15:36,661][60144] Updated weights for policy 1, policy_version 27392 (0.0008) +[2023-10-09 05:15:37,005][60143] Updated weights for policy 0, policy_version 27082 (0.0007) +[2023-10-09 05:15:37,381][60143] Updated weights for policy 0, policy_version 27092 (0.0008) +[2023-10-09 05:15:37,745][60143] Updated weights for policy 0, policy_version 27102 (0.0009) +[2023-10-09 05:15:40,551][60144] Updated weights for policy 1, policy_version 27402 (0.0009) +[2023-10-09 05:15:40,928][60144] Updated weights for policy 1, policy_version 27412 (0.0010) +[2023-10-09 05:15:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 55803904. Throughput: 0: 1704.5, 1: 1733.7. Samples: 13964666. Policy #0 lag: (min: 8.0, avg: 16.8, max: 40.0) +[2023-10-09 05:15:41,052][59242] Avg episode reward: [(0, '25.110'), (1, '26.920')] +[2023-10-09 05:15:41,291][60144] Updated weights for policy 1, policy_version 27422 (0.0008) +[2023-10-09 05:15:41,900][60143] Updated weights for policy 0, policy_version 27112 (0.0008) +[2023-10-09 05:15:42,281][60143] Updated weights for policy 0, policy_version 27122 (0.0008) +[2023-10-09 05:15:42,647][60143] Updated weights for policy 0, policy_version 27132 (0.0009) +[2023-10-09 05:15:45,168][60144] Updated weights for policy 1, policy_version 27432 (0.0007) +[2023-10-09 05:15:45,533][60144] Updated weights for policy 1, policy_version 27442 (0.0011) +[2023-10-09 05:15:45,896][60144] Updated weights for policy 1, policy_version 27452 (0.0009) +[2023-10-09 05:15:46,052][59242] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 55902208. Throughput: 0: 1711.5, 1: 1713.4. Samples: 13985280. Policy #0 lag: (min: 8.0, avg: 16.8, max: 40.0) +[2023-10-09 05:15:46,053][59242] Avg episode reward: [(0, '24.780'), (1, '26.260')] +[2023-10-09 05:15:46,541][60143] Updated weights for policy 0, policy_version 27142 (0.0008) +[2023-10-09 05:15:46,916][60143] Updated weights for policy 0, policy_version 27152 (0.0009) +[2023-10-09 05:15:47,281][60143] Updated weights for policy 0, policy_version 27162 (0.0009) +[2023-10-09 05:15:49,910][60144] Updated weights for policy 1, policy_version 27462 (0.0007) +[2023-10-09 05:15:50,281][60144] Updated weights for policy 1, policy_version 27472 (0.0011) +[2023-10-09 05:15:50,657][60144] Updated weights for policy 1, policy_version 27482 (0.0010) +[2023-10-09 05:15:51,052][59242] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 55967744. Throughput: 0: 1695.2, 1: 1730.7. Samples: 13995370. Policy #0 lag: (min: 8.0, avg: 16.8, max: 40.0) +[2023-10-09 05:15:51,053][59242] Avg episode reward: [(0, '24.860'), (1, '26.720')] +[2023-10-09 05:15:51,242][60143] Updated weights for policy 0, policy_version 27172 (0.0007) +[2023-10-09 05:15:51,620][60143] Updated weights for policy 0, policy_version 27182 (0.0008) +[2023-10-09 05:15:51,982][60143] Updated weights for policy 0, policy_version 27192 (0.0009) +[2023-10-09 05:15:54,505][60144] Updated weights for policy 1, policy_version 27492 (0.0008) +[2023-10-09 05:15:54,871][60144] Updated weights for policy 1, policy_version 27502 (0.0007) +[2023-10-09 05:15:55,244][60144] Updated weights for policy 1, policy_version 27512 (0.0009) +[2023-10-09 05:15:55,906][60143] Updated weights for policy 0, policy_version 27202 (0.0010) +[2023-10-09 05:15:56,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 56033280. Throughput: 0: 1709.5, 1: 1727.7. Samples: 14016268. Policy #0 lag: (min: 8.0, avg: 16.8, max: 40.0) +[2023-10-09 05:15:56,052][59242] Avg episode reward: [(0, '25.170'), (1, '26.730')] +[2023-10-09 05:15:56,281][60143] Updated weights for policy 0, policy_version 27212 (0.0009) +[2023-10-09 05:15:56,655][60143] Updated weights for policy 0, policy_version 27222 (0.0007) +[2023-10-09 05:15:57,022][60143] Updated weights for policy 0, policy_version 27232 (0.0009) +[2023-10-09 05:15:59,272][60144] Updated weights for policy 1, policy_version 27522 (0.0008) +[2023-10-09 05:15:59,643][60144] Updated weights for policy 1, policy_version 27532 (0.0007) +[2023-10-09 05:16:00,013][60144] Updated weights for policy 1, policy_version 27542 (0.0007) +[2023-10-09 05:16:00,380][60144] Updated weights for policy 1, policy_version 27552 (0.0009) +[2023-10-09 05:16:00,899][60143] Updated weights for policy 0, policy_version 27242 (0.0008) +[2023-10-09 05:16:01,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 56098816. Throughput: 0: 1709.9, 1: 1708.2. Samples: 14036454. Policy #0 lag: (min: 31.0, avg: 35.7, max: 63.0) +[2023-10-09 05:16:01,053][59242] Avg episode reward: [(0, '24.500'), (1, '27.220')] +[2023-10-09 05:16:01,267][60143] Updated weights for policy 0, policy_version 27252 (0.0007) +[2023-10-09 05:16:01,638][60143] Updated weights for policy 0, policy_version 27262 (0.0009) +[2023-10-09 05:16:04,234][60144] Updated weights for policy 1, policy_version 27562 (0.0009) +[2023-10-09 05:16:04,606][60144] Updated weights for policy 1, policy_version 27572 (0.0010) +[2023-10-09 05:16:04,964][60144] Updated weights for policy 1, policy_version 27582 (0.0011) +[2023-10-09 05:16:05,740][60143] Updated weights for policy 0, policy_version 27272 (0.0009) +[2023-10-09 05:16:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 56164352. Throughput: 0: 1707.2, 1: 1739.2. Samples: 14047122. Policy #0 lag: (min: 31.0, avg: 35.7, max: 63.0) +[2023-10-09 05:16:06,053][59242] Avg episode reward: [(0, '24.900'), (1, '27.980')] +[2023-10-09 05:16:06,116][60143] Updated weights for policy 0, policy_version 27282 (0.0010) +[2023-10-09 05:16:06,493][60143] Updated weights for policy 0, policy_version 27292 (0.0009) +[2023-10-09 05:16:08,911][60144] Updated weights for policy 1, policy_version 27592 (0.0009) +[2023-10-09 05:16:09,282][60144] Updated weights for policy 1, policy_version 27602 (0.0010) +[2023-10-09 05:16:09,644][60144] Updated weights for policy 1, policy_version 27612 (0.0009) +[2023-10-09 05:16:10,446][60143] Updated weights for policy 0, policy_version 27302 (0.0008) +[2023-10-09 05:16:10,817][60143] Updated weights for policy 0, policy_version 27312 (0.0010) +[2023-10-09 05:16:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 56229888. Throughput: 0: 1714.1, 1: 1712.4. Samples: 14067342. Policy #0 lag: (min: 31.0, avg: 35.7, max: 63.0) +[2023-10-09 05:16:11,053][59242] Avg episode reward: [(0, '26.350'), (1, '28.660')] +[2023-10-09 05:16:11,196][60143] Updated weights for policy 0, policy_version 27322 (0.0008) +[2023-10-09 05:16:13,473][60144] Updated weights for policy 1, policy_version 27622 (0.0010) +[2023-10-09 05:16:13,835][60144] Updated weights for policy 1, policy_version 27632 (0.0009) +[2023-10-09 05:16:14,198][60144] Updated weights for policy 1, policy_version 27642 (0.0010) +[2023-10-09 05:16:15,384][60143] Updated weights for policy 0, policy_version 27332 (0.0010) +[2023-10-09 05:16:15,758][60143] Updated weights for policy 0, policy_version 27342 (0.0009) +[2023-10-09 05:16:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 56295424. Throughput: 0: 1703.6, 1: 1709.8. Samples: 14087854. Policy #0 lag: (min: 31.0, avg: 35.7, max: 63.0) +[2023-10-09 05:16:16,053][59242] Avg episode reward: [(0, '26.180'), (1, '28.870')] +[2023-10-09 05:16:16,130][60143] Updated weights for policy 0, policy_version 27352 (0.0007) +[2023-10-09 05:16:18,137][60144] Updated weights for policy 1, policy_version 27652 (0.0008) +[2023-10-09 05:16:18,529][60144] Updated weights for policy 1, policy_version 27662 (0.0009) +[2023-10-09 05:16:18,896][60144] Updated weights for policy 1, policy_version 27672 (0.0010) +[2023-10-09 05:16:20,056][60143] Updated weights for policy 0, policy_version 27362 (0.0007) +[2023-10-09 05:16:20,423][60143] Updated weights for policy 0, policy_version 27372 (0.0008) +[2023-10-09 05:16:20,790][60143] Updated weights for policy 0, policy_version 27382 (0.0007) +[2023-10-09 05:16:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 56360960. Throughput: 0: 1715.5, 1: 1731.1. Samples: 14098386. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) +[2023-10-09 05:16:21,052][59242] Avg episode reward: [(0, '25.510'), (1, '28.220')] +[2023-10-09 05:16:21,158][60143] Updated weights for policy 0, policy_version 27392 (0.0008) +[2023-10-09 05:16:22,827][60144] Updated weights for policy 1, policy_version 27682 (0.0008) +[2023-10-09 05:16:23,199][60144] Updated weights for policy 1, policy_version 27692 (0.0007) +[2023-10-09 05:16:23,562][60144] Updated weights for policy 1, policy_version 27702 (0.0007) +[2023-10-09 05:16:23,931][60144] Updated weights for policy 1, policy_version 27712 (0.0009) +[2023-10-09 05:16:24,950][60143] Updated weights for policy 0, policy_version 27402 (0.0007) +[2023-10-09 05:16:25,308][60143] Updated weights for policy 0, policy_version 27412 (0.0008) +[2023-10-09 05:16:25,684][60143] Updated weights for policy 0, policy_version 27422 (0.0007) +[2023-10-09 05:16:26,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 56459264. Throughput: 0: 1722.0, 1: 1711.4. Samples: 14119170. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) +[2023-10-09 05:16:26,053][59242] Avg episode reward: [(0, '26.290'), (1, '27.060')] +[2023-10-09 05:16:27,779][60144] Updated weights for policy 1, policy_version 27722 (0.0010) +[2023-10-09 05:16:28,143][60144] Updated weights for policy 1, policy_version 27732 (0.0010) +[2023-10-09 05:16:28,499][60144] Updated weights for policy 1, policy_version 27742 (0.0007) +[2023-10-09 05:16:29,402][60143] Updated weights for policy 0, policy_version 27432 (0.0008) +[2023-10-09 05:16:29,773][60143] Updated weights for policy 0, policy_version 27442 (0.0008) +[2023-10-09 05:16:30,153][60143] Updated weights for policy 0, policy_version 27452 (0.0009) +[2023-10-09 05:16:31,052][59242] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 56524800. Throughput: 0: 1696.7, 1: 1729.0. Samples: 14139438. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) +[2023-10-09 05:16:31,053][59242] Avg episode reward: [(0, '26.500'), (1, '27.340')] +[2023-10-09 05:16:32,403][60144] Updated weights for policy 1, policy_version 27752 (0.0007) +[2023-10-09 05:16:32,770][60144] Updated weights for policy 1, policy_version 27762 (0.0008) +[2023-10-09 05:16:33,150][60144] Updated weights for policy 1, policy_version 27772 (0.0008) +[2023-10-09 05:16:33,907][60143] Updated weights for policy 0, policy_version 27462 (0.0007) +[2023-10-09 05:16:34,273][60143] Updated weights for policy 0, policy_version 27472 (0.0008) +[2023-10-09 05:16:34,636][60143] Updated weights for policy 0, policy_version 27482 (0.0007) +[2023-10-09 05:16:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 56590336. Throughput: 0: 1731.2, 1: 1713.8. Samples: 14150392. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) +[2023-10-09 05:16:36,052][59242] Avg episode reward: [(0, '26.470'), (1, '27.320')] +[2023-10-09 05:16:37,163][60144] Updated weights for policy 1, policy_version 27782 (0.0008) +[2023-10-09 05:16:37,527][60144] Updated weights for policy 1, policy_version 27792 (0.0007) +[2023-10-09 05:16:37,898][60144] Updated weights for policy 1, policy_version 27802 (0.0007) +[2023-10-09 05:16:38,695][60143] Updated weights for policy 0, policy_version 27492 (0.0007) +[2023-10-09 05:16:39,077][60143] Updated weights for policy 0, policy_version 27502 (0.0007) +[2023-10-09 05:16:39,452][60143] Updated weights for policy 0, policy_version 27512 (0.0007) +[2023-10-09 05:16:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 56655872. Throughput: 0: 1712.0, 1: 1722.0. Samples: 14170802. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) +[2023-10-09 05:16:41,053][59242] Avg episode reward: [(0, '26.520'), (1, '27.640')] +[2023-10-09 05:16:41,754][60144] Updated weights for policy 1, policy_version 27812 (0.0010) +[2023-10-09 05:16:42,117][60144] Updated weights for policy 1, policy_version 27822 (0.0008) +[2023-10-09 05:16:42,489][60144] Updated weights for policy 1, policy_version 27832 (0.0007) +[2023-10-09 05:16:43,404][60143] Updated weights for policy 0, policy_version 27522 (0.0008) +[2023-10-09 05:16:43,776][60143] Updated weights for policy 0, policy_version 27532 (0.0008) +[2023-10-09 05:16:44,142][60143] Updated weights for policy 0, policy_version 27542 (0.0007) +[2023-10-09 05:16:44,515][60143] Updated weights for policy 0, policy_version 27552 (0.0008) +[2023-10-09 05:16:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 56721408. Throughput: 0: 1705.0, 1: 1744.8. Samples: 14191696. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) +[2023-10-09 05:16:46,053][59242] Avg episode reward: [(0, '26.480'), (1, '26.970')] +[2023-10-09 05:16:46,485][60144] Updated weights for policy 1, policy_version 27842 (0.0007) +[2023-10-09 05:16:46,845][60144] Updated weights for policy 1, policy_version 27852 (0.0008) +[2023-10-09 05:16:47,211][60144] Updated weights for policy 1, policy_version 27862 (0.0007) +[2023-10-09 05:16:47,578][60144] Updated weights for policy 1, policy_version 27872 (0.0007) +[2023-10-09 05:16:48,477][60143] Updated weights for policy 0, policy_version 27562 (0.0009) +[2023-10-09 05:16:48,852][60143] Updated weights for policy 0, policy_version 27572 (0.0008) +[2023-10-09 05:16:49,218][60143] Updated weights for policy 0, policy_version 27582 (0.0009) +[2023-10-09 05:16:51,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 56786944. Throughput: 0: 1728.8, 1: 1709.4. Samples: 14201844. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) +[2023-10-09 05:16:51,053][59242] Avg episode reward: [(0, '27.660'), (1, '28.500')] +[2023-10-09 05:16:51,510][60144] Updated weights for policy 1, policy_version 27882 (0.0009) +[2023-10-09 05:16:51,870][60144] Updated weights for policy 1, policy_version 27892 (0.0009) +[2023-10-09 05:16:52,240][60144] Updated weights for policy 1, policy_version 27902 (0.0007) +[2023-10-09 05:16:53,272][60143] Updated weights for policy 0, policy_version 27592 (0.0007) +[2023-10-09 05:16:53,652][60143] Updated weights for policy 0, policy_version 27602 (0.0007) +[2023-10-09 05:16:54,032][60143] Updated weights for policy 0, policy_version 27612 (0.0010) +[2023-10-09 05:16:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 56852480. Throughput: 0: 1707.8, 1: 1738.0. Samples: 14222400. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) +[2023-10-09 05:16:56,053][59242] Avg episode reward: [(0, '27.240'), (1, '27.350')] +[2023-10-09 05:16:56,158][60144] Updated weights for policy 1, policy_version 27912 (0.0008) +[2023-10-09 05:16:56,527][60144] Updated weights for policy 1, policy_version 27922 (0.0007) +[2023-10-09 05:16:56,891][60144] Updated weights for policy 1, policy_version 27932 (0.0010) +[2023-10-09 05:16:58,036][60143] Updated weights for policy 0, policy_version 27622 (0.0010) +[2023-10-09 05:16:58,412][60143] Updated weights for policy 0, policy_version 27632 (0.0011) +[2023-10-09 05:16:58,786][60143] Updated weights for policy 0, policy_version 27642 (0.0007) +[2023-10-09 05:17:00,813][60144] Updated weights for policy 1, policy_version 27942 (0.0008) +[2023-10-09 05:17:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 56918016. Throughput: 0: 1716.0, 1: 1745.2. Samples: 14243612. Policy #0 lag: (min: 18.0, avg: 18.3, max: 31.0) +[2023-10-09 05:17:01,053][59242] Avg episode reward: [(0, '27.740'), (1, '27.730')] +[2023-10-09 05:17:01,189][60144] Updated weights for policy 1, policy_version 27952 (0.0009) +[2023-10-09 05:17:01,563][60144] Updated weights for policy 1, policy_version 27962 (0.0008) +[2023-10-09 05:17:02,833][60143] Updated weights for policy 0, policy_version 27652 (0.0010) +[2023-10-09 05:17:03,203][60143] Updated weights for policy 0, policy_version 27662 (0.0007) +[2023-10-09 05:17:03,579][60143] Updated weights for policy 0, policy_version 27672 (0.0008) +[2023-10-09 05:17:05,598][60144] Updated weights for policy 1, policy_version 27972 (0.0008) +[2023-10-09 05:17:06,004][60144] Updated weights for policy 1, policy_version 27982 (0.0007) +[2023-10-09 05:17:06,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 56983552. Throughput: 0: 1715.8, 1: 1727.1. Samples: 14253316. Policy #0 lag: (min: 18.0, avg: 18.3, max: 31.0) +[2023-10-09 05:17:06,053][59242] Avg episode reward: [(0, '27.040'), (1, '28.550')] +[2023-10-09 05:17:06,370][60144] Updated weights for policy 1, policy_version 27992 (0.0007) +[2023-10-09 05:17:07,600][60143] Updated weights for policy 0, policy_version 27682 (0.0009) +[2023-10-09 05:17:07,974][60143] Updated weights for policy 0, policy_version 27692 (0.0008) +[2023-10-09 05:17:08,342][60143] Updated weights for policy 0, policy_version 27702 (0.0010) +[2023-10-09 05:17:08,715][60143] Updated weights for policy 0, policy_version 27712 (0.0008) +[2023-10-09 05:17:10,332][60144] Updated weights for policy 1, policy_version 28002 (0.0008) +[2023-10-09 05:17:10,695][60144] Updated weights for policy 1, policy_version 28012 (0.0008) +[2023-10-09 05:17:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 57049088. Throughput: 0: 1699.6, 1: 1739.3. Samples: 14273922. Policy #0 lag: (min: 18.0, avg: 18.3, max: 31.0) +[2023-10-09 05:17:11,053][59242] Avg episode reward: [(0, '28.450'), (1, '28.020')] +[2023-10-09 05:17:11,062][60144] Updated weights for policy 1, policy_version 28022 (0.0007) +[2023-10-09 05:17:11,429][60144] Updated weights for policy 1, policy_version 28032 (0.0008) +[2023-10-09 05:17:12,797][60143] Updated weights for policy 0, policy_version 27722 (0.0008) +[2023-10-09 05:17:13,175][60143] Updated weights for policy 0, policy_version 27732 (0.0010) +[2023-10-09 05:17:13,535][60143] Updated weights for policy 0, policy_version 27742 (0.0008) +[2023-10-09 05:17:15,168][60144] Updated weights for policy 1, policy_version 28042 (0.0009) +[2023-10-09 05:17:15,533][60144] Updated weights for policy 1, policy_version 28052 (0.0011) +[2023-10-09 05:17:15,900][60144] Updated weights for policy 1, policy_version 28062 (0.0009) +[2023-10-09 05:17:16,052][59242] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 57147392. Throughput: 0: 1719.3, 1: 1727.2. Samples: 14294530. Policy #0 lag: (min: 18.0, avg: 18.3, max: 31.0) +[2023-10-09 05:17:16,053][59242] Avg episode reward: [(0, '27.740'), (1, '28.030')] +[2023-10-09 05:17:17,485][60143] Updated weights for policy 0, policy_version 27752 (0.0010) +[2023-10-09 05:17:17,856][60143] Updated weights for policy 0, policy_version 27762 (0.0008) +[2023-10-09 05:17:18,221][60143] Updated weights for policy 0, policy_version 27772 (0.0009) +[2023-10-09 05:17:19,759][60144] Updated weights for policy 1, policy_version 28072 (0.0009) +[2023-10-09 05:17:20,121][60144] Updated weights for policy 1, policy_version 28082 (0.0011) +[2023-10-09 05:17:20,489][60144] Updated weights for policy 1, policy_version 28092 (0.0008) +[2023-10-09 05:17:21,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 57212928. Throughput: 0: 1682.0, 1: 1748.2. Samples: 14304752. Policy #0 lag: (min: 15.0, avg: 22.9, max: 47.0) +[2023-10-09 05:17:21,052][59242] Avg episode reward: [(0, '27.810'), (1, '26.780')] +[2023-10-09 05:17:22,226][60143] Updated weights for policy 0, policy_version 27782 (0.0008) +[2023-10-09 05:17:22,593][60143] Updated weights for policy 0, policy_version 27792 (0.0007) +[2023-10-09 05:17:22,972][60143] Updated weights for policy 0, policy_version 27802 (0.0007) +[2023-10-09 05:17:24,319][60144] Updated weights for policy 1, policy_version 28102 (0.0009) +[2023-10-09 05:17:24,697][60144] Updated weights for policy 1, policy_version 28112 (0.0010) +[2023-10-09 05:17:25,065][60144] Updated weights for policy 1, policy_version 28122 (0.0008) +[2023-10-09 05:17:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 57278464. Throughput: 0: 1704.4, 1: 1742.6. Samples: 14325918. Policy #0 lag: (min: 15.0, avg: 22.9, max: 47.0) +[2023-10-09 05:17:26,053][59242] Avg episode reward: [(0, '28.700'), (1, '26.160')] +[2023-10-09 05:17:26,853][60143] Updated weights for policy 0, policy_version 27812 (0.0008) +[2023-10-09 05:17:27,227][60143] Updated weights for policy 0, policy_version 27822 (0.0009) +[2023-10-09 05:17:27,600][60143] Updated weights for policy 0, policy_version 27832 (0.0009) +[2023-10-09 05:17:29,022][60144] Updated weights for policy 1, policy_version 28132 (0.0008) +[2023-10-09 05:17:29,387][60144] Updated weights for policy 1, policy_version 28142 (0.0008) +[2023-10-09 05:17:29,752][60144] Updated weights for policy 1, policy_version 28152 (0.0010) +[2023-10-09 05:17:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 57344000. Throughput: 0: 1712.1, 1: 1723.6. Samples: 14346304. Policy #0 lag: (min: 15.0, avg: 22.9, max: 47.0) +[2023-10-09 05:17:31,053][59242] Avg episode reward: [(0, '27.440'), (1, '27.300')] +[2023-10-09 05:17:31,061][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000028160_28835840.pth... +[2023-10-09 05:17:31,061][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000027840_28508160.pth... +[2023-10-09 05:17:31,097][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000026528_27164672.pth +[2023-10-09 05:17:31,101][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000026240_26869760.pth +[2023-10-09 05:17:31,605][60143] Updated weights for policy 0, policy_version 27842 (0.0008) +[2023-10-09 05:17:31,979][60143] Updated weights for policy 0, policy_version 27852 (0.0007) +[2023-10-09 05:17:32,347][60143] Updated weights for policy 0, policy_version 27862 (0.0007) +[2023-10-09 05:17:32,719][60143] Updated weights for policy 0, policy_version 27872 (0.0009) +[2023-10-09 05:17:33,622][60144] Updated weights for policy 1, policy_version 28162 (0.0011) +[2023-10-09 05:17:33,983][60144] Updated weights for policy 1, policy_version 28172 (0.0009) +[2023-10-09 05:17:34,353][60144] Updated weights for policy 1, policy_version 28182 (0.0009) +[2023-10-09 05:17:34,713][60144] Updated weights for policy 1, policy_version 28192 (0.0009) +[2023-10-09 05:17:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 57409536. Throughput: 0: 1688.0, 1: 1754.6. Samples: 14356762. Policy #0 lag: (min: 15.0, avg: 22.9, max: 47.0) +[2023-10-09 05:17:36,053][59242] Avg episode reward: [(0, '27.470'), (1, '27.500')] +[2023-10-09 05:17:36,678][60143] Updated weights for policy 0, policy_version 27882 (0.0008) +[2023-10-09 05:17:37,045][60143] Updated weights for policy 0, policy_version 27892 (0.0008) +[2023-10-09 05:17:37,413][60143] Updated weights for policy 0, policy_version 27902 (0.0007) +[2023-10-09 05:17:38,658][60144] Updated weights for policy 1, policy_version 28202 (0.0009) +[2023-10-09 05:17:39,022][60144] Updated weights for policy 1, policy_version 28212 (0.0009) +[2023-10-09 05:17:39,395][60144] Updated weights for policy 1, policy_version 28222 (0.0009) +[2023-10-09 05:17:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 57475072. Throughput: 0: 1710.8, 1: 1717.9. Samples: 14376690. Policy #0 lag: (min: 25.0, avg: 28.4, max: 57.0) +[2023-10-09 05:17:41,053][59242] Avg episode reward: [(0, '28.450'), (1, '28.170')] +[2023-10-09 05:17:41,512][60143] Updated weights for policy 0, policy_version 27912 (0.0007) +[2023-10-09 05:17:41,892][60143] Updated weights for policy 0, policy_version 27922 (0.0009) +[2023-10-09 05:17:42,257][60143] Updated weights for policy 0, policy_version 27932 (0.0008) +[2023-10-09 05:17:43,266][60144] Updated weights for policy 1, policy_version 28232 (0.0008) +[2023-10-09 05:17:43,633][60144] Updated weights for policy 1, policy_version 28242 (0.0011) +[2023-10-09 05:17:44,004][60144] Updated weights for policy 1, policy_version 28252 (0.0007) +[2023-10-09 05:17:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 57540608. Throughput: 0: 1708.0, 1: 1722.4. Samples: 14397980. Policy #0 lag: (min: 25.0, avg: 28.4, max: 57.0) +[2023-10-09 05:17:46,053][59242] Avg episode reward: [(0, '28.240'), (1, '28.380')] +[2023-10-09 05:17:46,134][60143] Updated weights for policy 0, policy_version 27942 (0.0008) +[2023-10-09 05:17:46,507][60143] Updated weights for policy 0, policy_version 27952 (0.0007) +[2023-10-09 05:17:46,871][60143] Updated weights for policy 0, policy_version 27962 (0.0007) +[2023-10-09 05:17:47,868][60144] Updated weights for policy 1, policy_version 28262 (0.0007) +[2023-10-09 05:17:48,232][60144] Updated weights for policy 1, policy_version 28272 (0.0008) +[2023-10-09 05:17:48,608][60144] Updated weights for policy 1, policy_version 28282 (0.0007) +[2023-10-09 05:17:50,909][60143] Updated weights for policy 0, policy_version 27972 (0.0007) +[2023-10-09 05:17:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 57606144. Throughput: 0: 1697.6, 1: 1735.1. Samples: 14407788. Policy #0 lag: (min: 25.0, avg: 28.4, max: 57.0) +[2023-10-09 05:17:51,053][59242] Avg episode reward: [(0, '27.850'), (1, '28.030')] +[2023-10-09 05:17:51,274][60143] Updated weights for policy 0, policy_version 27982 (0.0009) +[2023-10-09 05:17:51,649][60143] Updated weights for policy 0, policy_version 27992 (0.0009) +[2023-10-09 05:17:52,645][60144] Updated weights for policy 1, policy_version 28292 (0.0007) +[2023-10-09 05:17:53,058][60144] Updated weights for policy 1, policy_version 28302 (0.0007) +[2023-10-09 05:17:53,427][60144] Updated weights for policy 1, policy_version 28312 (0.0007) +[2023-10-09 05:17:55,645][60143] Updated weights for policy 0, policy_version 28002 (0.0008) +[2023-10-09 05:17:56,019][60143] Updated weights for policy 0, policy_version 28012 (0.0008) +[2023-10-09 05:17:56,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 57671680. Throughput: 0: 1709.2, 1: 1729.0. Samples: 14428640. Policy #0 lag: (min: 25.0, avg: 28.4, max: 57.0) +[2023-10-09 05:17:56,053][59242] Avg episode reward: [(0, '28.360'), (1, '27.840')] +[2023-10-09 05:17:56,385][60143] Updated weights for policy 0, policy_version 28022 (0.0008) +[2023-10-09 05:17:56,755][60143] Updated weights for policy 0, policy_version 28032 (0.0007) +[2023-10-09 05:17:57,222][60144] Updated weights for policy 1, policy_version 28322 (0.0009) +[2023-10-09 05:17:57,591][60144] Updated weights for policy 1, policy_version 28332 (0.0007) +[2023-10-09 05:17:57,955][60144] Updated weights for policy 1, policy_version 28342 (0.0007) +[2023-10-09 05:17:58,315][60144] Updated weights for policy 1, policy_version 28352 (0.0010) +[2023-10-09 05:18:00,756][60143] Updated weights for policy 0, policy_version 28042 (0.0007) +[2023-10-09 05:18:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 57737216. Throughput: 0: 1713.6, 1: 1743.2. Samples: 14450088. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-09 05:18:01,053][59242] Avg episode reward: [(0, '28.630'), (1, '27.760')] +[2023-10-09 05:18:01,119][60143] Updated weights for policy 0, policy_version 28052 (0.0009) +[2023-10-09 05:18:01,497][60143] Updated weights for policy 0, policy_version 28062 (0.0008) +[2023-10-09 05:18:02,028][60144] Updated weights for policy 1, policy_version 28362 (0.0010) +[2023-10-09 05:18:02,403][60144] Updated weights for policy 1, policy_version 28372 (0.0010) +[2023-10-09 05:18:02,771][60144] Updated weights for policy 1, policy_version 28382 (0.0009) +[2023-10-09 05:18:05,507][60143] Updated weights for policy 0, policy_version 28072 (0.0008) +[2023-10-09 05:18:05,887][60143] Updated weights for policy 0, policy_version 28082 (0.0008) +[2023-10-09 05:18:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 57802752. Throughput: 0: 1721.6, 1: 1724.7. Samples: 14459838. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-09 05:18:06,053][59242] Avg episode reward: [(0, '28.700'), (1, '27.410')] +[2023-10-09 05:18:06,269][60143] Updated weights for policy 0, policy_version 28092 (0.0010) +[2023-10-09 05:18:06,955][60144] Updated weights for policy 1, policy_version 28392 (0.0009) +[2023-10-09 05:18:07,318][60144] Updated weights for policy 1, policy_version 28402 (0.0009) +[2023-10-09 05:18:07,673][60144] Updated weights for policy 1, policy_version 28412 (0.0009) +[2023-10-09 05:18:10,148][60143] Updated weights for policy 0, policy_version 28102 (0.0009) +[2023-10-09 05:18:10,517][60143] Updated weights for policy 0, policy_version 28112 (0.0007) +[2023-10-09 05:18:10,889][60143] Updated weights for policy 0, policy_version 28122 (0.0009) +[2023-10-09 05:18:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 57868288. Throughput: 0: 1719.5, 1: 1726.9. Samples: 14481004. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-09 05:18:11,053][59242] Avg episode reward: [(0, '27.200'), (1, '26.590')] +[2023-10-09 05:18:11,769][60144] Updated weights for policy 1, policy_version 28422 (0.0007) +[2023-10-09 05:18:12,126][60144] Updated weights for policy 1, policy_version 28432 (0.0009) +[2023-10-09 05:18:12,494][60144] Updated weights for policy 1, policy_version 28442 (0.0008) +[2023-10-09 05:18:14,846][60143] Updated weights for policy 0, policy_version 28132 (0.0009) +[2023-10-09 05:18:15,223][60143] Updated weights for policy 0, policy_version 28142 (0.0010) +[2023-10-09 05:18:15,584][60143] Updated weights for policy 0, policy_version 28152 (0.0010) +[2023-10-09 05:18:16,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 57966592. Throughput: 0: 1698.0, 1: 1751.1. Samples: 14501516. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-09 05:18:16,053][59242] Avg episode reward: [(0, '27.680'), (1, '26.760')] +[2023-10-09 05:18:16,303][60144] Updated weights for policy 1, policy_version 28452 (0.0007) +[2023-10-09 05:18:16,665][60144] Updated weights for policy 1, policy_version 28462 (0.0009) +[2023-10-09 05:18:17,032][60144] Updated weights for policy 1, policy_version 28472 (0.0009) +[2023-10-09 05:18:19,763][60143] Updated weights for policy 0, policy_version 28162 (0.0009) +[2023-10-09 05:18:20,140][60143] Updated weights for policy 0, policy_version 28172 (0.0007) +[2023-10-09 05:18:20,500][60143] Updated weights for policy 0, policy_version 28182 (0.0008) +[2023-10-09 05:18:20,868][60143] Updated weights for policy 0, policy_version 28192 (0.0008) +[2023-10-09 05:18:21,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 58032128. Throughput: 0: 1719.7, 1: 1720.5. Samples: 14511574. Policy #0 lag: (min: 2.0, avg: 2.2, max: 9.0) +[2023-10-09 05:18:21,053][59242] Avg episode reward: [(0, '27.730'), (1, '26.070')] +[2023-10-09 05:18:21,128][60144] Updated weights for policy 1, policy_version 28482 (0.0009) +[2023-10-09 05:18:21,494][60144] Updated weights for policy 1, policy_version 28492 (0.0009) +[2023-10-09 05:18:21,876][60144] Updated weights for policy 1, policy_version 28502 (0.0010) +[2023-10-09 05:18:22,242][60144] Updated weights for policy 1, policy_version 28512 (0.0010) +[2023-10-09 05:18:24,847][60143] Updated weights for policy 0, policy_version 28202 (0.0010) +[2023-10-09 05:18:25,210][60143] Updated weights for policy 0, policy_version 28212 (0.0009) +[2023-10-09 05:18:25,578][60143] Updated weights for policy 0, policy_version 28222 (0.0009) +[2023-10-09 05:18:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 58097664. Throughput: 0: 1711.4, 1: 1750.8. Samples: 14532488. Policy #0 lag: (min: 2.0, avg: 2.2, max: 9.0) +[2023-10-09 05:18:26,053][59242] Avg episode reward: [(0, '27.730'), (1, '26.190')] +[2023-10-09 05:18:26,081][60144] Updated weights for policy 1, policy_version 28522 (0.0008) +[2023-10-09 05:18:26,457][60144] Updated weights for policy 1, policy_version 28532 (0.0009) +[2023-10-09 05:18:26,835][60144] Updated weights for policy 1, policy_version 28542 (0.0007) +[2023-10-09 05:18:29,702][60143] Updated weights for policy 0, policy_version 28232 (0.0009) +[2023-10-09 05:18:30,089][60143] Updated weights for policy 0, policy_version 28242 (0.0009) +[2023-10-09 05:18:30,459][60143] Updated weights for policy 0, policy_version 28252 (0.0009) +[2023-10-09 05:18:30,635][60144] Updated weights for policy 1, policy_version 28552 (0.0009) +[2023-10-09 05:18:31,006][60144] Updated weights for policy 1, policy_version 28562 (0.0010) +[2023-10-09 05:18:31,053][59242] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 58163200. Throughput: 0: 1691.1, 1: 1741.3. Samples: 14552440. Policy #0 lag: (min: 2.0, avg: 2.2, max: 9.0) +[2023-10-09 05:18:31,054][59242] Avg episode reward: [(0, '27.180'), (1, '26.570')] +[2023-10-09 05:18:31,375][60144] Updated weights for policy 1, policy_version 28572 (0.0009) +[2023-10-09 05:18:34,487][60143] Updated weights for policy 0, policy_version 28262 (0.0007) +[2023-10-09 05:18:34,862][60143] Updated weights for policy 0, policy_version 28272 (0.0009) +[2023-10-09 05:18:35,224][60143] Updated weights for policy 0, policy_version 28282 (0.0008) +[2023-10-09 05:18:35,261][60144] Updated weights for policy 1, policy_version 28582 (0.0009) +[2023-10-09 05:18:35,637][60144] Updated weights for policy 1, policy_version 28592 (0.0009) +[2023-10-09 05:18:36,003][60144] Updated weights for policy 1, policy_version 28602 (0.0009) +[2023-10-09 05:18:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 58228736. Throughput: 0: 1715.2, 1: 1737.0. Samples: 14563138. Policy #0 lag: (min: 2.0, avg: 2.2, max: 9.0) +[2023-10-09 05:18:36,053][59242] Avg episode reward: [(0, '26.780'), (1, '25.370')] +[2023-10-09 05:18:39,240][60143] Updated weights for policy 0, policy_version 28292 (0.0009) +[2023-10-09 05:18:39,606][60143] Updated weights for policy 0, policy_version 28302 (0.0011) +[2023-10-09 05:18:39,858][60144] Updated weights for policy 1, policy_version 28612 (0.0007) +[2023-10-09 05:18:39,975][60143] Updated weights for policy 0, policy_version 28312 (0.0007) +[2023-10-09 05:18:40,248][60144] Updated weights for policy 1, policy_version 28622 (0.0009) +[2023-10-09 05:18:40,607][60144] Updated weights for policy 1, policy_version 28632 (0.0011) +[2023-10-09 05:18:41,052][59242] Fps is (10 sec: 16384.7, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 58327040. Throughput: 0: 1705.6, 1: 1748.0. Samples: 14584054. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 05:18:41,053][59242] Avg episode reward: [(0, '26.620'), (1, '24.280')] +[2023-10-09 05:18:44,153][60143] Updated weights for policy 0, policy_version 28322 (0.0008) +[2023-10-09 05:18:44,535][60143] Updated weights for policy 0, policy_version 28332 (0.0009) +[2023-10-09 05:18:44,658][60144] Updated weights for policy 1, policy_version 28642 (0.0009) +[2023-10-09 05:18:44,908][60143] Updated weights for policy 0, policy_version 28342 (0.0008) +[2023-10-09 05:18:45,029][60144] Updated weights for policy 1, policy_version 28652 (0.0009) +[2023-10-09 05:18:45,285][60143] Updated weights for policy 0, policy_version 28352 (0.0008) +[2023-10-09 05:18:45,405][60144] Updated weights for policy 1, policy_version 28662 (0.0009) +[2023-10-09 05:18:45,767][60144] Updated weights for policy 1, policy_version 28672 (0.0008) +[2023-10-09 05:18:46,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 58392576. Throughput: 0: 1679.7, 1: 1715.6. Samples: 14602876. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 05:18:46,053][59242] Avg episode reward: [(0, '27.710'), (1, '24.270')] +[2023-10-09 05:18:49,266][60143] Updated weights for policy 0, policy_version 28362 (0.0010) +[2023-10-09 05:18:49,635][60143] Updated weights for policy 0, policy_version 28372 (0.0007) +[2023-10-09 05:18:49,636][60144] Updated weights for policy 1, policy_version 28682 (0.0007) +[2023-10-09 05:18:50,004][60144] Updated weights for policy 1, policy_version 28692 (0.0008) +[2023-10-09 05:18:50,005][60143] Updated weights for policy 0, policy_version 28382 (0.0009) +[2023-10-09 05:18:50,366][60144] Updated weights for policy 1, policy_version 28702 (0.0010) +[2023-10-09 05:18:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 58458112. Throughput: 0: 1699.8, 1: 1735.0. Samples: 14614406. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 05:18:51,053][59242] Avg episode reward: [(0, '27.130'), (1, '23.820')] +[2023-10-09 05:18:53,998][60143] Updated weights for policy 0, policy_version 28392 (0.0010) +[2023-10-09 05:18:54,325][60144] Updated weights for policy 1, policy_version 28712 (0.0008) +[2023-10-09 05:18:54,366][60143] Updated weights for policy 0, policy_version 28402 (0.0007) +[2023-10-09 05:18:54,701][60144] Updated weights for policy 1, policy_version 28722 (0.0009) +[2023-10-09 05:18:54,735][60143] Updated weights for policy 0, policy_version 28412 (0.0008) +[2023-10-09 05:18:55,062][60144] Updated weights for policy 1, policy_version 28732 (0.0009) +[2023-10-09 05:18:56,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 58523648. Throughput: 0: 1679.2, 1: 1726.2. Samples: 14634250. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 05:18:56,053][59242] Avg episode reward: [(0, '27.010'), (1, '23.260')] +[2023-10-09 05:18:58,473][60143] Updated weights for policy 0, policy_version 28422 (0.0008) +[2023-10-09 05:18:58,852][60143] Updated weights for policy 0, policy_version 28432 (0.0008) +[2023-10-09 05:18:59,007][60144] Updated weights for policy 1, policy_version 28742 (0.0009) +[2023-10-09 05:18:59,219][60143] Updated weights for policy 0, policy_version 28442 (0.0009) +[2023-10-09 05:18:59,377][60144] Updated weights for policy 1, policy_version 28752 (0.0010) +[2023-10-09 05:18:59,739][60144] Updated weights for policy 1, policy_version 28762 (0.0008) +[2023-10-09 05:19:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.6). Total num frames: 58589184. Throughput: 0: 1692.6, 1: 1705.5. Samples: 14654432. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-09 05:19:01,053][59242] Avg episode reward: [(0, '26.890'), (1, '24.150')] +[2023-10-09 05:19:03,100][60143] Updated weights for policy 0, policy_version 28452 (0.0008) +[2023-10-09 05:19:03,465][60143] Updated weights for policy 0, policy_version 28462 (0.0007) +[2023-10-09 05:19:03,843][60143] Updated weights for policy 0, policy_version 28472 (0.0008) +[2023-10-09 05:19:03,882][60144] Updated weights for policy 1, policy_version 28772 (0.0008) +[2023-10-09 05:19:04,247][60144] Updated weights for policy 1, policy_version 28782 (0.0009) +[2023-10-09 05:19:04,605][60144] Updated weights for policy 1, policy_version 28792 (0.0008) +[2023-10-09 05:19:06,052][59242] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 58654720. Throughput: 0: 1687.7, 1: 1735.1. Samples: 14665600. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-09 05:19:06,052][59242] Avg episode reward: [(0, '25.600'), (1, '26.210')] +[2023-10-09 05:19:07,957][60143] Updated weights for policy 0, policy_version 28482 (0.0010) +[2023-10-09 05:19:08,335][60143] Updated weights for policy 0, policy_version 28492 (0.0008) +[2023-10-09 05:19:08,638][60144] Updated weights for policy 1, policy_version 28802 (0.0008) +[2023-10-09 05:19:08,707][60143] Updated weights for policy 0, policy_version 28502 (0.0007) +[2023-10-09 05:19:09,014][60144] Updated weights for policy 1, policy_version 28812 (0.0010) +[2023-10-09 05:19:09,073][60143] Updated weights for policy 0, policy_version 28512 (0.0008) +[2023-10-09 05:19:09,387][60144] Updated weights for policy 1, policy_version 28822 (0.0007) +[2023-10-09 05:19:09,756][60144] Updated weights for policy 1, policy_version 28832 (0.0008) +[2023-10-09 05:19:11,052][59242] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 58720256. Throughput: 0: 1674.4, 1: 1708.0. Samples: 14684696. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-09 05:19:11,053][59242] Avg episode reward: [(0, '26.890'), (1, '26.390')] +[2023-10-09 05:19:13,127][60143] Updated weights for policy 0, policy_version 28522 (0.0008) +[2023-10-09 05:19:13,494][60143] Updated weights for policy 0, policy_version 28532 (0.0008) +[2023-10-09 05:19:13,730][60144] Updated weights for policy 1, policy_version 28842 (0.0008) +[2023-10-09 05:19:13,869][60143] Updated weights for policy 0, policy_version 28542 (0.0008) +[2023-10-09 05:19:14,090][60144] Updated weights for policy 1, policy_version 28852 (0.0009) +[2023-10-09 05:19:14,457][60144] Updated weights for policy 1, policy_version 28862 (0.0007) +[2023-10-09 05:19:16,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 58785792. Throughput: 0: 1698.4, 1: 1705.2. Samples: 14705602. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-09 05:19:16,053][59242] Avg episode reward: [(0, '25.900'), (1, '25.550')] +[2023-10-09 05:19:18,185][60143] Updated weights for policy 0, policy_version 28552 (0.0009) +[2023-10-09 05:19:18,244][60144] Updated weights for policy 1, policy_version 28872 (0.0008) +[2023-10-09 05:19:18,553][60143] Updated weights for policy 0, policy_version 28562 (0.0008) +[2023-10-09 05:19:18,610][60144] Updated weights for policy 1, policy_version 28882 (0.0008) +[2023-10-09 05:19:18,925][60143] Updated weights for policy 0, policy_version 28572 (0.0008) +[2023-10-09 05:19:18,969][60144] Updated weights for policy 1, policy_version 28892 (0.0009) +[2023-10-09 05:19:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 58851328. Throughput: 0: 1687.5, 1: 1713.0. Samples: 14716160. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-09 05:19:21,052][59242] Avg episode reward: [(0, '26.300'), (1, '25.500')] +[2023-10-09 05:19:22,982][60144] Updated weights for policy 1, policy_version 28902 (0.0007) +[2023-10-09 05:19:22,996][60143] Updated weights for policy 0, policy_version 28582 (0.0008) +[2023-10-09 05:19:23,349][60144] Updated weights for policy 1, policy_version 28912 (0.0007) +[2023-10-09 05:19:23,369][60143] Updated weights for policy 0, policy_version 28592 (0.0008) +[2023-10-09 05:19:23,724][60144] Updated weights for policy 1, policy_version 28922 (0.0007) +[2023-10-09 05:19:23,750][60143] Updated weights for policy 0, policy_version 28602 (0.0008) +[2023-10-09 05:19:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 58916864. Throughput: 0: 1675.2, 1: 1695.1. Samples: 14735720. Policy #0 lag: (min: 17.0, avg: 28.4, max: 49.0) +[2023-10-09 05:19:26,052][59242] Avg episode reward: [(0, '28.660'), (1, '24.810')] +[2023-10-09 05:19:27,783][60144] Updated weights for policy 1, policy_version 28932 (0.0008) +[2023-10-09 05:19:27,813][60143] Updated weights for policy 0, policy_version 28612 (0.0007) +[2023-10-09 05:19:28,176][60144] Updated weights for policy 1, policy_version 28942 (0.0007) +[2023-10-09 05:19:28,192][60143] Updated weights for policy 0, policy_version 28622 (0.0008) +[2023-10-09 05:19:28,541][60144] Updated weights for policy 1, policy_version 28952 (0.0008) +[2023-10-09 05:19:28,559][60143] Updated weights for policy 0, policy_version 28632 (0.0009) +[2023-10-09 05:19:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 58982400. Throughput: 0: 1697.8, 1: 1719.0. Samples: 14756634. Policy #0 lag: (min: 17.0, avg: 28.4, max: 49.0) +[2023-10-09 05:19:31,053][59242] Avg episode reward: [(0, '28.480'), (1, '23.970')] +[2023-10-09 05:19:31,062][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000028960_29655040.pth... +[2023-10-09 05:19:31,062][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000028640_29327360.pth... +[2023-10-09 05:19:31,091][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000027360_28016640.pth +[2023-10-09 05:19:31,102][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000027040_27688960.pth +[2023-10-09 05:19:32,548][60144] Updated weights for policy 1, policy_version 28962 (0.0008) +[2023-10-09 05:19:32,605][60143] Updated weights for policy 0, policy_version 28642 (0.0009) +[2023-10-09 05:19:32,917][60144] Updated weights for policy 1, policy_version 28972 (0.0007) +[2023-10-09 05:19:32,974][60143] Updated weights for policy 0, policy_version 28652 (0.0007) +[2023-10-09 05:19:33,275][60144] Updated weights for policy 1, policy_version 28982 (0.0008) +[2023-10-09 05:19:33,344][60143] Updated weights for policy 0, policy_version 28662 (0.0009) +[2023-10-09 05:19:33,643][60144] Updated weights for policy 1, policy_version 28992 (0.0008) +[2023-10-09 05:19:33,711][60143] Updated weights for policy 0, policy_version 28672 (0.0010) +[2023-10-09 05:19:36,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 59047936. Throughput: 0: 1679.7, 1: 1699.7. Samples: 14766480. Policy #0 lag: (min: 17.0, avg: 28.4, max: 49.0) +[2023-10-09 05:19:36,053][59242] Avg episode reward: [(0, '28.410'), (1, '24.130')] +[2023-10-09 05:19:37,576][60143] Updated weights for policy 0, policy_version 28682 (0.0008) +[2023-10-09 05:19:37,661][60144] Updated weights for policy 1, policy_version 29002 (0.0007) +[2023-10-09 05:19:37,954][60143] Updated weights for policy 0, policy_version 28692 (0.0007) +[2023-10-09 05:19:38,026][60144] Updated weights for policy 1, policy_version 29012 (0.0008) +[2023-10-09 05:19:38,318][60143] Updated weights for policy 0, policy_version 28702 (0.0007) +[2023-10-09 05:19:38,391][60144] Updated weights for policy 1, policy_version 29022 (0.0010) +[2023-10-09 05:19:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 59113472. Throughput: 0: 1698.8, 1: 1706.1. Samples: 14787470. Policy #0 lag: (min: 17.0, avg: 28.4, max: 49.0) +[2023-10-09 05:19:41,053][59242] Avg episode reward: [(0, '28.300'), (1, '25.110')] +[2023-10-09 05:19:42,345][60143] Updated weights for policy 0, policy_version 28712 (0.0007) +[2023-10-09 05:19:42,471][60144] Updated weights for policy 1, policy_version 29032 (0.0007) +[2023-10-09 05:19:42,716][60143] Updated weights for policy 0, policy_version 28722 (0.0007) +[2023-10-09 05:19:42,841][60144] Updated weights for policy 1, policy_version 29042 (0.0007) +[2023-10-09 05:19:43,078][60143] Updated weights for policy 0, policy_version 28732 (0.0009) +[2023-10-09 05:19:43,202][60144] Updated weights for policy 1, policy_version 29052 (0.0007) +[2023-10-09 05:19:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 59179008. Throughput: 0: 1701.2, 1: 1718.1. Samples: 14808298. Policy #0 lag: (min: 17.0, avg: 28.4, max: 49.0) +[2023-10-09 05:19:46,053][59242] Avg episode reward: [(0, '28.290'), (1, '24.940')] +[2023-10-09 05:19:47,114][60144] Updated weights for policy 1, policy_version 29062 (0.0008) +[2023-10-09 05:19:47,174][60143] Updated weights for policy 0, policy_version 28742 (0.0009) +[2023-10-09 05:19:47,496][60144] Updated weights for policy 1, policy_version 29072 (0.0008) +[2023-10-09 05:19:47,536][60143] Updated weights for policy 0, policy_version 28752 (0.0008) +[2023-10-09 05:19:47,856][60144] Updated weights for policy 1, policy_version 29082 (0.0008) +[2023-10-09 05:19:47,909][60143] Updated weights for policy 0, policy_version 28762 (0.0007) +[2023-10-09 05:19:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 59244544. Throughput: 0: 1686.5, 1: 1689.2. Samples: 14817510. Policy #0 lag: (min: 3.0, avg: 11.0, max: 35.0) +[2023-10-09 05:19:51,053][59242] Avg episode reward: [(0, '28.910'), (1, '25.530')] +[2023-10-09 05:19:51,724][60144] Updated weights for policy 1, policy_version 29092 (0.0010) +[2023-10-09 05:19:51,967][60143] Updated weights for policy 0, policy_version 28772 (0.0009) +[2023-10-09 05:19:52,091][60144] Updated weights for policy 1, policy_version 29102 (0.0008) +[2023-10-09 05:19:52,346][60143] Updated weights for policy 0, policy_version 28782 (0.0009) +[2023-10-09 05:19:52,457][60144] Updated weights for policy 1, policy_version 29112 (0.0007) +[2023-10-09 05:19:52,716][60143] Updated weights for policy 0, policy_version 28792 (0.0009) +[2023-10-09 05:19:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 59310080. Throughput: 0: 1702.3, 1: 1724.1. Samples: 14838882. Policy #0 lag: (min: 3.0, avg: 11.0, max: 35.0) +[2023-10-09 05:19:56,053][59242] Avg episode reward: [(0, '29.480'), (1, '25.790')] +[2023-10-09 05:19:56,054][59934] Saving new best policy, reward=29.480! +[2023-10-09 05:19:56,478][60144] Updated weights for policy 1, policy_version 29122 (0.0008) +[2023-10-09 05:19:56,529][60143] Updated weights for policy 0, policy_version 28802 (0.0009) +[2023-10-09 05:19:56,843][60144] Updated weights for policy 1, policy_version 29132 (0.0007) +[2023-10-09 05:19:56,892][60143] Updated weights for policy 0, policy_version 28812 (0.0009) +[2023-10-09 05:19:57,211][60144] Updated weights for policy 1, policy_version 29142 (0.0007) +[2023-10-09 05:19:57,267][60143] Updated weights for policy 0, policy_version 28822 (0.0008) +[2023-10-09 05:19:57,571][60144] Updated weights for policy 1, policy_version 29152 (0.0007) +[2023-10-09 05:19:57,636][60143] Updated weights for policy 0, policy_version 28832 (0.0009) +[2023-10-09 05:20:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 59375616. Throughput: 0: 1700.3, 1: 1730.8. Samples: 14859998. Policy #0 lag: (min: 3.0, avg: 11.0, max: 35.0) +[2023-10-09 05:20:01,052][59242] Avg episode reward: [(0, '28.730'), (1, '26.310')] +[2023-10-09 05:20:01,382][60144] Updated weights for policy 1, policy_version 29162 (0.0008) +[2023-10-09 05:20:01,669][60143] Updated weights for policy 0, policy_version 28842 (0.0008) +[2023-10-09 05:20:01,742][60144] Updated weights for policy 1, policy_version 29172 (0.0007) +[2023-10-09 05:20:02,035][60143] Updated weights for policy 0, policy_version 28852 (0.0010) +[2023-10-09 05:20:02,108][60144] Updated weights for policy 1, policy_version 29182 (0.0009) +[2023-10-09 05:20:02,402][60143] Updated weights for policy 0, policy_version 28862 (0.0009) +[2023-10-09 05:20:06,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 59441152. Throughput: 0: 1686.4, 1: 1716.2. Samples: 14869274. Policy #0 lag: (min: 3.0, avg: 11.0, max: 35.0) +[2023-10-09 05:20:06,052][59242] Avg episode reward: [(0, '27.660'), (1, '26.840')] +[2023-10-09 05:20:06,137][60144] Updated weights for policy 1, policy_version 29192 (0.0008) +[2023-10-09 05:20:06,466][60143] Updated weights for policy 0, policy_version 28872 (0.0007) +[2023-10-09 05:20:06,505][60144] Updated weights for policy 1, policy_version 29202 (0.0008) +[2023-10-09 05:20:06,829][60143] Updated weights for policy 0, policy_version 28882 (0.0007) +[2023-10-09 05:20:06,867][60144] Updated weights for policy 1, policy_version 29212 (0.0008) +[2023-10-09 05:20:07,201][60143] Updated weights for policy 0, policy_version 28892 (0.0007) +[2023-10-09 05:20:10,749][60144] Updated weights for policy 1, policy_version 29222 (0.0008) +[2023-10-09 05:20:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 59506688. Throughput: 0: 1707.9, 1: 1727.0. Samples: 14890290. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 05:20:11,052][59242] Avg episode reward: [(0, '26.630'), (1, '26.930')] +[2023-10-09 05:20:11,093][60143] Updated weights for policy 0, policy_version 28902 (0.0010) +[2023-10-09 05:20:11,111][60144] Updated weights for policy 1, policy_version 29232 (0.0007) +[2023-10-09 05:20:11,469][60143] Updated weights for policy 0, policy_version 28912 (0.0009) +[2023-10-09 05:20:11,481][60144] Updated weights for policy 1, policy_version 29242 (0.0007) +[2023-10-09 05:20:11,827][60143] Updated weights for policy 0, policy_version 28922 (0.0009) +[2023-10-09 05:20:15,611][60144] Updated weights for policy 1, policy_version 29252 (0.0009) +[2023-10-09 05:20:15,887][60143] Updated weights for policy 0, policy_version 28932 (0.0008) +[2023-10-09 05:20:16,009][60144] Updated weights for policy 1, policy_version 29262 (0.0009) +[2023-10-09 05:20:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 59572224. Throughput: 0: 1713.8, 1: 1722.7. Samples: 14911276. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 05:20:16,053][59242] Avg episode reward: [(0, '26.880'), (1, '26.440')] +[2023-10-09 05:20:16,242][60143] Updated weights for policy 0, policy_version 28942 (0.0007) +[2023-10-09 05:20:16,370][60144] Updated weights for policy 1, policy_version 29272 (0.0007) +[2023-10-09 05:20:16,618][60143] Updated weights for policy 0, policy_version 28952 (0.0007) +[2023-10-09 05:20:20,364][60144] Updated weights for policy 1, policy_version 29282 (0.0007) +[2023-10-09 05:20:20,491][60143] Updated weights for policy 0, policy_version 28962 (0.0008) +[2023-10-09 05:20:20,734][60144] Updated weights for policy 1, policy_version 29292 (0.0010) +[2023-10-09 05:20:20,860][60143] Updated weights for policy 0, policy_version 28972 (0.0007) +[2023-10-09 05:20:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 59637760. Throughput: 0: 1706.2, 1: 1720.7. Samples: 14920690. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 05:20:21,052][59242] Avg episode reward: [(0, '27.410'), (1, '26.130')] +[2023-10-09 05:20:21,110][60144] Updated weights for policy 1, policy_version 29302 (0.0009) +[2023-10-09 05:20:21,227][60143] Updated weights for policy 0, policy_version 28982 (0.0009) +[2023-10-09 05:20:21,481][60144] Updated weights for policy 1, policy_version 29312 (0.0007) +[2023-10-09 05:20:21,608][60143] Updated weights for policy 0, policy_version 28992 (0.0008) +[2023-10-09 05:20:25,303][60144] Updated weights for policy 1, policy_version 29322 (0.0007) +[2023-10-09 05:20:25,660][60144] Updated weights for policy 1, policy_version 29332 (0.0008) +[2023-10-09 05:20:25,717][60143] Updated weights for policy 0, policy_version 29002 (0.0009) +[2023-10-09 05:20:26,030][60144] Updated weights for policy 1, policy_version 29342 (0.0007) +[2023-10-09 05:20:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 59703296. Throughput: 0: 1705.7, 1: 1724.2. Samples: 14941816. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 05:20:26,053][59242] Avg episode reward: [(0, '28.020'), (1, '26.250')] +[2023-10-09 05:20:26,076][60143] Updated weights for policy 0, policy_version 29012 (0.0007) +[2023-10-09 05:20:26,453][60143] Updated weights for policy 0, policy_version 29022 (0.0009) +[2023-10-09 05:20:30,003][60144] Updated weights for policy 1, policy_version 29352 (0.0007) +[2023-10-09 05:20:30,374][60144] Updated weights for policy 1, policy_version 29362 (0.0007) +[2023-10-09 05:20:30,538][60143] Updated weights for policy 0, policy_version 29032 (0.0007) +[2023-10-09 05:20:30,734][60144] Updated weights for policy 1, policy_version 29372 (0.0007) +[2023-10-09 05:20:30,903][60143] Updated weights for policy 0, policy_version 29042 (0.0008) +[2023-10-09 05:20:31,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 59801600. Throughput: 0: 1703.6, 1: 1708.5. Samples: 14961844. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 05:20:31,052][59242] Avg episode reward: [(0, '28.340'), (1, '25.840')] +[2023-10-09 05:20:31,273][60143] Updated weights for policy 0, policy_version 29052 (0.0009) +[2023-10-09 05:20:34,735][60144] Updated weights for policy 1, policy_version 29382 (0.0007) +[2023-10-09 05:20:35,094][60144] Updated weights for policy 1, policy_version 29392 (0.0007) +[2023-10-09 05:20:35,181][60143] Updated weights for policy 0, policy_version 29062 (0.0008) +[2023-10-09 05:20:35,461][60144] Updated weights for policy 1, policy_version 29402 (0.0010) +[2023-10-09 05:20:35,545][60143] Updated weights for policy 0, policy_version 29072 (0.0008) +[2023-10-09 05:20:35,919][60143] Updated weights for policy 0, policy_version 29082 (0.0008) +[2023-10-09 05:20:36,052][59242] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 59867136. Throughput: 0: 1707.6, 1: 1730.2. Samples: 14972212. Policy #0 lag: (min: 21.0, avg: 29.0, max: 53.0) +[2023-10-09 05:20:36,052][59242] Avg episode reward: [(0, '27.780'), (1, '26.650')] +[2023-10-09 05:20:39,407][60144] Updated weights for policy 1, policy_version 29412 (0.0010) +[2023-10-09 05:20:39,775][60144] Updated weights for policy 1, policy_version 29422 (0.0008) +[2023-10-09 05:20:39,779][60143] Updated weights for policy 0, policy_version 29092 (0.0008) +[2023-10-09 05:20:40,136][60144] Updated weights for policy 1, policy_version 29432 (0.0007) +[2023-10-09 05:20:40,147][60143] Updated weights for policy 0, policy_version 29102 (0.0007) +[2023-10-09 05:20:40,511][60143] Updated weights for policy 0, policy_version 29112 (0.0008) +[2023-10-09 05:20:41,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 59965440. Throughput: 0: 1715.8, 1: 1713.2. Samples: 14993190. Policy #0 lag: (min: 21.0, avg: 29.0, max: 53.0) +[2023-10-09 05:20:41,053][59242] Avg episode reward: [(0, '28.830'), (1, '27.500')] +[2023-10-09 05:20:44,104][60144] Updated weights for policy 1, policy_version 29442 (0.0008) +[2023-10-09 05:20:44,478][60144] Updated weights for policy 1, policy_version 29452 (0.0008) +[2023-10-09 05:20:44,571][60143] Updated weights for policy 0, policy_version 29122 (0.0011) +[2023-10-09 05:20:44,842][60144] Updated weights for policy 1, policy_version 29462 (0.0008) +[2023-10-09 05:20:44,942][60143] Updated weights for policy 0, policy_version 29132 (0.0008) +[2023-10-09 05:20:45,207][60144] Updated weights for policy 1, policy_version 29472 (0.0008) +[2023-10-09 05:20:45,312][60143] Updated weights for policy 0, policy_version 29142 (0.0009) +[2023-10-09 05:20:45,680][60143] Updated weights for policy 0, policy_version 29152 (0.0011) +[2023-10-09 05:20:46,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 60030976. Throughput: 0: 1690.3, 1: 1694.0. Samples: 15012294. Policy #0 lag: (min: 21.0, avg: 29.0, max: 53.0) +[2023-10-09 05:20:46,053][59242] Avg episode reward: [(0, '26.260'), (1, '27.630')] +[2023-10-09 05:20:49,173][60144] Updated weights for policy 1, policy_version 29482 (0.0009) +[2023-10-09 05:20:49,542][60144] Updated weights for policy 1, policy_version 29492 (0.0007) +[2023-10-09 05:20:49,683][60143] Updated weights for policy 0, policy_version 29162 (0.0007) +[2023-10-09 05:20:49,910][60144] Updated weights for policy 1, policy_version 29502 (0.0008) +[2023-10-09 05:20:50,054][60143] Updated weights for policy 0, policy_version 29172 (0.0007) +[2023-10-09 05:20:50,426][60143] Updated weights for policy 0, policy_version 29182 (0.0007) +[2023-10-09 05:20:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 60096512. Throughput: 0: 1717.5, 1: 1724.8. Samples: 15024180. Policy #0 lag: (min: 21.0, avg: 29.0, max: 53.0) +[2023-10-09 05:20:51,053][59242] Avg episode reward: [(0, '26.190'), (1, '27.670')] +[2023-10-09 05:20:53,791][60144] Updated weights for policy 1, policy_version 29512 (0.0008) +[2023-10-09 05:20:54,160][60144] Updated weights for policy 1, policy_version 29522 (0.0009) +[2023-10-09 05:20:54,369][60143] Updated weights for policy 0, policy_version 29192 (0.0008) +[2023-10-09 05:20:54,528][60144] Updated weights for policy 1, policy_version 29532 (0.0008) +[2023-10-09 05:20:54,744][60143] Updated weights for policy 0, policy_version 29202 (0.0008) +[2023-10-09 05:20:55,115][60143] Updated weights for policy 0, policy_version 29212 (0.0008) +[2023-10-09 05:20:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 60162048. Throughput: 0: 1708.5, 1: 1706.0. Samples: 15043942. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 05:20:56,053][59242] Avg episode reward: [(0, '26.980'), (1, '27.720')] +[2023-10-09 05:20:58,297][60144] Updated weights for policy 1, policy_version 29542 (0.0008) +[2023-10-09 05:20:58,672][60144] Updated weights for policy 1, policy_version 29552 (0.0007) +[2023-10-09 05:20:58,931][60143] Updated weights for policy 0, policy_version 29222 (0.0008) +[2023-10-09 05:20:59,039][60144] Updated weights for policy 1, policy_version 29562 (0.0008) +[2023-10-09 05:20:59,291][60143] Updated weights for policy 0, policy_version 29232 (0.0008) +[2023-10-09 05:20:59,664][60143] Updated weights for policy 0, policy_version 29242 (0.0011) +[2023-10-09 05:21:01,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 60227584. Throughput: 0: 1687.0, 1: 1717.8. Samples: 15064492. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 05:21:01,052][59242] Avg episode reward: [(0, '27.720'), (1, '29.710')] +[2023-10-09 05:21:01,062][60003] Saving new best policy, reward=29.710! +[2023-10-09 05:21:03,192][60144] Updated weights for policy 1, policy_version 29572 (0.0007) +[2023-10-09 05:21:03,588][60144] Updated weights for policy 1, policy_version 29582 (0.0007) +[2023-10-09 05:21:03,876][60143] Updated weights for policy 0, policy_version 29252 (0.0009) +[2023-10-09 05:21:03,961][60144] Updated weights for policy 1, policy_version 29592 (0.0008) +[2023-10-09 05:21:04,245][60143] Updated weights for policy 0, policy_version 29262 (0.0010) +[2023-10-09 05:21:04,615][60143] Updated weights for policy 0, policy_version 29272 (0.0008) +[2023-10-09 05:21:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 60293120. Throughput: 0: 1713.9, 1: 1728.4. Samples: 15075596. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 05:21:06,053][59242] Avg episode reward: [(0, '26.250'), (1, '27.510')] +[2023-10-09 05:21:07,941][60144] Updated weights for policy 1, policy_version 29602 (0.0007) +[2023-10-09 05:21:08,305][60144] Updated weights for policy 1, policy_version 29612 (0.0007) +[2023-10-09 05:21:08,632][60143] Updated weights for policy 0, policy_version 29282 (0.0007) +[2023-10-09 05:21:08,687][60144] Updated weights for policy 1, policy_version 29622 (0.0007) +[2023-10-09 05:21:08,990][60143] Updated weights for policy 0, policy_version 29292 (0.0008) +[2023-10-09 05:21:09,052][60144] Updated weights for policy 1, policy_version 29632 (0.0009) +[2023-10-09 05:21:09,364][60143] Updated weights for policy 0, policy_version 29302 (0.0009) +[2023-10-09 05:21:09,736][60143] Updated weights for policy 0, policy_version 29312 (0.0008) +[2023-10-09 05:21:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 60358656. Throughput: 0: 1694.6, 1: 1709.8. Samples: 15095016. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 05:21:11,053][59242] Avg episode reward: [(0, '25.340'), (1, '28.110')] +[2023-10-09 05:21:13,011][60144] Updated weights for policy 1, policy_version 29642 (0.0010) +[2023-10-09 05:21:13,379][60144] Updated weights for policy 1, policy_version 29652 (0.0010) +[2023-10-09 05:21:13,724][60143] Updated weights for policy 0, policy_version 29322 (0.0007) +[2023-10-09 05:21:13,748][60144] Updated weights for policy 1, policy_version 29662 (0.0007) +[2023-10-09 05:21:14,100][60143] Updated weights for policy 0, policy_version 29332 (0.0010) +[2023-10-09 05:21:14,472][60143] Updated weights for policy 0, policy_version 29342 (0.0010) +[2023-10-09 05:21:16,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 60424192. Throughput: 0: 1692.8, 1: 1728.3. Samples: 15115794. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 05:21:16,052][59242] Avg episode reward: [(0, '26.190'), (1, '27.070')] +[2023-10-09 05:21:17,740][60144] Updated weights for policy 1, policy_version 29672 (0.0008) +[2023-10-09 05:21:18,109][60144] Updated weights for policy 1, policy_version 29682 (0.0009) +[2023-10-09 05:21:18,480][60144] Updated weights for policy 1, policy_version 29692 (0.0008) +[2023-10-09 05:21:18,561][60143] Updated weights for policy 0, policy_version 29352 (0.0008) +[2023-10-09 05:21:18,936][60143] Updated weights for policy 0, policy_version 29362 (0.0008) +[2023-10-09 05:21:19,309][60143] Updated weights for policy 0, policy_version 29372 (0.0009) +[2023-10-09 05:21:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 60489728. Throughput: 0: 1717.5, 1: 1709.2. Samples: 15126416. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) +[2023-10-09 05:21:21,053][59242] Avg episode reward: [(0, '26.500'), (1, '26.890')] +[2023-10-09 05:21:22,379][60144] Updated weights for policy 1, policy_version 29702 (0.0011) +[2023-10-09 05:21:22,749][60144] Updated weights for policy 1, policy_version 29712 (0.0011) +[2023-10-09 05:21:23,118][60144] Updated weights for policy 1, policy_version 29722 (0.0009) +[2023-10-09 05:21:23,322][60143] Updated weights for policy 0, policy_version 29382 (0.0009) +[2023-10-09 05:21:23,695][60143] Updated weights for policy 0, policy_version 29392 (0.0008) +[2023-10-09 05:21:24,075][60143] Updated weights for policy 0, policy_version 29402 (0.0009) +[2023-10-09 05:21:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 60555264. Throughput: 0: 1687.4, 1: 1722.9. Samples: 15146654. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) +[2023-10-09 05:21:26,053][59242] Avg episode reward: [(0, '26.410'), (1, '25.570')] +[2023-10-09 05:21:26,898][60144] Updated weights for policy 1, policy_version 29732 (0.0008) +[2023-10-09 05:21:27,267][60144] Updated weights for policy 1, policy_version 29742 (0.0007) +[2023-10-09 05:21:27,635][60144] Updated weights for policy 1, policy_version 29752 (0.0008) +[2023-10-09 05:21:27,986][60143] Updated weights for policy 0, policy_version 29412 (0.0008) +[2023-10-09 05:21:28,349][60143] Updated weights for policy 0, policy_version 29422 (0.0007) +[2023-10-09 05:21:28,721][60143] Updated weights for policy 0, policy_version 29432 (0.0008) +[2023-10-09 05:21:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 60620800. Throughput: 0: 1714.7, 1: 1745.2. Samples: 15167988. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) +[2023-10-09 05:21:31,053][59242] Avg episode reward: [(0, '28.340'), (1, '24.580')] +[2023-10-09 05:21:31,061][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000029760_30474240.pth... +[2023-10-09 05:21:31,061][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000029440_30146560.pth... +[2023-10-09 05:21:31,109][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000027840_28508160.pth +[2023-10-09 05:21:31,110][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000028160_28835840.pth +[2023-10-09 05:21:31,506][60144] Updated weights for policy 1, policy_version 29762 (0.0008) +[2023-10-09 05:21:31,868][60144] Updated weights for policy 1, policy_version 29772 (0.0007) +[2023-10-09 05:21:32,234][60144] Updated weights for policy 1, policy_version 29782 (0.0007) +[2023-10-09 05:21:32,600][60144] Updated weights for policy 1, policy_version 29792 (0.0007) +[2023-10-09 05:21:32,673][60143] Updated weights for policy 0, policy_version 29442 (0.0008) +[2023-10-09 05:21:33,036][60143] Updated weights for policy 0, policy_version 29452 (0.0007) +[2023-10-09 05:21:33,403][60143] Updated weights for policy 0, policy_version 29462 (0.0008) +[2023-10-09 05:21:33,769][60143] Updated weights for policy 0, policy_version 29472 (0.0010) +[2023-10-09 05:21:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 60686336. Throughput: 0: 1698.7, 1: 1714.2. Samples: 15177760. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) +[2023-10-09 05:21:36,053][59242] Avg episode reward: [(0, '27.960'), (1, '23.540')] +[2023-10-09 05:21:36,636][60144] Updated weights for policy 1, policy_version 29802 (0.0009) +[2023-10-09 05:21:37,006][60144] Updated weights for policy 1, policy_version 29812 (0.0007) +[2023-10-09 05:21:37,379][60144] Updated weights for policy 1, policy_version 29822 (0.0010) +[2023-10-09 05:21:37,736][60143] Updated weights for policy 0, policy_version 29482 (0.0007) +[2023-10-09 05:21:38,113][60143] Updated weights for policy 0, policy_version 29492 (0.0007) +[2023-10-09 05:21:38,479][60143] Updated weights for policy 0, policy_version 29502 (0.0007) +[2023-10-09 05:21:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 60751872. Throughput: 0: 1698.5, 1: 1736.0. Samples: 15198492. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) +[2023-10-09 05:21:41,053][59242] Avg episode reward: [(0, '27.130'), (1, '23.150')] +[2023-10-09 05:21:41,250][60144] Updated weights for policy 1, policy_version 29832 (0.0007) +[2023-10-09 05:21:41,610][60144] Updated weights for policy 1, policy_version 29842 (0.0007) +[2023-10-09 05:21:41,979][60144] Updated weights for policy 1, policy_version 29852 (0.0007) +[2023-10-09 05:21:42,446][60143] Updated weights for policy 0, policy_version 29512 (0.0010) +[2023-10-09 05:21:42,823][60143] Updated weights for policy 0, policy_version 29522 (0.0010) +[2023-10-09 05:21:43,189][60143] Updated weights for policy 0, policy_version 29532 (0.0010) +[2023-10-09 05:21:45,808][60144] Updated weights for policy 1, policy_version 29862 (0.0008) +[2023-10-09 05:21:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 60817408. Throughput: 0: 1716.2, 1: 1737.6. Samples: 15219912. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 05:21:46,053][59242] Avg episode reward: [(0, '26.860'), (1, '23.600')] +[2023-10-09 05:21:46,175][60144] Updated weights for policy 1, policy_version 29872 (0.0009) +[2023-10-09 05:21:46,540][60144] Updated weights for policy 1, policy_version 29882 (0.0009) +[2023-10-09 05:21:46,922][60143] Updated weights for policy 0, policy_version 29542 (0.0009) +[2023-10-09 05:21:47,284][60143] Updated weights for policy 0, policy_version 29552 (0.0010) +[2023-10-09 05:21:47,653][60143] Updated weights for policy 0, policy_version 29562 (0.0008) +[2023-10-09 05:21:50,461][60144] Updated weights for policy 1, policy_version 29892 (0.0008) +[2023-10-09 05:21:50,846][60144] Updated weights for policy 1, policy_version 29902 (0.0007) +[2023-10-09 05:21:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 60882944. Throughput: 0: 1686.8, 1: 1726.0. Samples: 15229170. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 05:21:51,053][59242] Avg episode reward: [(0, '26.400'), (1, '23.090')] +[2023-10-09 05:21:51,214][60144] Updated weights for policy 1, policy_version 29912 (0.0007) +[2023-10-09 05:21:51,665][60143] Updated weights for policy 0, policy_version 29572 (0.0010) +[2023-10-09 05:21:52,044][60143] Updated weights for policy 0, policy_version 29582 (0.0008) +[2023-10-09 05:21:52,410][60143] Updated weights for policy 0, policy_version 29592 (0.0008) +[2023-10-09 05:21:55,191][60144] Updated weights for policy 1, policy_version 29922 (0.0009) +[2023-10-09 05:21:55,562][60144] Updated weights for policy 1, policy_version 29932 (0.0009) +[2023-10-09 05:21:55,922][60144] Updated weights for policy 1, policy_version 29942 (0.0010) +[2023-10-09 05:21:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 60948480. Throughput: 0: 1708.4, 1: 1744.8. Samples: 15250412. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 05:21:56,053][59242] Avg episode reward: [(0, '26.280'), (1, '22.560')] +[2023-10-09 05:21:56,295][60144] Updated weights for policy 1, policy_version 29952 (0.0008) +[2023-10-09 05:21:56,504][60143] Updated weights for policy 0, policy_version 29602 (0.0007) +[2023-10-09 05:21:56,888][60143] Updated weights for policy 0, policy_version 29612 (0.0007) +[2023-10-09 05:21:57,259][60143] Updated weights for policy 0, policy_version 29622 (0.0011) +[2023-10-09 05:21:57,643][60143] Updated weights for policy 0, policy_version 29632 (0.0009) +[2023-10-09 05:22:00,334][60144] Updated weights for policy 1, policy_version 29962 (0.0007) +[2023-10-09 05:22:00,706][60144] Updated weights for policy 1, policy_version 29972 (0.0008) +[2023-10-09 05:22:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 61014016. Throughput: 0: 1717.8, 1: 1730.9. Samples: 15270986. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 05:22:01,053][59242] Avg episode reward: [(0, '26.720'), (1, '21.670')] +[2023-10-09 05:22:01,071][60144] Updated weights for policy 1, policy_version 29982 (0.0007) +[2023-10-09 05:22:01,710][60143] Updated weights for policy 0, policy_version 29642 (0.0008) +[2023-10-09 05:22:02,082][60143] Updated weights for policy 0, policy_version 29652 (0.0007) +[2023-10-09 05:22:02,451][60143] Updated weights for policy 0, policy_version 29662 (0.0009) +[2023-10-09 05:22:04,823][60144] Updated weights for policy 1, policy_version 29992 (0.0007) +[2023-10-09 05:22:05,184][60144] Updated weights for policy 1, policy_version 30002 (0.0007) +[2023-10-09 05:22:05,560][60144] Updated weights for policy 1, policy_version 30012 (0.0010) +[2023-10-09 05:22:06,052][59242] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 61112320. Throughput: 0: 1687.9, 1: 1751.9. Samples: 15281204. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 05:22:06,053][59242] Avg episode reward: [(0, '26.440'), (1, '22.110')] +[2023-10-09 05:22:06,357][60143] Updated weights for policy 0, policy_version 29672 (0.0008) +[2023-10-09 05:22:06,734][60143] Updated weights for policy 0, policy_version 29682 (0.0007) +[2023-10-09 05:22:07,104][60143] Updated weights for policy 0, policy_version 29692 (0.0007) +[2023-10-09 05:22:09,331][60144] Updated weights for policy 1, policy_version 30022 (0.0008) +[2023-10-09 05:22:09,693][60144] Updated weights for policy 1, policy_version 30032 (0.0010) +[2023-10-09 05:22:10,074][60144] Updated weights for policy 1, policy_version 30042 (0.0008) +[2023-10-09 05:22:11,052][59242] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 61177856. Throughput: 0: 1713.6, 1: 1742.7. Samples: 15302184. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 05:22:11,052][59242] Avg episode reward: [(0, '26.960'), (1, '22.040')] +[2023-10-09 05:22:11,310][60143] Updated weights for policy 0, policy_version 29702 (0.0010) +[2023-10-09 05:22:11,677][60143] Updated weights for policy 0, policy_version 29712 (0.0009) +[2023-10-09 05:22:12,050][60143] Updated weights for policy 0, policy_version 29722 (0.0007) +[2023-10-09 05:22:13,982][60144] Updated weights for policy 1, policy_version 30052 (0.0010) +[2023-10-09 05:22:14,351][60144] Updated weights for policy 1, policy_version 30062 (0.0009) +[2023-10-09 05:22:14,724][60144] Updated weights for policy 1, policy_version 30072 (0.0009) +[2023-10-09 05:22:15,935][60143] Updated weights for policy 0, policy_version 29732 (0.0008) +[2023-10-09 05:22:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 61243392. Throughput: 0: 1720.5, 1: 1723.5. Samples: 15322966. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 05:22:16,053][59242] Avg episode reward: [(0, '25.890'), (1, '21.210')] +[2023-10-09 05:22:16,307][60143] Updated weights for policy 0, policy_version 29742 (0.0008) +[2023-10-09 05:22:16,676][60143] Updated weights for policy 0, policy_version 29752 (0.0010) +[2023-10-09 05:22:18,644][60144] Updated weights for policy 1, policy_version 30082 (0.0008) +[2023-10-09 05:22:19,009][60144] Updated weights for policy 1, policy_version 30092 (0.0008) +[2023-10-09 05:22:19,371][60144] Updated weights for policy 1, policy_version 30102 (0.0010) +[2023-10-09 05:22:19,748][60144] Updated weights for policy 1, policy_version 30112 (0.0010) +[2023-10-09 05:22:20,516][60143] Updated weights for policy 0, policy_version 29762 (0.0008) +[2023-10-09 05:22:20,886][60143] Updated weights for policy 0, policy_version 29772 (0.0007) +[2023-10-09 05:22:21,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 61308928. Throughput: 0: 1710.9, 1: 1752.5. Samples: 15333614. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 05:22:21,053][59242] Avg episode reward: [(0, '26.550'), (1, '22.240')] +[2023-10-09 05:22:21,263][60143] Updated weights for policy 0, policy_version 29782 (0.0008) +[2023-10-09 05:22:21,629][60143] Updated weights for policy 0, policy_version 29792 (0.0009) +[2023-10-09 05:22:23,636][60144] Updated weights for policy 1, policy_version 30122 (0.0007) +[2023-10-09 05:22:24,008][60144] Updated weights for policy 1, policy_version 30132 (0.0007) +[2023-10-09 05:22:24,380][60144] Updated weights for policy 1, policy_version 30142 (0.0008) +[2023-10-09 05:22:25,657][60143] Updated weights for policy 0, policy_version 29802 (0.0009) +[2023-10-09 05:22:26,028][60143] Updated weights for policy 0, policy_version 29812 (0.0007) +[2023-10-09 05:22:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 61374464. Throughput: 0: 1721.6, 1: 1727.7. Samples: 15353710. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 05:22:26,052][59242] Avg episode reward: [(0, '27.570'), (1, '21.600')] +[2023-10-09 05:22:26,396][60143] Updated weights for policy 0, policy_version 29822 (0.0009) +[2023-10-09 05:22:28,331][60144] Updated weights for policy 1, policy_version 30152 (0.0011) +[2023-10-09 05:22:28,705][60144] Updated weights for policy 1, policy_version 30162 (0.0008) +[2023-10-09 05:22:29,073][60144] Updated weights for policy 1, policy_version 30172 (0.0009) +[2023-10-09 05:22:30,438][60143] Updated weights for policy 0, policy_version 29832 (0.0007) +[2023-10-09 05:22:30,819][60143] Updated weights for policy 0, policy_version 29842 (0.0008) +[2023-10-09 05:22:31,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 61440000. Throughput: 0: 1710.0, 1: 1722.4. Samples: 15374374. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 05:22:31,053][59242] Avg episode reward: [(0, '27.680'), (1, '21.910')] +[2023-10-09 05:22:31,194][60143] Updated weights for policy 0, policy_version 29852 (0.0011) +[2023-10-09 05:22:33,144][60144] Updated weights for policy 1, policy_version 30182 (0.0009) +[2023-10-09 05:22:33,519][60144] Updated weights for policy 1, policy_version 30192 (0.0009) +[2023-10-09 05:22:33,877][60144] Updated weights for policy 1, policy_version 30202 (0.0008) +[2023-10-09 05:22:35,135][60143] Updated weights for policy 0, policy_version 29862 (0.0009) +[2023-10-09 05:22:35,501][60143] Updated weights for policy 0, policy_version 29872 (0.0010) +[2023-10-09 05:22:35,869][60143] Updated weights for policy 0, policy_version 29882 (0.0008) +[2023-10-09 05:22:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 61505536. Throughput: 0: 1720.0, 1: 1733.5. Samples: 15384578. Policy #0 lag: (min: 6.0, avg: 17.1, max: 38.0) +[2023-10-09 05:22:36,053][59242] Avg episode reward: [(0, '27.660'), (1, '22.010')] +[2023-10-09 05:22:37,732][60144] Updated weights for policy 1, policy_version 30212 (0.0008) +[2023-10-09 05:22:38,103][60144] Updated weights for policy 1, policy_version 30222 (0.0009) +[2023-10-09 05:22:38,462][60144] Updated weights for policy 1, policy_version 30232 (0.0009) +[2023-10-09 05:22:39,745][60143] Updated weights for policy 0, policy_version 29892 (0.0009) +[2023-10-09 05:22:40,112][60143] Updated weights for policy 0, policy_version 29902 (0.0008) +[2023-10-09 05:22:40,483][60143] Updated weights for policy 0, policy_version 29912 (0.0008) +[2023-10-09 05:22:41,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 61603840. Throughput: 0: 1723.9, 1: 1724.4. Samples: 15405588. Policy #0 lag: (min: 6.0, avg: 17.1, max: 38.0) +[2023-10-09 05:22:41,053][59242] Avg episode reward: [(0, '26.610'), (1, '23.010')] +[2023-10-09 05:22:42,431][60144] Updated weights for policy 1, policy_version 30242 (0.0011) +[2023-10-09 05:22:42,832][60144] Updated weights for policy 1, policy_version 30252 (0.0010) +[2023-10-09 05:22:43,196][60144] Updated weights for policy 1, policy_version 30262 (0.0010) +[2023-10-09 05:22:43,578][60144] Updated weights for policy 1, policy_version 30272 (0.0010) +[2023-10-09 05:22:44,415][60143] Updated weights for policy 0, policy_version 29922 (0.0007) +[2023-10-09 05:22:44,785][60143] Updated weights for policy 0, policy_version 29932 (0.0009) +[2023-10-09 05:22:45,156][60143] Updated weights for policy 0, policy_version 29942 (0.0008) +[2023-10-09 05:22:45,525][60143] Updated weights for policy 0, policy_version 29952 (0.0009) +[2023-10-09 05:22:46,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 61669376. Throughput: 0: 1698.9, 1: 1737.5. Samples: 15425626. Policy #0 lag: (min: 6.0, avg: 17.1, max: 38.0) +[2023-10-09 05:22:46,053][59242] Avg episode reward: [(0, '26.270'), (1, '23.040')] +[2023-10-09 05:22:47,535][60144] Updated weights for policy 1, policy_version 30282 (0.0008) +[2023-10-09 05:22:47,892][60144] Updated weights for policy 1, policy_version 30292 (0.0009) +[2023-10-09 05:22:48,263][60144] Updated weights for policy 1, policy_version 30302 (0.0008) +[2023-10-09 05:22:49,572][60143] Updated weights for policy 0, policy_version 29962 (0.0010) +[2023-10-09 05:22:49,938][60143] Updated weights for policy 0, policy_version 29972 (0.0007) +[2023-10-09 05:22:50,314][60143] Updated weights for policy 0, policy_version 29982 (0.0008) +[2023-10-09 05:22:51,052][59242] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 61734912. Throughput: 0: 1730.4, 1: 1716.2. Samples: 15436302. Policy #0 lag: (min: 6.0, avg: 17.1, max: 38.0) +[2023-10-09 05:22:51,052][59242] Avg episode reward: [(0, '26.330'), (1, '24.440')] +[2023-10-09 05:22:52,110][60144] Updated weights for policy 1, policy_version 30312 (0.0011) +[2023-10-09 05:22:52,479][60144] Updated weights for policy 1, policy_version 30322 (0.0010) +[2023-10-09 05:22:52,846][60144] Updated weights for policy 1, policy_version 30332 (0.0007) +[2023-10-09 05:22:54,262][60143] Updated weights for policy 0, policy_version 29992 (0.0008) +[2023-10-09 05:22:54,627][60143] Updated weights for policy 0, policy_version 30002 (0.0008) +[2023-10-09 05:22:54,999][60143] Updated weights for policy 0, policy_version 30012 (0.0008) +[2023-10-09 05:22:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 61800448. Throughput: 0: 1712.8, 1: 1723.9. Samples: 15456834. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:22:56,053][59242] Avg episode reward: [(0, '27.020'), (1, '25.230')] +[2023-10-09 05:22:56,910][60144] Updated weights for policy 1, policy_version 30342 (0.0008) +[2023-10-09 05:22:57,282][60144] Updated weights for policy 1, policy_version 30352 (0.0008) +[2023-10-09 05:22:57,655][60144] Updated weights for policy 1, policy_version 30362 (0.0008) +[2023-10-09 05:22:58,953][60143] Updated weights for policy 0, policy_version 30022 (0.0010) +[2023-10-09 05:22:59,329][60143] Updated weights for policy 0, policy_version 30032 (0.0010) +[2023-10-09 05:22:59,700][60143] Updated weights for policy 0, policy_version 30042 (0.0007) +[2023-10-09 05:23:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 61865984. Throughput: 0: 1693.7, 1: 1739.6. Samples: 15477468. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:23:01,052][59242] Avg episode reward: [(0, '26.080'), (1, '25.300')] +[2023-10-09 05:23:01,480][60144] Updated weights for policy 1, policy_version 30372 (0.0008) +[2023-10-09 05:23:01,843][60144] Updated weights for policy 1, policy_version 30382 (0.0009) +[2023-10-09 05:23:02,216][60144] Updated weights for policy 1, policy_version 30392 (0.0007) +[2023-10-09 05:23:03,799][60143] Updated weights for policy 0, policy_version 30052 (0.0007) +[2023-10-09 05:23:04,169][60143] Updated weights for policy 0, policy_version 30062 (0.0009) +[2023-10-09 05:23:04,542][60143] Updated weights for policy 0, policy_version 30072 (0.0008) +[2023-10-09 05:23:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 61931520. Throughput: 0: 1723.0, 1: 1711.4. Samples: 15488162. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:23:06,052][59242] Avg episode reward: [(0, '25.740'), (1, '24.260')] +[2023-10-09 05:23:06,127][60144] Updated weights for policy 1, policy_version 30402 (0.0008) +[2023-10-09 05:23:06,500][60144] Updated weights for policy 1, policy_version 30412 (0.0007) +[2023-10-09 05:23:06,866][60144] Updated weights for policy 1, policy_version 30422 (0.0008) +[2023-10-09 05:23:07,233][60144] Updated weights for policy 1, policy_version 30432 (0.0007) +[2023-10-09 05:23:08,339][60143] Updated weights for policy 0, policy_version 30082 (0.0007) +[2023-10-09 05:23:08,714][60143] Updated weights for policy 0, policy_version 30092 (0.0007) +[2023-10-09 05:23:09,075][60143] Updated weights for policy 0, policy_version 30102 (0.0009) +[2023-10-09 05:23:09,459][60143] Updated weights for policy 0, policy_version 30112 (0.0011) +[2023-10-09 05:23:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 61997056. Throughput: 0: 1693.6, 1: 1739.3. Samples: 15508192. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:23:11,052][59242] Avg episode reward: [(0, '26.620'), (1, '24.620')] +[2023-10-09 05:23:11,178][60144] Updated weights for policy 1, policy_version 30442 (0.0009) +[2023-10-09 05:23:11,555][60144] Updated weights for policy 1, policy_version 30452 (0.0007) +[2023-10-09 05:23:11,922][60144] Updated weights for policy 1, policy_version 30462 (0.0008) +[2023-10-09 05:23:13,503][60143] Updated weights for policy 0, policy_version 30122 (0.0008) +[2023-10-09 05:23:13,866][60143] Updated weights for policy 0, policy_version 30132 (0.0008) +[2023-10-09 05:23:14,236][60143] Updated weights for policy 0, policy_version 30142 (0.0009) +[2023-10-09 05:23:15,793][60144] Updated weights for policy 1, policy_version 30472 (0.0009) +[2023-10-09 05:23:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 62062592. Throughput: 0: 1702.1, 1: 1738.6. Samples: 15529206. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:23:16,053][59242] Avg episode reward: [(0, '26.540'), (1, '24.590')] +[2023-10-09 05:23:16,155][60144] Updated weights for policy 1, policy_version 30482 (0.0008) +[2023-10-09 05:23:16,523][60144] Updated weights for policy 1, policy_version 30492 (0.0009) +[2023-10-09 05:23:18,367][60143] Updated weights for policy 0, policy_version 30152 (0.0010) +[2023-10-09 05:23:18,743][60143] Updated weights for policy 0, policy_version 30162 (0.0009) +[2023-10-09 05:23:19,114][60143] Updated weights for policy 0, policy_version 30172 (0.0007) +[2023-10-09 05:23:20,463][60144] Updated weights for policy 1, policy_version 30502 (0.0008) +[2023-10-09 05:23:20,827][60144] Updated weights for policy 1, policy_version 30512 (0.0007) +[2023-10-09 05:23:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 62128128. Throughput: 0: 1710.5, 1: 1728.8. Samples: 15539346. Policy #0 lag: (min: 5.0, avg: 7.0, max: 36.0) +[2023-10-09 05:23:21,052][59242] Avg episode reward: [(0, '27.790'), (1, '25.150')] +[2023-10-09 05:23:21,193][60144] Updated weights for policy 1, policy_version 30522 (0.0007) +[2023-10-09 05:23:23,097][60143] Updated weights for policy 0, policy_version 30182 (0.0007) +[2023-10-09 05:23:23,467][60143] Updated weights for policy 0, policy_version 30192 (0.0007) +[2023-10-09 05:23:23,838][60143] Updated weights for policy 0, policy_version 30202 (0.0008) +[2023-10-09 05:23:25,164][60144] Updated weights for policy 1, policy_version 30532 (0.0007) +[2023-10-09 05:23:25,521][60144] Updated weights for policy 1, policy_version 30542 (0.0008) +[2023-10-09 05:23:25,887][60144] Updated weights for policy 1, policy_version 30552 (0.0009) +[2023-10-09 05:23:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 62193664. Throughput: 0: 1687.4, 1: 1740.9. Samples: 15559860. Policy #0 lag: (min: 5.0, avg: 7.0, max: 36.0) +[2023-10-09 05:23:26,052][59242] Avg episode reward: [(0, '28.030'), (1, '25.420')] +[2023-10-09 05:23:27,790][60143] Updated weights for policy 0, policy_version 30212 (0.0010) +[2023-10-09 05:23:28,160][60143] Updated weights for policy 0, policy_version 30222 (0.0008) +[2023-10-09 05:23:28,531][60143] Updated weights for policy 0, policy_version 30232 (0.0009) +[2023-10-09 05:23:29,825][60144] Updated weights for policy 1, policy_version 30562 (0.0009) +[2023-10-09 05:23:30,199][60144] Updated weights for policy 1, policy_version 30572 (0.0010) +[2023-10-09 05:23:30,566][60144] Updated weights for policy 1, policy_version 30582 (0.0008) +[2023-10-09 05:23:30,932][60144] Updated weights for policy 1, policy_version 30592 (0.0008) +[2023-10-09 05:23:31,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 62291968. Throughput: 0: 1714.4, 1: 1724.7. Samples: 15580384. Policy #0 lag: (min: 5.0, avg: 7.0, max: 36.0) +[2023-10-09 05:23:31,053][59242] Avg episode reward: [(0, '28.500'), (1, '25.040')] +[2023-10-09 05:23:31,062][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000030240_30965760.pth... +[2023-10-09 05:23:31,062][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000030592_31326208.pth... +[2023-10-09 05:23:31,093][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000028640_29327360.pth +[2023-10-09 05:23:31,096][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000028960_29655040.pth +[2023-10-09 05:23:32,490][60143] Updated weights for policy 0, policy_version 30242 (0.0009) +[2023-10-09 05:23:32,867][60143] Updated weights for policy 0, policy_version 30252 (0.0009) +[2023-10-09 05:23:33,231][60143] Updated weights for policy 0, policy_version 30262 (0.0010) +[2023-10-09 05:23:33,600][60143] Updated weights for policy 0, policy_version 30272 (0.0010) +[2023-10-09 05:23:34,715][60144] Updated weights for policy 1, policy_version 30602 (0.0007) +[2023-10-09 05:23:35,096][60144] Updated weights for policy 1, policy_version 30612 (0.0008) +[2023-10-09 05:23:35,456][60144] Updated weights for policy 1, policy_version 30622 (0.0010) +[2023-10-09 05:23:36,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 62357504. Throughput: 0: 1688.6, 1: 1745.1. Samples: 15590818. Policy #0 lag: (min: 5.0, avg: 7.0, max: 36.0) +[2023-10-09 05:23:36,053][59242] Avg episode reward: [(0, '28.720'), (1, '25.430')] +[2023-10-09 05:23:37,590][60143] Updated weights for policy 0, policy_version 30282 (0.0007) +[2023-10-09 05:23:37,955][60143] Updated weights for policy 0, policy_version 30292 (0.0008) +[2023-10-09 05:23:38,323][60143] Updated weights for policy 0, policy_version 30302 (0.0009) +[2023-10-09 05:23:39,390][60144] Updated weights for policy 1, policy_version 30632 (0.0010) +[2023-10-09 05:23:39,763][60144] Updated weights for policy 1, policy_version 30642 (0.0009) +[2023-10-09 05:23:40,118][60144] Updated weights for policy 1, policy_version 30652 (0.0010) +[2023-10-09 05:23:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 62423040. Throughput: 0: 1701.0, 1: 1735.1. Samples: 15611456. Policy #0 lag: (min: 5.0, avg: 7.0, max: 36.0) +[2023-10-09 05:23:41,053][59242] Avg episode reward: [(0, '27.440'), (1, '26.040')] +[2023-10-09 05:23:42,366][60143] Updated weights for policy 0, policy_version 30312 (0.0009) +[2023-10-09 05:23:42,745][60143] Updated weights for policy 0, policy_version 30322 (0.0008) +[2023-10-09 05:23:43,118][60143] Updated weights for policy 0, policy_version 30332 (0.0009) +[2023-10-09 05:23:43,843][60144] Updated weights for policy 1, policy_version 30662 (0.0008) +[2023-10-09 05:23:44,215][60144] Updated weights for policy 1, policy_version 30672 (0.0008) +[2023-10-09 05:23:44,583][60144] Updated weights for policy 1, policy_version 30682 (0.0008) +[2023-10-09 05:23:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 62488576. Throughput: 0: 1714.7, 1: 1721.1. Samples: 15632078. Policy #0 lag: (min: 5.0, avg: 12.6, max: 37.0) +[2023-10-09 05:23:46,053][59242] Avg episode reward: [(0, '26.530'), (1, '26.280')] +[2023-10-09 05:23:47,223][60143] Updated weights for policy 0, policy_version 30342 (0.0008) +[2023-10-09 05:23:47,593][60143] Updated weights for policy 0, policy_version 30352 (0.0010) +[2023-10-09 05:23:47,972][60143] Updated weights for policy 0, policy_version 30362 (0.0009) +[2023-10-09 05:23:48,641][60144] Updated weights for policy 1, policy_version 30692 (0.0008) +[2023-10-09 05:23:49,004][60144] Updated weights for policy 1, policy_version 30702 (0.0009) +[2023-10-09 05:23:49,367][60144] Updated weights for policy 1, policy_version 30712 (0.0009) +[2023-10-09 05:23:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 62554112. Throughput: 0: 1681.3, 1: 1744.3. Samples: 15642314. Policy #0 lag: (min: 5.0, avg: 12.6, max: 37.0) +[2023-10-09 05:23:51,053][59242] Avg episode reward: [(0, '26.350'), (1, '28.400')] +[2023-10-09 05:23:52,014][60143] Updated weights for policy 0, policy_version 30372 (0.0008) +[2023-10-09 05:23:52,390][60143] Updated weights for policy 0, policy_version 30382 (0.0009) +[2023-10-09 05:23:52,760][60143] Updated weights for policy 0, policy_version 30392 (0.0009) +[2023-10-09 05:23:53,109][60144] Updated weights for policy 1, policy_version 30722 (0.0009) +[2023-10-09 05:23:53,474][60144] Updated weights for policy 1, policy_version 30732 (0.0010) +[2023-10-09 05:23:53,842][60144] Updated weights for policy 1, policy_version 30742 (0.0009) +[2023-10-09 05:23:54,209][60144] Updated weights for policy 1, policy_version 30752 (0.0008) +[2023-10-09 05:23:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 62619648. Throughput: 0: 1712.4, 1: 1720.6. Samples: 15662674. Policy #0 lag: (min: 5.0, avg: 12.6, max: 37.0) +[2023-10-09 05:23:56,053][59242] Avg episode reward: [(0, '27.590'), (1, '28.770')] +[2023-10-09 05:23:56,694][60143] Updated weights for policy 0, policy_version 30402 (0.0009) +[2023-10-09 05:23:57,066][60143] Updated weights for policy 0, policy_version 30412 (0.0008) +[2023-10-09 05:23:57,432][60143] Updated weights for policy 0, policy_version 30422 (0.0007) +[2023-10-09 05:23:57,801][60143] Updated weights for policy 0, policy_version 30432 (0.0007) +[2023-10-09 05:23:58,059][60144] Updated weights for policy 1, policy_version 30762 (0.0007) +[2023-10-09 05:23:58,423][60144] Updated weights for policy 1, policy_version 30772 (0.0008) +[2023-10-09 05:23:58,787][60144] Updated weights for policy 1, policy_version 30782 (0.0009) +[2023-10-09 05:24:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 62685184. Throughput: 0: 1718.4, 1: 1726.1. Samples: 15684212. Policy #0 lag: (min: 5.0, avg: 12.6, max: 37.0) +[2023-10-09 05:24:01,053][59242] Avg episode reward: [(0, '27.510'), (1, '27.680')] +[2023-10-09 05:24:01,510][60143] Updated weights for policy 0, policy_version 30442 (0.0009) +[2023-10-09 05:24:01,883][60143] Updated weights for policy 0, policy_version 30452 (0.0008) +[2023-10-09 05:24:02,249][60143] Updated weights for policy 0, policy_version 30462 (0.0008) +[2023-10-09 05:24:02,853][60144] Updated weights for policy 1, policy_version 30792 (0.0010) +[2023-10-09 05:24:03,225][60144] Updated weights for policy 1, policy_version 30802 (0.0009) +[2023-10-09 05:24:03,600][60144] Updated weights for policy 1, policy_version 30812 (0.0009) +[2023-10-09 05:24:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 62750720. Throughput: 0: 1702.0, 1: 1729.9. Samples: 15693780. Policy #0 lag: (min: 5.0, avg: 12.6, max: 37.0) +[2023-10-09 05:24:06,053][59242] Avg episode reward: [(0, '27.580'), (1, '27.630')] +[2023-10-09 05:24:06,381][60143] Updated weights for policy 0, policy_version 30472 (0.0008) +[2023-10-09 05:24:06,750][60143] Updated weights for policy 0, policy_version 30482 (0.0007) +[2023-10-09 05:24:07,119][60143] Updated weights for policy 0, policy_version 30492 (0.0008) +[2023-10-09 05:24:07,758][60144] Updated weights for policy 1, policy_version 30822 (0.0011) +[2023-10-09 05:24:08,134][60144] Updated weights for policy 1, policy_version 30832 (0.0010) +[2023-10-09 05:24:08,508][60144] Updated weights for policy 1, policy_version 30842 (0.0008) +[2023-10-09 05:24:11,004][60143] Updated weights for policy 0, policy_version 30502 (0.0007) +[2023-10-09 05:24:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 62816256. Throughput: 0: 1719.0, 1: 1716.8. Samples: 15714472. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 05:24:11,053][59242] Avg episode reward: [(0, '27.440'), (1, '27.940')] +[2023-10-09 05:24:11,386][60143] Updated weights for policy 0, policy_version 30512 (0.0007) +[2023-10-09 05:24:11,745][60143] Updated weights for policy 0, policy_version 30522 (0.0008) +[2023-10-09 05:24:12,449][60144] Updated weights for policy 1, policy_version 30852 (0.0008) +[2023-10-09 05:24:12,822][60144] Updated weights for policy 1, policy_version 30862 (0.0008) +[2023-10-09 05:24:13,181][60144] Updated weights for policy 1, policy_version 30872 (0.0008) +[2023-10-09 05:24:15,576][60143] Updated weights for policy 0, policy_version 30532 (0.0009) +[2023-10-09 05:24:15,946][60143] Updated weights for policy 0, policy_version 30542 (0.0009) +[2023-10-09 05:24:16,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 62881792. Throughput: 0: 1714.7, 1: 1736.6. Samples: 15735694. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 05:24:16,053][59242] Avg episode reward: [(0, '27.300'), (1, '27.780')] +[2023-10-09 05:24:16,317][60143] Updated weights for policy 0, policy_version 30552 (0.0008) +[2023-10-09 05:24:17,154][60144] Updated weights for policy 1, policy_version 30882 (0.0007) +[2023-10-09 05:24:17,565][60144] Updated weights for policy 1, policy_version 30892 (0.0008) +[2023-10-09 05:24:17,932][60144] Updated weights for policy 1, policy_version 30902 (0.0008) +[2023-10-09 05:24:18,301][60144] Updated weights for policy 1, policy_version 30912 (0.0009) +[2023-10-09 05:24:20,339][60143] Updated weights for policy 0, policy_version 30562 (0.0011) +[2023-10-09 05:24:20,709][60143] Updated weights for policy 0, policy_version 30572 (0.0008) +[2023-10-09 05:24:21,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 62947328. Throughput: 0: 1712.4, 1: 1709.5. Samples: 15744800. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 05:24:21,052][59242] Avg episode reward: [(0, '28.500'), (1, '27.770')] +[2023-10-09 05:24:21,082][60143] Updated weights for policy 0, policy_version 30582 (0.0009) +[2023-10-09 05:24:21,451][60143] Updated weights for policy 0, policy_version 30592 (0.0008) +[2023-10-09 05:24:22,325][60144] Updated weights for policy 1, policy_version 30922 (0.0007) +[2023-10-09 05:24:22,692][60144] Updated weights for policy 1, policy_version 30932 (0.0007) +[2023-10-09 05:24:23,057][60144] Updated weights for policy 1, policy_version 30942 (0.0007) +[2023-10-09 05:24:25,509][60143] Updated weights for policy 0, policy_version 30602 (0.0010) +[2023-10-09 05:24:25,878][60143] Updated weights for policy 0, policy_version 30612 (0.0008) +[2023-10-09 05:24:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 63012864. Throughput: 0: 1718.5, 1: 1723.7. Samples: 15766354. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 05:24:26,053][59242] Avg episode reward: [(0, '29.250'), (1, '29.030')] +[2023-10-09 05:24:26,255][60143] Updated weights for policy 0, policy_version 30622 (0.0008) +[2023-10-09 05:24:26,869][60144] Updated weights for policy 1, policy_version 30952 (0.0007) +[2023-10-09 05:24:27,233][60144] Updated weights for policy 1, policy_version 30962 (0.0009) +[2023-10-09 05:24:27,607][60144] Updated weights for policy 1, policy_version 30972 (0.0008) +[2023-10-09 05:24:30,300][60143] Updated weights for policy 0, policy_version 30632 (0.0008) +[2023-10-09 05:24:30,671][60143] Updated weights for policy 0, policy_version 30642 (0.0009) +[2023-10-09 05:24:31,038][60143] Updated weights for policy 0, policy_version 30652 (0.0008) +[2023-10-09 05:24:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 63078400. Throughput: 0: 1708.4, 1: 1735.9. Samples: 15787072. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 05:24:31,053][59242] Avg episode reward: [(0, '26.840'), (1, '30.050')] +[2023-10-09 05:24:31,065][60003] Saving new best policy, reward=30.050! +[2023-10-09 05:24:31,625][60144] Updated weights for policy 1, policy_version 30982 (0.0009) +[2023-10-09 05:24:31,992][60144] Updated weights for policy 1, policy_version 30992 (0.0009) +[2023-10-09 05:24:32,354][60144] Updated weights for policy 1, policy_version 31002 (0.0009) +[2023-10-09 05:24:35,118][60143] Updated weights for policy 0, policy_version 30662 (0.0011) +[2023-10-09 05:24:35,483][60143] Updated weights for policy 0, policy_version 30672 (0.0009) +[2023-10-09 05:24:35,861][60143] Updated weights for policy 0, policy_version 30682 (0.0009) +[2023-10-09 05:24:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 63143936. Throughput: 0: 1724.3, 1: 1709.9. Samples: 15796852. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:24:36,053][59242] Avg episode reward: [(0, '26.920'), (1, '29.990')] +[2023-10-09 05:24:36,311][60144] Updated weights for policy 1, policy_version 31012 (0.0007) +[2023-10-09 05:24:36,682][60144] Updated weights for policy 1, policy_version 31022 (0.0008) +[2023-10-09 05:24:37,057][60144] Updated weights for policy 1, policy_version 31032 (0.0007) +[2023-10-09 05:24:39,951][60143] Updated weights for policy 0, policy_version 30692 (0.0008) +[2023-10-09 05:24:40,328][60143] Updated weights for policy 0, policy_version 30702 (0.0009) +[2023-10-09 05:24:40,688][60143] Updated weights for policy 0, policy_version 30712 (0.0009) +[2023-10-09 05:24:40,837][60144] Updated weights for policy 1, policy_version 31042 (0.0010) +[2023-10-09 05:24:41,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 63242240. Throughput: 0: 1716.6, 1: 1737.1. Samples: 15818090. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:24:41,053][59242] Avg episode reward: [(0, '25.990'), (1, '30.320')] +[2023-10-09 05:24:41,201][60144] Updated weights for policy 1, policy_version 31052 (0.0008) +[2023-10-09 05:24:41,568][60144] Updated weights for policy 1, policy_version 31062 (0.0009) +[2023-10-09 05:24:41,934][60003] Saving new best policy, reward=30.320! +[2023-10-09 05:24:41,938][60144] Updated weights for policy 1, policy_version 31072 (0.0008) +[2023-10-09 05:24:44,542][60143] Updated weights for policy 0, policy_version 30722 (0.0009) +[2023-10-09 05:24:44,906][60143] Updated weights for policy 0, policy_version 30732 (0.0011) +[2023-10-09 05:24:45,274][60143] Updated weights for policy 0, policy_version 30742 (0.0008) +[2023-10-09 05:24:45,647][60143] Updated weights for policy 0, policy_version 30752 (0.0008) +[2023-10-09 05:24:45,845][60144] Updated weights for policy 1, policy_version 31082 (0.0007) +[2023-10-09 05:24:46,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 63307776. Throughput: 0: 1690.2, 1: 1733.4. Samples: 15838274. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:24:46,053][59242] Avg episode reward: [(0, '25.730'), (1, '30.830')] +[2023-10-09 05:24:46,211][60144] Updated weights for policy 1, policy_version 31092 (0.0008) +[2023-10-09 05:24:46,572][60144] Updated weights for policy 1, policy_version 31102 (0.0007) +[2023-10-09 05:24:46,649][60003] Saving new best policy, reward=30.830! +[2023-10-09 05:24:49,467][60143] Updated weights for policy 0, policy_version 30762 (0.0008) +[2023-10-09 05:24:49,839][60143] Updated weights for policy 0, policy_version 30772 (0.0008) +[2023-10-09 05:24:50,201][60143] Updated weights for policy 0, policy_version 30782 (0.0009) +[2023-10-09 05:24:50,519][60144] Updated weights for policy 1, policy_version 31112 (0.0009) +[2023-10-09 05:24:50,886][60144] Updated weights for policy 1, policy_version 31122 (0.0010) +[2023-10-09 05:24:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 63373312. Throughput: 0: 1717.9, 1: 1729.8. Samples: 15848924. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:24:51,052][59242] Avg episode reward: [(0, '26.880'), (1, '30.570')] +[2023-10-09 05:24:51,248][60144] Updated weights for policy 1, policy_version 31132 (0.0011) +[2023-10-09 05:24:54,246][60143] Updated weights for policy 0, policy_version 30792 (0.0009) +[2023-10-09 05:24:54,615][60143] Updated weights for policy 0, policy_version 30802 (0.0011) +[2023-10-09 05:24:54,982][60143] Updated weights for policy 0, policy_version 30812 (0.0009) +[2023-10-09 05:24:55,356][60144] Updated weights for policy 1, policy_version 31142 (0.0009) +[2023-10-09 05:24:55,715][60144] Updated weights for policy 1, policy_version 31152 (0.0008) +[2023-10-09 05:24:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 63438848. Throughput: 0: 1705.4, 1: 1737.1. Samples: 15869384. Policy #0 lag: (min: 5.0, avg: 5.7, max: 23.0) +[2023-10-09 05:24:56,053][59242] Avg episode reward: [(0, '26.120'), (1, '30.780')] +[2023-10-09 05:24:56,092][60144] Updated weights for policy 1, policy_version 31162 (0.0010) +[2023-10-09 05:24:58,884][60143] Updated weights for policy 0, policy_version 30822 (0.0010) +[2023-10-09 05:24:59,247][60143] Updated weights for policy 0, policy_version 30832 (0.0009) +[2023-10-09 05:24:59,630][60143] Updated weights for policy 0, policy_version 30842 (0.0010) +[2023-10-09 05:24:59,974][60144] Updated weights for policy 1, policy_version 31172 (0.0009) +[2023-10-09 05:25:00,335][60144] Updated weights for policy 1, policy_version 31182 (0.0007) +[2023-10-09 05:25:00,704][60144] Updated weights for policy 1, policy_version 31192 (0.0007) +[2023-10-09 05:25:01,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 63537152. Throughput: 0: 1688.8, 1: 1724.8. Samples: 15889306. Policy #0 lag: (min: 5.0, avg: 5.7, max: 23.0) +[2023-10-09 05:25:01,053][59242] Avg episode reward: [(0, '26.480'), (1, '31.360')] +[2023-10-09 05:25:01,063][60003] Saving new best policy, reward=31.360! +[2023-10-09 05:25:03,791][60143] Updated weights for policy 0, policy_version 30852 (0.0009) +[2023-10-09 05:25:04,159][60143] Updated weights for policy 0, policy_version 30862 (0.0010) +[2023-10-09 05:25:04,531][60143] Updated weights for policy 0, policy_version 30872 (0.0009) +[2023-10-09 05:25:04,705][60144] Updated weights for policy 1, policy_version 31202 (0.0008) +[2023-10-09 05:25:05,124][60144] Updated weights for policy 1, policy_version 31212 (0.0008) +[2023-10-09 05:25:05,480][60144] Updated weights for policy 1, policy_version 31222 (0.0008) +[2023-10-09 05:25:05,856][60144] Updated weights for policy 1, policy_version 31232 (0.0008) +[2023-10-09 05:25:06,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 63602688. Throughput: 0: 1720.6, 1: 1748.7. Samples: 15900918. Policy #0 lag: (min: 5.0, avg: 5.7, max: 23.0) +[2023-10-09 05:25:06,053][59242] Avg episode reward: [(0, '26.070'), (1, '29.920')] +[2023-10-09 05:25:08,456][60143] Updated weights for policy 0, policy_version 30882 (0.0008) +[2023-10-09 05:25:08,823][60143] Updated weights for policy 0, policy_version 30892 (0.0008) +[2023-10-09 05:25:09,197][60143] Updated weights for policy 0, policy_version 30902 (0.0007) +[2023-10-09 05:25:09,563][60143] Updated weights for policy 0, policy_version 30912 (0.0007) +[2023-10-09 05:25:09,675][60144] Updated weights for policy 1, policy_version 31242 (0.0007) +[2023-10-09 05:25:10,042][60144] Updated weights for policy 1, policy_version 31252 (0.0007) +[2023-10-09 05:25:10,406][60144] Updated weights for policy 1, policy_version 31262 (0.0008) +[2023-10-09 05:25:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 63668224. Throughput: 0: 1692.3, 1: 1734.5. Samples: 15920558. Policy #0 lag: (min: 5.0, avg: 5.7, max: 23.0) +[2023-10-09 05:25:11,053][59242] Avg episode reward: [(0, '27.740'), (1, '30.300')] +[2023-10-09 05:25:13,438][60143] Updated weights for policy 0, policy_version 30922 (0.0011) +[2023-10-09 05:25:13,816][60143] Updated weights for policy 0, policy_version 30932 (0.0010) +[2023-10-09 05:25:14,182][60143] Updated weights for policy 0, policy_version 30942 (0.0009) +[2023-10-09 05:25:14,400][60144] Updated weights for policy 1, policy_version 31272 (0.0008) +[2023-10-09 05:25:14,767][60144] Updated weights for policy 1, policy_version 31282 (0.0007) +[2023-10-09 05:25:15,140][60144] Updated weights for policy 1, policy_version 31292 (0.0008) +[2023-10-09 05:25:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 63733760. Throughput: 0: 1699.5, 1: 1712.7. Samples: 15940622. Policy #0 lag: (min: 5.0, avg: 5.7, max: 23.0) +[2023-10-09 05:25:16,053][59242] Avg episode reward: [(0, '27.840'), (1, '29.920')] +[2023-10-09 05:25:18,262][60143] Updated weights for policy 0, policy_version 30952 (0.0008) +[2023-10-09 05:25:18,634][60143] Updated weights for policy 0, policy_version 30962 (0.0008) +[2023-10-09 05:25:18,993][60143] Updated weights for policy 0, policy_version 30972 (0.0008) +[2023-10-09 05:25:19,150][60144] Updated weights for policy 1, policy_version 31302 (0.0008) +[2023-10-09 05:25:19,511][60144] Updated weights for policy 1, policy_version 31312 (0.0008) +[2023-10-09 05:25:19,877][60144] Updated weights for policy 1, policy_version 31322 (0.0008) +[2023-10-09 05:25:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 63799296. Throughput: 0: 1703.9, 1: 1742.6. Samples: 15951948. Policy #0 lag: (min: 13.0, avg: 18.0, max: 45.0) +[2023-10-09 05:25:21,053][59242] Avg episode reward: [(0, '26.930'), (1, '29.370')] +[2023-10-09 05:25:22,935][60143] Updated weights for policy 0, policy_version 30982 (0.0009) +[2023-10-09 05:25:23,312][60143] Updated weights for policy 0, policy_version 30992 (0.0009) +[2023-10-09 05:25:23,671][60143] Updated weights for policy 0, policy_version 31002 (0.0008) +[2023-10-09 05:25:23,737][60144] Updated weights for policy 1, policy_version 31332 (0.0008) +[2023-10-09 05:25:24,105][60144] Updated weights for policy 1, policy_version 31342 (0.0009) +[2023-10-09 05:25:24,471][60144] Updated weights for policy 1, policy_version 31352 (0.0010) +[2023-10-09 05:25:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 63864832. Throughput: 0: 1691.9, 1: 1715.7. Samples: 15971434. Policy #0 lag: (min: 13.0, avg: 18.0, max: 45.0) +[2023-10-09 05:25:26,053][59242] Avg episode reward: [(0, '27.580'), (1, '29.380')] +[2023-10-09 05:25:27,533][60143] Updated weights for policy 0, policy_version 31012 (0.0008) +[2023-10-09 05:25:27,910][60143] Updated weights for policy 0, policy_version 31022 (0.0007) +[2023-10-09 05:25:28,288][60143] Updated weights for policy 0, policy_version 31032 (0.0009) +[2023-10-09 05:25:28,293][60144] Updated weights for policy 1, policy_version 31362 (0.0010) +[2023-10-09 05:25:28,659][60144] Updated weights for policy 1, policy_version 31372 (0.0010) +[2023-10-09 05:25:29,030][60144] Updated weights for policy 1, policy_version 31382 (0.0009) +[2023-10-09 05:25:29,394][60144] Updated weights for policy 1, policy_version 31392 (0.0010) +[2023-10-09 05:25:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 63930368. Throughput: 0: 1714.7, 1: 1711.9. Samples: 15992470. Policy #0 lag: (min: 13.0, avg: 18.0, max: 45.0) +[2023-10-09 05:25:31,053][59242] Avg episode reward: [(0, '27.580'), (1, '29.080')] +[2023-10-09 05:25:31,058][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000031392_32145408.pth... +[2023-10-09 05:25:31,059][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000031040_31784960.pth... +[2023-10-09 05:25:31,100][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000029760_30474240.pth +[2023-10-09 05:25:31,105][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000029440_30146560.pth +[2023-10-09 05:25:32,358][60143] Updated weights for policy 0, policy_version 31042 (0.0008) +[2023-10-09 05:25:32,731][60143] Updated weights for policy 0, policy_version 31052 (0.0009) +[2023-10-09 05:25:33,100][60143] Updated weights for policy 0, policy_version 31062 (0.0008) +[2023-10-09 05:25:33,235][60144] Updated weights for policy 1, policy_version 31402 (0.0008) +[2023-10-09 05:25:33,462][60143] Updated weights for policy 0, policy_version 31072 (0.0008) +[2023-10-09 05:25:33,606][60144] Updated weights for policy 1, policy_version 31412 (0.0008) +[2023-10-09 05:25:33,968][60144] Updated weights for policy 1, policy_version 31422 (0.0009) +[2023-10-09 05:25:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 63995904. Throughput: 0: 1683.6, 1: 1724.6. Samples: 16002294. Policy #0 lag: (min: 13.0, avg: 18.0, max: 45.0) +[2023-10-09 05:25:36,052][59242] Avg episode reward: [(0, '27.600'), (1, '28.680')] +[2023-10-09 05:25:37,484][60143] Updated weights for policy 0, policy_version 31082 (0.0008) +[2023-10-09 05:25:37,859][60143] Updated weights for policy 0, policy_version 31092 (0.0009) +[2023-10-09 05:25:37,994][60144] Updated weights for policy 1, policy_version 31432 (0.0008) +[2023-10-09 05:25:38,229][60143] Updated weights for policy 0, policy_version 31102 (0.0007) +[2023-10-09 05:25:38,359][60144] Updated weights for policy 1, policy_version 31442 (0.0007) +[2023-10-09 05:25:38,732][60144] Updated weights for policy 1, policy_version 31452 (0.0007) +[2023-10-09 05:25:41,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 64061440. Throughput: 0: 1701.7, 1: 1715.8. Samples: 16023170. Policy #0 lag: (min: 13.0, avg: 18.0, max: 45.0) +[2023-10-09 05:25:41,053][59242] Avg episode reward: [(0, '27.170'), (1, '28.770')] +[2023-10-09 05:25:42,336][60143] Updated weights for policy 0, policy_version 31112 (0.0010) +[2023-10-09 05:25:42,621][60144] Updated weights for policy 1, policy_version 31462 (0.0008) +[2023-10-09 05:25:42,705][60143] Updated weights for policy 0, policy_version 31122 (0.0010) +[2023-10-09 05:25:42,978][60144] Updated weights for policy 1, policy_version 31472 (0.0007) +[2023-10-09 05:25:43,070][60143] Updated weights for policy 0, policy_version 31132 (0.0008) +[2023-10-09 05:25:43,347][60144] Updated weights for policy 1, policy_version 31482 (0.0008) +[2023-10-09 05:25:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 64126976. Throughput: 0: 1711.1, 1: 1728.0. Samples: 16044064. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-09 05:25:46,053][59242] Avg episode reward: [(0, '28.690'), (1, '28.510')] +[2023-10-09 05:25:46,992][60143] Updated weights for policy 0, policy_version 31142 (0.0008) +[2023-10-09 05:25:47,256][60144] Updated weights for policy 1, policy_version 31492 (0.0010) +[2023-10-09 05:25:47,361][60143] Updated weights for policy 0, policy_version 31152 (0.0007) +[2023-10-09 05:25:47,621][60144] Updated weights for policy 1, policy_version 31502 (0.0008) +[2023-10-09 05:25:47,728][60143] Updated weights for policy 0, policy_version 31162 (0.0008) +[2023-10-09 05:25:47,989][60144] Updated weights for policy 1, policy_version 31512 (0.0009) +[2023-10-09 05:25:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 64192512. Throughput: 0: 1676.8, 1: 1711.6. Samples: 16053400. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-09 05:25:51,053][59242] Avg episode reward: [(0, '28.350'), (1, '28.220')] +[2023-10-09 05:25:51,935][60143] Updated weights for policy 0, policy_version 31172 (0.0009) +[2023-10-09 05:25:52,184][60144] Updated weights for policy 1, policy_version 31522 (0.0008) +[2023-10-09 05:25:52,305][60143] Updated weights for policy 0, policy_version 31182 (0.0009) +[2023-10-09 05:25:52,601][60144] Updated weights for policy 1, policy_version 31532 (0.0007) +[2023-10-09 05:25:52,681][60143] Updated weights for policy 0, policy_version 31192 (0.0009) +[2023-10-09 05:25:52,966][60144] Updated weights for policy 1, policy_version 31542 (0.0007) +[2023-10-09 05:25:53,334][60144] Updated weights for policy 1, policy_version 31552 (0.0008) +[2023-10-09 05:25:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 64258048. Throughput: 0: 1700.4, 1: 1719.1. Samples: 16074434. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-09 05:25:56,053][59242] Avg episode reward: [(0, '28.830'), (1, '29.490')] +[2023-10-09 05:25:56,535][60143] Updated weights for policy 0, policy_version 31202 (0.0007) +[2023-10-09 05:25:56,892][60143] Updated weights for policy 0, policy_version 31212 (0.0009) +[2023-10-09 05:25:56,990][60144] Updated weights for policy 1, policy_version 31562 (0.0007) +[2023-10-09 05:25:57,263][60143] Updated weights for policy 0, policy_version 31222 (0.0008) +[2023-10-09 05:25:57,358][60144] Updated weights for policy 1, policy_version 31572 (0.0008) +[2023-10-09 05:25:57,623][60143] Updated weights for policy 0, policy_version 31232 (0.0009) +[2023-10-09 05:25:57,724][60144] Updated weights for policy 1, policy_version 31582 (0.0007) +[2023-10-09 05:26:01,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 64323584. Throughput: 0: 1699.6, 1: 1746.7. Samples: 16095704. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-09 05:26:01,052][59242] Avg episode reward: [(0, '28.920'), (1, '29.270')] +[2023-10-09 05:26:01,620][60144] Updated weights for policy 1, policy_version 31592 (0.0007) +[2023-10-09 05:26:01,785][60143] Updated weights for policy 0, policy_version 31242 (0.0007) +[2023-10-09 05:26:01,991][60144] Updated weights for policy 1, policy_version 31602 (0.0007) +[2023-10-09 05:26:02,145][60143] Updated weights for policy 0, policy_version 31252 (0.0007) +[2023-10-09 05:26:02,348][60144] Updated weights for policy 1, policy_version 31612 (0.0007) +[2023-10-09 05:26:02,528][60143] Updated weights for policy 0, policy_version 31262 (0.0008) +[2023-10-09 05:26:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 64389120. Throughput: 0: 1684.1, 1: 1718.2. Samples: 16105052. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-09 05:26:06,053][59242] Avg episode reward: [(0, '29.590'), (1, '28.830')] +[2023-10-09 05:26:06,054][59934] Saving new best policy, reward=29.590! +[2023-10-09 05:26:06,251][60144] Updated weights for policy 1, policy_version 31622 (0.0008) +[2023-10-09 05:26:06,534][60143] Updated weights for policy 0, policy_version 31272 (0.0007) +[2023-10-09 05:26:06,613][60144] Updated weights for policy 1, policy_version 31632 (0.0007) +[2023-10-09 05:26:06,897][60143] Updated weights for policy 0, policy_version 31282 (0.0007) +[2023-10-09 05:26:06,986][60144] Updated weights for policy 1, policy_version 31642 (0.0008) +[2023-10-09 05:26:07,258][60143] Updated weights for policy 0, policy_version 31292 (0.0009) +[2023-10-09 05:26:10,895][60144] Updated weights for policy 1, policy_version 31652 (0.0008) +[2023-10-09 05:26:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 64454656. Throughput: 0: 1697.8, 1: 1740.8. Samples: 16126172. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:26:11,052][59242] Avg episode reward: [(0, '29.640'), (1, '28.410')] +[2023-10-09 05:26:11,267][60144] Updated weights for policy 1, policy_version 31662 (0.0009) +[2023-10-09 05:26:11,326][60143] Updated weights for policy 0, policy_version 31302 (0.0010) +[2023-10-09 05:26:11,625][60144] Updated weights for policy 1, policy_version 31672 (0.0009) +[2023-10-09 05:26:11,691][60143] Updated weights for policy 0, policy_version 31312 (0.0008) +[2023-10-09 05:26:12,064][60143] Updated weights for policy 0, policy_version 31322 (0.0010) +[2023-10-09 05:26:12,280][59934] Saving new best policy, reward=29.640! +[2023-10-09 05:26:15,842][60144] Updated weights for policy 1, policy_version 31682 (0.0007) +[2023-10-09 05:26:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 64520192. Throughput: 0: 1704.0, 1: 1738.8. Samples: 16147396. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:26:16,053][59242] Avg episode reward: [(0, '28.850'), (1, '26.900')] +[2023-10-09 05:26:16,099][60143] Updated weights for policy 0, policy_version 31332 (0.0010) +[2023-10-09 05:26:16,209][60144] Updated weights for policy 1, policy_version 31692 (0.0007) +[2023-10-09 05:26:16,465][60143] Updated weights for policy 0, policy_version 31342 (0.0009) +[2023-10-09 05:26:16,572][60144] Updated weights for policy 1, policy_version 31702 (0.0007) +[2023-10-09 05:26:16,832][60143] Updated weights for policy 0, policy_version 31352 (0.0009) +[2023-10-09 05:26:16,950][60144] Updated weights for policy 1, policy_version 31712 (0.0008) +[2023-10-09 05:26:20,880][60143] Updated weights for policy 0, policy_version 31362 (0.0008) +[2023-10-09 05:26:20,889][60144] Updated weights for policy 1, policy_version 31722 (0.0007) +[2023-10-09 05:26:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 64585728. Throughput: 0: 1701.8, 1: 1724.3. Samples: 16156470. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:26:21,053][59242] Avg episode reward: [(0, '28.730'), (1, '26.880')] +[2023-10-09 05:26:21,241][60143] Updated weights for policy 0, policy_version 31372 (0.0009) +[2023-10-09 05:26:21,257][60144] Updated weights for policy 1, policy_version 31732 (0.0007) +[2023-10-09 05:26:21,619][60144] Updated weights for policy 1, policy_version 31742 (0.0008) +[2023-10-09 05:26:21,620][60143] Updated weights for policy 0, policy_version 31382 (0.0009) +[2023-10-09 05:26:21,988][60143] Updated weights for policy 0, policy_version 31392 (0.0008) +[2023-10-09 05:26:25,585][60144] Updated weights for policy 1, policy_version 31752 (0.0008) +[2023-10-09 05:26:25,950][60144] Updated weights for policy 1, policy_version 31762 (0.0007) +[2023-10-09 05:26:25,959][60143] Updated weights for policy 0, policy_version 31402 (0.0009) +[2023-10-09 05:26:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 64651264. Throughput: 0: 1697.7, 1: 1737.5. Samples: 16177754. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:26:26,053][59242] Avg episode reward: [(0, '28.650'), (1, '26.610')] +[2023-10-09 05:26:26,320][60144] Updated weights for policy 1, policy_version 31772 (0.0007) +[2023-10-09 05:26:26,335][60143] Updated weights for policy 0, policy_version 31412 (0.0007) +[2023-10-09 05:26:26,694][60143] Updated weights for policy 0, policy_version 31422 (0.0008) +[2023-10-09 05:26:30,000][60144] Updated weights for policy 1, policy_version 31782 (0.0009) +[2023-10-09 05:26:30,365][60144] Updated weights for policy 1, policy_version 31792 (0.0010) +[2023-10-09 05:26:30,699][60143] Updated weights for policy 0, policy_version 31432 (0.0009) +[2023-10-09 05:26:30,732][60144] Updated weights for policy 1, policy_version 31802 (0.0008) +[2023-10-09 05:26:31,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 64749568. Throughput: 0: 1698.4, 1: 1720.4. Samples: 16197910. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:26:31,053][59242] Avg episode reward: [(0, '29.010'), (1, '26.100')] +[2023-10-09 05:26:31,079][60143] Updated weights for policy 0, policy_version 31442 (0.0008) +[2023-10-09 05:26:31,448][60143] Updated weights for policy 0, policy_version 31452 (0.0007) +[2023-10-09 05:26:34,627][60144] Updated weights for policy 1, policy_version 31812 (0.0008) +[2023-10-09 05:26:34,999][60144] Updated weights for policy 1, policy_version 31822 (0.0008) +[2023-10-09 05:26:35,311][60143] Updated weights for policy 0, policy_version 31462 (0.0008) +[2023-10-09 05:26:35,366][60144] Updated weights for policy 1, policy_version 31832 (0.0008) +[2023-10-09 05:26:35,676][60143] Updated weights for policy 0, policy_version 31472 (0.0008) +[2023-10-09 05:26:36,052][59242] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 64815104. Throughput: 0: 1705.9, 1: 1742.1. Samples: 16208560. Policy #0 lag: (min: 31.0, avg: 42.7, max: 63.0) +[2023-10-09 05:26:36,053][59242] Avg episode reward: [(0, '28.180'), (1, '26.860')] +[2023-10-09 05:26:36,056][60143] Updated weights for policy 0, policy_version 31482 (0.0007) +[2023-10-09 05:26:39,585][60144] Updated weights for policy 1, policy_version 31842 (0.0009) +[2023-10-09 05:26:40,010][60144] Updated weights for policy 1, policy_version 31852 (0.0008) +[2023-10-09 05:26:40,176][60143] Updated weights for policy 0, policy_version 31492 (0.0008) +[2023-10-09 05:26:40,369][60144] Updated weights for policy 1, policy_version 31862 (0.0008) +[2023-10-09 05:26:40,547][60143] Updated weights for policy 0, policy_version 31502 (0.0009) +[2023-10-09 05:26:40,729][60144] Updated weights for policy 1, policy_version 31872 (0.0007) +[2023-10-09 05:26:40,914][60143] Updated weights for policy 0, policy_version 31512 (0.0007) +[2023-10-09 05:26:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 64880640. Throughput: 0: 1705.7, 1: 1734.8. Samples: 16229256. Policy #0 lag: (min: 31.0, avg: 42.7, max: 63.0) +[2023-10-09 05:26:41,053][59242] Avg episode reward: [(0, '27.450'), (1, '27.120')] +[2023-10-09 05:26:44,657][60144] Updated weights for policy 1, policy_version 31882 (0.0007) +[2023-10-09 05:26:44,955][60143] Updated weights for policy 0, policy_version 31522 (0.0007) +[2023-10-09 05:26:45,023][60144] Updated weights for policy 1, policy_version 31892 (0.0007) +[2023-10-09 05:26:45,329][60143] Updated weights for policy 0, policy_version 31532 (0.0008) +[2023-10-09 05:26:45,389][60144] Updated weights for policy 1, policy_version 31902 (0.0009) +[2023-10-09 05:26:45,693][60143] Updated weights for policy 0, policy_version 31542 (0.0007) +[2023-10-09 05:26:46,052][59242] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 64978944. Throughput: 0: 1693.5, 1: 1701.2. Samples: 16248468. Policy #0 lag: (min: 31.0, avg: 42.7, max: 63.0) +[2023-10-09 05:26:46,053][59242] Avg episode reward: [(0, '27.910'), (1, '27.260')] +[2023-10-09 05:26:46,054][60143] Updated weights for policy 0, policy_version 31552 (0.0007) +[2023-10-09 05:26:49,256][60144] Updated weights for policy 1, policy_version 31912 (0.0007) +[2023-10-09 05:26:49,613][60144] Updated weights for policy 1, policy_version 31922 (0.0008) +[2023-10-09 05:26:49,988][60144] Updated weights for policy 1, policy_version 31932 (0.0009) +[2023-10-09 05:26:50,159][60143] Updated weights for policy 0, policy_version 31562 (0.0008) +[2023-10-09 05:26:50,530][60143] Updated weights for policy 0, policy_version 31572 (0.0010) +[2023-10-09 05:26:50,898][60143] Updated weights for policy 0, policy_version 31582 (0.0010) +[2023-10-09 05:26:51,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 65044480. Throughput: 0: 1705.3, 1: 1731.5. Samples: 16259708. Policy #0 lag: (min: 31.0, avg: 42.7, max: 63.0) +[2023-10-09 05:26:51,053][59242] Avg episode reward: [(0, '27.370'), (1, '25.820')] +[2023-10-09 05:26:53,866][60144] Updated weights for policy 1, policy_version 31942 (0.0009) +[2023-10-09 05:26:54,234][60144] Updated weights for policy 1, policy_version 31952 (0.0010) +[2023-10-09 05:26:54,594][60144] Updated weights for policy 1, policy_version 31962 (0.0008) +[2023-10-09 05:26:54,981][60143] Updated weights for policy 0, policy_version 31592 (0.0008) +[2023-10-09 05:26:55,350][60143] Updated weights for policy 0, policy_version 31602 (0.0008) +[2023-10-09 05:26:55,725][60143] Updated weights for policy 0, policy_version 31612 (0.0008) +[2023-10-09 05:26:56,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 65110016. Throughput: 0: 1706.5, 1: 1709.6. Samples: 16279898. Policy #0 lag: (min: 10.0, avg: 17.6, max: 42.0) +[2023-10-09 05:26:56,053][59242] Avg episode reward: [(0, '26.240'), (1, '26.400')] +[2023-10-09 05:26:58,518][60144] Updated weights for policy 1, policy_version 31972 (0.0007) +[2023-10-09 05:26:58,879][60144] Updated weights for policy 1, policy_version 31982 (0.0007) +[2023-10-09 05:26:59,242][60144] Updated weights for policy 1, policy_version 31992 (0.0008) +[2023-10-09 05:26:59,647][60143] Updated weights for policy 0, policy_version 31622 (0.0009) +[2023-10-09 05:27:00,024][60143] Updated weights for policy 0, policy_version 31632 (0.0007) +[2023-10-09 05:27:00,384][60143] Updated weights for policy 0, policy_version 31642 (0.0009) +[2023-10-09 05:27:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 65175552. Throughput: 0: 1679.4, 1: 1714.2. Samples: 16300106. Policy #0 lag: (min: 10.0, avg: 17.6, max: 42.0) +[2023-10-09 05:27:01,053][59242] Avg episode reward: [(0, '25.460'), (1, '26.620')] +[2023-10-09 05:27:03,216][60144] Updated weights for policy 1, policy_version 32002 (0.0008) +[2023-10-09 05:27:03,590][60144] Updated weights for policy 1, policy_version 32012 (0.0008) +[2023-10-09 05:27:03,961][60144] Updated weights for policy 1, policy_version 32022 (0.0009) +[2023-10-09 05:27:04,326][60144] Updated weights for policy 1, policy_version 32032 (0.0011) +[2023-10-09 05:27:04,326][60143] Updated weights for policy 0, policy_version 31652 (0.0008) +[2023-10-09 05:27:04,701][60143] Updated weights for policy 0, policy_version 31662 (0.0009) +[2023-10-09 05:27:05,064][60143] Updated weights for policy 0, policy_version 31672 (0.0008) +[2023-10-09 05:27:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 65241088. Throughput: 0: 1709.0, 1: 1730.4. Samples: 16311246. Policy #0 lag: (min: 10.0, avg: 17.6, max: 42.0) +[2023-10-09 05:27:06,053][59242] Avg episode reward: [(0, '25.920'), (1, '25.970')] +[2023-10-09 05:27:08,208][60144] Updated weights for policy 1, policy_version 32042 (0.0010) +[2023-10-09 05:27:08,576][60144] Updated weights for policy 1, policy_version 32052 (0.0010) +[2023-10-09 05:27:08,945][60144] Updated weights for policy 1, policy_version 32062 (0.0008) +[2023-10-09 05:27:09,069][60143] Updated weights for policy 0, policy_version 31682 (0.0009) +[2023-10-09 05:27:09,452][60143] Updated weights for policy 0, policy_version 31692 (0.0010) +[2023-10-09 05:27:09,811][60143] Updated weights for policy 0, policy_version 31702 (0.0010) +[2023-10-09 05:27:10,178][60143] Updated weights for policy 0, policy_version 31712 (0.0009) +[2023-10-09 05:27:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 65306624. Throughput: 0: 1698.8, 1: 1714.0. Samples: 16331330. Policy #0 lag: (min: 10.0, avg: 17.6, max: 42.0) +[2023-10-09 05:27:11,052][59242] Avg episode reward: [(0, '25.280'), (1, '25.830')] +[2023-10-09 05:27:12,761][60144] Updated weights for policy 1, policy_version 32072 (0.0008) +[2023-10-09 05:27:13,123][60144] Updated weights for policy 1, policy_version 32082 (0.0007) +[2023-10-09 05:27:13,499][60144] Updated weights for policy 1, policy_version 32092 (0.0008) +[2023-10-09 05:27:14,205][60143] Updated weights for policy 0, policy_version 31722 (0.0009) +[2023-10-09 05:27:14,567][60143] Updated weights for policy 0, policy_version 31732 (0.0007) +[2023-10-09 05:27:14,934][60143] Updated weights for policy 0, policy_version 31742 (0.0007) +[2023-10-09 05:27:16,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 65372160. Throughput: 0: 1689.6, 1: 1734.0. Samples: 16351976. Policy #0 lag: (min: 10.0, avg: 17.6, max: 42.0) +[2023-10-09 05:27:16,053][59242] Avg episode reward: [(0, '24.450'), (1, '25.480')] +[2023-10-09 05:27:17,566][60144] Updated weights for policy 1, policy_version 32102 (0.0009) +[2023-10-09 05:27:17,926][60144] Updated weights for policy 1, policy_version 32112 (0.0008) +[2023-10-09 05:27:18,291][60144] Updated weights for policy 1, policy_version 32122 (0.0007) +[2023-10-09 05:27:18,928][60143] Updated weights for policy 0, policy_version 31752 (0.0008) +[2023-10-09 05:27:19,312][60143] Updated weights for policy 0, policy_version 31762 (0.0008) +[2023-10-09 05:27:19,679][60143] Updated weights for policy 0, policy_version 31772 (0.0007) +[2023-10-09 05:27:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 65437696. Throughput: 0: 1715.2, 1: 1708.5. Samples: 16362630. Policy #0 lag: (min: 14.0, avg: 18.6, max: 46.0) +[2023-10-09 05:27:21,053][59242] Avg episode reward: [(0, '26.690'), (1, '24.760')] +[2023-10-09 05:27:22,190][60144] Updated weights for policy 1, policy_version 32132 (0.0007) +[2023-10-09 05:27:22,556][60144] Updated weights for policy 1, policy_version 32142 (0.0007) +[2023-10-09 05:27:22,932][60144] Updated weights for policy 1, policy_version 32152 (0.0008) +[2023-10-09 05:27:23,649][60143] Updated weights for policy 0, policy_version 31782 (0.0007) +[2023-10-09 05:27:24,024][60143] Updated weights for policy 0, policy_version 31792 (0.0007) +[2023-10-09 05:27:24,398][60143] Updated weights for policy 0, policy_version 31802 (0.0007) +[2023-10-09 05:27:26,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 65503232. Throughput: 0: 1689.8, 1: 1721.3. Samples: 16382756. Policy #0 lag: (min: 14.0, avg: 18.6, max: 46.0) +[2023-10-09 05:27:26,054][59242] Avg episode reward: [(0, '25.920'), (1, '25.540')] +[2023-10-09 05:27:26,857][60144] Updated weights for policy 1, policy_version 32162 (0.0009) +[2023-10-09 05:27:27,266][60144] Updated weights for policy 1, policy_version 32172 (0.0008) +[2023-10-09 05:27:27,628][60144] Updated weights for policy 1, policy_version 32182 (0.0010) +[2023-10-09 05:27:27,993][60144] Updated weights for policy 1, policy_version 32192 (0.0010) +[2023-10-09 05:27:28,239][60143] Updated weights for policy 0, policy_version 31812 (0.0008) +[2023-10-09 05:27:28,605][60143] Updated weights for policy 0, policy_version 31822 (0.0008) +[2023-10-09 05:27:28,975][60143] Updated weights for policy 0, policy_version 31832 (0.0007) +[2023-10-09 05:27:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 65568768. Throughput: 0: 1701.2, 1: 1750.5. Samples: 16403794. Policy #0 lag: (min: 14.0, avg: 18.6, max: 46.0) +[2023-10-09 05:27:31,053][59242] Avg episode reward: [(0, '25.720'), (1, '26.660')] +[2023-10-09 05:27:31,061][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000032192_32964608.pth... +[2023-10-09 05:27:31,062][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000031840_32604160.pth... +[2023-10-09 05:27:31,103][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000030240_30965760.pth +[2023-10-09 05:27:31,105][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000030592_31326208.pth +[2023-10-09 05:27:31,108][59934] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p0/milestones/checkpoint_000031840_32604160.pth +[2023-10-09 05:27:31,111][60003] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p1/milestones/checkpoint_000032192_32964608.pth +[2023-10-09 05:27:31,954][60144] Updated weights for policy 1, policy_version 32202 (0.0010) +[2023-10-09 05:27:32,322][60144] Updated weights for policy 1, policy_version 32212 (0.0011) +[2023-10-09 05:27:32,688][60144] Updated weights for policy 1, policy_version 32222 (0.0007) +[2023-10-09 05:27:33,054][60143] Updated weights for policy 0, policy_version 31842 (0.0008) +[2023-10-09 05:27:33,422][60143] Updated weights for policy 0, policy_version 31852 (0.0010) +[2023-10-09 05:27:33,790][60143] Updated weights for policy 0, policy_version 31862 (0.0008) +[2023-10-09 05:27:34,156][60143] Updated weights for policy 0, policy_version 31872 (0.0011) +[2023-10-09 05:27:36,052][59242] Fps is (10 sec: 13107.7, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 65634304. Throughput: 0: 1705.9, 1: 1717.3. Samples: 16413752. Policy #0 lag: (min: 14.0, avg: 18.6, max: 46.0) +[2023-10-09 05:27:36,052][59242] Avg episode reward: [(0, '26.910'), (1, '26.520')] +[2023-10-09 05:27:36,640][60144] Updated weights for policy 1, policy_version 32232 (0.0010) +[2023-10-09 05:27:37,012][60144] Updated weights for policy 1, policy_version 32242 (0.0007) +[2023-10-09 05:27:37,373][60144] Updated weights for policy 1, policy_version 32252 (0.0009) +[2023-10-09 05:27:38,067][60143] Updated weights for policy 0, policy_version 31882 (0.0007) +[2023-10-09 05:27:38,440][60143] Updated weights for policy 0, policy_version 31892 (0.0009) +[2023-10-09 05:27:38,809][60143] Updated weights for policy 0, policy_version 31902 (0.0007) +[2023-10-09 05:27:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 65699840. Throughput: 0: 1694.7, 1: 1740.1. Samples: 16434464. Policy #0 lag: (min: 14.0, avg: 18.6, max: 46.0) +[2023-10-09 05:27:41,053][59242] Avg episode reward: [(0, '26.600'), (1, '25.460')] +[2023-10-09 05:27:41,359][60144] Updated weights for policy 1, policy_version 32262 (0.0008) +[2023-10-09 05:27:41,738][60144] Updated weights for policy 1, policy_version 32272 (0.0008) +[2023-10-09 05:27:42,104][60144] Updated weights for policy 1, policy_version 32282 (0.0008) +[2023-10-09 05:27:42,735][60143] Updated weights for policy 0, policy_version 31912 (0.0008) +[2023-10-09 05:27:43,103][60143] Updated weights for policy 0, policy_version 31922 (0.0007) +[2023-10-09 05:27:43,480][60143] Updated weights for policy 0, policy_version 31932 (0.0008) +[2023-10-09 05:27:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 65765376. Throughput: 0: 1722.6, 1: 1742.6. Samples: 16456042. Policy #0 lag: (min: 31.0, avg: 40.1, max: 63.0) +[2023-10-09 05:27:46,053][59242] Avg episode reward: [(0, '26.230'), (1, '24.840')] +[2023-10-09 05:27:46,064][60144] Updated weights for policy 1, policy_version 32292 (0.0008) +[2023-10-09 05:27:46,427][60144] Updated weights for policy 1, policy_version 32302 (0.0008) +[2023-10-09 05:27:46,797][60144] Updated weights for policy 1, policy_version 32312 (0.0008) +[2023-10-09 05:27:47,366][60143] Updated weights for policy 0, policy_version 31942 (0.0011) +[2023-10-09 05:27:47,732][60143] Updated weights for policy 0, policy_version 31952 (0.0009) +[2023-10-09 05:27:48,104][60143] Updated weights for policy 0, policy_version 31962 (0.0010) +[2023-10-09 05:27:50,522][60144] Updated weights for policy 1, policy_version 32322 (0.0008) +[2023-10-09 05:27:50,888][60144] Updated weights for policy 1, policy_version 32332 (0.0008) +[2023-10-09 05:27:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 65830912. Throughput: 0: 1697.6, 1: 1727.3. Samples: 16465366. Policy #0 lag: (min: 31.0, avg: 40.1, max: 63.0) +[2023-10-09 05:27:51,053][59242] Avg episode reward: [(0, '26.530'), (1, '25.350')] +[2023-10-09 05:27:51,262][60144] Updated weights for policy 1, policy_version 32342 (0.0009) +[2023-10-09 05:27:51,628][60144] Updated weights for policy 1, policy_version 32352 (0.0008) +[2023-10-09 05:27:52,111][60143] Updated weights for policy 0, policy_version 31972 (0.0009) +[2023-10-09 05:27:52,490][60143] Updated weights for policy 0, policy_version 31982 (0.0007) +[2023-10-09 05:27:52,858][60143] Updated weights for policy 0, policy_version 31992 (0.0007) +[2023-10-09 05:27:55,570][60144] Updated weights for policy 1, policy_version 32362 (0.0008) +[2023-10-09 05:27:55,941][60144] Updated weights for policy 1, policy_version 32372 (0.0007) +[2023-10-09 05:27:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 65896448. Throughput: 0: 1706.3, 1: 1740.8. Samples: 16486450. Policy #0 lag: (min: 31.0, avg: 40.1, max: 63.0) +[2023-10-09 05:27:56,053][59242] Avg episode reward: [(0, '26.970'), (1, '24.240')] +[2023-10-09 05:27:56,319][60144] Updated weights for policy 1, policy_version 32382 (0.0007) +[2023-10-09 05:27:56,901][60143] Updated weights for policy 0, policy_version 32002 (0.0007) +[2023-10-09 05:27:57,268][60143] Updated weights for policy 0, policy_version 32012 (0.0011) +[2023-10-09 05:27:57,639][60143] Updated weights for policy 0, policy_version 32022 (0.0008) +[2023-10-09 05:27:58,003][60143] Updated weights for policy 0, policy_version 32032 (0.0008) +[2023-10-09 05:28:00,426][60144] Updated weights for policy 1, policy_version 32392 (0.0010) +[2023-10-09 05:28:00,797][60144] Updated weights for policy 1, policy_version 32402 (0.0009) +[2023-10-09 05:28:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 65961984. Throughput: 0: 1728.7, 1: 1724.9. Samples: 16507388. Policy #0 lag: (min: 31.0, avg: 40.1, max: 63.0) +[2023-10-09 05:28:01,053][59242] Avg episode reward: [(0, '26.690'), (1, '23.970')] +[2023-10-09 05:28:01,172][60144] Updated weights for policy 1, policy_version 32412 (0.0010) +[2023-10-09 05:28:01,823][60143] Updated weights for policy 0, policy_version 32042 (0.0009) +[2023-10-09 05:28:02,188][60143] Updated weights for policy 0, policy_version 32052 (0.0008) +[2023-10-09 05:28:02,559][60143] Updated weights for policy 0, policy_version 32062 (0.0009) +[2023-10-09 05:28:05,046][60144] Updated weights for policy 1, policy_version 32422 (0.0008) +[2023-10-09 05:28:05,414][60144] Updated weights for policy 1, policy_version 32432 (0.0008) +[2023-10-09 05:28:05,789][60144] Updated weights for policy 1, policy_version 32442 (0.0009) +[2023-10-09 05:28:06,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 66060288. Throughput: 0: 1696.8, 1: 1739.6. Samples: 16517264. Policy #0 lag: (min: 31.0, avg: 40.1, max: 63.0) +[2023-10-09 05:28:06,052][59242] Avg episode reward: [(0, '27.690'), (1, '24.190')] +[2023-10-09 05:28:06,595][60143] Updated weights for policy 0, policy_version 32072 (0.0007) +[2023-10-09 05:28:06,961][60143] Updated weights for policy 0, policy_version 32082 (0.0008) +[2023-10-09 05:28:07,332][60143] Updated weights for policy 0, policy_version 32092 (0.0008) +[2023-10-09 05:28:09,770][60144] Updated weights for policy 1, policy_version 32452 (0.0010) +[2023-10-09 05:28:10,136][60144] Updated weights for policy 1, policy_version 32462 (0.0009) +[2023-10-09 05:28:10,498][60144] Updated weights for policy 1, policy_version 32472 (0.0008) +[2023-10-09 05:28:11,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 66125824. Throughput: 0: 1721.3, 1: 1734.3. Samples: 16538254. Policy #0 lag: (min: 9.0, avg: 18.6, max: 41.0) +[2023-10-09 05:28:11,053][59242] Avg episode reward: [(0, '27.600'), (1, '26.210')] +[2023-10-09 05:28:11,201][60143] Updated weights for policy 0, policy_version 32102 (0.0009) +[2023-10-09 05:28:11,586][60143] Updated weights for policy 0, policy_version 32112 (0.0009) +[2023-10-09 05:28:11,953][60143] Updated weights for policy 0, policy_version 32122 (0.0009) +[2023-10-09 05:28:14,333][60144] Updated weights for policy 1, policy_version 32482 (0.0010) +[2023-10-09 05:28:14,714][60144] Updated weights for policy 1, policy_version 32492 (0.0008) +[2023-10-09 05:28:15,077][60144] Updated weights for policy 1, policy_version 32502 (0.0007) +[2023-10-09 05:28:15,439][60144] Updated weights for policy 1, policy_version 32512 (0.0010) +[2023-10-09 05:28:15,899][60143] Updated weights for policy 0, policy_version 32132 (0.0008) +[2023-10-09 05:28:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 66191360. Throughput: 0: 1727.2, 1: 1709.1. Samples: 16558428. Policy #0 lag: (min: 9.0, avg: 18.6, max: 41.0) +[2023-10-09 05:28:16,052][59242] Avg episode reward: [(0, '27.240'), (1, '26.570')] +[2023-10-09 05:28:16,267][60143] Updated weights for policy 0, policy_version 32142 (0.0008) +[2023-10-09 05:28:16,638][60143] Updated weights for policy 0, policy_version 32152 (0.0008) +[2023-10-09 05:28:19,327][60144] Updated weights for policy 1, policy_version 32522 (0.0011) +[2023-10-09 05:28:19,694][60144] Updated weights for policy 1, policy_version 32532 (0.0007) +[2023-10-09 05:28:20,068][60144] Updated weights for policy 1, policy_version 32542 (0.0008) +[2023-10-09 05:28:20,565][60143] Updated weights for policy 0, policy_version 32162 (0.0009) +[2023-10-09 05:28:20,929][60143] Updated weights for policy 0, policy_version 32172 (0.0008) +[2023-10-09 05:28:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 66256896. Throughput: 0: 1708.1, 1: 1741.1. Samples: 16568966. Policy #0 lag: (min: 9.0, avg: 18.6, max: 41.0) +[2023-10-09 05:28:21,053][59242] Avg episode reward: [(0, '28.180'), (1, '26.170')] +[2023-10-09 05:28:21,294][60143] Updated weights for policy 0, policy_version 32182 (0.0008) +[2023-10-09 05:28:21,668][60143] Updated weights for policy 0, policy_version 32192 (0.0007) +[2023-10-09 05:28:23,970][60144] Updated weights for policy 1, policy_version 32552 (0.0008) +[2023-10-09 05:28:24,334][60144] Updated weights for policy 1, policy_version 32562 (0.0007) +[2023-10-09 05:28:24,702][60144] Updated weights for policy 1, policy_version 32572 (0.0009) +[2023-10-09 05:28:25,962][60143] Updated weights for policy 0, policy_version 32202 (0.0008) +[2023-10-09 05:28:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 66322432. Throughput: 0: 1722.7, 1: 1720.0. Samples: 16589386. Policy #0 lag: (min: 9.0, avg: 18.6, max: 41.0) +[2023-10-09 05:28:26,052][59242] Avg episode reward: [(0, '28.300'), (1, '27.520')] +[2023-10-09 05:28:26,334][60143] Updated weights for policy 0, policy_version 32212 (0.0007) +[2023-10-09 05:28:26,702][60143] Updated weights for policy 0, policy_version 32222 (0.0007) +[2023-10-09 05:28:28,635][60144] Updated weights for policy 1, policy_version 32582 (0.0008) +[2023-10-09 05:28:28,992][60144] Updated weights for policy 1, policy_version 32592 (0.0008) +[2023-10-09 05:28:29,358][60144] Updated weights for policy 1, policy_version 32602 (0.0007) +[2023-10-09 05:28:30,594][60143] Updated weights for policy 0, policy_version 32232 (0.0008) +[2023-10-09 05:28:30,963][60143] Updated weights for policy 0, policy_version 32242 (0.0007) +[2023-10-09 05:28:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 66387968. Throughput: 0: 1719.0, 1: 1708.4. Samples: 16610276. Policy #0 lag: (min: 9.0, avg: 18.6, max: 41.0) +[2023-10-09 05:28:31,053][59242] Avg episode reward: [(0, '27.910'), (1, '26.960')] +[2023-10-09 05:28:31,328][60143] Updated weights for policy 0, policy_version 32252 (0.0007) +[2023-10-09 05:28:33,428][60144] Updated weights for policy 1, policy_version 32612 (0.0009) +[2023-10-09 05:28:33,791][60144] Updated weights for policy 1, policy_version 32622 (0.0010) +[2023-10-09 05:28:34,166][60144] Updated weights for policy 1, policy_version 32632 (0.0011) +[2023-10-09 05:28:35,125][60143] Updated weights for policy 0, policy_version 32262 (0.0008) +[2023-10-09 05:28:35,499][60143] Updated weights for policy 0, policy_version 32272 (0.0010) +[2023-10-09 05:28:35,861][60143] Updated weights for policy 0, policy_version 32282 (0.0010) +[2023-10-09 05:28:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 66453504. Throughput: 0: 1724.3, 1: 1724.9. Samples: 16620580. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:28:36,053][59242] Avg episode reward: [(0, '26.420'), (1, '26.420')] +[2023-10-09 05:28:38,234][60144] Updated weights for policy 1, policy_version 32642 (0.0008) +[2023-10-09 05:28:38,598][60144] Updated weights for policy 1, policy_version 32652 (0.0009) +[2023-10-09 05:28:38,971][60144] Updated weights for policy 1, policy_version 32662 (0.0007) +[2023-10-09 05:28:39,333][60144] Updated weights for policy 1, policy_version 32672 (0.0007) +[2023-10-09 05:28:39,691][60143] Updated weights for policy 0, policy_version 32292 (0.0009) +[2023-10-09 05:28:40,067][60143] Updated weights for policy 0, policy_version 32302 (0.0008) +[2023-10-09 05:28:40,440][60143] Updated weights for policy 0, policy_version 32312 (0.0007) +[2023-10-09 05:28:41,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 66551808. Throughput: 0: 1731.2, 1: 1702.0. Samples: 16640946. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:28:41,053][59242] Avg episode reward: [(0, '26.070'), (1, '25.980')] +[2023-10-09 05:28:43,319][60144] Updated weights for policy 1, policy_version 32682 (0.0008) +[2023-10-09 05:28:43,684][60144] Updated weights for policy 1, policy_version 32692 (0.0010) +[2023-10-09 05:28:44,040][60144] Updated weights for policy 1, policy_version 32702 (0.0008) +[2023-10-09 05:28:44,336][60143] Updated weights for policy 0, policy_version 32322 (0.0008) +[2023-10-09 05:28:44,700][60143] Updated weights for policy 0, policy_version 32332 (0.0009) +[2023-10-09 05:28:45,064][60143] Updated weights for policy 0, policy_version 32342 (0.0008) +[2023-10-09 05:28:45,430][60143] Updated weights for policy 0, policy_version 32352 (0.0009) +[2023-10-09 05:28:46,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 66617344. Throughput: 0: 1700.4, 1: 1715.6. Samples: 16661110. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:28:46,053][59242] Avg episode reward: [(0, '26.820'), (1, '25.620')] +[2023-10-09 05:28:47,944][60144] Updated weights for policy 1, policy_version 32712 (0.0010) +[2023-10-09 05:28:48,310][60144] Updated weights for policy 1, policy_version 32722 (0.0009) +[2023-10-09 05:28:48,679][60144] Updated weights for policy 1, policy_version 32732 (0.0010) +[2023-10-09 05:28:49,420][60143] Updated weights for policy 0, policy_version 32362 (0.0009) +[2023-10-09 05:28:49,794][60143] Updated weights for policy 0, policy_version 32372 (0.0008) +[2023-10-09 05:28:50,169][60143] Updated weights for policy 0, policy_version 32382 (0.0008) +[2023-10-09 05:28:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 66682880. Throughput: 0: 1730.1, 1: 1709.3. Samples: 16672038. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:28:51,053][59242] Avg episode reward: [(0, '26.120'), (1, '25.240')] +[2023-10-09 05:28:52,726][60144] Updated weights for policy 1, policy_version 32742 (0.0009) +[2023-10-09 05:28:53,094][60144] Updated weights for policy 1, policy_version 32752 (0.0010) +[2023-10-09 05:28:53,461][60144] Updated weights for policy 1, policy_version 32762 (0.0010) +[2023-10-09 05:28:54,266][60143] Updated weights for policy 0, policy_version 32392 (0.0010) +[2023-10-09 05:28:54,648][60143] Updated weights for policy 0, policy_version 32402 (0.0008) +[2023-10-09 05:28:55,021][60143] Updated weights for policy 0, policy_version 32412 (0.0007) +[2023-10-09 05:28:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 66748416. Throughput: 0: 1722.0, 1: 1701.4. Samples: 16692304. Policy #0 lag: (min: 29.0, avg: 34.3, max: 61.0) +[2023-10-09 05:28:56,053][59242] Avg episode reward: [(0, '27.530'), (1, '25.900')] +[2023-10-09 05:28:57,211][60144] Updated weights for policy 1, policy_version 32772 (0.0008) +[2023-10-09 05:28:57,580][60144] Updated weights for policy 1, policy_version 32782 (0.0008) +[2023-10-09 05:28:57,945][60144] Updated weights for policy 1, policy_version 32792 (0.0007) +[2023-10-09 05:28:59,199][60143] Updated weights for policy 0, policy_version 32422 (0.0007) +[2023-10-09 05:28:59,570][60143] Updated weights for policy 0, policy_version 32432 (0.0008) +[2023-10-09 05:28:59,934][60143] Updated weights for policy 0, policy_version 32442 (0.0007) +[2023-10-09 05:29:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 66813952. Throughput: 0: 1699.8, 1: 1730.9. Samples: 16712810. Policy #0 lag: (min: 29.0, avg: 34.3, max: 61.0) +[2023-10-09 05:29:01,053][59242] Avg episode reward: [(0, '26.300'), (1, '27.370')] +[2023-10-09 05:29:01,953][60144] Updated weights for policy 1, policy_version 32802 (0.0008) +[2023-10-09 05:29:02,360][60144] Updated weights for policy 1, policy_version 32812 (0.0009) +[2023-10-09 05:29:02,717][60144] Updated weights for policy 1, policy_version 32822 (0.0008) +[2023-10-09 05:29:03,090][60144] Updated weights for policy 1, policy_version 32832 (0.0007) +[2023-10-09 05:29:03,982][60143] Updated weights for policy 0, policy_version 32452 (0.0008) +[2023-10-09 05:29:04,350][60143] Updated weights for policy 0, policy_version 32462 (0.0010) +[2023-10-09 05:29:04,725][60143] Updated weights for policy 0, policy_version 32472 (0.0011) +[2023-10-09 05:29:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 66879488. Throughput: 0: 1729.7, 1: 1697.5. Samples: 16723192. Policy #0 lag: (min: 29.0, avg: 34.3, max: 61.0) +[2023-10-09 05:29:06,053][59242] Avg episode reward: [(0, '27.770'), (1, '26.330')] +[2023-10-09 05:29:07,124][60144] Updated weights for policy 1, policy_version 32842 (0.0008) +[2023-10-09 05:29:07,501][60144] Updated weights for policy 1, policy_version 32852 (0.0009) +[2023-10-09 05:29:07,867][60144] Updated weights for policy 1, policy_version 32862 (0.0009) +[2023-10-09 05:29:08,674][60143] Updated weights for policy 0, policy_version 32482 (0.0010) +[2023-10-09 05:29:09,050][60143] Updated weights for policy 0, policy_version 32492 (0.0008) +[2023-10-09 05:29:09,419][60143] Updated weights for policy 0, policy_version 32502 (0.0007) +[2023-10-09 05:29:09,794][60143] Updated weights for policy 0, policy_version 32512 (0.0007) +[2023-10-09 05:29:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 66945024. Throughput: 0: 1706.4, 1: 1716.2. Samples: 16743402. Policy #0 lag: (min: 29.0, avg: 34.3, max: 61.0) +[2023-10-09 05:29:11,053][59242] Avg episode reward: [(0, '27.890'), (1, '26.690')] +[2023-10-09 05:29:12,041][60144] Updated weights for policy 1, policy_version 32872 (0.0008) +[2023-10-09 05:29:12,413][60144] Updated weights for policy 1, policy_version 32882 (0.0007) +[2023-10-09 05:29:12,787][60144] Updated weights for policy 1, policy_version 32892 (0.0007) +[2023-10-09 05:29:13,781][60143] Updated weights for policy 0, policy_version 32522 (0.0009) +[2023-10-09 05:29:14,140][60143] Updated weights for policy 0, policy_version 32532 (0.0010) +[2023-10-09 05:29:14,514][60143] Updated weights for policy 0, policy_version 32542 (0.0008) +[2023-10-09 05:29:16,053][59242] Fps is (10 sec: 13106.7, 60 sec: 13653.2, 300 sec: 13773.6). Total num frames: 67010560. Throughput: 0: 1696.5, 1: 1727.0. Samples: 16764332. Policy #0 lag: (min: 29.0, avg: 34.3, max: 61.0) +[2023-10-09 05:29:16,054][59242] Avg episode reward: [(0, '27.140'), (1, '26.030')] +[2023-10-09 05:29:16,659][60144] Updated weights for policy 1, policy_version 32902 (0.0008) +[2023-10-09 05:29:17,026][60144] Updated weights for policy 1, policy_version 32912 (0.0007) +[2023-10-09 05:29:17,392][60144] Updated weights for policy 1, policy_version 32922 (0.0007) +[2023-10-09 05:29:18,555][60143] Updated weights for policy 0, policy_version 32552 (0.0009) +[2023-10-09 05:29:18,931][60143] Updated weights for policy 0, policy_version 32562 (0.0008) +[2023-10-09 05:29:19,298][60143] Updated weights for policy 0, policy_version 32572 (0.0009) +[2023-10-09 05:29:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 67076096. Throughput: 0: 1715.0, 1: 1709.3. Samples: 16774676. Policy #0 lag: (min: 1.0, avg: 12.5, max: 33.0) +[2023-10-09 05:29:21,053][59242] Avg episode reward: [(0, '27.900'), (1, '24.020')] +[2023-10-09 05:29:21,452][60144] Updated weights for policy 1, policy_version 32932 (0.0008) +[2023-10-09 05:29:21,817][60144] Updated weights for policy 1, policy_version 32942 (0.0010) +[2023-10-09 05:29:22,186][60144] Updated weights for policy 1, policy_version 32952 (0.0008) +[2023-10-09 05:29:23,152][60143] Updated weights for policy 0, policy_version 32582 (0.0009) +[2023-10-09 05:29:23,512][60143] Updated weights for policy 0, policy_version 32592 (0.0008) +[2023-10-09 05:29:23,884][60143] Updated weights for policy 0, policy_version 32602 (0.0009) +[2023-10-09 05:29:26,052][59242] Fps is (10 sec: 13107.7, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 67141632. Throughput: 0: 1687.2, 1: 1734.0. Samples: 16794902. Policy #0 lag: (min: 1.0, avg: 12.5, max: 33.0) +[2023-10-09 05:29:26,053][59242] Avg episode reward: [(0, '27.230'), (1, '25.160')] +[2023-10-09 05:29:26,085][60144] Updated weights for policy 1, policy_version 32962 (0.0009) +[2023-10-09 05:29:26,454][60144] Updated weights for policy 1, policy_version 32972 (0.0009) +[2023-10-09 05:29:26,818][60144] Updated weights for policy 1, policy_version 32982 (0.0009) +[2023-10-09 05:29:27,186][60144] Updated weights for policy 1, policy_version 32992 (0.0008) +[2023-10-09 05:29:27,788][60143] Updated weights for policy 0, policy_version 32612 (0.0011) +[2023-10-09 05:29:28,146][60143] Updated weights for policy 0, policy_version 32622 (0.0009) +[2023-10-09 05:29:28,514][60143] Updated weights for policy 0, policy_version 32632 (0.0008) +[2023-10-09 05:29:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 67207168. Throughput: 0: 1715.2, 1: 1732.3. Samples: 16816244. Policy #0 lag: (min: 1.0, avg: 12.5, max: 33.0) +[2023-10-09 05:29:31,053][59242] Avg episode reward: [(0, '27.070'), (1, '25.290')] +[2023-10-09 05:29:31,064][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000032640_33423360.pth... +[2023-10-09 05:29:31,096][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000031040_31784960.pth +[2023-10-09 05:29:31,143][60144] Updated weights for policy 1, policy_version 33002 (0.0007) +[2023-10-09 05:29:31,513][60144] Updated weights for policy 1, policy_version 33012 (0.0007) +[2023-10-09 05:29:31,882][60144] Updated weights for policy 1, policy_version 33022 (0.0009) +[2023-10-09 05:29:31,955][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000033024_33816576.pth... +[2023-10-09 05:29:31,991][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000031392_32145408.pth +[2023-10-09 05:29:32,415][60143] Updated weights for policy 0, policy_version 32642 (0.0010) +[2023-10-09 05:29:32,778][60143] Updated weights for policy 0, policy_version 32652 (0.0007) +[2023-10-09 05:29:33,144][60143] Updated weights for policy 0, policy_version 32662 (0.0007) +[2023-10-09 05:29:33,516][60143] Updated weights for policy 0, policy_version 32672 (0.0008) +[2023-10-09 05:29:36,004][60144] Updated weights for policy 1, policy_version 33032 (0.0009) +[2023-10-09 05:29:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 67272704. Throughput: 0: 1688.2, 1: 1723.5. Samples: 16825566. Policy #0 lag: (min: 1.0, avg: 12.5, max: 33.0) +[2023-10-09 05:29:36,052][59242] Avg episode reward: [(0, '26.430'), (1, '24.430')] +[2023-10-09 05:29:36,368][60144] Updated weights for policy 1, policy_version 33042 (0.0007) +[2023-10-09 05:29:36,740][60144] Updated weights for policy 1, policy_version 33052 (0.0008) +[2023-10-09 05:29:37,631][60143] Updated weights for policy 0, policy_version 32682 (0.0007) +[2023-10-09 05:29:37,998][60143] Updated weights for policy 0, policy_version 32692 (0.0007) +[2023-10-09 05:29:38,360][60143] Updated weights for policy 0, policy_version 32702 (0.0009) +[2023-10-09 05:29:40,495][60144] Updated weights for policy 1, policy_version 33062 (0.0007) +[2023-10-09 05:29:40,860][60144] Updated weights for policy 1, policy_version 33072 (0.0007) +[2023-10-09 05:29:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 67338240. Throughput: 0: 1699.6, 1: 1732.7. Samples: 16846754. Policy #0 lag: (min: 1.0, avg: 12.5, max: 33.0) +[2023-10-09 05:29:41,053][59242] Avg episode reward: [(0, '28.640'), (1, '24.410')] +[2023-10-09 05:29:41,226][60144] Updated weights for policy 1, policy_version 33082 (0.0007) +[2023-10-09 05:29:42,381][60143] Updated weights for policy 0, policy_version 32712 (0.0008) +[2023-10-09 05:29:42,764][60143] Updated weights for policy 0, policy_version 32722 (0.0009) +[2023-10-09 05:29:43,134][60143] Updated weights for policy 0, policy_version 32732 (0.0008) +[2023-10-09 05:29:45,211][60144] Updated weights for policy 1, policy_version 33092 (0.0008) +[2023-10-09 05:29:45,573][60144] Updated weights for policy 1, policy_version 33102 (0.0008) +[2023-10-09 05:29:45,936][60144] Updated weights for policy 1, policy_version 33112 (0.0010) +[2023-10-09 05:29:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 67403776. Throughput: 0: 1713.2, 1: 1717.9. Samples: 16867210. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:29:46,053][59242] Avg episode reward: [(0, '28.470'), (1, '25.060')] +[2023-10-09 05:29:47,142][60143] Updated weights for policy 0, policy_version 32742 (0.0009) +[2023-10-09 05:29:47,517][60143] Updated weights for policy 0, policy_version 32752 (0.0009) +[2023-10-09 05:29:47,888][60143] Updated weights for policy 0, policy_version 32762 (0.0008) +[2023-10-09 05:29:49,867][60144] Updated weights for policy 1, policy_version 33122 (0.0008) +[2023-10-09 05:29:50,269][60144] Updated weights for policy 1, policy_version 33132 (0.0007) +[2023-10-09 05:29:50,635][60144] Updated weights for policy 1, policy_version 33142 (0.0009) +[2023-10-09 05:29:51,002][60144] Updated weights for policy 1, policy_version 33152 (0.0009) +[2023-10-09 05:29:51,052][59242] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 67502080. Throughput: 0: 1685.5, 1: 1734.1. Samples: 16877074. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:29:51,052][59242] Avg episode reward: [(0, '28.460'), (1, '26.110')] +[2023-10-09 05:29:51,988][60143] Updated weights for policy 0, policy_version 32772 (0.0007) +[2023-10-09 05:29:52,351][60143] Updated weights for policy 0, policy_version 32782 (0.0009) +[2023-10-09 05:29:52,716][60143] Updated weights for policy 0, policy_version 32792 (0.0010) +[2023-10-09 05:29:54,814][60144] Updated weights for policy 1, policy_version 33162 (0.0011) +[2023-10-09 05:29:55,181][60144] Updated weights for policy 1, policy_version 33172 (0.0008) +[2023-10-09 05:29:55,545][60144] Updated weights for policy 1, policy_version 33182 (0.0009) +[2023-10-09 05:29:56,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 67567616. Throughput: 0: 1709.8, 1: 1734.3. Samples: 16898388. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:29:56,053][59242] Avg episode reward: [(0, '29.050'), (1, '25.380')] +[2023-10-09 05:29:56,588][60143] Updated weights for policy 0, policy_version 32802 (0.0007) +[2023-10-09 05:29:56,957][60143] Updated weights for policy 0, policy_version 32812 (0.0010) +[2023-10-09 05:29:57,337][60143] Updated weights for policy 0, policy_version 32822 (0.0012) +[2023-10-09 05:29:57,694][60143] Updated weights for policy 0, policy_version 32832 (0.0011) +[2023-10-09 05:29:59,377][60144] Updated weights for policy 1, policy_version 33192 (0.0010) +[2023-10-09 05:29:59,747][60144] Updated weights for policy 1, policy_version 33202 (0.0008) +[2023-10-09 05:30:00,122][60144] Updated weights for policy 1, policy_version 33212 (0.0007) +[2023-10-09 05:30:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 67633152. Throughput: 0: 1714.1, 1: 1702.7. Samples: 16918084. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:30:01,053][59242] Avg episode reward: [(0, '28.730'), (1, '25.940')] +[2023-10-09 05:30:01,953][60143] Updated weights for policy 0, policy_version 32842 (0.0008) +[2023-10-09 05:30:02,333][60143] Updated weights for policy 0, policy_version 32852 (0.0008) +[2023-10-09 05:30:02,702][60143] Updated weights for policy 0, policy_version 32862 (0.0009) +[2023-10-09 05:30:04,285][60144] Updated weights for policy 1, policy_version 33222 (0.0009) +[2023-10-09 05:30:04,657][60144] Updated weights for policy 1, policy_version 33232 (0.0009) +[2023-10-09 05:30:05,017][60144] Updated weights for policy 1, policy_version 33242 (0.0008) +[2023-10-09 05:30:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 67698688. Throughput: 0: 1688.5, 1: 1730.0. Samples: 16928512. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:30:06,053][59242] Avg episode reward: [(0, '29.120'), (1, '26.440')] +[2023-10-09 05:30:06,573][60143] Updated weights for policy 0, policy_version 32872 (0.0010) +[2023-10-09 05:30:06,940][60143] Updated weights for policy 0, policy_version 32882 (0.0009) +[2023-10-09 05:30:07,311][60143] Updated weights for policy 0, policy_version 32892 (0.0008) +[2023-10-09 05:30:08,923][60144] Updated weights for policy 1, policy_version 33252 (0.0010) +[2023-10-09 05:30:09,290][60144] Updated weights for policy 1, policy_version 33262 (0.0009) +[2023-10-09 05:30:09,656][60144] Updated weights for policy 1, policy_version 33272 (0.0007) +[2023-10-09 05:30:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 67764224. Throughput: 0: 1710.8, 1: 1710.6. Samples: 16948868. Policy #0 lag: (min: 17.0, avg: 28.5, max: 49.0) +[2023-10-09 05:30:11,053][59242] Avg episode reward: [(0, '28.850'), (1, '26.970')] +[2023-10-09 05:30:11,317][60143] Updated weights for policy 0, policy_version 32902 (0.0008) +[2023-10-09 05:30:11,688][60143] Updated weights for policy 0, policy_version 32912 (0.0009) +[2023-10-09 05:30:12,054][60143] Updated weights for policy 0, policy_version 32922 (0.0009) +[2023-10-09 05:30:13,581][60144] Updated weights for policy 1, policy_version 33282 (0.0009) +[2023-10-09 05:30:13,946][60144] Updated weights for policy 1, policy_version 33292 (0.0010) +[2023-10-09 05:30:14,318][60144] Updated weights for policy 1, policy_version 33302 (0.0007) +[2023-10-09 05:30:14,684][60144] Updated weights for policy 1, policy_version 33312 (0.0008) +[2023-10-09 05:30:15,983][60143] Updated weights for policy 0, policy_version 32932 (0.0010) +[2023-10-09 05:30:16,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 67829760. Throughput: 0: 1708.0, 1: 1702.6. Samples: 16969718. Policy #0 lag: (min: 17.0, avg: 28.5, max: 49.0) +[2023-10-09 05:30:16,053][59242] Avg episode reward: [(0, '29.370'), (1, '25.610')] +[2023-10-09 05:30:16,360][60143] Updated weights for policy 0, policy_version 32942 (0.0009) +[2023-10-09 05:30:16,734][60143] Updated weights for policy 0, policy_version 32952 (0.0008) +[2023-10-09 05:30:18,664][60144] Updated weights for policy 1, policy_version 33322 (0.0007) +[2023-10-09 05:30:19,017][60144] Updated weights for policy 1, policy_version 33332 (0.0008) +[2023-10-09 05:30:19,381][60144] Updated weights for policy 1, policy_version 33342 (0.0009) +[2023-10-09 05:30:20,805][60143] Updated weights for policy 0, policy_version 32962 (0.0010) +[2023-10-09 05:30:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 67895296. Throughput: 0: 1704.0, 1: 1730.1. Samples: 16980102. Policy #0 lag: (min: 17.0, avg: 28.5, max: 49.0) +[2023-10-09 05:30:21,052][59242] Avg episode reward: [(0, '30.330'), (1, '26.120')] +[2023-10-09 05:30:21,185][60143] Updated weights for policy 0, policy_version 32972 (0.0008) +[2023-10-09 05:30:21,549][60143] Updated weights for policy 0, policy_version 32982 (0.0009) +[2023-10-09 05:30:21,905][59934] Saving new best policy, reward=30.330! +[2023-10-09 05:30:21,906][60143] Updated weights for policy 0, policy_version 32992 (0.0011) +[2023-10-09 05:30:23,293][60144] Updated weights for policy 1, policy_version 33352 (0.0007) +[2023-10-09 05:30:23,663][60144] Updated weights for policy 1, policy_version 33362 (0.0007) +[2023-10-09 05:30:24,029][60144] Updated weights for policy 1, policy_version 33372 (0.0007) +[2023-10-09 05:30:25,922][60143] Updated weights for policy 0, policy_version 33002 (0.0009) +[2023-10-09 05:30:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 67960832. Throughput: 0: 1711.8, 1: 1710.6. Samples: 17000760. Policy #0 lag: (min: 17.0, avg: 28.5, max: 49.0) +[2023-10-09 05:30:26,052][59242] Avg episode reward: [(0, '30.690'), (1, '26.820')] +[2023-10-09 05:30:26,291][60143] Updated weights for policy 0, policy_version 33012 (0.0009) +[2023-10-09 05:30:26,656][60143] Updated weights for policy 0, policy_version 33022 (0.0010) +[2023-10-09 05:30:26,730][59934] Saving new best policy, reward=30.690! +[2023-10-09 05:30:28,075][60144] Updated weights for policy 1, policy_version 33382 (0.0007) +[2023-10-09 05:30:28,438][60144] Updated weights for policy 1, policy_version 33392 (0.0007) +[2023-10-09 05:30:28,816][60144] Updated weights for policy 1, policy_version 33402 (0.0008) +[2023-10-09 05:30:30,698][60143] Updated weights for policy 0, policy_version 33032 (0.0007) +[2023-10-09 05:30:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 68026368. Throughput: 0: 1716.0, 1: 1720.9. Samples: 17021874. Policy #0 lag: (min: 17.0, avg: 28.5, max: 49.0) +[2023-10-09 05:30:31,053][59242] Avg episode reward: [(0, '30.130'), (1, '27.790')] +[2023-10-09 05:30:31,074][60143] Updated weights for policy 0, policy_version 33042 (0.0007) +[2023-10-09 05:30:31,448][60143] Updated weights for policy 0, policy_version 33052 (0.0007) +[2023-10-09 05:30:32,669][60144] Updated weights for policy 1, policy_version 33412 (0.0009) +[2023-10-09 05:30:33,043][60144] Updated weights for policy 1, policy_version 33422 (0.0010) +[2023-10-09 05:30:33,410][60144] Updated weights for policy 1, policy_version 33432 (0.0008) +[2023-10-09 05:30:35,305][60143] Updated weights for policy 0, policy_version 33062 (0.0010) +[2023-10-09 05:30:35,671][60143] Updated weights for policy 0, policy_version 33072 (0.0008) +[2023-10-09 05:30:36,038][60143] Updated weights for policy 0, policy_version 33082 (0.0009) +[2023-10-09 05:30:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 68091904. Throughput: 0: 1721.4, 1: 1713.3. Samples: 17031636. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:30:36,052][59242] Avg episode reward: [(0, '32.660'), (1, '28.100')] +[2023-10-09 05:30:36,250][59934] Saving new best policy, reward=32.660! +[2023-10-09 05:30:37,256][60144] Updated weights for policy 1, policy_version 33442 (0.0007) +[2023-10-09 05:30:37,630][60144] Updated weights for policy 1, policy_version 33452 (0.0008) +[2023-10-09 05:30:38,011][60144] Updated weights for policy 1, policy_version 33462 (0.0008) +[2023-10-09 05:30:38,369][60144] Updated weights for policy 1, policy_version 33472 (0.0008) +[2023-10-09 05:30:39,874][60143] Updated weights for policy 0, policy_version 33092 (0.0008) +[2023-10-09 05:30:40,239][60143] Updated weights for policy 0, policy_version 33102 (0.0008) +[2023-10-09 05:30:40,609][60143] Updated weights for policy 0, policy_version 33112 (0.0010) +[2023-10-09 05:30:41,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 68190208. Throughput: 0: 1723.8, 1: 1712.5. Samples: 17053022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:30:41,052][59242] Avg episode reward: [(0, '31.160'), (1, '28.470')] +[2023-10-09 05:30:42,354][60144] Updated weights for policy 1, policy_version 33482 (0.0007) +[2023-10-09 05:30:42,731][60144] Updated weights for policy 1, policy_version 33492 (0.0007) +[2023-10-09 05:30:43,108][60144] Updated weights for policy 1, policy_version 33502 (0.0008) +[2023-10-09 05:30:44,559][60143] Updated weights for policy 0, policy_version 33122 (0.0008) +[2023-10-09 05:30:44,940][60143] Updated weights for policy 0, policy_version 33132 (0.0010) +[2023-10-09 05:30:45,315][60143] Updated weights for policy 0, policy_version 33142 (0.0008) +[2023-10-09 05:30:45,678][60143] Updated weights for policy 0, policy_version 33152 (0.0009) +[2023-10-09 05:30:46,052][59242] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 68255744. Throughput: 0: 1706.0, 1: 1738.8. Samples: 17073102. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:30:46,053][59242] Avg episode reward: [(0, '30.620'), (1, '27.960')] +[2023-10-09 05:30:47,020][60144] Updated weights for policy 1, policy_version 33512 (0.0008) +[2023-10-09 05:30:47,386][60144] Updated weights for policy 1, policy_version 33522 (0.0010) +[2023-10-09 05:30:47,743][60144] Updated weights for policy 1, policy_version 33532 (0.0007) +[2023-10-09 05:30:49,673][60143] Updated weights for policy 0, policy_version 33162 (0.0007) +[2023-10-09 05:30:50,037][60143] Updated weights for policy 0, policy_version 33172 (0.0010) +[2023-10-09 05:30:50,409][60143] Updated weights for policy 0, policy_version 33182 (0.0010) +[2023-10-09 05:30:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 68321280. Throughput: 0: 1732.2, 1: 1709.5. Samples: 17083388. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:30:51,053][59242] Avg episode reward: [(0, '29.670'), (1, '28.290')] +[2023-10-09 05:30:51,660][60144] Updated weights for policy 1, policy_version 33542 (0.0009) +[2023-10-09 05:30:52,030][60144] Updated weights for policy 1, policy_version 33552 (0.0010) +[2023-10-09 05:30:52,403][60144] Updated weights for policy 1, policy_version 33562 (0.0009) +[2023-10-09 05:30:54,144][60143] Updated weights for policy 0, policy_version 33192 (0.0009) +[2023-10-09 05:30:54,512][60143] Updated weights for policy 0, policy_version 33202 (0.0009) +[2023-10-09 05:30:54,883][60143] Updated weights for policy 0, policy_version 33212 (0.0010) +[2023-10-09 05:30:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 68386816. Throughput: 0: 1722.5, 1: 1728.0. Samples: 17104140. Policy #0 lag: (min: 31.0, avg: 45.2, max: 63.0) +[2023-10-09 05:30:56,053][59242] Avg episode reward: [(0, '31.080'), (1, '27.960')] +[2023-10-09 05:30:56,303][60144] Updated weights for policy 1, policy_version 33572 (0.0009) +[2023-10-09 05:30:56,660][60144] Updated weights for policy 1, policy_version 33582 (0.0009) +[2023-10-09 05:30:57,029][60144] Updated weights for policy 1, policy_version 33592 (0.0007) +[2023-10-09 05:30:59,014][60143] Updated weights for policy 0, policy_version 33222 (0.0011) +[2023-10-09 05:30:59,378][60143] Updated weights for policy 0, policy_version 33232 (0.0011) +[2023-10-09 05:30:59,755][60143] Updated weights for policy 0, policy_version 33242 (0.0011) +[2023-10-09 05:31:00,968][60144] Updated weights for policy 1, policy_version 33602 (0.0009) +[2023-10-09 05:31:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 68452352. Throughput: 0: 1703.7, 1: 1743.2. Samples: 17124830. Policy #0 lag: (min: 31.0, avg: 45.2, max: 63.0) +[2023-10-09 05:31:01,053][59242] Avg episode reward: [(0, '32.070'), (1, '28.910')] +[2023-10-09 05:31:01,342][60144] Updated weights for policy 1, policy_version 33612 (0.0008) +[2023-10-09 05:31:01,708][60144] Updated weights for policy 1, policy_version 33622 (0.0007) +[2023-10-09 05:31:02,076][60144] Updated weights for policy 1, policy_version 33632 (0.0008) +[2023-10-09 05:31:03,869][60143] Updated weights for policy 0, policy_version 33252 (0.0009) +[2023-10-09 05:31:04,255][60143] Updated weights for policy 0, policy_version 33262 (0.0010) +[2023-10-09 05:31:04,618][60143] Updated weights for policy 0, policy_version 33272 (0.0011) +[2023-10-09 05:31:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 68517888. Throughput: 0: 1733.2, 1: 1719.5. Samples: 17135470. Policy #0 lag: (min: 31.0, avg: 45.2, max: 63.0) +[2023-10-09 05:31:06,053][59242] Avg episode reward: [(0, '29.650'), (1, '27.520')] +[2023-10-09 05:31:06,205][60144] Updated weights for policy 1, policy_version 33642 (0.0009) +[2023-10-09 05:31:06,576][60144] Updated weights for policy 1, policy_version 33652 (0.0010) +[2023-10-09 05:31:06,935][60144] Updated weights for policy 1, policy_version 33662 (0.0008) +[2023-10-09 05:31:08,525][60143] Updated weights for policy 0, policy_version 33282 (0.0008) +[2023-10-09 05:31:08,893][60143] Updated weights for policy 0, policy_version 33292 (0.0008) +[2023-10-09 05:31:09,267][60143] Updated weights for policy 0, policy_version 33302 (0.0008) +[2023-10-09 05:31:09,630][60143] Updated weights for policy 0, policy_version 33312 (0.0007) +[2023-10-09 05:31:10,600][60144] Updated weights for policy 1, policy_version 33672 (0.0010) +[2023-10-09 05:31:10,977][60144] Updated weights for policy 1, policy_version 33682 (0.0010) +[2023-10-09 05:31:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 68583424. Throughput: 0: 1699.4, 1: 1742.4. Samples: 17155640. Policy #0 lag: (min: 31.0, avg: 45.2, max: 63.0) +[2023-10-09 05:31:11,053][59242] Avg episode reward: [(0, '28.910'), (1, '27.590')] +[2023-10-09 05:31:11,347][60144] Updated weights for policy 1, policy_version 33692 (0.0008) +[2023-10-09 05:31:13,594][60143] Updated weights for policy 0, policy_version 33322 (0.0009) +[2023-10-09 05:31:13,965][60143] Updated weights for policy 0, policy_version 33332 (0.0009) +[2023-10-09 05:31:14,334][60143] Updated weights for policy 0, policy_version 33342 (0.0008) +[2023-10-09 05:31:15,175][60144] Updated weights for policy 1, policy_version 33702 (0.0008) +[2023-10-09 05:31:15,532][60144] Updated weights for policy 1, policy_version 33712 (0.0010) +[2023-10-09 05:31:15,903][60144] Updated weights for policy 1, policy_version 33722 (0.0007) +[2023-10-09 05:31:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 68648960. Throughput: 0: 1695.7, 1: 1728.8. Samples: 17175976. Policy #0 lag: (min: 31.0, avg: 45.2, max: 63.0) +[2023-10-09 05:31:16,053][59242] Avg episode reward: [(0, '28.900'), (1, '26.790')] +[2023-10-09 05:31:18,401][60143] Updated weights for policy 0, policy_version 33352 (0.0009) +[2023-10-09 05:31:18,774][60143] Updated weights for policy 0, policy_version 33362 (0.0008) +[2023-10-09 05:31:19,150][60143] Updated weights for policy 0, policy_version 33372 (0.0009) +[2023-10-09 05:31:19,905][60144] Updated weights for policy 1, policy_version 33732 (0.0008) +[2023-10-09 05:31:20,276][60144] Updated weights for policy 1, policy_version 33742 (0.0007) +[2023-10-09 05:31:20,639][60144] Updated weights for policy 1, policy_version 33752 (0.0007) +[2023-10-09 05:31:21,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 68747264. Throughput: 0: 1707.9, 1: 1740.7. Samples: 17186824. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-09 05:31:21,053][59242] Avg episode reward: [(0, '29.470'), (1, '26.670')] +[2023-10-09 05:31:23,303][60143] Updated weights for policy 0, policy_version 33382 (0.0008) +[2023-10-09 05:31:23,684][60143] Updated weights for policy 0, policy_version 33392 (0.0007) +[2023-10-09 05:31:24,057][60143] Updated weights for policy 0, policy_version 33402 (0.0009) +[2023-10-09 05:31:24,635][60144] Updated weights for policy 1, policy_version 33762 (0.0008) +[2023-10-09 05:31:25,009][60144] Updated weights for policy 1, policy_version 33772 (0.0007) +[2023-10-09 05:31:25,366][60144] Updated weights for policy 1, policy_version 33782 (0.0009) +[2023-10-09 05:31:25,731][60144] Updated weights for policy 1, policy_version 33792 (0.0007) +[2023-10-09 05:31:26,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 68812800. Throughput: 0: 1678.4, 1: 1741.4. Samples: 17206912. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-09 05:31:26,053][59242] Avg episode reward: [(0, '29.700'), (1, '25.890')] +[2023-10-09 05:31:28,031][60143] Updated weights for policy 0, policy_version 33412 (0.0010) +[2023-10-09 05:31:28,410][60143] Updated weights for policy 0, policy_version 33422 (0.0008) +[2023-10-09 05:31:28,785][60143] Updated weights for policy 0, policy_version 33432 (0.0009) +[2023-10-09 05:31:29,535][60144] Updated weights for policy 1, policy_version 33802 (0.0007) +[2023-10-09 05:31:29,911][60144] Updated weights for policy 1, policy_version 33812 (0.0011) +[2023-10-09 05:31:30,273][60144] Updated weights for policy 1, policy_version 33822 (0.0009) +[2023-10-09 05:31:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 68878336. Throughput: 0: 1702.1, 1: 1714.6. Samples: 17226854. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-09 05:31:31,053][59242] Avg episode reward: [(0, '28.610'), (1, '27.260')] +[2023-10-09 05:31:31,063][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000033824_34635776.pth... +[2023-10-09 05:31:31,063][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000033440_34242560.pth... +[2023-10-09 05:31:31,101][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000031840_32604160.pth +[2023-10-09 05:31:31,103][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000032192_32964608.pth +[2023-10-09 05:31:32,650][60143] Updated weights for policy 0, policy_version 33442 (0.0008) +[2023-10-09 05:31:33,028][60143] Updated weights for policy 0, policy_version 33452 (0.0008) +[2023-10-09 05:31:33,395][60143] Updated weights for policy 0, policy_version 33462 (0.0008) +[2023-10-09 05:31:33,757][60143] Updated weights for policy 0, policy_version 33472 (0.0008) +[2023-10-09 05:31:34,267][60144] Updated weights for policy 1, policy_version 33832 (0.0010) +[2023-10-09 05:31:34,632][60144] Updated weights for policy 1, policy_version 33842 (0.0007) +[2023-10-09 05:31:35,008][60144] Updated weights for policy 1, policy_version 33852 (0.0009) +[2023-10-09 05:31:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 68943872. Throughput: 0: 1684.8, 1: 1745.4. Samples: 17237748. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-09 05:31:36,052][59242] Avg episode reward: [(0, '27.080'), (1, '27.400')] +[2023-10-09 05:31:37,762][60143] Updated weights for policy 0, policy_version 33482 (0.0010) +[2023-10-09 05:31:38,128][60143] Updated weights for policy 0, policy_version 33492 (0.0008) +[2023-10-09 05:31:38,502][60143] Updated weights for policy 0, policy_version 33502 (0.0008) +[2023-10-09 05:31:38,953][60144] Updated weights for policy 1, policy_version 33862 (0.0009) +[2023-10-09 05:31:39,326][60144] Updated weights for policy 1, policy_version 33872 (0.0007) +[2023-10-09 05:31:39,699][60144] Updated weights for policy 1, policy_version 33882 (0.0008) +[2023-10-09 05:31:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 69009408. Throughput: 0: 1685.2, 1: 1724.1. Samples: 17257556. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-09 05:31:41,053][59242] Avg episode reward: [(0, '26.690'), (1, '26.940')] +[2023-10-09 05:31:42,635][60143] Updated weights for policy 0, policy_version 33512 (0.0008) +[2023-10-09 05:31:43,014][60143] Updated weights for policy 0, policy_version 33522 (0.0009) +[2023-10-09 05:31:43,388][60143] Updated weights for policy 0, policy_version 33532 (0.0008) +[2023-10-09 05:31:43,596][60144] Updated weights for policy 1, policy_version 33892 (0.0008) +[2023-10-09 05:31:43,958][60144] Updated weights for policy 1, policy_version 33902 (0.0007) +[2023-10-09 05:31:44,336][60144] Updated weights for policy 1, policy_version 33912 (0.0008) +[2023-10-09 05:31:46,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 69074944. Throughput: 0: 1701.5, 1: 1710.3. Samples: 17278362. Policy #0 lag: (min: 1.0, avg: 15.1, max: 33.0) +[2023-10-09 05:31:46,053][59242] Avg episode reward: [(0, '24.730'), (1, '26.500')] +[2023-10-09 05:31:47,447][60143] Updated weights for policy 0, policy_version 33542 (0.0007) +[2023-10-09 05:31:47,823][60143] Updated weights for policy 0, policy_version 33552 (0.0008) +[2023-10-09 05:31:48,187][60143] Updated weights for policy 0, policy_version 33562 (0.0007) +[2023-10-09 05:31:48,225][60144] Updated weights for policy 1, policy_version 33922 (0.0009) +[2023-10-09 05:31:48,583][60144] Updated weights for policy 1, policy_version 33932 (0.0009) +[2023-10-09 05:31:48,955][60144] Updated weights for policy 1, policy_version 33942 (0.0010) +[2023-10-09 05:31:49,323][60144] Updated weights for policy 1, policy_version 33952 (0.0008) +[2023-10-09 05:31:51,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 69140480. Throughput: 0: 1669.5, 1: 1731.3. Samples: 17288504. Policy #0 lag: (min: 1.0, avg: 15.1, max: 33.0) +[2023-10-09 05:31:51,053][59242] Avg episode reward: [(0, '25.020'), (1, '25.470')] +[2023-10-09 05:31:52,194][60143] Updated weights for policy 0, policy_version 33572 (0.0008) +[2023-10-09 05:31:52,559][60143] Updated weights for policy 0, policy_version 33582 (0.0007) +[2023-10-09 05:31:52,935][60143] Updated weights for policy 0, policy_version 33592 (0.0008) +[2023-10-09 05:31:53,482][60144] Updated weights for policy 1, policy_version 33962 (0.0009) +[2023-10-09 05:31:53,855][60144] Updated weights for policy 1, policy_version 33972 (0.0009) +[2023-10-09 05:31:54,220][60144] Updated weights for policy 1, policy_version 33982 (0.0008) +[2023-10-09 05:31:56,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 69206016. Throughput: 0: 1696.3, 1: 1702.6. Samples: 17308590. Policy #0 lag: (min: 1.0, avg: 15.1, max: 33.0) +[2023-10-09 05:31:56,052][59242] Avg episode reward: [(0, '25.090'), (1, '25.380')] +[2023-10-09 05:31:56,796][60143] Updated weights for policy 0, policy_version 33602 (0.0009) +[2023-10-09 05:31:57,161][60143] Updated weights for policy 0, policy_version 33612 (0.0007) +[2023-10-09 05:31:57,522][60143] Updated weights for policy 0, policy_version 33622 (0.0007) +[2023-10-09 05:31:57,891][60143] Updated weights for policy 0, policy_version 33632 (0.0007) +[2023-10-09 05:31:58,314][60144] Updated weights for policy 1, policy_version 33992 (0.0007) +[2023-10-09 05:31:58,676][60144] Updated weights for policy 1, policy_version 34002 (0.0008) +[2023-10-09 05:31:59,046][60144] Updated weights for policy 1, policy_version 34012 (0.0008) +[2023-10-09 05:32:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 69271552. Throughput: 0: 1705.1, 1: 1716.5. Samples: 17329950. Policy #0 lag: (min: 1.0, avg: 15.1, max: 33.0) +[2023-10-09 05:32:01,053][59242] Avg episode reward: [(0, '25.880'), (1, '25.240')] +[2023-10-09 05:32:01,905][60143] Updated weights for policy 0, policy_version 33642 (0.0010) +[2023-10-09 05:32:02,275][60143] Updated weights for policy 0, policy_version 33652 (0.0009) +[2023-10-09 05:32:02,635][60143] Updated weights for policy 0, policy_version 33662 (0.0012) +[2023-10-09 05:32:03,115][60144] Updated weights for policy 1, policy_version 34022 (0.0007) +[2023-10-09 05:32:03,483][60144] Updated weights for policy 1, policy_version 34032 (0.0008) +[2023-10-09 05:32:03,850][60144] Updated weights for policy 1, policy_version 34042 (0.0010) +[2023-10-09 05:32:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 69337088. Throughput: 0: 1688.1, 1: 1714.3. Samples: 17339928. Policy #0 lag: (min: 1.0, avg: 15.1, max: 33.0) +[2023-10-09 05:32:06,052][59242] Avg episode reward: [(0, '26.940'), (1, '25.440')] +[2023-10-09 05:32:06,753][60143] Updated weights for policy 0, policy_version 33672 (0.0009) +[2023-10-09 05:32:07,126][60143] Updated weights for policy 0, policy_version 33682 (0.0009) +[2023-10-09 05:32:07,494][60143] Updated weights for policy 0, policy_version 33692 (0.0008) +[2023-10-09 05:32:07,770][60144] Updated weights for policy 1, policy_version 34052 (0.0010) +[2023-10-09 05:32:08,129][60144] Updated weights for policy 1, policy_version 34062 (0.0010) +[2023-10-09 05:32:08,496][60144] Updated weights for policy 1, policy_version 34072 (0.0010) +[2023-10-09 05:32:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 69402624. Throughput: 0: 1712.1, 1: 1697.2. Samples: 17360330. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:32:11,053][59242] Avg episode reward: [(0, '25.330'), (1, '25.650')] +[2023-10-09 05:32:11,488][60143] Updated weights for policy 0, policy_version 33702 (0.0007) +[2023-10-09 05:32:11,863][60143] Updated weights for policy 0, policy_version 33712 (0.0008) +[2023-10-09 05:32:12,231][60143] Updated weights for policy 0, policy_version 33722 (0.0009) +[2023-10-09 05:32:12,420][60144] Updated weights for policy 1, policy_version 34082 (0.0010) +[2023-10-09 05:32:12,789][60144] Updated weights for policy 1, policy_version 34092 (0.0007) +[2023-10-09 05:32:13,151][60144] Updated weights for policy 1, policy_version 34102 (0.0007) +[2023-10-09 05:32:13,517][60144] Updated weights for policy 1, policy_version 34112 (0.0007) +[2023-10-09 05:32:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 69468160. Throughput: 0: 1713.2, 1: 1731.2. Samples: 17381850. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:32:16,052][59242] Avg episode reward: [(0, '26.710'), (1, '25.170')] +[2023-10-09 05:32:16,241][60143] Updated weights for policy 0, policy_version 33732 (0.0009) +[2023-10-09 05:32:16,621][60143] Updated weights for policy 0, policy_version 33742 (0.0010) +[2023-10-09 05:32:16,990][60143] Updated weights for policy 0, policy_version 33752 (0.0010) +[2023-10-09 05:32:17,337][60144] Updated weights for policy 1, policy_version 34122 (0.0009) +[2023-10-09 05:32:17,710][60144] Updated weights for policy 1, policy_version 34132 (0.0009) +[2023-10-09 05:32:18,081][60144] Updated weights for policy 1, policy_version 34142 (0.0008) +[2023-10-09 05:32:20,883][60143] Updated weights for policy 0, policy_version 33762 (0.0009) +[2023-10-09 05:32:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 69533696. Throughput: 0: 1708.3, 1: 1700.6. Samples: 17391146. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:32:21,053][59242] Avg episode reward: [(0, '26.550'), (1, '25.900')] +[2023-10-09 05:32:21,251][60143] Updated weights for policy 0, policy_version 33772 (0.0010) +[2023-10-09 05:32:21,627][60143] Updated weights for policy 0, policy_version 33782 (0.0007) +[2023-10-09 05:32:21,997][60143] Updated weights for policy 0, policy_version 33792 (0.0007) +[2023-10-09 05:32:22,053][60144] Updated weights for policy 1, policy_version 34152 (0.0008) +[2023-10-09 05:32:22,421][60144] Updated weights for policy 1, policy_version 34162 (0.0008) +[2023-10-09 05:32:22,784][60144] Updated weights for policy 1, policy_version 34172 (0.0007) +[2023-10-09 05:32:25,723][60143] Updated weights for policy 0, policy_version 33802 (0.0009) +[2023-10-09 05:32:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 69599232. Throughput: 0: 1722.5, 1: 1722.1. Samples: 17412562. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:32:26,052][59242] Avg episode reward: [(0, '26.580'), (1, '25.930')] +[2023-10-09 05:32:26,098][60143] Updated weights for policy 0, policy_version 33812 (0.0009) +[2023-10-09 05:32:26,459][60143] Updated weights for policy 0, policy_version 33822 (0.0009) +[2023-10-09 05:32:26,773][60144] Updated weights for policy 1, policy_version 34182 (0.0008) +[2023-10-09 05:32:27,138][60144] Updated weights for policy 1, policy_version 34192 (0.0007) +[2023-10-09 05:32:27,519][60144] Updated weights for policy 1, policy_version 34202 (0.0010) +[2023-10-09 05:32:30,417][60143] Updated weights for policy 0, policy_version 33832 (0.0010) +[2023-10-09 05:32:30,778][60143] Updated weights for policy 0, policy_version 33842 (0.0009) +[2023-10-09 05:32:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 69664768. Throughput: 0: 1718.9, 1: 1728.9. Samples: 17433514. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:32:31,053][59242] Avg episode reward: [(0, '27.780'), (1, '26.740')] +[2023-10-09 05:32:31,148][60143] Updated weights for policy 0, policy_version 33852 (0.0008) +[2023-10-09 05:32:31,509][60144] Updated weights for policy 1, policy_version 34212 (0.0009) +[2023-10-09 05:32:31,874][60144] Updated weights for policy 1, policy_version 34222 (0.0009) +[2023-10-09 05:32:32,241][60144] Updated weights for policy 1, policy_version 34232 (0.0007) +[2023-10-09 05:32:35,104][60143] Updated weights for policy 0, policy_version 33862 (0.0009) +[2023-10-09 05:32:35,473][60143] Updated weights for policy 0, policy_version 33872 (0.0007) +[2023-10-09 05:32:35,852][60143] Updated weights for policy 0, policy_version 33882 (0.0008) +[2023-10-09 05:32:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 69730304. Throughput: 0: 1732.0, 1: 1707.8. Samples: 17443294. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 05:32:36,053][59242] Avg episode reward: [(0, '28.650'), (1, '27.650')] +[2023-10-09 05:32:36,274][60144] Updated weights for policy 1, policy_version 34242 (0.0009) +[2023-10-09 05:32:36,633][60144] Updated weights for policy 1, policy_version 34252 (0.0008) +[2023-10-09 05:32:37,005][60144] Updated weights for policy 1, policy_version 34262 (0.0010) +[2023-10-09 05:32:37,377][60144] Updated weights for policy 1, policy_version 34272 (0.0008) +[2023-10-09 05:32:39,737][60143] Updated weights for policy 0, policy_version 33892 (0.0007) +[2023-10-09 05:32:40,096][60143] Updated weights for policy 0, policy_version 33902 (0.0007) +[2023-10-09 05:32:40,472][60143] Updated weights for policy 0, policy_version 33912 (0.0011) +[2023-10-09 05:32:41,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 69828608. Throughput: 0: 1733.4, 1: 1728.5. Samples: 17464378. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 05:32:41,053][59242] Avg episode reward: [(0, '29.590'), (1, '28.000')] +[2023-10-09 05:32:41,264][60144] Updated weights for policy 1, policy_version 34282 (0.0008) +[2023-10-09 05:32:41,633][60144] Updated weights for policy 1, policy_version 34292 (0.0009) +[2023-10-09 05:32:42,007][60144] Updated weights for policy 1, policy_version 34302 (0.0009) +[2023-10-09 05:32:44,426][60143] Updated weights for policy 0, policy_version 33922 (0.0009) +[2023-10-09 05:32:44,802][60143] Updated weights for policy 0, policy_version 33932 (0.0008) +[2023-10-09 05:32:45,173][60143] Updated weights for policy 0, policy_version 33942 (0.0009) +[2023-10-09 05:32:45,535][60143] Updated weights for policy 0, policy_version 33952 (0.0009) +[2023-10-09 05:32:45,795][60144] Updated weights for policy 1, policy_version 34312 (0.0010) +[2023-10-09 05:32:46,052][59242] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 69894144. Throughput: 0: 1704.5, 1: 1730.4. Samples: 17484518. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 05:32:46,053][59242] Avg episode reward: [(0, '29.240'), (1, '26.780')] +[2023-10-09 05:32:46,169][60144] Updated weights for policy 1, policy_version 34322 (0.0009) +[2023-10-09 05:32:46,529][60144] Updated weights for policy 1, policy_version 34332 (0.0010) +[2023-10-09 05:32:49,553][60143] Updated weights for policy 0, policy_version 33962 (0.0008) +[2023-10-09 05:32:49,929][60143] Updated weights for policy 0, policy_version 33972 (0.0009) +[2023-10-09 05:32:50,309][60143] Updated weights for policy 0, policy_version 33982 (0.0009) +[2023-10-09 05:32:50,444][60144] Updated weights for policy 1, policy_version 34342 (0.0009) +[2023-10-09 05:32:50,806][60144] Updated weights for policy 1, policy_version 34352 (0.0008) +[2023-10-09 05:32:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 69959680. Throughput: 0: 1730.7, 1: 1718.4. Samples: 17495136. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 05:32:51,053][59242] Avg episode reward: [(0, '28.860'), (1, '26.990')] +[2023-10-09 05:32:51,185][60144] Updated weights for policy 1, policy_version 34362 (0.0007) +[2023-10-09 05:32:54,447][60143] Updated weights for policy 0, policy_version 33992 (0.0007) +[2023-10-09 05:32:54,827][60143] Updated weights for policy 0, policy_version 34002 (0.0008) +[2023-10-09 05:32:55,158][60144] Updated weights for policy 1, policy_version 34372 (0.0008) +[2023-10-09 05:32:55,196][60143] Updated weights for policy 0, policy_version 34012 (0.0007) +[2023-10-09 05:32:55,527][60144] Updated weights for policy 1, policy_version 34382 (0.0010) +[2023-10-09 05:32:55,896][60144] Updated weights for policy 1, policy_version 34392 (0.0008) +[2023-10-09 05:32:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 70025216. Throughput: 0: 1715.4, 1: 1737.2. Samples: 17515700. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:32:56,053][59242] Avg episode reward: [(0, '28.230'), (1, '26.900')] +[2023-10-09 05:32:59,186][60143] Updated weights for policy 0, policy_version 34022 (0.0008) +[2023-10-09 05:32:59,553][60143] Updated weights for policy 0, policy_version 34032 (0.0008) +[2023-10-09 05:32:59,713][60144] Updated weights for policy 1, policy_version 34402 (0.0008) +[2023-10-09 05:32:59,920][60143] Updated weights for policy 0, policy_version 34042 (0.0007) +[2023-10-09 05:33:00,085][60144] Updated weights for policy 1, policy_version 34412 (0.0007) +[2023-10-09 05:33:00,455][60144] Updated weights for policy 1, policy_version 34422 (0.0007) +[2023-10-09 05:33:00,819][60144] Updated weights for policy 1, policy_version 34432 (0.0008) +[2023-10-09 05:33:01,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 70123520. Throughput: 0: 1692.0, 1: 1712.9. Samples: 17535070. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:33:01,053][59242] Avg episode reward: [(0, '26.900'), (1, '27.600')] +[2023-10-09 05:33:03,982][60143] Updated weights for policy 0, policy_version 34052 (0.0009) +[2023-10-09 05:33:04,349][60143] Updated weights for policy 0, policy_version 34062 (0.0010) +[2023-10-09 05:33:04,714][60143] Updated weights for policy 0, policy_version 34072 (0.0009) +[2023-10-09 05:33:04,976][60144] Updated weights for policy 1, policy_version 34442 (0.0007) +[2023-10-09 05:33:05,348][60144] Updated weights for policy 1, policy_version 34452 (0.0009) +[2023-10-09 05:33:05,722][60144] Updated weights for policy 1, policy_version 34462 (0.0010) +[2023-10-09 05:33:06,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 70189056. Throughput: 0: 1717.7, 1: 1737.6. Samples: 17546636. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:33:06,053][59242] Avg episode reward: [(0, '27.620'), (1, '27.400')] +[2023-10-09 05:33:08,826][60143] Updated weights for policy 0, policy_version 34082 (0.0009) +[2023-10-09 05:33:09,203][60143] Updated weights for policy 0, policy_version 34092 (0.0010) +[2023-10-09 05:33:09,566][60143] Updated weights for policy 0, policy_version 34102 (0.0008) +[2023-10-09 05:33:09,638][60144] Updated weights for policy 1, policy_version 34472 (0.0007) +[2023-10-09 05:33:09,926][60143] Updated weights for policy 0, policy_version 34112 (0.0008) +[2023-10-09 05:33:10,004][60144] Updated weights for policy 1, policy_version 34482 (0.0007) +[2023-10-09 05:33:10,382][60144] Updated weights for policy 1, policy_version 34492 (0.0008) +[2023-10-09 05:33:11,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 70254592. Throughput: 0: 1694.6, 1: 1730.1. Samples: 17566674. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:33:11,053][59242] Avg episode reward: [(0, '28.350'), (1, '27.480')] +[2023-10-09 05:33:13,943][60143] Updated weights for policy 0, policy_version 34122 (0.0009) +[2023-10-09 05:33:14,159][60144] Updated weights for policy 1, policy_version 34502 (0.0008) +[2023-10-09 05:33:14,309][60143] Updated weights for policy 0, policy_version 34132 (0.0008) +[2023-10-09 05:33:14,526][60144] Updated weights for policy 1, policy_version 34512 (0.0008) +[2023-10-09 05:33:14,675][60143] Updated weights for policy 0, policy_version 34142 (0.0007) +[2023-10-09 05:33:14,899][60144] Updated weights for policy 1, policy_version 34522 (0.0008) +[2023-10-09 05:33:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 70320128. Throughput: 0: 1688.4, 1: 1709.2. Samples: 17586406. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:33:16,053][59242] Avg episode reward: [(0, '28.610'), (1, '27.300')] +[2023-10-09 05:33:18,758][60143] Updated weights for policy 0, policy_version 34152 (0.0007) +[2023-10-09 05:33:18,893][60144] Updated weights for policy 1, policy_version 34532 (0.0007) +[2023-10-09 05:33:19,130][60143] Updated weights for policy 0, policy_version 34162 (0.0007) +[2023-10-09 05:33:19,255][60144] Updated weights for policy 1, policy_version 34542 (0.0007) +[2023-10-09 05:33:19,511][60143] Updated weights for policy 0, policy_version 34172 (0.0009) +[2023-10-09 05:33:19,618][60144] Updated weights for policy 1, policy_version 34552 (0.0007) +[2023-10-09 05:33:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 70385664. Throughput: 0: 1706.2, 1: 1740.0. Samples: 17598376. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:33:21,053][59242] Avg episode reward: [(0, '28.020'), (1, '27.830')] +[2023-10-09 05:33:23,500][60143] Updated weights for policy 0, policy_version 34182 (0.0008) +[2023-10-09 05:33:23,501][60144] Updated weights for policy 1, policy_version 34562 (0.0009) +[2023-10-09 05:33:23,863][60143] Updated weights for policy 0, policy_version 34192 (0.0010) +[2023-10-09 05:33:23,871][60144] Updated weights for policy 1, policy_version 34572 (0.0008) +[2023-10-09 05:33:24,233][60144] Updated weights for policy 1, policy_version 34582 (0.0007) +[2023-10-09 05:33:24,244][60143] Updated weights for policy 0, policy_version 34202 (0.0008) +[2023-10-09 05:33:24,600][60144] Updated weights for policy 1, policy_version 34592 (0.0007) +[2023-10-09 05:33:26,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 70451200. Throughput: 0: 1673.9, 1: 1720.5. Samples: 17617128. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:33:26,054][59242] Avg episode reward: [(0, '28.430'), (1, '27.610')] +[2023-10-09 05:33:28,169][60143] Updated weights for policy 0, policy_version 34212 (0.0009) +[2023-10-09 05:33:28,536][60143] Updated weights for policy 0, policy_version 34222 (0.0010) +[2023-10-09 05:33:28,563][60144] Updated weights for policy 1, policy_version 34602 (0.0008) +[2023-10-09 05:33:28,895][60143] Updated weights for policy 0, policy_version 34232 (0.0007) +[2023-10-09 05:33:28,925][60144] Updated weights for policy 1, policy_version 34612 (0.0009) +[2023-10-09 05:33:29,288][60144] Updated weights for policy 1, policy_version 34622 (0.0008) +[2023-10-09 05:33:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 70516736. Throughput: 0: 1699.5, 1: 1714.8. Samples: 17638162. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:33:31,053][59242] Avg episode reward: [(0, '29.260'), (1, '27.030')] +[2023-10-09 05:33:31,060][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000034624_35454976.pth... +[2023-10-09 05:33:31,060][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000034240_35061760.pth... +[2023-10-09 05:33:31,095][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000032640_33423360.pth +[2023-10-09 05:33:31,098][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000033024_33816576.pth +[2023-10-09 05:33:32,826][60143] Updated weights for policy 0, policy_version 34242 (0.0009) +[2023-10-09 05:33:33,206][60143] Updated weights for policy 0, policy_version 34252 (0.0007) +[2023-10-09 05:33:33,250][60144] Updated weights for policy 1, policy_version 34632 (0.0008) +[2023-10-09 05:33:33,571][60143] Updated weights for policy 0, policy_version 34262 (0.0009) +[2023-10-09 05:33:33,617][60144] Updated weights for policy 1, policy_version 34642 (0.0009) +[2023-10-09 05:33:33,940][60143] Updated weights for policy 0, policy_version 34272 (0.0007) +[2023-10-09 05:33:33,980][60144] Updated weights for policy 1, policy_version 34652 (0.0009) +[2023-10-09 05:33:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 70582272. Throughput: 0: 1683.4, 1: 1724.3. Samples: 17648484. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:33:36,053][59242] Avg episode reward: [(0, '29.170'), (1, '27.730')] +[2023-10-09 05:33:37,913][60144] Updated weights for policy 1, policy_version 34662 (0.0008) +[2023-10-09 05:33:38,129][60143] Updated weights for policy 0, policy_version 34282 (0.0008) +[2023-10-09 05:33:38,283][60144] Updated weights for policy 1, policy_version 34672 (0.0009) +[2023-10-09 05:33:38,491][60143] Updated weights for policy 0, policy_version 34292 (0.0009) +[2023-10-09 05:33:38,649][60144] Updated weights for policy 1, policy_version 34682 (0.0007) +[2023-10-09 05:33:38,861][60143] Updated weights for policy 0, policy_version 34302 (0.0008) +[2023-10-09 05:33:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 70647808. Throughput: 0: 1681.0, 1: 1710.1. Samples: 17668302. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:33:41,053][59242] Avg episode reward: [(0, '28.840'), (1, '27.440')] +[2023-10-09 05:33:42,686][60144] Updated weights for policy 1, policy_version 34692 (0.0008) +[2023-10-09 05:33:42,931][60143] Updated weights for policy 0, policy_version 34312 (0.0009) +[2023-10-09 05:33:43,054][60144] Updated weights for policy 1, policy_version 34702 (0.0007) +[2023-10-09 05:33:43,288][60143] Updated weights for policy 0, policy_version 34322 (0.0009) +[2023-10-09 05:33:43,426][60144] Updated weights for policy 1, policy_version 34712 (0.0008) +[2023-10-09 05:33:43,663][60143] Updated weights for policy 0, policy_version 34332 (0.0007) +[2023-10-09 05:33:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 70713344. Throughput: 0: 1699.8, 1: 1732.1. Samples: 17689504. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:33:46,053][59242] Avg episode reward: [(0, '27.250'), (1, '27.320')] +[2023-10-09 05:33:47,457][60144] Updated weights for policy 1, policy_version 34722 (0.0007) +[2023-10-09 05:33:47,655][60143] Updated weights for policy 0, policy_version 34342 (0.0009) +[2023-10-09 05:33:47,826][60144] Updated weights for policy 1, policy_version 34732 (0.0009) +[2023-10-09 05:33:48,014][60143] Updated weights for policy 0, policy_version 34352 (0.0009) +[2023-10-09 05:33:48,190][60144] Updated weights for policy 1, policy_version 34742 (0.0007) +[2023-10-09 05:33:48,386][60143] Updated weights for policy 0, policy_version 34362 (0.0009) +[2023-10-09 05:33:48,563][60144] Updated weights for policy 1, policy_version 34752 (0.0008) +[2023-10-09 05:33:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 70778880. Throughput: 0: 1673.0, 1: 1709.2. Samples: 17698836. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:33:51,053][59242] Avg episode reward: [(0, '28.690'), (1, '26.130')] +[2023-10-09 05:33:52,512][60143] Updated weights for policy 0, policy_version 34372 (0.0009) +[2023-10-09 05:33:52,764][60144] Updated weights for policy 1, policy_version 34762 (0.0008) +[2023-10-09 05:33:52,877][60143] Updated weights for policy 0, policy_version 34382 (0.0007) +[2023-10-09 05:33:53,127][60144] Updated weights for policy 1, policy_version 34772 (0.0007) +[2023-10-09 05:33:53,244][60143] Updated weights for policy 0, policy_version 34392 (0.0007) +[2023-10-09 05:33:53,500][60144] Updated weights for policy 1, policy_version 34782 (0.0007) +[2023-10-09 05:33:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 70844416. Throughput: 0: 1684.2, 1: 1709.6. Samples: 17719394. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:33:56,053][59242] Avg episode reward: [(0, '29.240'), (1, '26.060')] +[2023-10-09 05:33:57,109][60143] Updated weights for policy 0, policy_version 34402 (0.0009) +[2023-10-09 05:33:57,420][60144] Updated weights for policy 1, policy_version 34792 (0.0008) +[2023-10-09 05:33:57,484][60143] Updated weights for policy 0, policy_version 34412 (0.0008) +[2023-10-09 05:33:57,788][60144] Updated weights for policy 1, policy_version 34802 (0.0008) +[2023-10-09 05:33:57,857][60143] Updated weights for policy 0, policy_version 34422 (0.0008) +[2023-10-09 05:33:58,155][60144] Updated weights for policy 1, policy_version 34812 (0.0010) +[2023-10-09 05:33:58,227][60143] Updated weights for policy 0, policy_version 34432 (0.0007) +[2023-10-09 05:34:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13662.6). Total num frames: 70909952. Throughput: 0: 1695.1, 1: 1731.8. Samples: 17740614. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:34:01,053][59242] Avg episode reward: [(0, '28.440'), (1, '27.150')] +[2023-10-09 05:34:02,032][60144] Updated weights for policy 1, policy_version 34822 (0.0008) +[2023-10-09 05:34:02,253][60143] Updated weights for policy 0, policy_version 34442 (0.0007) +[2023-10-09 05:34:02,395][60144] Updated weights for policy 1, policy_version 34832 (0.0008) +[2023-10-09 05:34:02,614][60143] Updated weights for policy 0, policy_version 34452 (0.0008) +[2023-10-09 05:34:02,761][60144] Updated weights for policy 1, policy_version 34842 (0.0008) +[2023-10-09 05:34:02,985][60143] Updated weights for policy 0, policy_version 34462 (0.0009) +[2023-10-09 05:34:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 70975488. Throughput: 0: 1664.1, 1: 1700.9. Samples: 17749802. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:34:06,053][59242] Avg episode reward: [(0, '27.480'), (1, '28.110')] +[2023-10-09 05:34:06,648][60144] Updated weights for policy 1, policy_version 34852 (0.0008) +[2023-10-09 05:34:07,021][60144] Updated weights for policy 1, policy_version 34862 (0.0008) +[2023-10-09 05:34:07,288][60143] Updated weights for policy 0, policy_version 34472 (0.0010) +[2023-10-09 05:34:07,382][60144] Updated weights for policy 1, policy_version 34872 (0.0007) +[2023-10-09 05:34:07,649][60143] Updated weights for policy 0, policy_version 34482 (0.0008) +[2023-10-09 05:34:08,016][60143] Updated weights for policy 0, policy_version 34492 (0.0007) +[2023-10-09 05:34:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 71041024. Throughput: 0: 1694.8, 1: 1727.5. Samples: 17771128. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:34:11,053][59242] Avg episode reward: [(0, '27.490'), (1, '27.610')] +[2023-10-09 05:34:11,345][60144] Updated weights for policy 1, policy_version 34882 (0.0007) +[2023-10-09 05:34:11,715][60144] Updated weights for policy 1, policy_version 34892 (0.0007) +[2023-10-09 05:34:12,033][60143] Updated weights for policy 0, policy_version 34502 (0.0008) +[2023-10-09 05:34:12,082][60144] Updated weights for policy 1, policy_version 34902 (0.0009) +[2023-10-09 05:34:12,389][60143] Updated weights for policy 0, policy_version 34512 (0.0008) +[2023-10-09 05:34:12,448][60144] Updated weights for policy 1, policy_version 34912 (0.0007) +[2023-10-09 05:34:12,758][60143] Updated weights for policy 0, policy_version 34522 (0.0009) +[2023-10-09 05:34:16,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 71106560. Throughput: 0: 1694.8, 1: 1733.4. Samples: 17792430. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:34:16,053][59242] Avg episode reward: [(0, '27.190'), (1, '27.880')] +[2023-10-09 05:34:16,319][60144] Updated weights for policy 1, policy_version 34922 (0.0009) +[2023-10-09 05:34:16,622][60143] Updated weights for policy 0, policy_version 34532 (0.0008) +[2023-10-09 05:34:16,685][60144] Updated weights for policy 1, policy_version 34932 (0.0007) +[2023-10-09 05:34:16,997][60143] Updated weights for policy 0, policy_version 34542 (0.0007) +[2023-10-09 05:34:17,059][60144] Updated weights for policy 1, policy_version 34942 (0.0007) +[2023-10-09 05:34:17,361][60143] Updated weights for policy 0, policy_version 34552 (0.0007) +[2023-10-09 05:34:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 71172096. Throughput: 0: 1681.0, 1: 1719.6. Samples: 17801512. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:34:21,053][59242] Avg episode reward: [(0, '28.570'), (1, '26.610')] +[2023-10-09 05:34:21,093][60144] Updated weights for policy 1, policy_version 34952 (0.0010) +[2023-10-09 05:34:21,465][60144] Updated weights for policy 1, policy_version 34962 (0.0008) +[2023-10-09 05:34:21,478][60143] Updated weights for policy 0, policy_version 34562 (0.0008) +[2023-10-09 05:34:21,829][60144] Updated weights for policy 1, policy_version 34972 (0.0007) +[2023-10-09 05:34:21,848][60143] Updated weights for policy 0, policy_version 34572 (0.0007) +[2023-10-09 05:34:22,223][60143] Updated weights for policy 0, policy_version 34582 (0.0009) +[2023-10-09 05:34:22,586][60143] Updated weights for policy 0, policy_version 34592 (0.0007) +[2023-10-09 05:34:25,674][60144] Updated weights for policy 1, policy_version 34982 (0.0010) +[2023-10-09 05:34:26,044][60144] Updated weights for policy 1, policy_version 34992 (0.0008) +[2023-10-09 05:34:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 71237632. Throughput: 0: 1699.6, 1: 1732.8. Samples: 17822760. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:34:26,053][59242] Avg episode reward: [(0, '28.490'), (1, '25.240')] +[2023-10-09 05:34:26,409][60144] Updated weights for policy 1, policy_version 35002 (0.0007) +[2023-10-09 05:34:26,532][60143] Updated weights for policy 0, policy_version 34602 (0.0009) +[2023-10-09 05:34:26,901][60143] Updated weights for policy 0, policy_version 34612 (0.0007) +[2023-10-09 05:34:27,267][60143] Updated weights for policy 0, policy_version 34622 (0.0007) +[2023-10-09 05:34:30,190][60144] Updated weights for policy 1, policy_version 35012 (0.0009) +[2023-10-09 05:34:30,558][60144] Updated weights for policy 1, policy_version 35022 (0.0010) +[2023-10-09 05:34:30,926][60144] Updated weights for policy 1, policy_version 35032 (0.0008) +[2023-10-09 05:34:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 71303168. Throughput: 0: 1703.3, 1: 1721.4. Samples: 17843614. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:34:31,053][59242] Avg episode reward: [(0, '28.870'), (1, '24.910')] +[2023-10-09 05:34:31,289][60143] Updated weights for policy 0, policy_version 34632 (0.0009) +[2023-10-09 05:34:31,658][60143] Updated weights for policy 0, policy_version 34642 (0.0008) +[2023-10-09 05:34:32,023][60143] Updated weights for policy 0, policy_version 34652 (0.0008) +[2023-10-09 05:34:34,876][60144] Updated weights for policy 1, policy_version 35042 (0.0008) +[2023-10-09 05:34:35,247][60144] Updated weights for policy 1, policy_version 35052 (0.0010) +[2023-10-09 05:34:35,611][60144] Updated weights for policy 1, policy_version 35062 (0.0009) +[2023-10-09 05:34:35,970][60144] Updated weights for policy 1, policy_version 35072 (0.0007) +[2023-10-09 05:34:36,004][60143] Updated weights for policy 0, policy_version 34662 (0.0008) +[2023-10-09 05:34:36,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 71401472. Throughput: 0: 1697.8, 1: 1737.9. Samples: 17853442. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:34:36,053][59242] Avg episode reward: [(0, '29.370'), (1, '26.640')] +[2023-10-09 05:34:36,377][60143] Updated weights for policy 0, policy_version 34672 (0.0007) +[2023-10-09 05:34:36,745][60143] Updated weights for policy 0, policy_version 34682 (0.0008) +[2023-10-09 05:34:39,936][60144] Updated weights for policy 1, policy_version 35082 (0.0009) +[2023-10-09 05:34:40,301][60144] Updated weights for policy 1, policy_version 35092 (0.0008) +[2023-10-09 05:34:40,668][60144] Updated weights for policy 1, policy_version 35102 (0.0010) +[2023-10-09 05:34:40,803][60143] Updated weights for policy 0, policy_version 34692 (0.0007) +[2023-10-09 05:34:41,052][59242] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 71467008. Throughput: 0: 1702.1, 1: 1745.5. Samples: 17874538. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:34:41,053][59242] Avg episode reward: [(0, '29.900'), (1, '25.570')] +[2023-10-09 05:34:41,169][60143] Updated weights for policy 0, policy_version 34702 (0.0007) +[2023-10-09 05:34:41,540][60143] Updated weights for policy 0, policy_version 34712 (0.0008) +[2023-10-09 05:34:44,709][60144] Updated weights for policy 1, policy_version 35112 (0.0007) +[2023-10-09 05:34:45,078][60144] Updated weights for policy 1, policy_version 35122 (0.0007) +[2023-10-09 05:34:45,442][60144] Updated weights for policy 1, policy_version 35132 (0.0008) +[2023-10-09 05:34:45,509][60143] Updated weights for policy 0, policy_version 34722 (0.0007) +[2023-10-09 05:34:45,874][60143] Updated weights for policy 0, policy_version 34732 (0.0007) +[2023-10-09 05:34:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 71532544. Throughput: 0: 1704.9, 1: 1711.1. Samples: 17894334. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:34:46,053][59242] Avg episode reward: [(0, '30.870'), (1, '25.110')] +[2023-10-09 05:34:46,239][60143] Updated weights for policy 0, policy_version 34742 (0.0009) +[2023-10-09 05:34:46,613][60143] Updated weights for policy 0, policy_version 34752 (0.0009) +[2023-10-09 05:34:49,399][60144] Updated weights for policy 1, policy_version 35142 (0.0007) +[2023-10-09 05:34:49,767][60144] Updated weights for policy 1, policy_version 35152 (0.0008) +[2023-10-09 05:34:50,147][60144] Updated weights for policy 1, policy_version 35162 (0.0007) +[2023-10-09 05:34:50,612][60143] Updated weights for policy 0, policy_version 34762 (0.0008) +[2023-10-09 05:34:50,987][60143] Updated weights for policy 0, policy_version 34772 (0.0009) +[2023-10-09 05:34:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 71598080. Throughput: 0: 1710.3, 1: 1739.9. Samples: 17905058. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:34:51,053][59242] Avg episode reward: [(0, '30.560'), (1, '25.840')] +[2023-10-09 05:34:51,349][60143] Updated weights for policy 0, policy_version 34782 (0.0007) +[2023-10-09 05:34:54,024][60144] Updated weights for policy 1, policy_version 35172 (0.0008) +[2023-10-09 05:34:54,387][60144] Updated weights for policy 1, policy_version 35182 (0.0007) +[2023-10-09 05:34:54,762][60144] Updated weights for policy 1, policy_version 35192 (0.0008) +[2023-10-09 05:34:55,400][60143] Updated weights for policy 0, policy_version 34792 (0.0011) +[2023-10-09 05:34:55,764][60143] Updated weights for policy 0, policy_version 34802 (0.0011) +[2023-10-09 05:34:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 71663616. Throughput: 0: 1710.1, 1: 1722.5. Samples: 17925592. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:34:56,053][59242] Avg episode reward: [(0, '31.070'), (1, '25.380')] +[2023-10-09 05:34:56,126][60143] Updated weights for policy 0, policy_version 34812 (0.0008) +[2023-10-09 05:34:58,604][60144] Updated weights for policy 1, policy_version 35202 (0.0009) +[2023-10-09 05:34:58,975][60144] Updated weights for policy 1, policy_version 35212 (0.0007) +[2023-10-09 05:34:59,339][60144] Updated weights for policy 1, policy_version 35222 (0.0007) +[2023-10-09 05:34:59,706][60144] Updated weights for policy 1, policy_version 35232 (0.0007) +[2023-10-09 05:35:00,231][60143] Updated weights for policy 0, policy_version 34822 (0.0009) +[2023-10-09 05:35:00,606][60143] Updated weights for policy 0, policy_version 34832 (0.0010) +[2023-10-09 05:35:00,969][60143] Updated weights for policy 0, policy_version 34842 (0.0010) +[2023-10-09 05:35:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 71729152. Throughput: 0: 1700.3, 1: 1714.4. Samples: 17946092. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-09 05:35:01,053][59242] Avg episode reward: [(0, '30.760'), (1, '24.820')] +[2023-10-09 05:35:03,522][60144] Updated weights for policy 1, policy_version 35242 (0.0007) +[2023-10-09 05:35:03,894][60144] Updated weights for policy 1, policy_version 35252 (0.0010) +[2023-10-09 05:35:04,271][60144] Updated weights for policy 1, policy_version 35262 (0.0009) +[2023-10-09 05:35:04,961][60143] Updated weights for policy 0, policy_version 34852 (0.0010) +[2023-10-09 05:35:05,325][60143] Updated weights for policy 0, policy_version 34862 (0.0010) +[2023-10-09 05:35:05,693][60143] Updated weights for policy 0, policy_version 34872 (0.0010) +[2023-10-09 05:35:06,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 71827456. Throughput: 0: 1713.7, 1: 1735.3. Samples: 17956720. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-09 05:35:06,053][59242] Avg episode reward: [(0, '30.520'), (1, '25.730')] +[2023-10-09 05:35:08,190][60144] Updated weights for policy 1, policy_version 35272 (0.0009) +[2023-10-09 05:35:08,565][60144] Updated weights for policy 1, policy_version 35282 (0.0008) +[2023-10-09 05:35:08,933][60144] Updated weights for policy 1, policy_version 35292 (0.0010) +[2023-10-09 05:35:09,615][60143] Updated weights for policy 0, policy_version 34882 (0.0009) +[2023-10-09 05:35:09,982][60143] Updated weights for policy 0, policy_version 34892 (0.0011) +[2023-10-09 05:35:10,358][60143] Updated weights for policy 0, policy_version 34902 (0.0007) +[2023-10-09 05:35:10,718][60143] Updated weights for policy 0, policy_version 34912 (0.0007) +[2023-10-09 05:35:11,052][59242] Fps is (10 sec: 16384.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 71892992. Throughput: 0: 1710.1, 1: 1716.4. Samples: 17976948. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-09 05:35:11,053][59242] Avg episode reward: [(0, '30.550'), (1, '26.410')] +[2023-10-09 05:35:13,008][60144] Updated weights for policy 1, policy_version 35302 (0.0009) +[2023-10-09 05:35:13,377][60144] Updated weights for policy 1, policy_version 35312 (0.0007) +[2023-10-09 05:35:13,748][60144] Updated weights for policy 1, policy_version 35322 (0.0007) +[2023-10-09 05:35:14,699][60143] Updated weights for policy 0, policy_version 34922 (0.0011) +[2023-10-09 05:35:15,079][60143] Updated weights for policy 0, policy_version 34932 (0.0009) +[2023-10-09 05:35:15,444][60143] Updated weights for policy 0, policy_version 34942 (0.0008) +[2023-10-09 05:35:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 71958528. Throughput: 0: 1678.4, 1: 1729.7. Samples: 17996982. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-09 05:35:16,053][59242] Avg episode reward: [(0, '29.990'), (1, '24.490')] +[2023-10-09 05:35:17,560][60144] Updated weights for policy 1, policy_version 35332 (0.0007) +[2023-10-09 05:35:17,928][60144] Updated weights for policy 1, policy_version 35342 (0.0007) +[2023-10-09 05:35:18,294][60144] Updated weights for policy 1, policy_version 35352 (0.0008) +[2023-10-09 05:35:19,474][60143] Updated weights for policy 0, policy_version 34952 (0.0008) +[2023-10-09 05:35:19,857][60143] Updated weights for policy 0, policy_version 34962 (0.0011) +[2023-10-09 05:35:20,224][60143] Updated weights for policy 0, policy_version 34972 (0.0009) +[2023-10-09 05:35:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 72024064. Throughput: 0: 1714.3, 1: 1714.9. Samples: 18007754. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:35:21,052][59242] Avg episode reward: [(0, '28.330'), (1, '24.980')] +[2023-10-09 05:35:22,219][60144] Updated weights for policy 1, policy_version 35362 (0.0008) +[2023-10-09 05:35:22,581][60144] Updated weights for policy 1, policy_version 35372 (0.0007) +[2023-10-09 05:35:22,952][60144] Updated weights for policy 1, policy_version 35382 (0.0007) +[2023-10-09 05:35:23,323][60144] Updated weights for policy 1, policy_version 35392 (0.0007) +[2023-10-09 05:35:24,208][60143] Updated weights for policy 0, policy_version 34982 (0.0008) +[2023-10-09 05:35:24,574][60143] Updated weights for policy 0, policy_version 34992 (0.0007) +[2023-10-09 05:35:24,939][60143] Updated weights for policy 0, policy_version 35002 (0.0011) +[2023-10-09 05:35:26,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 72089600. Throughput: 0: 1700.2, 1: 1714.4. Samples: 18028192. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:35:26,053][59242] Avg episode reward: [(0, '28.080'), (1, '25.470')] +[2023-10-09 05:35:27,339][60144] Updated weights for policy 1, policy_version 35402 (0.0007) +[2023-10-09 05:35:27,708][60144] Updated weights for policy 1, policy_version 35412 (0.0007) +[2023-10-09 05:35:28,071][60144] Updated weights for policy 1, policy_version 35422 (0.0008) +[2023-10-09 05:35:28,945][60143] Updated weights for policy 0, policy_version 35012 (0.0010) +[2023-10-09 05:35:29,319][60143] Updated weights for policy 0, policy_version 35022 (0.0009) +[2023-10-09 05:35:29,684][60143] Updated weights for policy 0, policy_version 35032 (0.0008) +[2023-10-09 05:35:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 72155136. Throughput: 0: 1677.7, 1: 1750.0. Samples: 18048582. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:35:31,053][59242] Avg episode reward: [(0, '28.460'), (1, '25.350')] +[2023-10-09 05:35:31,064][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000035424_36274176.pth... +[2023-10-09 05:35:31,065][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000035040_35880960.pth... +[2023-10-09 05:35:31,099][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000033824_34635776.pth +[2023-10-09 05:35:31,105][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000033440_34242560.pth +[2023-10-09 05:35:31,924][60144] Updated weights for policy 1, policy_version 35432 (0.0009) +[2023-10-09 05:35:32,283][60144] Updated weights for policy 1, policy_version 35442 (0.0010) +[2023-10-09 05:35:32,648][60144] Updated weights for policy 1, policy_version 35452 (0.0010) +[2023-10-09 05:35:33,662][60143] Updated weights for policy 0, policy_version 35042 (0.0007) +[2023-10-09 05:35:34,040][60143] Updated weights for policy 0, policy_version 35052 (0.0009) +[2023-10-09 05:35:34,410][60143] Updated weights for policy 0, policy_version 35062 (0.0009) +[2023-10-09 05:35:34,769][60143] Updated weights for policy 0, policy_version 35072 (0.0008) +[2023-10-09 05:35:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 72220672. Throughput: 0: 1710.4, 1: 1715.7. Samples: 18059230. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:35:36,053][59242] Avg episode reward: [(0, '28.760'), (1, '25.350')] +[2023-10-09 05:35:36,750][60144] Updated weights for policy 1, policy_version 35462 (0.0009) +[2023-10-09 05:35:37,119][60144] Updated weights for policy 1, policy_version 35472 (0.0008) +[2023-10-09 05:35:37,489][60144] Updated weights for policy 1, policy_version 35482 (0.0008) +[2023-10-09 05:35:38,683][60143] Updated weights for policy 0, policy_version 35082 (0.0010) +[2023-10-09 05:35:39,059][60143] Updated weights for policy 0, policy_version 35092 (0.0010) +[2023-10-09 05:35:39,427][60143] Updated weights for policy 0, policy_version 35102 (0.0011) +[2023-10-09 05:35:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 72286208. Throughput: 0: 1688.0, 1: 1728.8. Samples: 18079348. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:35:41,052][59242] Avg episode reward: [(0, '29.300'), (1, '26.490')] +[2023-10-09 05:35:41,560][60144] Updated weights for policy 1, policy_version 35492 (0.0010) +[2023-10-09 05:35:41,918][60144] Updated weights for policy 1, policy_version 35502 (0.0008) +[2023-10-09 05:35:42,290][60144] Updated weights for policy 1, policy_version 35512 (0.0008) +[2023-10-09 05:35:43,425][60143] Updated weights for policy 0, policy_version 35112 (0.0008) +[2023-10-09 05:35:43,788][60143] Updated weights for policy 0, policy_version 35122 (0.0008) +[2023-10-09 05:35:44,162][60143] Updated weights for policy 0, policy_version 35132 (0.0010) +[2023-10-09 05:35:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 72351744. Throughput: 0: 1695.7, 1: 1733.9. Samples: 18100424. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 05:35:46,053][59242] Avg episode reward: [(0, '28.370'), (1, '27.330')] +[2023-10-09 05:35:46,368][60144] Updated weights for policy 1, policy_version 35522 (0.0009) +[2023-10-09 05:35:46,731][60144] Updated weights for policy 1, policy_version 35532 (0.0009) +[2023-10-09 05:35:47,099][60144] Updated weights for policy 1, policy_version 35542 (0.0007) +[2023-10-09 05:35:47,470][60144] Updated weights for policy 1, policy_version 35552 (0.0008) +[2023-10-09 05:35:48,193][60143] Updated weights for policy 0, policy_version 35142 (0.0008) +[2023-10-09 05:35:48,564][60143] Updated weights for policy 0, policy_version 35152 (0.0008) +[2023-10-09 05:35:48,938][60143] Updated weights for policy 0, policy_version 35162 (0.0009) +[2023-10-09 05:35:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 72417280. Throughput: 0: 1703.4, 1: 1711.1. Samples: 18110374. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 05:35:51,053][59242] Avg episode reward: [(0, '28.560'), (1, '27.030')] +[2023-10-09 05:35:51,481][60144] Updated weights for policy 1, policy_version 35562 (0.0007) +[2023-10-09 05:35:51,854][60144] Updated weights for policy 1, policy_version 35572 (0.0010) +[2023-10-09 05:35:52,216][60144] Updated weights for policy 1, policy_version 35582 (0.0011) +[2023-10-09 05:35:53,000][60143] Updated weights for policy 0, policy_version 35172 (0.0008) +[2023-10-09 05:35:53,361][60143] Updated weights for policy 0, policy_version 35182 (0.0007) +[2023-10-09 05:35:53,734][60143] Updated weights for policy 0, policy_version 35192 (0.0008) +[2023-10-09 05:35:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 72482816. Throughput: 0: 1687.8, 1: 1731.8. Samples: 18130832. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 05:35:56,052][59242] Avg episode reward: [(0, '30.420'), (1, '26.430')] +[2023-10-09 05:35:56,206][60144] Updated weights for policy 1, policy_version 35592 (0.0008) +[2023-10-09 05:35:56,573][60144] Updated weights for policy 1, policy_version 35602 (0.0008) +[2023-10-09 05:35:56,937][60144] Updated weights for policy 1, policy_version 35612 (0.0007) +[2023-10-09 05:35:57,795][60143] Updated weights for policy 0, policy_version 35202 (0.0008) +[2023-10-09 05:35:58,158][60143] Updated weights for policy 0, policy_version 35212 (0.0011) +[2023-10-09 05:35:58,527][60143] Updated weights for policy 0, policy_version 35222 (0.0011) +[2023-10-09 05:35:58,894][60143] Updated weights for policy 0, policy_version 35232 (0.0011) +[2023-10-09 05:36:00,725][60144] Updated weights for policy 1, policy_version 35622 (0.0009) +[2023-10-09 05:36:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 72548352. Throughput: 0: 1712.1, 1: 1726.2. Samples: 18151704. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 05:36:01,052][59242] Avg episode reward: [(0, '29.440'), (1, '26.730')] +[2023-10-09 05:36:01,085][60144] Updated weights for policy 1, policy_version 35632 (0.0009) +[2023-10-09 05:36:01,464][60144] Updated weights for policy 1, policy_version 35642 (0.0010) +[2023-10-09 05:36:02,872][60143] Updated weights for policy 0, policy_version 35242 (0.0008) +[2023-10-09 05:36:03,241][60143] Updated weights for policy 0, policy_version 35252 (0.0007) +[2023-10-09 05:36:03,613][60143] Updated weights for policy 0, policy_version 35262 (0.0007) +[2023-10-09 05:36:05,405][60144] Updated weights for policy 1, policy_version 35652 (0.0011) +[2023-10-09 05:36:05,774][60144] Updated weights for policy 1, policy_version 35662 (0.0009) +[2023-10-09 05:36:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 72613888. Throughput: 0: 1688.2, 1: 1726.4. Samples: 18161412. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 05:36:06,053][59242] Avg episode reward: [(0, '28.320'), (1, '27.000')] +[2023-10-09 05:36:06,142][60144] Updated weights for policy 1, policy_version 35672 (0.0008) +[2023-10-09 05:36:07,643][60143] Updated weights for policy 0, policy_version 35272 (0.0008) +[2023-10-09 05:36:08,013][60143] Updated weights for policy 0, policy_version 35282 (0.0009) +[2023-10-09 05:36:08,376][60143] Updated weights for policy 0, policy_version 35292 (0.0010) +[2023-10-09 05:36:10,043][60144] Updated weights for policy 1, policy_version 35682 (0.0009) +[2023-10-09 05:36:10,406][60144] Updated weights for policy 1, policy_version 35692 (0.0008) +[2023-10-09 05:36:10,778][60144] Updated weights for policy 1, policy_version 35702 (0.0010) +[2023-10-09 05:36:11,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13662.6). Total num frames: 72679424. Throughput: 0: 1698.4, 1: 1726.6. Samples: 18182318. Policy #0 lag: (min: 22.0, avg: 28.6, max: 54.0) +[2023-10-09 05:36:11,053][59242] Avg episode reward: [(0, '27.860'), (1, '27.060')] +[2023-10-09 05:36:11,147][60144] Updated weights for policy 1, policy_version 35712 (0.0010) +[2023-10-09 05:36:12,516][60143] Updated weights for policy 0, policy_version 35302 (0.0010) +[2023-10-09 05:36:12,899][60143] Updated weights for policy 0, policy_version 35312 (0.0008) +[2023-10-09 05:36:13,272][60143] Updated weights for policy 0, policy_version 35322 (0.0009) +[2023-10-09 05:36:15,124][60144] Updated weights for policy 1, policy_version 35722 (0.0008) +[2023-10-09 05:36:15,485][60144] Updated weights for policy 1, policy_version 35732 (0.0008) +[2023-10-09 05:36:15,863][60144] Updated weights for policy 1, policy_version 35742 (0.0008) +[2023-10-09 05:36:16,052][59242] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 72777728. Throughput: 0: 1717.2, 1: 1701.1. Samples: 18202406. Policy #0 lag: (min: 22.0, avg: 28.6, max: 54.0) +[2023-10-09 05:36:16,053][59242] Avg episode reward: [(0, '26.040'), (1, '26.620')] +[2023-10-09 05:36:17,147][60143] Updated weights for policy 0, policy_version 35332 (0.0009) +[2023-10-09 05:36:17,518][60143] Updated weights for policy 0, policy_version 35342 (0.0008) +[2023-10-09 05:36:17,887][60143] Updated weights for policy 0, policy_version 35352 (0.0007) +[2023-10-09 05:36:19,748][60144] Updated weights for policy 1, policy_version 35752 (0.0008) +[2023-10-09 05:36:20,114][60144] Updated weights for policy 1, policy_version 35762 (0.0008) +[2023-10-09 05:36:20,486][60144] Updated weights for policy 1, policy_version 35772 (0.0008) +[2023-10-09 05:36:21,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 72843264. Throughput: 0: 1681.4, 1: 1725.1. Samples: 18212526. Policy #0 lag: (min: 22.0, avg: 28.6, max: 54.0) +[2023-10-09 05:36:21,053][59242] Avg episode reward: [(0, '25.650'), (1, '26.800')] +[2023-10-09 05:36:21,722][60143] Updated weights for policy 0, policy_version 35362 (0.0009) +[2023-10-09 05:36:22,104][60143] Updated weights for policy 0, policy_version 35372 (0.0011) +[2023-10-09 05:36:22,462][60143] Updated weights for policy 0, policy_version 35382 (0.0010) +[2023-10-09 05:36:22,834][60143] Updated weights for policy 0, policy_version 35392 (0.0010) +[2023-10-09 05:36:24,528][60144] Updated weights for policy 1, policy_version 35782 (0.0008) +[2023-10-09 05:36:24,894][60144] Updated weights for policy 1, policy_version 35792 (0.0007) +[2023-10-09 05:36:25,266][60144] Updated weights for policy 1, policy_version 35802 (0.0009) +[2023-10-09 05:36:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 72908800. Throughput: 0: 1710.0, 1: 1715.6. Samples: 18233498. Policy #0 lag: (min: 22.0, avg: 28.6, max: 54.0) +[2023-10-09 05:36:26,053][59242] Avg episode reward: [(0, '26.550'), (1, '28.870')] +[2023-10-09 05:36:26,846][60143] Updated weights for policy 0, policy_version 35402 (0.0009) +[2023-10-09 05:36:27,219][60143] Updated weights for policy 0, policy_version 35412 (0.0008) +[2023-10-09 05:36:27,588][60143] Updated weights for policy 0, policy_version 35422 (0.0008) +[2023-10-09 05:36:29,246][60144] Updated weights for policy 1, policy_version 35812 (0.0008) +[2023-10-09 05:36:29,619][60144] Updated weights for policy 1, policy_version 35822 (0.0007) +[2023-10-09 05:36:29,984][60144] Updated weights for policy 1, policy_version 35832 (0.0009) +[2023-10-09 05:36:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 72974336. Throughput: 0: 1716.0, 1: 1694.5. Samples: 18253900. Policy #0 lag: (min: 22.0, avg: 28.6, max: 54.0) +[2023-10-09 05:36:31,053][59242] Avg episode reward: [(0, '27.090'), (1, '27.360')] +[2023-10-09 05:36:31,544][60143] Updated weights for policy 0, policy_version 35432 (0.0007) +[2023-10-09 05:36:31,907][60143] Updated weights for policy 0, policy_version 35442 (0.0010) +[2023-10-09 05:36:32,280][60143] Updated weights for policy 0, policy_version 35452 (0.0008) +[2023-10-09 05:36:33,847][60144] Updated weights for policy 1, policy_version 35842 (0.0010) +[2023-10-09 05:36:34,218][60144] Updated weights for policy 1, policy_version 35852 (0.0011) +[2023-10-09 05:36:34,573][60144] Updated weights for policy 1, policy_version 35862 (0.0008) +[2023-10-09 05:36:34,946][60144] Updated weights for policy 1, policy_version 35872 (0.0008) +[2023-10-09 05:36:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 73039872. Throughput: 0: 1698.7, 1: 1731.5. Samples: 18264730. Policy #0 lag: (min: 4.0, avg: 6.9, max: 36.0) +[2023-10-09 05:36:36,052][59242] Avg episode reward: [(0, '26.710'), (1, '27.020')] +[2023-10-09 05:36:36,177][60143] Updated weights for policy 0, policy_version 35462 (0.0007) +[2023-10-09 05:36:36,544][60143] Updated weights for policy 0, policy_version 35472 (0.0008) +[2023-10-09 05:36:36,913][60143] Updated weights for policy 0, policy_version 35482 (0.0010) +[2023-10-09 05:36:38,862][60144] Updated weights for policy 1, policy_version 35882 (0.0009) +[2023-10-09 05:36:39,222][60144] Updated weights for policy 1, policy_version 35892 (0.0011) +[2023-10-09 05:36:39,601][60144] Updated weights for policy 1, policy_version 35902 (0.0010) +[2023-10-09 05:36:41,012][60143] Updated weights for policy 0, policy_version 35492 (0.0009) +[2023-10-09 05:36:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 73105408. Throughput: 0: 1717.7, 1: 1705.4. Samples: 18284872. Policy #0 lag: (min: 4.0, avg: 6.9, max: 36.0) +[2023-10-09 05:36:41,053][59242] Avg episode reward: [(0, '26.100'), (1, '27.270')] +[2023-10-09 05:36:41,377][60143] Updated weights for policy 0, policy_version 35502 (0.0008) +[2023-10-09 05:36:41,746][60143] Updated weights for policy 0, policy_version 35512 (0.0009) +[2023-10-09 05:36:43,641][60144] Updated weights for policy 1, policy_version 35912 (0.0007) +[2023-10-09 05:36:44,008][60144] Updated weights for policy 1, policy_version 35922 (0.0008) +[2023-10-09 05:36:44,376][60144] Updated weights for policy 1, policy_version 35932 (0.0007) +[2023-10-09 05:36:45,810][60143] Updated weights for policy 0, policy_version 35522 (0.0007) +[2023-10-09 05:36:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 73170944. Throughput: 0: 1722.7, 1: 1704.8. Samples: 18305942. Policy #0 lag: (min: 4.0, avg: 6.9, max: 36.0) +[2023-10-09 05:36:46,052][59242] Avg episode reward: [(0, '27.030'), (1, '27.230')] +[2023-10-09 05:36:46,173][60143] Updated weights for policy 0, policy_version 35532 (0.0007) +[2023-10-09 05:36:46,546][60143] Updated weights for policy 0, policy_version 35542 (0.0007) +[2023-10-09 05:36:46,914][60143] Updated weights for policy 0, policy_version 35552 (0.0008) +[2023-10-09 05:36:48,317][60144] Updated weights for policy 1, policy_version 35942 (0.0008) +[2023-10-09 05:36:48,684][60144] Updated weights for policy 1, policy_version 35952 (0.0009) +[2023-10-09 05:36:49,055][60144] Updated weights for policy 1, policy_version 35962 (0.0010) +[2023-10-09 05:36:50,759][60143] Updated weights for policy 0, policy_version 35562 (0.0007) +[2023-10-09 05:36:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 73236480. Throughput: 0: 1713.8, 1: 1719.5. Samples: 18315914. Policy #0 lag: (min: 4.0, avg: 6.9, max: 36.0) +[2023-10-09 05:36:51,053][59242] Avg episode reward: [(0, '28.090'), (1, '27.590')] +[2023-10-09 05:36:51,126][60143] Updated weights for policy 0, policy_version 35572 (0.0007) +[2023-10-09 05:36:51,492][60143] Updated weights for policy 0, policy_version 35582 (0.0007) +[2023-10-09 05:36:53,047][60144] Updated weights for policy 1, policy_version 35972 (0.0009) +[2023-10-09 05:36:53,428][60144] Updated weights for policy 1, policy_version 35982 (0.0007) +[2023-10-09 05:36:53,798][60144] Updated weights for policy 1, policy_version 35992 (0.0008) +[2023-10-09 05:36:55,543][60143] Updated weights for policy 0, policy_version 35592 (0.0008) +[2023-10-09 05:36:55,913][60143] Updated weights for policy 0, policy_version 35602 (0.0011) +[2023-10-09 05:36:56,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 73302016. Throughput: 0: 1719.8, 1: 1700.4. Samples: 18336228. Policy #0 lag: (min: 4.0, avg: 6.9, max: 36.0) +[2023-10-09 05:36:56,053][59242] Avg episode reward: [(0, '29.070'), (1, '27.360')] +[2023-10-09 05:36:56,283][60143] Updated weights for policy 0, policy_version 35612 (0.0011) +[2023-10-09 05:36:57,580][60144] Updated weights for policy 1, policy_version 36002 (0.0009) +[2023-10-09 05:36:57,949][60144] Updated weights for policy 1, policy_version 36012 (0.0008) +[2023-10-09 05:36:58,317][60144] Updated weights for policy 1, policy_version 36022 (0.0007) +[2023-10-09 05:36:58,693][60144] Updated weights for policy 1, policy_version 36032 (0.0007) +[2023-10-09 05:37:00,290][60143] Updated weights for policy 0, policy_version 35622 (0.0010) +[2023-10-09 05:37:00,659][60143] Updated weights for policy 0, policy_version 35632 (0.0011) +[2023-10-09 05:37:01,032][60143] Updated weights for policy 0, policy_version 35642 (0.0010) +[2023-10-09 05:37:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 73367552. Throughput: 0: 1708.3, 1: 1727.2. Samples: 18357002. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:37:01,053][59242] Avg episode reward: [(0, '27.780'), (1, '26.860')] +[2023-10-09 05:37:02,714][60144] Updated weights for policy 1, policy_version 36042 (0.0007) +[2023-10-09 05:37:03,085][60144] Updated weights for policy 1, policy_version 36052 (0.0007) +[2023-10-09 05:37:03,454][60144] Updated weights for policy 1, policy_version 36062 (0.0008) +[2023-10-09 05:37:04,886][60143] Updated weights for policy 0, policy_version 35652 (0.0007) +[2023-10-09 05:37:05,258][60143] Updated weights for policy 0, policy_version 35662 (0.0008) +[2023-10-09 05:37:05,620][60143] Updated weights for policy 0, policy_version 35672 (0.0008) +[2023-10-09 05:37:06,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 73465856. Throughput: 0: 1724.3, 1: 1703.9. Samples: 18366792. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:37:06,053][59242] Avg episode reward: [(0, '27.940'), (1, '26.550')] +[2023-10-09 05:37:07,519][60144] Updated weights for policy 1, policy_version 36072 (0.0008) +[2023-10-09 05:37:07,881][60144] Updated weights for policy 1, policy_version 36082 (0.0007) +[2023-10-09 05:37:08,254][60144] Updated weights for policy 1, policy_version 36092 (0.0008) +[2023-10-09 05:37:09,756][60143] Updated weights for policy 0, policy_version 35682 (0.0010) +[2023-10-09 05:37:10,132][60143] Updated weights for policy 0, policy_version 35692 (0.0008) +[2023-10-09 05:37:10,502][60143] Updated weights for policy 0, policy_version 35702 (0.0008) +[2023-10-09 05:37:10,868][60143] Updated weights for policy 0, policy_version 35712 (0.0009) +[2023-10-09 05:37:11,052][59242] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 73531392. Throughput: 0: 1714.7, 1: 1707.0. Samples: 18387474. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:37:11,052][59242] Avg episode reward: [(0, '27.510'), (1, '27.950')] +[2023-10-09 05:37:12,283][60144] Updated weights for policy 1, policy_version 36102 (0.0009) +[2023-10-09 05:37:12,660][60144] Updated weights for policy 1, policy_version 36112 (0.0010) +[2023-10-09 05:37:13,026][60144] Updated weights for policy 1, policy_version 36122 (0.0010) +[2023-10-09 05:37:14,910][60143] Updated weights for policy 0, policy_version 35722 (0.0009) +[2023-10-09 05:37:15,290][60143] Updated weights for policy 0, policy_version 35732 (0.0009) +[2023-10-09 05:37:15,659][60143] Updated weights for policy 0, policy_version 35742 (0.0009) +[2023-10-09 05:37:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 73596928. Throughput: 0: 1686.9, 1: 1728.4. Samples: 18407588. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:37:16,053][59242] Avg episode reward: [(0, '27.640'), (1, '27.610')] +[2023-10-09 05:37:16,995][60144] Updated weights for policy 1, policy_version 36132 (0.0009) +[2023-10-09 05:37:17,360][60144] Updated weights for policy 1, policy_version 36142 (0.0007) +[2023-10-09 05:37:17,724][60144] Updated weights for policy 1, policy_version 36152 (0.0009) +[2023-10-09 05:37:19,646][60143] Updated weights for policy 0, policy_version 35752 (0.0009) +[2023-10-09 05:37:20,012][60143] Updated weights for policy 0, policy_version 35762 (0.0009) +[2023-10-09 05:37:20,383][60143] Updated weights for policy 0, policy_version 35772 (0.0009) +[2023-10-09 05:37:21,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 73662464. Throughput: 0: 1709.2, 1: 1695.2. Samples: 18417926. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-09 05:37:21,053][59242] Avg episode reward: [(0, '28.050'), (1, '26.330')] +[2023-10-09 05:37:21,755][60144] Updated weights for policy 1, policy_version 36162 (0.0008) +[2023-10-09 05:37:22,126][60144] Updated weights for policy 1, policy_version 36172 (0.0007) +[2023-10-09 05:37:22,494][60144] Updated weights for policy 1, policy_version 36182 (0.0007) +[2023-10-09 05:37:22,857][60144] Updated weights for policy 1, policy_version 36192 (0.0008) +[2023-10-09 05:37:24,502][60143] Updated weights for policy 0, policy_version 35782 (0.0009) +[2023-10-09 05:37:24,874][60143] Updated weights for policy 0, policy_version 35792 (0.0008) +[2023-10-09 05:37:25,236][60143] Updated weights for policy 0, policy_version 35802 (0.0007) +[2023-10-09 05:37:26,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 73728000. Throughput: 0: 1698.1, 1: 1719.2. Samples: 18438650. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-09 05:37:26,052][59242] Avg episode reward: [(0, '28.030'), (1, '26.030')] +[2023-10-09 05:37:26,749][60144] Updated weights for policy 1, policy_version 36202 (0.0007) +[2023-10-09 05:37:27,120][60144] Updated weights for policy 1, policy_version 36212 (0.0008) +[2023-10-09 05:37:27,488][60144] Updated weights for policy 1, policy_version 36222 (0.0008) +[2023-10-09 05:37:29,105][60143] Updated weights for policy 0, policy_version 35812 (0.0007) +[2023-10-09 05:37:29,476][60143] Updated weights for policy 0, policy_version 35822 (0.0009) +[2023-10-09 05:37:29,842][60143] Updated weights for policy 0, policy_version 35832 (0.0008) +[2023-10-09 05:37:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 73793536. Throughput: 0: 1678.9, 1: 1725.3. Samples: 18459132. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-09 05:37:31,053][59242] Avg episode reward: [(0, '28.080'), (1, '27.190')] +[2023-10-09 05:37:31,065][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000035840_36700160.pth... +[2023-10-09 05:37:31,100][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000034240_35061760.pth +[2023-10-09 05:37:31,418][60144] Updated weights for policy 1, policy_version 36232 (0.0008) +[2023-10-09 05:37:31,781][60144] Updated weights for policy 1, policy_version 36242 (0.0008) +[2023-10-09 05:37:32,155][60144] Updated weights for policy 1, policy_version 36252 (0.0008) +[2023-10-09 05:37:32,300][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000036256_37126144.pth... +[2023-10-09 05:37:32,329][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000034624_35454976.pth +[2023-10-09 05:37:33,865][60143] Updated weights for policy 0, policy_version 35842 (0.0008) +[2023-10-09 05:37:34,242][60143] Updated weights for policy 0, policy_version 35852 (0.0009) +[2023-10-09 05:37:34,612][60143] Updated weights for policy 0, policy_version 35862 (0.0007) +[2023-10-09 05:37:34,986][60143] Updated weights for policy 0, policy_version 35872 (0.0008) +[2023-10-09 05:37:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 73859072. Throughput: 0: 1709.5, 1: 1707.1. Samples: 18469660. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-09 05:37:36,053][59242] Avg episode reward: [(0, '29.220'), (1, '27.740')] +[2023-10-09 05:37:36,172][60144] Updated weights for policy 1, policy_version 36262 (0.0008) +[2023-10-09 05:37:36,538][60144] Updated weights for policy 1, policy_version 36272 (0.0009) +[2023-10-09 05:37:36,908][60144] Updated weights for policy 1, policy_version 36282 (0.0010) +[2023-10-09 05:37:38,904][60143] Updated weights for policy 0, policy_version 35882 (0.0008) +[2023-10-09 05:37:39,274][60143] Updated weights for policy 0, policy_version 35892 (0.0012) +[2023-10-09 05:37:39,662][60143] Updated weights for policy 0, policy_version 35902 (0.0012) +[2023-10-09 05:37:40,879][60144] Updated weights for policy 1, policy_version 36292 (0.0009) +[2023-10-09 05:37:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 73924608. Throughput: 0: 1689.8, 1: 1727.1. Samples: 18489988. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-09 05:37:41,053][59242] Avg episode reward: [(0, '27.660'), (1, '25.900')] +[2023-10-09 05:37:41,238][60144] Updated weights for policy 1, policy_version 36302 (0.0007) +[2023-10-09 05:37:41,612][60144] Updated weights for policy 1, policy_version 36312 (0.0007) +[2023-10-09 05:37:43,793][60143] Updated weights for policy 0, policy_version 35912 (0.0008) +[2023-10-09 05:37:44,167][60143] Updated weights for policy 0, policy_version 35922 (0.0008) +[2023-10-09 05:37:44,537][60143] Updated weights for policy 0, policy_version 35932 (0.0007) +[2023-10-09 05:37:45,370][60144] Updated weights for policy 1, policy_version 36322 (0.0007) +[2023-10-09 05:37:45,740][60144] Updated weights for policy 1, policy_version 36332 (0.0010) +[2023-10-09 05:37:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 73990144. Throughput: 0: 1694.0, 1: 1720.7. Samples: 18510660. Policy #0 lag: (min: 8.0, avg: 30.9, max: 40.0) +[2023-10-09 05:37:46,052][59242] Avg episode reward: [(0, '30.430'), (1, '25.400')] +[2023-10-09 05:37:46,110][60144] Updated weights for policy 1, policy_version 36342 (0.0009) +[2023-10-09 05:37:46,482][60144] Updated weights for policy 1, policy_version 36352 (0.0007) +[2023-10-09 05:37:48,599][60143] Updated weights for policy 0, policy_version 35942 (0.0007) +[2023-10-09 05:37:48,980][60143] Updated weights for policy 0, policy_version 35952 (0.0007) +[2023-10-09 05:37:49,344][60143] Updated weights for policy 0, policy_version 35962 (0.0010) +[2023-10-09 05:37:50,505][60144] Updated weights for policy 1, policy_version 36362 (0.0009) +[2023-10-09 05:37:50,866][60144] Updated weights for policy 1, policy_version 36372 (0.0008) +[2023-10-09 05:37:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 74055680. Throughput: 0: 1699.1, 1: 1731.8. Samples: 18521184. Policy #0 lag: (min: 8.0, avg: 30.9, max: 40.0) +[2023-10-09 05:37:51,052][59242] Avg episode reward: [(0, '29.780'), (1, '26.790')] +[2023-10-09 05:37:51,242][60144] Updated weights for policy 1, policy_version 36382 (0.0009) +[2023-10-09 05:37:53,284][60143] Updated weights for policy 0, policy_version 35972 (0.0009) +[2023-10-09 05:37:53,658][60143] Updated weights for policy 0, policy_version 35982 (0.0008) +[2023-10-09 05:37:54,033][60143] Updated weights for policy 0, policy_version 35992 (0.0009) +[2023-10-09 05:37:55,018][60144] Updated weights for policy 1, policy_version 36392 (0.0010) +[2023-10-09 05:37:55,376][60144] Updated weights for policy 1, policy_version 36402 (0.0009) +[2023-10-09 05:37:55,738][60144] Updated weights for policy 1, policy_version 36412 (0.0008) +[2023-10-09 05:37:56,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 74153984. Throughput: 0: 1680.4, 1: 1748.7. Samples: 18541784. Policy #0 lag: (min: 8.0, avg: 30.9, max: 40.0) +[2023-10-09 05:37:56,053][59242] Avg episode reward: [(0, '28.220'), (1, '26.390')] +[2023-10-09 05:37:58,139][60143] Updated weights for policy 0, policy_version 36002 (0.0008) +[2023-10-09 05:37:58,507][60143] Updated weights for policy 0, policy_version 36012 (0.0010) +[2023-10-09 05:37:58,884][60143] Updated weights for policy 0, policy_version 36022 (0.0010) +[2023-10-09 05:37:59,256][60143] Updated weights for policy 0, policy_version 36032 (0.0010) +[2023-10-09 05:37:59,610][60144] Updated weights for policy 1, policy_version 36422 (0.0010) +[2023-10-09 05:37:59,983][60144] Updated weights for policy 1, policy_version 36432 (0.0008) +[2023-10-09 05:38:00,347][60144] Updated weights for policy 1, policy_version 36442 (0.0007) +[2023-10-09 05:38:01,052][59242] Fps is (10 sec: 16383.5, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 74219520. Throughput: 0: 1706.5, 1: 1723.7. Samples: 18561948. Policy #0 lag: (min: 8.0, avg: 30.9, max: 40.0) +[2023-10-09 05:38:01,053][59242] Avg episode reward: [(0, '28.480'), (1, '28.370')] +[2023-10-09 05:38:03,053][60143] Updated weights for policy 0, policy_version 36042 (0.0007) +[2023-10-09 05:38:03,422][60143] Updated weights for policy 0, policy_version 36052 (0.0007) +[2023-10-09 05:38:03,788][60143] Updated weights for policy 0, policy_version 36062 (0.0008) +[2023-10-09 05:38:04,222][60144] Updated weights for policy 1, policy_version 36452 (0.0007) +[2023-10-09 05:38:04,585][60144] Updated weights for policy 1, policy_version 36462 (0.0008) +[2023-10-09 05:38:04,959][60144] Updated weights for policy 1, policy_version 36472 (0.0009) +[2023-10-09 05:38:06,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 74285056. Throughput: 0: 1695.5, 1: 1754.3. Samples: 18573166. Policy #0 lag: (min: 8.0, avg: 30.9, max: 40.0) +[2023-10-09 05:38:06,052][59242] Avg episode reward: [(0, '28.440'), (1, '28.310')] +[2023-10-09 05:38:07,602][60143] Updated weights for policy 0, policy_version 36072 (0.0007) +[2023-10-09 05:38:07,970][60143] Updated weights for policy 0, policy_version 36082 (0.0008) +[2023-10-09 05:38:08,347][60143] Updated weights for policy 0, policy_version 36092 (0.0008) +[2023-10-09 05:38:08,674][60144] Updated weights for policy 1, policy_version 36482 (0.0007) +[2023-10-09 05:38:09,047][60144] Updated weights for policy 1, policy_version 36492 (0.0008) +[2023-10-09 05:38:09,404][60144] Updated weights for policy 1, policy_version 36502 (0.0008) +[2023-10-09 05:38:09,766][60144] Updated weights for policy 1, policy_version 36512 (0.0009) +[2023-10-09 05:38:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 74350592. Throughput: 0: 1696.8, 1: 1737.0. Samples: 18593168. Policy #0 lag: (min: 17.0, avg: 33.4, max: 49.0) +[2023-10-09 05:38:11,053][59242] Avg episode reward: [(0, '27.420'), (1, '27.790')] +[2023-10-09 05:38:12,402][60143] Updated weights for policy 0, policy_version 36102 (0.0010) +[2023-10-09 05:38:12,770][60143] Updated weights for policy 0, policy_version 36112 (0.0010) +[2023-10-09 05:38:13,144][60143] Updated weights for policy 0, policy_version 36122 (0.0009) +[2023-10-09 05:38:13,763][60144] Updated weights for policy 1, policy_version 36522 (0.0008) +[2023-10-09 05:38:14,131][60144] Updated weights for policy 1, policy_version 36532 (0.0007) +[2023-10-09 05:38:14,508][60144] Updated weights for policy 1, policy_version 36542 (0.0007) +[2023-10-09 05:38:16,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 74416128. Throughput: 0: 1711.8, 1: 1728.7. Samples: 18613956. Policy #0 lag: (min: 17.0, avg: 33.4, max: 49.0) +[2023-10-09 05:38:16,053][59242] Avg episode reward: [(0, '29.810'), (1, '28.350')] +[2023-10-09 05:38:17,238][60143] Updated weights for policy 0, policy_version 36132 (0.0007) +[2023-10-09 05:38:17,611][60143] Updated weights for policy 0, policy_version 36142 (0.0007) +[2023-10-09 05:38:17,989][60143] Updated weights for policy 0, policy_version 36152 (0.0007) +[2023-10-09 05:38:18,364][60144] Updated weights for policy 1, policy_version 36552 (0.0010) +[2023-10-09 05:38:18,728][60144] Updated weights for policy 1, policy_version 36562 (0.0008) +[2023-10-09 05:38:19,089][60144] Updated weights for policy 1, policy_version 36572 (0.0010) +[2023-10-09 05:38:21,053][59242] Fps is (10 sec: 13106.4, 60 sec: 13653.2, 300 sec: 13662.6). Total num frames: 74481664. Throughput: 0: 1681.8, 1: 1748.8. Samples: 18624040. Policy #0 lag: (min: 17.0, avg: 33.4, max: 49.0) +[2023-10-09 05:38:21,054][59242] Avg episode reward: [(0, '28.830'), (1, '28.980')] +[2023-10-09 05:38:22,074][60143] Updated weights for policy 0, policy_version 36162 (0.0008) +[2023-10-09 05:38:22,444][60143] Updated weights for policy 0, policy_version 36172 (0.0007) +[2023-10-09 05:38:22,830][60143] Updated weights for policy 0, policy_version 36182 (0.0009) +[2023-10-09 05:38:23,028][60144] Updated weights for policy 1, policy_version 36582 (0.0009) +[2023-10-09 05:38:23,190][60143] Updated weights for policy 0, policy_version 36192 (0.0009) +[2023-10-09 05:38:23,391][60144] Updated weights for policy 1, policy_version 36592 (0.0008) +[2023-10-09 05:38:23,761][60144] Updated weights for policy 1, policy_version 36602 (0.0008) +[2023-10-09 05:38:26,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 74547200. Throughput: 0: 1701.2, 1: 1732.6. Samples: 18644506. Policy #0 lag: (min: 17.0, avg: 33.4, max: 49.0) +[2023-10-09 05:38:26,053][59242] Avg episode reward: [(0, '28.420'), (1, '29.270')] +[2023-10-09 05:38:27,088][60143] Updated weights for policy 0, policy_version 36202 (0.0009) +[2023-10-09 05:38:27,452][60143] Updated weights for policy 0, policy_version 36212 (0.0009) +[2023-10-09 05:38:27,704][60144] Updated weights for policy 1, policy_version 36612 (0.0009) +[2023-10-09 05:38:27,821][60143] Updated weights for policy 0, policy_version 36222 (0.0007) +[2023-10-09 05:38:28,070][60144] Updated weights for policy 1, policy_version 36622 (0.0007) +[2023-10-09 05:38:28,436][60144] Updated weights for policy 1, policy_version 36632 (0.0007) +[2023-10-09 05:38:31,052][59242] Fps is (10 sec: 13107.9, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 74612736. Throughput: 0: 1713.6, 1: 1737.5. Samples: 18665960. Policy #0 lag: (min: 17.0, avg: 33.4, max: 49.0) +[2023-10-09 05:38:31,053][59242] Avg episode reward: [(0, '28.060'), (1, '27.410')] +[2023-10-09 05:38:31,765][60143] Updated weights for policy 0, policy_version 36232 (0.0008) +[2023-10-09 05:38:32,144][60143] Updated weights for policy 0, policy_version 36242 (0.0008) +[2023-10-09 05:38:32,342][60144] Updated weights for policy 1, policy_version 36642 (0.0007) +[2023-10-09 05:38:32,513][60143] Updated weights for policy 0, policy_version 36252 (0.0009) +[2023-10-09 05:38:32,716][60144] Updated weights for policy 1, policy_version 36652 (0.0007) +[2023-10-09 05:38:33,073][60144] Updated weights for policy 1, policy_version 36662 (0.0007) +[2023-10-09 05:38:33,442][60144] Updated weights for policy 1, policy_version 36672 (0.0007) +[2023-10-09 05:38:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 74678272. Throughput: 0: 1693.3, 1: 1732.6. Samples: 18675348. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-09 05:38:36,053][59242] Avg episode reward: [(0, '28.310'), (1, '26.850')] +[2023-10-09 05:38:36,639][60143] Updated weights for policy 0, policy_version 36262 (0.0007) +[2023-10-09 05:38:37,010][60143] Updated weights for policy 0, policy_version 36272 (0.0008) +[2023-10-09 05:38:37,332][60144] Updated weights for policy 1, policy_version 36682 (0.0009) +[2023-10-09 05:38:37,377][60143] Updated weights for policy 0, policy_version 36282 (0.0009) +[2023-10-09 05:38:37,695][60144] Updated weights for policy 1, policy_version 36692 (0.0008) +[2023-10-09 05:38:38,066][60144] Updated weights for policy 1, policy_version 36702 (0.0010) +[2023-10-09 05:38:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 74743808. Throughput: 0: 1709.9, 1: 1729.1. Samples: 18696540. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-09 05:38:41,053][59242] Avg episode reward: [(0, '28.190'), (1, '27.840')] +[2023-10-09 05:38:41,450][60143] Updated weights for policy 0, policy_version 36292 (0.0008) +[2023-10-09 05:38:41,817][60143] Updated weights for policy 0, policy_version 36302 (0.0009) +[2023-10-09 05:38:41,914][60144] Updated weights for policy 1, policy_version 36712 (0.0008) +[2023-10-09 05:38:42,180][60143] Updated weights for policy 0, policy_version 36312 (0.0008) +[2023-10-09 05:38:42,282][60144] Updated weights for policy 1, policy_version 36722 (0.0007) +[2023-10-09 05:38:42,651][60144] Updated weights for policy 1, policy_version 36732 (0.0007) +[2023-10-09 05:38:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 74809344. Throughput: 0: 1706.7, 1: 1759.8. Samples: 18717940. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-09 05:38:46,053][59242] Avg episode reward: [(0, '27.500'), (1, '29.260')] +[2023-10-09 05:38:46,339][60143] Updated weights for policy 0, policy_version 36322 (0.0007) +[2023-10-09 05:38:46,538][60144] Updated weights for policy 1, policy_version 36742 (0.0009) +[2023-10-09 05:38:46,710][60143] Updated weights for policy 0, policy_version 36332 (0.0008) +[2023-10-09 05:38:46,894][60144] Updated weights for policy 1, policy_version 36752 (0.0007) +[2023-10-09 05:38:47,082][60143] Updated weights for policy 0, policy_version 36342 (0.0008) +[2023-10-09 05:38:47,259][60144] Updated weights for policy 1, policy_version 36762 (0.0007) +[2023-10-09 05:38:47,458][60143] Updated weights for policy 0, policy_version 36352 (0.0007) +[2023-10-09 05:38:51,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 74874880. Throughput: 0: 1694.6, 1: 1729.5. Samples: 18727250. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-09 05:38:51,053][59242] Avg episode reward: [(0, '27.750'), (1, '30.130')] +[2023-10-09 05:38:51,208][60144] Updated weights for policy 1, policy_version 36772 (0.0007) +[2023-10-09 05:38:51,528][60143] Updated weights for policy 0, policy_version 36362 (0.0009) +[2023-10-09 05:38:51,575][60144] Updated weights for policy 1, policy_version 36782 (0.0008) +[2023-10-09 05:38:51,900][60143] Updated weights for policy 0, policy_version 36372 (0.0010) +[2023-10-09 05:38:51,938][60144] Updated weights for policy 1, policy_version 36792 (0.0009) +[2023-10-09 05:38:52,264][60143] Updated weights for policy 0, policy_version 36382 (0.0008) +[2023-10-09 05:38:55,846][60144] Updated weights for policy 1, policy_version 36802 (0.0009) +[2023-10-09 05:38:56,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 74940416. Throughput: 0: 1702.2, 1: 1747.5. Samples: 18748402. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-09 05:38:56,052][59242] Avg episode reward: [(0, '29.420'), (1, '27.900')] +[2023-10-09 05:38:56,135][60143] Updated weights for policy 0, policy_version 36392 (0.0010) +[2023-10-09 05:38:56,209][60144] Updated weights for policy 1, policy_version 36812 (0.0008) +[2023-10-09 05:38:56,505][60143] Updated weights for policy 0, policy_version 36402 (0.0008) +[2023-10-09 05:38:56,580][60144] Updated weights for policy 1, policy_version 36822 (0.0008) +[2023-10-09 05:38:56,868][60143] Updated weights for policy 0, policy_version 36412 (0.0008) +[2023-10-09 05:38:56,955][60144] Updated weights for policy 1, policy_version 36832 (0.0009) +[2023-10-09 05:39:00,837][60143] Updated weights for policy 0, policy_version 36422 (0.0009) +[2023-10-09 05:39:00,890][60144] Updated weights for policy 1, policy_version 36842 (0.0008) +[2023-10-09 05:39:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 75005952. Throughput: 0: 1704.7, 1: 1752.9. Samples: 18769548. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-09 05:39:01,053][59242] Avg episode reward: [(0, '29.380'), (1, '28.100')] +[2023-10-09 05:39:01,208][60143] Updated weights for policy 0, policy_version 36432 (0.0007) +[2023-10-09 05:39:01,246][60144] Updated weights for policy 1, policy_version 36852 (0.0007) +[2023-10-09 05:39:01,570][60143] Updated weights for policy 0, policy_version 36442 (0.0009) +[2023-10-09 05:39:01,614][60144] Updated weights for policy 1, policy_version 36862 (0.0008) +[2023-10-09 05:39:05,421][60143] Updated weights for policy 0, policy_version 36452 (0.0009) +[2023-10-09 05:39:05,669][60144] Updated weights for policy 1, policy_version 36872 (0.0007) +[2023-10-09 05:39:05,797][60143] Updated weights for policy 0, policy_version 36462 (0.0007) +[2023-10-09 05:39:06,037][60144] Updated weights for policy 1, policy_version 36882 (0.0007) +[2023-10-09 05:39:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 75071488. Throughput: 0: 1708.5, 1: 1738.9. Samples: 18779172. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-09 05:39:06,053][59242] Avg episode reward: [(0, '29.410'), (1, '28.460')] +[2023-10-09 05:39:06,171][60143] Updated weights for policy 0, policy_version 36472 (0.0007) +[2023-10-09 05:39:06,410][60144] Updated weights for policy 1, policy_version 36892 (0.0008) +[2023-10-09 05:39:10,267][60143] Updated weights for policy 0, policy_version 36482 (0.0007) +[2023-10-09 05:39:10,322][60144] Updated weights for policy 1, policy_version 36902 (0.0007) +[2023-10-09 05:39:10,636][60143] Updated weights for policy 0, policy_version 36492 (0.0008) +[2023-10-09 05:39:10,691][60144] Updated weights for policy 1, policy_version 36912 (0.0008) +[2023-10-09 05:39:11,008][60143] Updated weights for policy 0, policy_version 36502 (0.0007) +[2023-10-09 05:39:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 75137024. Throughput: 0: 1712.3, 1: 1753.1. Samples: 18800448. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-09 05:39:11,052][59242] Avg episode reward: [(0, '30.580'), (1, '29.840')] +[2023-10-09 05:39:11,054][60144] Updated weights for policy 1, policy_version 36922 (0.0010) +[2023-10-09 05:39:11,378][60143] Updated weights for policy 0, policy_version 36512 (0.0008) +[2023-10-09 05:39:15,021][60144] Updated weights for policy 1, policy_version 36932 (0.0008) +[2023-10-09 05:39:15,385][60144] Updated weights for policy 1, policy_version 36942 (0.0008) +[2023-10-09 05:39:15,418][60143] Updated weights for policy 0, policy_version 36522 (0.0009) +[2023-10-09 05:39:15,748][60144] Updated weights for policy 1, policy_version 36952 (0.0009) +[2023-10-09 05:39:15,781][60143] Updated weights for policy 0, policy_version 36532 (0.0011) +[2023-10-09 05:39:16,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 75235328. Throughput: 0: 1694.8, 1: 1731.3. Samples: 18820132. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-09 05:39:16,053][59242] Avg episode reward: [(0, '30.410'), (1, '30.300')] +[2023-10-09 05:39:16,149][60143] Updated weights for policy 0, policy_version 36542 (0.0007) +[2023-10-09 05:39:19,749][60144] Updated weights for policy 1, policy_version 36962 (0.0007) +[2023-10-09 05:39:20,122][60144] Updated weights for policy 1, policy_version 36972 (0.0007) +[2023-10-09 05:39:20,126][60143] Updated weights for policy 0, policy_version 36552 (0.0008) +[2023-10-09 05:39:20,484][60144] Updated weights for policy 1, policy_version 36982 (0.0007) +[2023-10-09 05:39:20,500][60143] Updated weights for policy 0, policy_version 36562 (0.0009) +[2023-10-09 05:39:20,856][60144] Updated weights for policy 1, policy_version 36992 (0.0007) +[2023-10-09 05:39:20,866][60143] Updated weights for policy 0, policy_version 36572 (0.0009) +[2023-10-09 05:39:21,052][59242] Fps is (10 sec: 19660.4, 60 sec: 14199.6, 300 sec: 13884.7). Total num frames: 75333632. Throughput: 0: 1701.5, 1: 1747.9. Samples: 18830572. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:39:21,053][59242] Avg episode reward: [(0, '30.890'), (1, '28.940')] +[2023-10-09 05:39:24,525][60144] Updated weights for policy 1, policy_version 37002 (0.0007) +[2023-10-09 05:39:24,891][60144] Updated weights for policy 1, policy_version 37012 (0.0007) +[2023-10-09 05:39:24,899][60143] Updated weights for policy 0, policy_version 36582 (0.0008) +[2023-10-09 05:39:25,258][60144] Updated weights for policy 1, policy_version 37022 (0.0008) +[2023-10-09 05:39:25,269][60143] Updated weights for policy 0, policy_version 36592 (0.0009) +[2023-10-09 05:39:25,640][60143] Updated weights for policy 0, policy_version 36602 (0.0009) +[2023-10-09 05:39:26,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 75399168. Throughput: 0: 1709.5, 1: 1737.5. Samples: 18851656. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:39:26,053][59242] Avg episode reward: [(0, '32.490'), (1, '29.570')] +[2023-10-09 05:39:29,130][60144] Updated weights for policy 1, policy_version 37032 (0.0009) +[2023-10-09 05:39:29,423][60143] Updated weights for policy 0, policy_version 36612 (0.0008) +[2023-10-09 05:39:29,496][60144] Updated weights for policy 1, policy_version 37042 (0.0007) +[2023-10-09 05:39:29,793][60143] Updated weights for policy 0, policy_version 36622 (0.0007) +[2023-10-09 05:39:29,858][60144] Updated weights for policy 1, policy_version 37052 (0.0007) +[2023-10-09 05:39:30,162][60143] Updated weights for policy 0, policy_version 36632 (0.0010) +[2023-10-09 05:39:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 75464704. Throughput: 0: 1682.5, 1: 1714.4. Samples: 18870800. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:39:31,053][59242] Avg episode reward: [(0, '32.450'), (1, '29.260')] +[2023-10-09 05:39:31,063][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000036640_37519360.pth... +[2023-10-09 05:39:31,063][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000037056_37945344.pth... +[2023-10-09 05:39:31,095][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000035040_35880960.pth +[2023-10-09 05:39:31,102][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000035424_36274176.pth +[2023-10-09 05:39:33,857][60144] Updated weights for policy 1, policy_version 37062 (0.0007) +[2023-10-09 05:39:34,136][60143] Updated weights for policy 0, policy_version 36642 (0.0008) +[2023-10-09 05:39:34,217][60144] Updated weights for policy 1, policy_version 37072 (0.0010) +[2023-10-09 05:39:34,497][60143] Updated weights for policy 0, policy_version 36652 (0.0008) +[2023-10-09 05:39:34,585][60144] Updated weights for policy 1, policy_version 37082 (0.0007) +[2023-10-09 05:39:34,864][60143] Updated weights for policy 0, policy_version 36662 (0.0008) +[2023-10-09 05:39:35,235][60143] Updated weights for policy 0, policy_version 36672 (0.0010) +[2023-10-09 05:39:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 75530240. Throughput: 0: 1712.3, 1: 1742.9. Samples: 18882732. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:39:36,052][59242] Avg episode reward: [(0, '30.120'), (1, '28.640')] +[2023-10-09 05:39:38,411][60144] Updated weights for policy 1, policy_version 37092 (0.0009) +[2023-10-09 05:39:38,782][60144] Updated weights for policy 1, policy_version 37102 (0.0010) +[2023-10-09 05:39:39,140][60144] Updated weights for policy 1, policy_version 37112 (0.0010) +[2023-10-09 05:39:39,427][60143] Updated weights for policy 0, policy_version 36682 (0.0008) +[2023-10-09 05:39:39,796][60143] Updated weights for policy 0, policy_version 36692 (0.0008) +[2023-10-09 05:39:40,170][60143] Updated weights for policy 0, policy_version 36702 (0.0008) +[2023-10-09 05:39:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 75595776. Throughput: 0: 1699.7, 1: 1719.8. Samples: 18902280. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:39:41,053][59242] Avg episode reward: [(0, '29.730'), (1, '27.680')] +[2023-10-09 05:39:43,116][60144] Updated weights for policy 1, policy_version 37122 (0.0009) +[2023-10-09 05:39:43,500][60144] Updated weights for policy 1, policy_version 37132 (0.0009) +[2023-10-09 05:39:43,862][60144] Updated weights for policy 1, policy_version 37142 (0.0007) +[2023-10-09 05:39:44,175][60143] Updated weights for policy 0, policy_version 36712 (0.0008) +[2023-10-09 05:39:44,224][60144] Updated weights for policy 1, policy_version 37152 (0.0007) +[2023-10-09 05:39:44,538][60143] Updated weights for policy 0, policy_version 36722 (0.0010) +[2023-10-09 05:39:44,914][60143] Updated weights for policy 0, policy_version 36732 (0.0011) +[2023-10-09 05:39:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 75661312. Throughput: 0: 1685.6, 1: 1724.2. Samples: 18922986. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:39:46,052][59242] Avg episode reward: [(0, '28.930'), (1, '27.880')] +[2023-10-09 05:39:48,270][60144] Updated weights for policy 1, policy_version 37162 (0.0007) +[2023-10-09 05:39:48,644][60144] Updated weights for policy 1, policy_version 37172 (0.0007) +[2023-10-09 05:39:48,731][60143] Updated weights for policy 0, policy_version 36742 (0.0008) +[2023-10-09 05:39:49,005][60144] Updated weights for policy 1, policy_version 37182 (0.0010) +[2023-10-09 05:39:49,103][60143] Updated weights for policy 0, policy_version 36752 (0.0010) +[2023-10-09 05:39:49,476][60143] Updated weights for policy 0, policy_version 36762 (0.0011) +[2023-10-09 05:39:51,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 75726848. Throughput: 0: 1717.3, 1: 1730.8. Samples: 18934338. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 05:39:51,053][59242] Avg episode reward: [(0, '29.430'), (1, '27.580')] +[2023-10-09 05:39:52,930][60144] Updated weights for policy 1, policy_version 37192 (0.0007) +[2023-10-09 05:39:53,307][60144] Updated weights for policy 1, policy_version 37202 (0.0009) +[2023-10-09 05:39:53,438][60143] Updated weights for policy 0, policy_version 36772 (0.0008) +[2023-10-09 05:39:53,671][60144] Updated weights for policy 1, policy_version 37212 (0.0008) +[2023-10-09 05:39:53,809][60143] Updated weights for policy 0, policy_version 36782 (0.0008) +[2023-10-09 05:39:54,178][60143] Updated weights for policy 0, policy_version 36792 (0.0008) +[2023-10-09 05:39:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 75792384. Throughput: 0: 1689.1, 1: 1721.3. Samples: 18953914. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 05:39:56,053][59242] Avg episode reward: [(0, '30.530'), (1, '28.440')] +[2023-10-09 05:39:57,578][60144] Updated weights for policy 1, policy_version 37222 (0.0007) +[2023-10-09 05:39:57,953][60144] Updated weights for policy 1, policy_version 37232 (0.0009) +[2023-10-09 05:39:58,315][60144] Updated weights for policy 1, policy_version 37242 (0.0008) +[2023-10-09 05:39:58,411][60143] Updated weights for policy 0, policy_version 36802 (0.0007) +[2023-10-09 05:39:58,773][60143] Updated weights for policy 0, policy_version 36812 (0.0008) +[2023-10-09 05:39:59,144][60143] Updated weights for policy 0, policy_version 36822 (0.0007) +[2023-10-09 05:39:59,515][60143] Updated weights for policy 0, policy_version 36832 (0.0009) +[2023-10-09 05:40:01,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 75857920. Throughput: 0: 1698.8, 1: 1743.2. Samples: 18975020. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 05:40:01,053][59242] Avg episode reward: [(0, '30.190'), (1, '27.440')] +[2023-10-09 05:40:02,223][60144] Updated weights for policy 1, policy_version 37252 (0.0007) +[2023-10-09 05:40:02,594][60144] Updated weights for policy 1, policy_version 37262 (0.0007) +[2023-10-09 05:40:02,959][60144] Updated weights for policy 1, policy_version 37272 (0.0008) +[2023-10-09 05:40:03,485][60143] Updated weights for policy 0, policy_version 36842 (0.0008) +[2023-10-09 05:40:03,859][60143] Updated weights for policy 0, policy_version 36852 (0.0008) +[2023-10-09 05:40:04,238][60143] Updated weights for policy 0, policy_version 36862 (0.0009) +[2023-10-09 05:40:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 75923456. Throughput: 0: 1712.7, 1: 1729.3. Samples: 18985460. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 05:40:06,053][59242] Avg episode reward: [(0, '29.380'), (1, '27.480')] +[2023-10-09 05:40:06,944][60144] Updated weights for policy 1, policy_version 37282 (0.0008) +[2023-10-09 05:40:07,318][60144] Updated weights for policy 1, policy_version 37292 (0.0008) +[2023-10-09 05:40:07,692][60144] Updated weights for policy 1, policy_version 37302 (0.0009) +[2023-10-09 05:40:08,062][60144] Updated weights for policy 1, policy_version 37312 (0.0007) +[2023-10-09 05:40:08,097][60143] Updated weights for policy 0, policy_version 36872 (0.0009) +[2023-10-09 05:40:08,470][60143] Updated weights for policy 0, policy_version 36882 (0.0009) +[2023-10-09 05:40:08,838][60143] Updated weights for policy 0, policy_version 36892 (0.0008) +[2023-10-09 05:40:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 75988992. Throughput: 0: 1692.0, 1: 1731.1. Samples: 19005692. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 05:40:11,053][59242] Avg episode reward: [(0, '29.860'), (1, '27.290')] +[2023-10-09 05:40:11,999][60144] Updated weights for policy 1, policy_version 37322 (0.0007) +[2023-10-09 05:40:12,371][60144] Updated weights for policy 1, policy_version 37332 (0.0009) +[2023-10-09 05:40:12,742][60144] Updated weights for policy 1, policy_version 37342 (0.0010) +[2023-10-09 05:40:13,077][60143] Updated weights for policy 0, policy_version 36902 (0.0008) +[2023-10-09 05:40:13,450][60143] Updated weights for policy 0, policy_version 36912 (0.0009) +[2023-10-09 05:40:13,823][60143] Updated weights for policy 0, policy_version 36922 (0.0009) +[2023-10-09 05:40:16,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 76054528. Throughput: 0: 1716.3, 1: 1751.7. Samples: 19026862. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-09 05:40:16,053][59242] Avg episode reward: [(0, '28.630'), (1, '27.600')] +[2023-10-09 05:40:16,493][60144] Updated weights for policy 1, policy_version 37352 (0.0008) +[2023-10-09 05:40:16,871][60144] Updated weights for policy 1, policy_version 37362 (0.0007) +[2023-10-09 05:40:17,245][60144] Updated weights for policy 1, policy_version 37372 (0.0008) +[2023-10-09 05:40:17,648][60143] Updated weights for policy 0, policy_version 36932 (0.0007) +[2023-10-09 05:40:18,020][60143] Updated weights for policy 0, policy_version 36942 (0.0007) +[2023-10-09 05:40:18,388][60143] Updated weights for policy 0, policy_version 36952 (0.0009) +[2023-10-09 05:40:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 76120064. Throughput: 0: 1693.6, 1: 1720.8. Samples: 19036384. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-09 05:40:21,054][59242] Avg episode reward: [(0, '28.900'), (1, '27.670')] +[2023-10-09 05:40:21,139][60144] Updated weights for policy 1, policy_version 37382 (0.0009) +[2023-10-09 05:40:21,516][60144] Updated weights for policy 1, policy_version 37392 (0.0009) +[2023-10-09 05:40:21,872][60144] Updated weights for policy 1, policy_version 37402 (0.0009) +[2023-10-09 05:40:22,301][60143] Updated weights for policy 0, policy_version 36962 (0.0008) +[2023-10-09 05:40:22,662][60143] Updated weights for policy 0, policy_version 36972 (0.0007) +[2023-10-09 05:40:23,035][60143] Updated weights for policy 0, policy_version 36982 (0.0007) +[2023-10-09 05:40:23,408][60143] Updated weights for policy 0, policy_version 36992 (0.0007) +[2023-10-09 05:40:25,832][60144] Updated weights for policy 1, policy_version 37412 (0.0008) +[2023-10-09 05:40:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 76185600. Throughput: 0: 1708.3, 1: 1745.4. Samples: 19057694. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-09 05:40:26,053][59242] Avg episode reward: [(0, '28.490'), (1, '28.410')] +[2023-10-09 05:40:26,190][60144] Updated weights for policy 1, policy_version 37422 (0.0007) +[2023-10-09 05:40:26,559][60144] Updated weights for policy 1, policy_version 37432 (0.0009) +[2023-10-09 05:40:27,191][60143] Updated weights for policy 0, policy_version 37002 (0.0007) +[2023-10-09 05:40:27,556][60143] Updated weights for policy 0, policy_version 37012 (0.0008) +[2023-10-09 05:40:27,928][60143] Updated weights for policy 0, policy_version 37022 (0.0009) +[2023-10-09 05:40:30,474][60144] Updated weights for policy 1, policy_version 37442 (0.0008) +[2023-10-09 05:40:30,848][60144] Updated weights for policy 1, policy_version 37452 (0.0011) +[2023-10-09 05:40:31,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 76251136. Throughput: 0: 1727.3, 1: 1734.0. Samples: 19078746. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-09 05:40:31,052][59242] Avg episode reward: [(0, '27.520'), (1, '27.150')] +[2023-10-09 05:40:31,214][60144] Updated weights for policy 1, policy_version 37462 (0.0007) +[2023-10-09 05:40:31,578][60144] Updated weights for policy 1, policy_version 37472 (0.0007) +[2023-10-09 05:40:31,991][60143] Updated weights for policy 0, policy_version 37032 (0.0010) +[2023-10-09 05:40:32,347][60143] Updated weights for policy 0, policy_version 37042 (0.0010) +[2023-10-09 05:40:32,727][60143] Updated weights for policy 0, policy_version 37052 (0.0007) +[2023-10-09 05:40:35,538][60144] Updated weights for policy 1, policy_version 37482 (0.0007) +[2023-10-09 05:40:35,910][60144] Updated weights for policy 1, policy_version 37492 (0.0007) +[2023-10-09 05:40:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 76316672. Throughput: 0: 1692.2, 1: 1727.2. Samples: 19088210. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-09 05:40:36,053][59242] Avg episode reward: [(0, '26.150'), (1, '27.500')] +[2023-10-09 05:40:36,278][60144] Updated weights for policy 1, policy_version 37502 (0.0007) +[2023-10-09 05:40:36,675][60143] Updated weights for policy 0, policy_version 37062 (0.0008) +[2023-10-09 05:40:37,049][60143] Updated weights for policy 0, policy_version 37072 (0.0009) +[2023-10-09 05:40:37,418][60143] Updated weights for policy 0, policy_version 37082 (0.0008) +[2023-10-09 05:40:40,259][60144] Updated weights for policy 1, policy_version 37512 (0.0008) +[2023-10-09 05:40:40,621][60144] Updated weights for policy 1, policy_version 37522 (0.0008) +[2023-10-09 05:40:40,987][60144] Updated weights for policy 1, policy_version 37532 (0.0008) +[2023-10-09 05:40:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 76382208. Throughput: 0: 1716.5, 1: 1740.7. Samples: 19109488. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-09 05:40:41,053][59242] Avg episode reward: [(0, '26.760'), (1, '28.130')] +[2023-10-09 05:40:41,469][60143] Updated weights for policy 0, policy_version 37092 (0.0008) +[2023-10-09 05:40:41,829][60143] Updated weights for policy 0, policy_version 37102 (0.0007) +[2023-10-09 05:40:42,203][60143] Updated weights for policy 0, policy_version 37112 (0.0007) +[2023-10-09 05:40:45,062][60144] Updated weights for policy 1, policy_version 37542 (0.0010) +[2023-10-09 05:40:45,431][60144] Updated weights for policy 1, policy_version 37552 (0.0008) +[2023-10-09 05:40:45,803][60144] Updated weights for policy 1, policy_version 37562 (0.0009) +[2023-10-09 05:40:46,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 76480512. Throughput: 0: 1726.3, 1: 1719.0. Samples: 19130058. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-09 05:40:46,053][59242] Avg episode reward: [(0, '26.720'), (1, '28.080')] +[2023-10-09 05:40:46,106][60143] Updated weights for policy 0, policy_version 37122 (0.0008) +[2023-10-09 05:40:46,480][60143] Updated weights for policy 0, policy_version 37132 (0.0008) +[2023-10-09 05:40:46,844][60143] Updated weights for policy 0, policy_version 37142 (0.0007) +[2023-10-09 05:40:47,215][60143] Updated weights for policy 0, policy_version 37152 (0.0008) +[2023-10-09 05:40:49,730][60144] Updated weights for policy 1, policy_version 37572 (0.0007) +[2023-10-09 05:40:50,088][60144] Updated weights for policy 1, policy_version 37582 (0.0007) +[2023-10-09 05:40:50,454][60144] Updated weights for policy 1, policy_version 37592 (0.0007) +[2023-10-09 05:40:51,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 76546048. Throughput: 0: 1706.0, 1: 1735.3. Samples: 19140318. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-09 05:40:51,052][59242] Avg episode reward: [(0, '26.760'), (1, '28.040')] +[2023-10-09 05:40:51,234][60143] Updated weights for policy 0, policy_version 37162 (0.0009) +[2023-10-09 05:40:51,604][60143] Updated weights for policy 0, policy_version 37172 (0.0008) +[2023-10-09 05:40:51,977][60143] Updated weights for policy 0, policy_version 37182 (0.0011) +[2023-10-09 05:40:54,385][60144] Updated weights for policy 1, policy_version 37602 (0.0007) +[2023-10-09 05:40:54,752][60144] Updated weights for policy 1, policy_version 37612 (0.0007) +[2023-10-09 05:40:55,117][60144] Updated weights for policy 1, policy_version 37622 (0.0011) +[2023-10-09 05:40:55,486][60144] Updated weights for policy 1, policy_version 37632 (0.0008) +[2023-10-09 05:40:55,883][60143] Updated weights for policy 0, policy_version 37192 (0.0009) +[2023-10-09 05:40:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 76611584. Throughput: 0: 1724.7, 1: 1728.1. Samples: 19161068. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-09 05:40:56,053][59242] Avg episode reward: [(0, '27.330'), (1, '28.170')] +[2023-10-09 05:40:56,254][60143] Updated weights for policy 0, policy_version 37202 (0.0008) +[2023-10-09 05:40:56,629][60143] Updated weights for policy 0, policy_version 37212 (0.0010) +[2023-10-09 05:40:59,574][60144] Updated weights for policy 1, policy_version 37642 (0.0007) +[2023-10-09 05:40:59,944][60144] Updated weights for policy 1, policy_version 37652 (0.0008) +[2023-10-09 05:41:00,303][60144] Updated weights for policy 1, policy_version 37662 (0.0009) +[2023-10-09 05:41:00,683][60143] Updated weights for policy 0, policy_version 37222 (0.0009) +[2023-10-09 05:41:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 76677120. Throughput: 0: 1725.5, 1: 1699.4. Samples: 19180982. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-09 05:41:01,053][60143] Updated weights for policy 0, policy_version 37232 (0.0007) +[2023-10-09 05:41:01,053][59242] Avg episode reward: [(0, '28.140'), (1, '28.190')] +[2023-10-09 05:41:01,423][60143] Updated weights for policy 0, policy_version 37242 (0.0007) +[2023-10-09 05:41:04,095][60144] Updated weights for policy 1, policy_version 37672 (0.0008) +[2023-10-09 05:41:04,454][60144] Updated weights for policy 1, policy_version 37682 (0.0007) +[2023-10-09 05:41:04,818][60144] Updated weights for policy 1, policy_version 37692 (0.0007) +[2023-10-09 05:41:05,241][60143] Updated weights for policy 0, policy_version 37252 (0.0008) +[2023-10-09 05:41:05,615][60143] Updated weights for policy 0, policy_version 37262 (0.0007) +[2023-10-09 05:41:05,985][60143] Updated weights for policy 0, policy_version 37272 (0.0007) +[2023-10-09 05:41:06,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 76742656. Throughput: 0: 1718.2, 1: 1735.4. Samples: 19191798. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-09 05:41:06,052][59242] Avg episode reward: [(0, '27.530'), (1, '28.750')] +[2023-10-09 05:41:08,685][60144] Updated weights for policy 1, policy_version 37702 (0.0007) +[2023-10-09 05:41:09,052][60144] Updated weights for policy 1, policy_version 37712 (0.0008) +[2023-10-09 05:41:09,419][60144] Updated weights for policy 1, policy_version 37722 (0.0007) +[2023-10-09 05:41:10,153][60143] Updated weights for policy 0, policy_version 37282 (0.0009) +[2023-10-09 05:41:10,516][60143] Updated weights for policy 0, policy_version 37292 (0.0011) +[2023-10-09 05:41:10,894][60143] Updated weights for policy 0, policy_version 37302 (0.0011) +[2023-10-09 05:41:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 76808192. Throughput: 0: 1720.2, 1: 1711.0. Samples: 19212096. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-09 05:41:11,053][59242] Avg episode reward: [(0, '29.370'), (1, '30.640')] +[2023-10-09 05:41:11,256][60143] Updated weights for policy 0, policy_version 37312 (0.0010) +[2023-10-09 05:41:13,443][60144] Updated weights for policy 1, policy_version 37732 (0.0007) +[2023-10-09 05:41:13,815][60144] Updated weights for policy 1, policy_version 37742 (0.0008) +[2023-10-09 05:41:14,190][60144] Updated weights for policy 1, policy_version 37752 (0.0009) +[2023-10-09 05:41:15,372][60143] Updated weights for policy 0, policy_version 37322 (0.0009) +[2023-10-09 05:41:15,746][60143] Updated weights for policy 0, policy_version 37332 (0.0009) +[2023-10-09 05:41:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 76873728. Throughput: 0: 1707.1, 1: 1714.2. Samples: 19232704. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 05:41:16,053][59242] Avg episode reward: [(0, '27.570'), (1, '30.980')] +[2023-10-09 05:41:16,113][60143] Updated weights for policy 0, policy_version 37342 (0.0009) +[2023-10-09 05:41:18,100][60144] Updated weights for policy 1, policy_version 37762 (0.0009) +[2023-10-09 05:41:18,466][60144] Updated weights for policy 1, policy_version 37772 (0.0009) +[2023-10-09 05:41:18,829][60144] Updated weights for policy 1, policy_version 37782 (0.0008) +[2023-10-09 05:41:19,201][60144] Updated weights for policy 1, policy_version 37792 (0.0009) +[2023-10-09 05:41:19,956][60143] Updated weights for policy 0, policy_version 37352 (0.0010) +[2023-10-09 05:41:20,323][60143] Updated weights for policy 0, policy_version 37362 (0.0011) +[2023-10-09 05:41:20,695][60143] Updated weights for policy 0, policy_version 37372 (0.0011) +[2023-10-09 05:41:21,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 76972032. Throughput: 0: 1718.4, 1: 1728.6. Samples: 19243326. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 05:41:21,053][59242] Avg episode reward: [(0, '28.110'), (1, '31.180')] +[2023-10-09 05:41:23,256][60144] Updated weights for policy 1, policy_version 37802 (0.0007) +[2023-10-09 05:41:23,626][60144] Updated weights for policy 1, policy_version 37812 (0.0007) +[2023-10-09 05:41:23,997][60144] Updated weights for policy 1, policy_version 37822 (0.0008) +[2023-10-09 05:41:24,601][60143] Updated weights for policy 0, policy_version 37382 (0.0011) +[2023-10-09 05:41:24,969][60143] Updated weights for policy 0, policy_version 37392 (0.0007) +[2023-10-09 05:41:25,337][60143] Updated weights for policy 0, policy_version 37402 (0.0008) +[2023-10-09 05:41:26,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 77037568. Throughput: 0: 1722.0, 1: 1705.5. Samples: 19263726. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 05:41:26,053][59242] Avg episode reward: [(0, '28.980'), (1, '29.830')] +[2023-10-09 05:41:27,945][60144] Updated weights for policy 1, policy_version 37832 (0.0007) +[2023-10-09 05:41:28,307][60144] Updated weights for policy 1, policy_version 37842 (0.0007) +[2023-10-09 05:41:28,683][60144] Updated weights for policy 1, policy_version 37852 (0.0008) +[2023-10-09 05:41:29,451][60143] Updated weights for policy 0, policy_version 37412 (0.0009) +[2023-10-09 05:41:29,811][60143] Updated weights for policy 0, policy_version 37422 (0.0010) +[2023-10-09 05:41:30,184][60143] Updated weights for policy 0, policy_version 37432 (0.0009) +[2023-10-09 05:41:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 77103104. Throughput: 0: 1687.1, 1: 1727.6. Samples: 19283722. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 05:41:31,053][59242] Avg episode reward: [(0, '28.690'), (1, '29.850')] +[2023-10-09 05:41:31,065][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000037856_38764544.pth... +[2023-10-09 05:41:31,065][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000037440_38338560.pth... +[2023-10-09 05:41:31,097][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000036256_37126144.pth +[2023-10-09 05:41:31,106][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000035840_36700160.pth +[2023-10-09 05:41:32,764][60144] Updated weights for policy 1, policy_version 37862 (0.0007) +[2023-10-09 05:41:33,132][60144] Updated weights for policy 1, policy_version 37872 (0.0009) +[2023-10-09 05:41:33,506][60144] Updated weights for policy 1, policy_version 37882 (0.0008) +[2023-10-09 05:41:34,163][60143] Updated weights for policy 0, policy_version 37442 (0.0007) +[2023-10-09 05:41:34,532][60143] Updated weights for policy 0, policy_version 37452 (0.0007) +[2023-10-09 05:41:34,899][60143] Updated weights for policy 0, policy_version 37462 (0.0009) +[2023-10-09 05:41:35,263][60143] Updated weights for policy 0, policy_version 37472 (0.0011) +[2023-10-09 05:41:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 77168640. Throughput: 0: 1714.0, 1: 1711.0. Samples: 19294446. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 05:41:36,053][59242] Avg episode reward: [(0, '28.360'), (1, '30.060')] +[2023-10-09 05:41:37,478][60144] Updated weights for policy 1, policy_version 37892 (0.0008) +[2023-10-09 05:41:37,857][60144] Updated weights for policy 1, policy_version 37902 (0.0007) +[2023-10-09 05:41:38,220][60144] Updated weights for policy 1, policy_version 37912 (0.0011) +[2023-10-09 05:41:39,277][60143] Updated weights for policy 0, policy_version 37482 (0.0010) +[2023-10-09 05:41:39,654][60143] Updated weights for policy 0, policy_version 37492 (0.0007) +[2023-10-09 05:41:40,017][60143] Updated weights for policy 0, policy_version 37502 (0.0010) +[2023-10-09 05:41:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 77234176. Throughput: 0: 1700.8, 1: 1716.1. Samples: 19314832. Policy #0 lag: (min: 8.0, avg: 33.1, max: 40.0) +[2023-10-09 05:41:41,053][59242] Avg episode reward: [(0, '29.990'), (1, '30.860')] +[2023-10-09 05:41:42,118][60144] Updated weights for policy 1, policy_version 37922 (0.0010) +[2023-10-09 05:41:42,488][60144] Updated weights for policy 1, policy_version 37932 (0.0007) +[2023-10-09 05:41:42,852][60144] Updated weights for policy 1, policy_version 37942 (0.0010) +[2023-10-09 05:41:43,224][60144] Updated weights for policy 1, policy_version 37952 (0.0007) +[2023-10-09 05:41:44,035][60143] Updated weights for policy 0, policy_version 37512 (0.0009) +[2023-10-09 05:41:44,413][60143] Updated weights for policy 0, policy_version 37522 (0.0008) +[2023-10-09 05:41:44,778][60143] Updated weights for policy 0, policy_version 37532 (0.0008) +[2023-10-09 05:41:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 77299712. Throughput: 0: 1683.5, 1: 1745.7. Samples: 19335296. Policy #0 lag: (min: 8.0, avg: 33.1, max: 40.0) +[2023-10-09 05:41:46,053][59242] Avg episode reward: [(0, '29.930'), (1, '30.190')] +[2023-10-09 05:41:47,146][60144] Updated weights for policy 1, policy_version 37962 (0.0009) +[2023-10-09 05:41:47,519][60144] Updated weights for policy 1, policy_version 37972 (0.0008) +[2023-10-09 05:41:47,884][60144] Updated weights for policy 1, policy_version 37982 (0.0008) +[2023-10-09 05:41:48,854][60143] Updated weights for policy 0, policy_version 37542 (0.0009) +[2023-10-09 05:41:49,231][60143] Updated weights for policy 0, policy_version 37552 (0.0009) +[2023-10-09 05:41:49,598][60143] Updated weights for policy 0, policy_version 37562 (0.0010) +[2023-10-09 05:41:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 77365248. Throughput: 0: 1712.1, 1: 1708.0. Samples: 19345704. Policy #0 lag: (min: 8.0, avg: 33.1, max: 40.0) +[2023-10-09 05:41:51,053][59242] Avg episode reward: [(0, '28.920'), (1, '30.330')] +[2023-10-09 05:41:51,653][60144] Updated weights for policy 1, policy_version 37992 (0.0010) +[2023-10-09 05:41:52,021][60144] Updated weights for policy 1, policy_version 38002 (0.0009) +[2023-10-09 05:41:52,398][60144] Updated weights for policy 1, policy_version 38012 (0.0009) +[2023-10-09 05:41:53,597][60143] Updated weights for policy 0, policy_version 37572 (0.0009) +[2023-10-09 05:41:53,962][60143] Updated weights for policy 0, policy_version 37582 (0.0010) +[2023-10-09 05:41:54,324][60143] Updated weights for policy 0, policy_version 37592 (0.0010) +[2023-10-09 05:41:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 77430784. Throughput: 0: 1683.9, 1: 1735.7. Samples: 19365978. Policy #0 lag: (min: 8.0, avg: 33.1, max: 40.0) +[2023-10-09 05:41:56,053][59242] Avg episode reward: [(0, '29.390'), (1, '29.240')] +[2023-10-09 05:41:56,444][60144] Updated weights for policy 1, policy_version 38022 (0.0009) +[2023-10-09 05:41:56,810][60144] Updated weights for policy 1, policy_version 38032 (0.0008) +[2023-10-09 05:41:57,180][60144] Updated weights for policy 1, policy_version 38042 (0.0007) +[2023-10-09 05:41:58,333][60143] Updated weights for policy 0, policy_version 37602 (0.0009) +[2023-10-09 05:41:58,700][60143] Updated weights for policy 0, policy_version 37612 (0.0008) +[2023-10-09 05:41:59,061][60143] Updated weights for policy 0, policy_version 37622 (0.0010) +[2023-10-09 05:41:59,440][60143] Updated weights for policy 0, policy_version 37632 (0.0010) +[2023-10-09 05:42:01,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 77496320. Throughput: 0: 1690.0, 1: 1736.8. Samples: 19386912. Policy #0 lag: (min: 8.0, avg: 33.1, max: 40.0) +[2023-10-09 05:42:01,052][59242] Avg episode reward: [(0, '29.060'), (1, '29.010')] +[2023-10-09 05:42:01,062][60144] Updated weights for policy 1, policy_version 38052 (0.0008) +[2023-10-09 05:42:01,426][60144] Updated weights for policy 1, policy_version 38062 (0.0008) +[2023-10-09 05:42:01,800][60144] Updated weights for policy 1, policy_version 38072 (0.0008) +[2023-10-09 05:42:03,400][60143] Updated weights for policy 0, policy_version 37642 (0.0012) +[2023-10-09 05:42:03,772][60143] Updated weights for policy 0, policy_version 37652 (0.0009) +[2023-10-09 05:42:04,139][60143] Updated weights for policy 0, policy_version 37662 (0.0008) +[2023-10-09 05:42:05,744][60144] Updated weights for policy 1, policy_version 38082 (0.0008) +[2023-10-09 05:42:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 77561856. Throughput: 0: 1699.4, 1: 1719.2. Samples: 19397164. Policy #0 lag: (min: 8.0, avg: 33.1, max: 40.0) +[2023-10-09 05:42:06,053][59242] Avg episode reward: [(0, '29.440'), (1, '28.330')] +[2023-10-09 05:42:06,112][60144] Updated weights for policy 1, policy_version 38092 (0.0008) +[2023-10-09 05:42:06,486][60144] Updated weights for policy 1, policy_version 38102 (0.0009) +[2023-10-09 05:42:06,850][60144] Updated weights for policy 1, policy_version 38112 (0.0010) +[2023-10-09 05:42:08,161][60143] Updated weights for policy 0, policy_version 37672 (0.0008) +[2023-10-09 05:42:08,520][60143] Updated weights for policy 0, policy_version 37682 (0.0008) +[2023-10-09 05:42:08,889][60143] Updated weights for policy 0, policy_version 37692 (0.0009) +[2023-10-09 05:42:10,804][60144] Updated weights for policy 1, policy_version 38122 (0.0007) +[2023-10-09 05:42:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 77627392. Throughput: 0: 1682.4, 1: 1741.7. Samples: 19417808. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-09 05:42:11,052][59242] Avg episode reward: [(0, '29.460'), (1, '27.980')] +[2023-10-09 05:42:11,172][60144] Updated weights for policy 1, policy_version 38132 (0.0008) +[2023-10-09 05:42:11,538][60144] Updated weights for policy 1, policy_version 38142 (0.0008) +[2023-10-09 05:42:12,930][60143] Updated weights for policy 0, policy_version 37702 (0.0009) +[2023-10-09 05:42:13,297][60143] Updated weights for policy 0, policy_version 37712 (0.0007) +[2023-10-09 05:42:13,672][60143] Updated weights for policy 0, policy_version 37722 (0.0008) +[2023-10-09 05:42:15,511][60144] Updated weights for policy 1, policy_version 38152 (0.0010) +[2023-10-09 05:42:15,881][60144] Updated weights for policy 1, policy_version 38162 (0.0010) +[2023-10-09 05:42:16,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 77692928. Throughput: 0: 1713.2, 1: 1730.7. Samples: 19438696. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-09 05:42:16,053][59242] Avg episode reward: [(0, '30.940'), (1, '27.670')] +[2023-10-09 05:42:16,251][60144] Updated weights for policy 1, policy_version 38172 (0.0010) +[2023-10-09 05:42:17,702][60143] Updated weights for policy 0, policy_version 37732 (0.0008) +[2023-10-09 05:42:18,079][60143] Updated weights for policy 0, policy_version 37742 (0.0008) +[2023-10-09 05:42:18,446][60143] Updated weights for policy 0, policy_version 37752 (0.0008) +[2023-10-09 05:42:20,160][60144] Updated weights for policy 1, policy_version 38182 (0.0009) +[2023-10-09 05:42:20,530][60144] Updated weights for policy 1, policy_version 38192 (0.0007) +[2023-10-09 05:42:20,893][60144] Updated weights for policy 1, policy_version 38202 (0.0009) +[2023-10-09 05:42:21,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 77758464. Throughput: 0: 1695.0, 1: 1736.4. Samples: 19448858. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-09 05:42:21,053][59242] Avg episode reward: [(0, '31.640'), (1, '27.740')] +[2023-10-09 05:42:22,456][60143] Updated weights for policy 0, policy_version 37762 (0.0009) +[2023-10-09 05:42:22,816][60143] Updated weights for policy 0, policy_version 37772 (0.0007) +[2023-10-09 05:42:23,193][60143] Updated weights for policy 0, policy_version 37782 (0.0007) +[2023-10-09 05:42:23,568][60143] Updated weights for policy 0, policy_version 37792 (0.0007) +[2023-10-09 05:42:24,702][60144] Updated weights for policy 1, policy_version 38212 (0.0007) +[2023-10-09 05:42:25,070][60144] Updated weights for policy 1, policy_version 38222 (0.0007) +[2023-10-09 05:42:25,434][60144] Updated weights for policy 1, policy_version 38232 (0.0008) +[2023-10-09 05:42:26,052][59242] Fps is (10 sec: 16384.5, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 77856768. Throughput: 0: 1698.4, 1: 1743.3. Samples: 19469708. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-09 05:42:26,053][59242] Avg episode reward: [(0, '30.030'), (1, '26.040')] +[2023-10-09 05:42:27,404][60143] Updated weights for policy 0, policy_version 37802 (0.0009) +[2023-10-09 05:42:27,775][60143] Updated weights for policy 0, policy_version 37812 (0.0011) +[2023-10-09 05:42:28,136][60143] Updated weights for policy 0, policy_version 37822 (0.0009) +[2023-10-09 05:42:29,195][60144] Updated weights for policy 1, policy_version 38242 (0.0009) +[2023-10-09 05:42:29,564][60144] Updated weights for policy 1, policy_version 38252 (0.0009) +[2023-10-09 05:42:29,935][60144] Updated weights for policy 1, policy_version 38262 (0.0007) +[2023-10-09 05:42:30,304][60144] Updated weights for policy 1, policy_version 38272 (0.0010) +[2023-10-09 05:42:31,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 77922304. Throughput: 0: 1717.4, 1: 1715.2. Samples: 19489766. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-09 05:42:31,052][59242] Avg episode reward: [(0, '29.990'), (1, '26.440')] +[2023-10-09 05:42:32,140][60143] Updated weights for policy 0, policy_version 37832 (0.0008) +[2023-10-09 05:42:32,514][60143] Updated weights for policy 0, policy_version 37842 (0.0007) +[2023-10-09 05:42:32,888][60143] Updated weights for policy 0, policy_version 37852 (0.0007) +[2023-10-09 05:42:34,299][60144] Updated weights for policy 1, policy_version 38282 (0.0008) +[2023-10-09 05:42:34,667][60144] Updated weights for policy 1, policy_version 38292 (0.0007) +[2023-10-09 05:42:35,024][60144] Updated weights for policy 1, policy_version 38302 (0.0008) +[2023-10-09 05:42:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 77987840. Throughput: 0: 1686.5, 1: 1753.7. Samples: 19500510. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-09 05:42:36,052][59242] Avg episode reward: [(0, '30.180'), (1, '26.960')] +[2023-10-09 05:42:36,716][60143] Updated weights for policy 0, policy_version 37862 (0.0007) +[2023-10-09 05:42:37,100][60143] Updated weights for policy 0, policy_version 37872 (0.0007) +[2023-10-09 05:42:37,477][60143] Updated weights for policy 0, policy_version 37882 (0.0009) +[2023-10-09 05:42:38,823][60144] Updated weights for policy 1, policy_version 38312 (0.0008) +[2023-10-09 05:42:39,184][60144] Updated weights for policy 1, policy_version 38322 (0.0010) +[2023-10-09 05:42:39,548][60144] Updated weights for policy 1, policy_version 38332 (0.0008) +[2023-10-09 05:42:41,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 78053376. Throughput: 0: 1715.8, 1: 1722.3. Samples: 19520692. Policy #0 lag: (min: 25.0, avg: 40.7, max: 57.0) +[2023-10-09 05:42:41,053][59242] Avg episode reward: [(0, '30.430'), (1, '26.710')] +[2023-10-09 05:42:41,509][60143] Updated weights for policy 0, policy_version 37892 (0.0008) +[2023-10-09 05:42:41,895][60143] Updated weights for policy 0, policy_version 37902 (0.0010) +[2023-10-09 05:42:42,276][60143] Updated weights for policy 0, policy_version 37912 (0.0007) +[2023-10-09 05:42:43,632][60144] Updated weights for policy 1, policy_version 38342 (0.0008) +[2023-10-09 05:42:44,007][60144] Updated weights for policy 1, policy_version 38352 (0.0011) +[2023-10-09 05:42:44,385][60144] Updated weights for policy 1, policy_version 38362 (0.0008) +[2023-10-09 05:42:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 78118912. Throughput: 0: 1715.1, 1: 1718.4. Samples: 19541416. Policy #0 lag: (min: 25.0, avg: 40.7, max: 57.0) +[2023-10-09 05:42:46,053][59242] Avg episode reward: [(0, '30.200'), (1, '26.380')] +[2023-10-09 05:42:46,302][60143] Updated weights for policy 0, policy_version 37922 (0.0008) +[2023-10-09 05:42:46,676][60143] Updated weights for policy 0, policy_version 37932 (0.0007) +[2023-10-09 05:42:47,047][60143] Updated weights for policy 0, policy_version 37942 (0.0008) +[2023-10-09 05:42:47,417][60143] Updated weights for policy 0, policy_version 37952 (0.0010) +[2023-10-09 05:42:48,218][60144] Updated weights for policy 1, policy_version 38372 (0.0009) +[2023-10-09 05:42:48,591][60144] Updated weights for policy 1, policy_version 38382 (0.0010) +[2023-10-09 05:42:48,962][60144] Updated weights for policy 1, policy_version 38392 (0.0009) +[2023-10-09 05:42:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 78184448. Throughput: 0: 1692.3, 1: 1736.9. Samples: 19551478. Policy #0 lag: (min: 25.0, avg: 40.7, max: 57.0) +[2023-10-09 05:42:51,053][59242] Avg episode reward: [(0, '30.190'), (1, '26.850')] +[2023-10-09 05:42:51,646][60143] Updated weights for policy 0, policy_version 37962 (0.0007) +[2023-10-09 05:42:52,026][60143] Updated weights for policy 0, policy_version 37972 (0.0007) +[2023-10-09 05:42:52,386][60143] Updated weights for policy 0, policy_version 37982 (0.0010) +[2023-10-09 05:42:52,807][60144] Updated weights for policy 1, policy_version 38402 (0.0010) +[2023-10-09 05:42:53,167][60144] Updated weights for policy 1, policy_version 38412 (0.0008) +[2023-10-09 05:42:53,537][60144] Updated weights for policy 1, policy_version 38422 (0.0008) +[2023-10-09 05:42:53,903][60144] Updated weights for policy 1, policy_version 38432 (0.0011) +[2023-10-09 05:42:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 78249984. Throughput: 0: 1711.0, 1: 1719.0. Samples: 19572160. Policy #0 lag: (min: 25.0, avg: 40.7, max: 57.0) +[2023-10-09 05:42:56,053][59242] Avg episode reward: [(0, '29.160'), (1, '28.000')] +[2023-10-09 05:42:56,372][60143] Updated weights for policy 0, policy_version 37992 (0.0008) +[2023-10-09 05:42:56,741][60143] Updated weights for policy 0, policy_version 38002 (0.0009) +[2023-10-09 05:42:57,110][60143] Updated weights for policy 0, policy_version 38012 (0.0008) +[2023-10-09 05:42:57,696][60144] Updated weights for policy 1, policy_version 38442 (0.0008) +[2023-10-09 05:42:58,069][60144] Updated weights for policy 1, policy_version 38452 (0.0010) +[2023-10-09 05:42:58,439][60144] Updated weights for policy 1, policy_version 38462 (0.0010) +[2023-10-09 05:43:01,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 78315520. Throughput: 0: 1707.0, 1: 1729.4. Samples: 19593334. Policy #0 lag: (min: 25.0, avg: 40.7, max: 57.0) +[2023-10-09 05:43:01,053][59242] Avg episode reward: [(0, '30.600'), (1, '28.680')] +[2023-10-09 05:43:01,196][60143] Updated weights for policy 0, policy_version 38022 (0.0008) +[2023-10-09 05:43:01,570][60143] Updated weights for policy 0, policy_version 38032 (0.0009) +[2023-10-09 05:43:01,927][60143] Updated weights for policy 0, policy_version 38042 (0.0008) +[2023-10-09 05:43:02,471][60144] Updated weights for policy 1, policy_version 38472 (0.0010) +[2023-10-09 05:43:02,834][60144] Updated weights for policy 1, policy_version 38482 (0.0008) +[2023-10-09 05:43:03,204][60144] Updated weights for policy 1, policy_version 38492 (0.0007) +[2023-10-09 05:43:05,844][60143] Updated weights for policy 0, policy_version 38052 (0.0008) +[2023-10-09 05:43:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 78381056. Throughput: 0: 1696.9, 1: 1718.6. Samples: 19602556. Policy #0 lag: (min: 25.0, avg: 40.7, max: 57.0) +[2023-10-09 05:43:06,052][59242] Avg episode reward: [(0, '30.300'), (1, '30.320')] +[2023-10-09 05:43:06,225][60143] Updated weights for policy 0, policy_version 38062 (0.0009) +[2023-10-09 05:43:06,591][60143] Updated weights for policy 0, policy_version 38072 (0.0011) +[2023-10-09 05:43:07,255][60144] Updated weights for policy 1, policy_version 38502 (0.0007) +[2023-10-09 05:43:07,638][60144] Updated weights for policy 1, policy_version 38512 (0.0007) +[2023-10-09 05:43:08,006][60144] Updated weights for policy 1, policy_version 38522 (0.0007) +[2023-10-09 05:43:10,583][60143] Updated weights for policy 0, policy_version 38082 (0.0010) +[2023-10-09 05:43:10,944][60143] Updated weights for policy 0, policy_version 38092 (0.0010) +[2023-10-09 05:43:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 78446592. Throughput: 0: 1708.5, 1: 1714.3. Samples: 19623734. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) +[2023-10-09 05:43:11,053][59242] Avg episode reward: [(0, '30.400'), (1, '29.520')] +[2023-10-09 05:43:11,315][60143] Updated weights for policy 0, policy_version 38102 (0.0009) +[2023-10-09 05:43:11,684][60143] Updated weights for policy 0, policy_version 38112 (0.0008) +[2023-10-09 05:43:12,092][60144] Updated weights for policy 1, policy_version 38532 (0.0008) +[2023-10-09 05:43:12,461][60144] Updated weights for policy 1, policy_version 38542 (0.0007) +[2023-10-09 05:43:12,828][60144] Updated weights for policy 1, policy_version 38552 (0.0008) +[2023-10-09 05:43:15,600][60143] Updated weights for policy 0, policy_version 38122 (0.0008) +[2023-10-09 05:43:15,979][60143] Updated weights for policy 0, policy_version 38132 (0.0007) +[2023-10-09 05:43:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 78512128. Throughput: 0: 1708.2, 1: 1741.1. Samples: 19644986. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) +[2023-10-09 05:43:16,052][59242] Avg episode reward: [(0, '31.630'), (1, '30.200')] +[2023-10-09 05:43:16,345][60143] Updated weights for policy 0, policy_version 38142 (0.0007) +[2023-10-09 05:43:16,754][60144] Updated weights for policy 1, policy_version 38562 (0.0008) +[2023-10-09 05:43:17,124][60144] Updated weights for policy 1, policy_version 38572 (0.0010) +[2023-10-09 05:43:17,492][60144] Updated weights for policy 1, policy_version 38582 (0.0010) +[2023-10-09 05:43:17,853][60144] Updated weights for policy 1, policy_version 38592 (0.0010) +[2023-10-09 05:43:20,217][60143] Updated weights for policy 0, policy_version 38152 (0.0010) +[2023-10-09 05:43:20,591][60143] Updated weights for policy 0, policy_version 38162 (0.0008) +[2023-10-09 05:43:20,956][60143] Updated weights for policy 0, policy_version 38172 (0.0008) +[2023-10-09 05:43:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 78577664. Throughput: 0: 1719.3, 1: 1704.8. Samples: 19654592. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) +[2023-10-09 05:43:21,053][59242] Avg episode reward: [(0, '31.510'), (1, '30.950')] +[2023-10-09 05:43:21,870][60144] Updated weights for policy 1, policy_version 38602 (0.0007) +[2023-10-09 05:43:22,249][60144] Updated weights for policy 1, policy_version 38612 (0.0007) +[2023-10-09 05:43:22,618][60144] Updated weights for policy 1, policy_version 38622 (0.0007) +[2023-10-09 05:43:24,982][60143] Updated weights for policy 0, policy_version 38182 (0.0010) +[2023-10-09 05:43:25,353][60143] Updated weights for policy 0, policy_version 38192 (0.0008) +[2023-10-09 05:43:25,717][60143] Updated weights for policy 0, policy_version 38202 (0.0010) +[2023-10-09 05:43:26,052][59242] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 78675968. Throughput: 0: 1715.9, 1: 1732.3. Samples: 19675858. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) +[2023-10-09 05:43:26,053][59242] Avg episode reward: [(0, '31.190'), (1, '30.610')] +[2023-10-09 05:43:26,480][60144] Updated weights for policy 1, policy_version 38632 (0.0009) +[2023-10-09 05:43:26,859][60144] Updated weights for policy 1, policy_version 38642 (0.0011) +[2023-10-09 05:43:27,230][60144] Updated weights for policy 1, policy_version 38652 (0.0011) +[2023-10-09 05:43:29,716][60143] Updated weights for policy 0, policy_version 38212 (0.0009) +[2023-10-09 05:43:30,112][60143] Updated weights for policy 0, policy_version 38222 (0.0008) +[2023-10-09 05:43:30,473][60143] Updated weights for policy 0, policy_version 38232 (0.0008) +[2023-10-09 05:43:31,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 78741504. Throughput: 0: 1698.0, 1: 1740.6. Samples: 19696152. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) +[2023-10-09 05:43:31,053][59242] Avg episode reward: [(0, '30.790'), (1, '30.750')] +[2023-10-09 05:43:31,064][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000038240_39157760.pth... +[2023-10-09 05:43:31,099][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000036640_37519360.pth +[2023-10-09 05:43:31,215][60144] Updated weights for policy 1, policy_version 38662 (0.0007) +[2023-10-09 05:43:31,589][60144] Updated weights for policy 1, policy_version 38672 (0.0008) +[2023-10-09 05:43:31,959][60144] Updated weights for policy 1, policy_version 38682 (0.0007) +[2023-10-09 05:43:32,173][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000038688_39616512.pth... +[2023-10-09 05:43:32,203][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000037056_37945344.pth +[2023-10-09 05:43:34,505][60143] Updated weights for policy 0, policy_version 38242 (0.0008) +[2023-10-09 05:43:34,877][60143] Updated weights for policy 0, policy_version 38252 (0.0007) +[2023-10-09 05:43:35,247][60143] Updated weights for policy 0, policy_version 38262 (0.0009) +[2023-10-09 05:43:35,620][60143] Updated weights for policy 0, policy_version 38272 (0.0010) +[2023-10-09 05:43:35,875][60144] Updated weights for policy 1, policy_version 38692 (0.0009) +[2023-10-09 05:43:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 78807040. Throughput: 0: 1720.0, 1: 1718.6. Samples: 19706218. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-09 05:43:36,053][59242] Avg episode reward: [(0, '32.660'), (1, '29.510')] +[2023-10-09 05:43:36,235][60144] Updated weights for policy 1, policy_version 38702 (0.0008) +[2023-10-09 05:43:36,598][60144] Updated weights for policy 1, policy_version 38712 (0.0007) +[2023-10-09 05:43:39,666][60143] Updated weights for policy 0, policy_version 38282 (0.0009) +[2023-10-09 05:43:40,039][60143] Updated weights for policy 0, policy_version 38292 (0.0007) +[2023-10-09 05:43:40,365][60144] Updated weights for policy 1, policy_version 38722 (0.0007) +[2023-10-09 05:43:40,420][60143] Updated weights for policy 0, policy_version 38302 (0.0009) +[2023-10-09 05:43:40,726][60144] Updated weights for policy 1, policy_version 38732 (0.0007) +[2023-10-09 05:43:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 78872576. Throughput: 0: 1711.2, 1: 1737.5. Samples: 19727352. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-09 05:43:41,053][59242] Avg episode reward: [(0, '31.540'), (1, '30.150')] +[2023-10-09 05:43:41,097][60144] Updated weights for policy 1, policy_version 38742 (0.0009) +[2023-10-09 05:43:41,461][60144] Updated weights for policy 1, policy_version 38752 (0.0010) +[2023-10-09 05:43:44,173][60143] Updated weights for policy 0, policy_version 38312 (0.0008) +[2023-10-09 05:43:44,543][60143] Updated weights for policy 0, policy_version 38322 (0.0007) +[2023-10-09 05:43:44,911][60143] Updated weights for policy 0, policy_version 38332 (0.0007) +[2023-10-09 05:43:45,273][60144] Updated weights for policy 1, policy_version 38762 (0.0008) +[2023-10-09 05:43:45,636][60144] Updated weights for policy 1, policy_version 38772 (0.0007) +[2023-10-09 05:43:45,999][60144] Updated weights for policy 1, policy_version 38782 (0.0009) +[2023-10-09 05:43:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 78938112. Throughput: 0: 1694.5, 1: 1724.6. Samples: 19747194. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-09 05:43:46,052][59242] Avg episode reward: [(0, '31.430'), (1, '29.720')] +[2023-10-09 05:43:48,674][60143] Updated weights for policy 0, policy_version 38342 (0.0009) +[2023-10-09 05:43:49,038][60143] Updated weights for policy 0, policy_version 38352 (0.0008) +[2023-10-09 05:43:49,405][60143] Updated weights for policy 0, policy_version 38362 (0.0009) +[2023-10-09 05:43:50,002][60144] Updated weights for policy 1, policy_version 38792 (0.0008) +[2023-10-09 05:43:50,371][60144] Updated weights for policy 1, policy_version 38802 (0.0008) +[2023-10-09 05:43:50,736][60144] Updated weights for policy 1, policy_version 38812 (0.0007) +[2023-10-09 05:43:51,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 79036416. Throughput: 0: 1728.9, 1: 1738.3. Samples: 19758582. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-09 05:43:51,053][59242] Avg episode reward: [(0, '30.480'), (1, '29.370')] +[2023-10-09 05:43:53,260][60143] Updated weights for policy 0, policy_version 38372 (0.0009) +[2023-10-09 05:43:53,643][60143] Updated weights for policy 0, policy_version 38382 (0.0010) +[2023-10-09 05:43:54,008][60143] Updated weights for policy 0, policy_version 38392 (0.0009) +[2023-10-09 05:43:54,590][60144] Updated weights for policy 1, policy_version 38822 (0.0009) +[2023-10-09 05:43:54,960][60144] Updated weights for policy 1, policy_version 38832 (0.0009) +[2023-10-09 05:43:55,324][60144] Updated weights for policy 1, policy_version 38842 (0.0009) +[2023-10-09 05:43:56,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 79101952. Throughput: 0: 1704.0, 1: 1740.5. Samples: 19778740. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-09 05:43:56,053][59242] Avg episode reward: [(0, '31.590'), (1, '29.540')] +[2023-10-09 05:43:57,978][60143] Updated weights for policy 0, policy_version 38402 (0.0009) +[2023-10-09 05:43:58,350][60143] Updated weights for policy 0, policy_version 38412 (0.0009) +[2023-10-09 05:43:58,720][60143] Updated weights for policy 0, policy_version 38422 (0.0008) +[2023-10-09 05:43:59,094][60143] Updated weights for policy 0, policy_version 38432 (0.0008) +[2023-10-09 05:43:59,293][60144] Updated weights for policy 1, policy_version 38852 (0.0011) +[2023-10-09 05:43:59,668][60144] Updated weights for policy 1, policy_version 38862 (0.0011) +[2023-10-09 05:44:00,030][60144] Updated weights for policy 1, policy_version 38872 (0.0008) +[2023-10-09 05:44:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 79167488. Throughput: 0: 1711.4, 1: 1716.0. Samples: 19799218. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-09 05:44:01,053][59242] Avg episode reward: [(0, '30.340'), (1, '28.990')] +[2023-10-09 05:44:02,909][60143] Updated weights for policy 0, policy_version 38442 (0.0009) +[2023-10-09 05:44:03,288][60143] Updated weights for policy 0, policy_version 38452 (0.0009) +[2023-10-09 05:44:03,653][60143] Updated weights for policy 0, policy_version 38462 (0.0007) +[2023-10-09 05:44:03,945][60144] Updated weights for policy 1, policy_version 38882 (0.0007) +[2023-10-09 05:44:04,316][60144] Updated weights for policy 1, policy_version 38892 (0.0008) +[2023-10-09 05:44:04,681][60144] Updated weights for policy 1, policy_version 38902 (0.0008) +[2023-10-09 05:44:05,053][60144] Updated weights for policy 1, policy_version 38912 (0.0007) +[2023-10-09 05:44:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 79233024. Throughput: 0: 1713.0, 1: 1748.2. Samples: 19810346. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:44:06,052][59242] Avg episode reward: [(0, '33.030'), (1, '29.140')] +[2023-10-09 05:44:06,053][59934] Saving new best policy, reward=33.030! +[2023-10-09 05:44:07,572][60143] Updated weights for policy 0, policy_version 38472 (0.0009) +[2023-10-09 05:44:07,946][60143] Updated weights for policy 0, policy_version 38482 (0.0009) +[2023-10-09 05:44:08,307][60143] Updated weights for policy 0, policy_version 38492 (0.0007) +[2023-10-09 05:44:08,944][60144] Updated weights for policy 1, policy_version 38922 (0.0008) +[2023-10-09 05:44:09,311][60144] Updated weights for policy 1, policy_version 38932 (0.0007) +[2023-10-09 05:44:09,681][60144] Updated weights for policy 1, policy_version 38942 (0.0009) +[2023-10-09 05:44:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 79298560. Throughput: 0: 1706.5, 1: 1725.3. Samples: 19830288. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:44:11,053][59242] Avg episode reward: [(0, '31.860'), (1, '29.270')] +[2023-10-09 05:44:12,305][60143] Updated weights for policy 0, policy_version 38502 (0.0007) +[2023-10-09 05:44:12,671][60143] Updated weights for policy 0, policy_version 38512 (0.0007) +[2023-10-09 05:44:13,036][60143] Updated weights for policy 0, policy_version 38522 (0.0010) +[2023-10-09 05:44:13,627][60144] Updated weights for policy 1, policy_version 38952 (0.0007) +[2023-10-09 05:44:13,996][60144] Updated weights for policy 1, policy_version 38962 (0.0008) +[2023-10-09 05:44:14,360][60144] Updated weights for policy 1, policy_version 38972 (0.0008) +[2023-10-09 05:44:16,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 79364096. Throughput: 0: 1728.3, 1: 1716.9. Samples: 19851188. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:44:16,053][59242] Avg episode reward: [(0, '31.630'), (1, '27.260')] +[2023-10-09 05:44:17,216][60143] Updated weights for policy 0, policy_version 38532 (0.0011) +[2023-10-09 05:44:17,627][60143] Updated weights for policy 0, policy_version 38542 (0.0011) +[2023-10-09 05:44:17,991][60143] Updated weights for policy 0, policy_version 38552 (0.0010) +[2023-10-09 05:44:18,445][60144] Updated weights for policy 1, policy_version 38982 (0.0009) +[2023-10-09 05:44:18,806][60144] Updated weights for policy 1, policy_version 38992 (0.0010) +[2023-10-09 05:44:19,167][60144] Updated weights for policy 1, policy_version 39002 (0.0009) +[2023-10-09 05:44:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 79429632. Throughput: 0: 1704.8, 1: 1738.6. Samples: 19861172. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:44:21,053][59242] Avg episode reward: [(0, '31.070'), (1, '26.680')] +[2023-10-09 05:44:21,888][60143] Updated weights for policy 0, policy_version 38562 (0.0007) +[2023-10-09 05:44:22,255][60143] Updated weights for policy 0, policy_version 38572 (0.0008) +[2023-10-09 05:44:22,621][60143] Updated weights for policy 0, policy_version 38582 (0.0007) +[2023-10-09 05:44:22,989][60143] Updated weights for policy 0, policy_version 38592 (0.0007) +[2023-10-09 05:44:23,048][60144] Updated weights for policy 1, policy_version 39012 (0.0007) +[2023-10-09 05:44:23,418][60144] Updated weights for policy 1, policy_version 39022 (0.0008) +[2023-10-09 05:44:23,776][60144] Updated weights for policy 1, policy_version 39032 (0.0007) +[2023-10-09 05:44:26,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 79495168. Throughput: 0: 1710.5, 1: 1716.6. Samples: 19881574. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:44:26,052][59242] Avg episode reward: [(0, '30.550'), (1, '25.720')] +[2023-10-09 05:44:27,091][60143] Updated weights for policy 0, policy_version 38602 (0.0007) +[2023-10-09 05:44:27,473][60143] Updated weights for policy 0, policy_version 38612 (0.0007) +[2023-10-09 05:44:27,721][60144] Updated weights for policy 1, policy_version 39042 (0.0007) +[2023-10-09 05:44:27,837][60143] Updated weights for policy 0, policy_version 38622 (0.0008) +[2023-10-09 05:44:28,088][60144] Updated weights for policy 1, policy_version 39052 (0.0009) +[2023-10-09 05:44:28,455][60144] Updated weights for policy 1, policy_version 39062 (0.0008) +[2023-10-09 05:44:28,826][60144] Updated weights for policy 1, policy_version 39072 (0.0008) +[2023-10-09 05:44:31,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 79560704. Throughput: 0: 1729.3, 1: 1728.0. Samples: 19902776. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:44:31,053][59242] Avg episode reward: [(0, '28.310'), (1, '25.950')] +[2023-10-09 05:44:31,877][60143] Updated weights for policy 0, policy_version 38632 (0.0008) +[2023-10-09 05:44:32,250][60143] Updated weights for policy 0, policy_version 38642 (0.0007) +[2023-10-09 05:44:32,623][60143] Updated weights for policy 0, policy_version 38652 (0.0008) +[2023-10-09 05:44:32,723][60144] Updated weights for policy 1, policy_version 39082 (0.0007) +[2023-10-09 05:44:33,083][60144] Updated weights for policy 1, policy_version 39092 (0.0010) +[2023-10-09 05:44:33,457][60144] Updated weights for policy 1, policy_version 39102 (0.0007) +[2023-10-09 05:44:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 79626240. Throughput: 0: 1696.4, 1: 1717.4. Samples: 19912202. Policy #0 lag: (min: 1.0, avg: 13.1, max: 33.0) +[2023-10-09 05:44:36,053][59242] Avg episode reward: [(0, '27.160'), (1, '26.530')] +[2023-10-09 05:44:36,637][60143] Updated weights for policy 0, policy_version 38662 (0.0008) +[2023-10-09 05:44:37,007][60143] Updated weights for policy 0, policy_version 38672 (0.0010) +[2023-10-09 05:44:37,382][60143] Updated weights for policy 0, policy_version 38682 (0.0008) +[2023-10-09 05:44:37,434][60144] Updated weights for policy 1, policy_version 39112 (0.0008) +[2023-10-09 05:44:37,799][60144] Updated weights for policy 1, policy_version 39122 (0.0008) +[2023-10-09 05:44:38,173][60144] Updated weights for policy 1, policy_version 39132 (0.0007) +[2023-10-09 05:44:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 79691776. Throughput: 0: 1720.8, 1: 1717.2. Samples: 19933446. Policy #0 lag: (min: 1.0, avg: 13.1, max: 33.0) +[2023-10-09 05:44:41,053][59242] Avg episode reward: [(0, '28.540'), (1, '26.650')] +[2023-10-09 05:44:41,319][60143] Updated weights for policy 0, policy_version 38692 (0.0009) +[2023-10-09 05:44:41,696][60143] Updated weights for policy 0, policy_version 38702 (0.0009) +[2023-10-09 05:44:42,068][60143] Updated weights for policy 0, policy_version 38712 (0.0010) +[2023-10-09 05:44:42,250][60144] Updated weights for policy 1, policy_version 39142 (0.0008) +[2023-10-09 05:44:42,614][60144] Updated weights for policy 1, policy_version 39152 (0.0008) +[2023-10-09 05:44:42,978][60144] Updated weights for policy 1, policy_version 39162 (0.0007) +[2023-10-09 05:44:45,987][60143] Updated weights for policy 0, policy_version 38722 (0.0009) +[2023-10-09 05:44:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 79757312. Throughput: 0: 1714.2, 1: 1740.8. Samples: 19954692. Policy #0 lag: (min: 1.0, avg: 13.1, max: 33.0) +[2023-10-09 05:44:46,053][59242] Avg episode reward: [(0, '28.860'), (1, '28.160')] +[2023-10-09 05:44:46,359][60143] Updated weights for policy 0, policy_version 38732 (0.0008) +[2023-10-09 05:44:46,731][60143] Updated weights for policy 0, policy_version 38742 (0.0008) +[2023-10-09 05:44:46,892][60144] Updated weights for policy 1, policy_version 39172 (0.0007) +[2023-10-09 05:44:47,106][60143] Updated weights for policy 0, policy_version 38752 (0.0008) +[2023-10-09 05:44:47,259][60144] Updated weights for policy 1, policy_version 39182 (0.0009) +[2023-10-09 05:44:47,630][60144] Updated weights for policy 1, policy_version 39192 (0.0009) +[2023-10-09 05:44:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 79822848. Throughput: 0: 1702.5, 1: 1711.3. Samples: 19963966. Policy #0 lag: (min: 1.0, avg: 13.1, max: 33.0) +[2023-10-09 05:44:51,053][59242] Avg episode reward: [(0, '27.290'), (1, '27.020')] +[2023-10-09 05:44:51,204][60143] Updated weights for policy 0, policy_version 38762 (0.0011) +[2023-10-09 05:44:51,460][60144] Updated weights for policy 1, policy_version 39202 (0.0010) +[2023-10-09 05:44:51,579][60143] Updated weights for policy 0, policy_version 38772 (0.0008) +[2023-10-09 05:44:51,830][60144] Updated weights for policy 1, policy_version 39212 (0.0009) +[2023-10-09 05:44:51,958][60143] Updated weights for policy 0, policy_version 38782 (0.0008) +[2023-10-09 05:44:52,194][60144] Updated weights for policy 1, policy_version 39222 (0.0009) +[2023-10-09 05:44:52,550][60144] Updated weights for policy 1, policy_version 39232 (0.0007) +[2023-10-09 05:44:56,029][60143] Updated weights for policy 0, policy_version 38792 (0.0009) +[2023-10-09 05:44:56,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 79888384. Throughput: 0: 1703.4, 1: 1738.1. Samples: 19985158. Policy #0 lag: (min: 1.0, avg: 13.1, max: 33.0) +[2023-10-09 05:44:56,053][59242] Avg episode reward: [(0, '26.630'), (1, '28.850')] +[2023-10-09 05:44:56,395][60143] Updated weights for policy 0, policy_version 38802 (0.0009) +[2023-10-09 05:44:56,510][60144] Updated weights for policy 1, policy_version 39242 (0.0008) +[2023-10-09 05:44:56,767][60143] Updated weights for policy 0, policy_version 38812 (0.0008) +[2023-10-09 05:44:56,872][60144] Updated weights for policy 1, policy_version 39252 (0.0008) +[2023-10-09 05:44:57,238][60144] Updated weights for policy 1, policy_version 39262 (0.0010) +[2023-10-09 05:45:00,869][60143] Updated weights for policy 0, policy_version 38822 (0.0008) +[2023-10-09 05:45:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13662.6). Total num frames: 79953920. Throughput: 0: 1704.0, 1: 1738.1. Samples: 20006084. Policy #0 lag: (min: 1.0, avg: 13.1, max: 33.0) +[2023-10-09 05:45:01,053][59242] Avg episode reward: [(0, '26.520'), (1, '28.510')] +[2023-10-09 05:45:01,255][60143] Updated weights for policy 0, policy_version 38832 (0.0008) +[2023-10-09 05:45:01,337][60144] Updated weights for policy 1, policy_version 39272 (0.0007) +[2023-10-09 05:45:01,619][60143] Updated weights for policy 0, policy_version 38842 (0.0009) +[2023-10-09 05:45:01,701][60144] Updated weights for policy 1, policy_version 39282 (0.0007) +[2023-10-09 05:45:02,070][60144] Updated weights for policy 1, policy_version 39292 (0.0007) +[2023-10-09 05:45:05,631][60143] Updated weights for policy 0, policy_version 38852 (0.0009) +[2023-10-09 05:45:05,907][60144] Updated weights for policy 1, policy_version 39302 (0.0009) +[2023-10-09 05:45:06,026][60143] Updated weights for policy 0, policy_version 38862 (0.0007) +[2023-10-09 05:45:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 80019456. Throughput: 0: 1707.6, 1: 1718.4. Samples: 20015342. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:45:06,053][59242] Avg episode reward: [(0, '27.000'), (1, '30.020')] +[2023-10-09 05:45:06,264][60144] Updated weights for policy 1, policy_version 39312 (0.0007) +[2023-10-09 05:45:06,396][60143] Updated weights for policy 0, policy_version 38872 (0.0008) +[2023-10-09 05:45:06,634][60144] Updated weights for policy 1, policy_version 39322 (0.0010) +[2023-10-09 05:45:10,526][60143] Updated weights for policy 0, policy_version 38882 (0.0007) +[2023-10-09 05:45:10,624][60144] Updated weights for policy 1, policy_version 39332 (0.0007) +[2023-10-09 05:45:10,893][60143] Updated weights for policy 0, policy_version 38892 (0.0008) +[2023-10-09 05:45:10,981][60144] Updated weights for policy 1, policy_version 39342 (0.0009) +[2023-10-09 05:45:11,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 80084992. Throughput: 0: 1702.0, 1: 1736.0. Samples: 20036288. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:45:11,052][59242] Avg episode reward: [(0, '27.580'), (1, '28.580')] +[2023-10-09 05:45:11,261][60143] Updated weights for policy 0, policy_version 38902 (0.0008) +[2023-10-09 05:45:11,346][60144] Updated weights for policy 1, policy_version 39352 (0.0007) +[2023-10-09 05:45:11,627][60143] Updated weights for policy 0, policy_version 38912 (0.0007) +[2023-10-09 05:45:15,251][60144] Updated weights for policy 1, policy_version 39362 (0.0007) +[2023-10-09 05:45:15,623][60144] Updated weights for policy 1, policy_version 39372 (0.0009) +[2023-10-09 05:45:15,753][60143] Updated weights for policy 0, policy_version 38922 (0.0010) +[2023-10-09 05:45:15,988][60144] Updated weights for policy 1, policy_version 39382 (0.0007) +[2023-10-09 05:45:16,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 80150528. Throughput: 0: 1694.5, 1: 1726.3. Samples: 20056714. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:45:16,052][59242] Avg episode reward: [(0, '28.410'), (1, '28.490')] +[2023-10-09 05:45:16,128][60143] Updated weights for policy 0, policy_version 38932 (0.0009) +[2023-10-09 05:45:16,352][60144] Updated weights for policy 1, policy_version 39392 (0.0008) +[2023-10-09 05:45:16,489][60143] Updated weights for policy 0, policy_version 38942 (0.0008) +[2023-10-09 05:45:20,405][60143] Updated weights for policy 0, policy_version 38952 (0.0009) +[2023-10-09 05:45:20,488][60144] Updated weights for policy 1, policy_version 39402 (0.0007) +[2023-10-09 05:45:20,763][60143] Updated weights for policy 0, policy_version 38962 (0.0009) +[2023-10-09 05:45:20,857][60144] Updated weights for policy 1, policy_version 39412 (0.0008) +[2023-10-09 05:45:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 80216064. Throughput: 0: 1697.1, 1: 1732.0. Samples: 20066514. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:45:21,053][59242] Avg episode reward: [(0, '28.780'), (1, '28.560')] +[2023-10-09 05:45:21,131][60143] Updated weights for policy 0, policy_version 38972 (0.0008) +[2023-10-09 05:45:21,222][60144] Updated weights for policy 1, policy_version 39422 (0.0007) +[2023-10-09 05:45:25,017][60144] Updated weights for policy 1, policy_version 39432 (0.0009) +[2023-10-09 05:45:25,238][60143] Updated weights for policy 0, policy_version 38982 (0.0008) +[2023-10-09 05:45:25,389][60144] Updated weights for policy 1, policy_version 39442 (0.0008) +[2023-10-09 05:45:25,604][60143] Updated weights for policy 0, policy_version 38992 (0.0007) +[2023-10-09 05:45:25,750][60144] Updated weights for policy 1, policy_version 39452 (0.0009) +[2023-10-09 05:45:25,972][60143] Updated weights for policy 0, policy_version 39002 (0.0007) +[2023-10-09 05:45:26,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 80314368. Throughput: 0: 1692.8, 1: 1738.5. Samples: 20087854. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:45:26,052][59242] Avg episode reward: [(0, '29.530'), (1, '28.990')] +[2023-10-09 05:45:29,708][60144] Updated weights for policy 1, policy_version 39462 (0.0008) +[2023-10-09 05:45:29,966][60143] Updated weights for policy 0, policy_version 39012 (0.0007) +[2023-10-09 05:45:30,082][60144] Updated weights for policy 1, policy_version 39472 (0.0009) +[2023-10-09 05:45:30,334][60143] Updated weights for policy 0, policy_version 39022 (0.0007) +[2023-10-09 05:45:30,439][60144] Updated weights for policy 1, policy_version 39482 (0.0007) +[2023-10-09 05:45:30,701][60143] Updated weights for policy 0, policy_version 39032 (0.0008) +[2023-10-09 05:45:31,052][59242] Fps is (10 sec: 19660.7, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 80412672. Throughput: 0: 1676.4, 1: 1709.8. Samples: 20107072. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:45:31,053][59242] Avg episode reward: [(0, '28.470'), (1, '28.500')] +[2023-10-09 05:45:31,062][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000039488_40435712.pth... +[2023-10-09 05:45:31,063][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000039040_39976960.pth... +[2023-10-09 05:45:31,097][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000037856_38764544.pth +[2023-10-09 05:45:31,105][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000037440_38338560.pth +[2023-10-09 05:45:34,377][60144] Updated weights for policy 1, policy_version 39492 (0.0008) +[2023-10-09 05:45:34,632][60143] Updated weights for policy 0, policy_version 39042 (0.0008) +[2023-10-09 05:45:34,755][60144] Updated weights for policy 1, policy_version 39502 (0.0008) +[2023-10-09 05:45:34,999][60143] Updated weights for policy 0, policy_version 39052 (0.0007) +[2023-10-09 05:45:35,127][60144] Updated weights for policy 1, policy_version 39512 (0.0007) +[2023-10-09 05:45:35,364][60143] Updated weights for policy 0, policy_version 39062 (0.0009) +[2023-10-09 05:45:35,742][60143] Updated weights for policy 0, policy_version 39072 (0.0008) +[2023-10-09 05:45:36,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 80478208. Throughput: 0: 1697.1, 1: 1735.3. Samples: 20118424. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:45:36,053][59242] Avg episode reward: [(0, '29.010'), (1, '28.020')] +[2023-10-09 05:45:39,071][60144] Updated weights for policy 1, policy_version 39522 (0.0007) +[2023-10-09 05:45:39,436][60144] Updated weights for policy 1, policy_version 39532 (0.0008) +[2023-10-09 05:45:39,754][60143] Updated weights for policy 0, policy_version 39082 (0.0007) +[2023-10-09 05:45:39,813][60144] Updated weights for policy 1, policy_version 39542 (0.0007) +[2023-10-09 05:45:40,118][60143] Updated weights for policy 0, policy_version 39092 (0.0008) +[2023-10-09 05:45:40,178][60144] Updated weights for policy 1, policy_version 39552 (0.0008) +[2023-10-09 05:45:40,486][60143] Updated weights for policy 0, policy_version 39102 (0.0009) +[2023-10-09 05:45:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 80543744. Throughput: 0: 1698.0, 1: 1717.1. Samples: 20138836. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:45:41,053][59242] Avg episode reward: [(0, '29.770'), (1, '28.740')] +[2023-10-09 05:45:43,991][60144] Updated weights for policy 1, policy_version 39562 (0.0009) +[2023-10-09 05:45:44,373][60144] Updated weights for policy 1, policy_version 39572 (0.0007) +[2023-10-09 05:45:44,591][60143] Updated weights for policy 0, policy_version 39112 (0.0008) +[2023-10-09 05:45:44,736][60144] Updated weights for policy 1, policy_version 39582 (0.0007) +[2023-10-09 05:45:44,957][60143] Updated weights for policy 0, policy_version 39122 (0.0007) +[2023-10-09 05:45:45,335][60143] Updated weights for policy 0, policy_version 39132 (0.0007) +[2023-10-09 05:45:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 80609280. Throughput: 0: 1669.6, 1: 1706.0. Samples: 20157986. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:45:46,053][59242] Avg episode reward: [(0, '30.280'), (1, '29.730')] +[2023-10-09 05:45:48,600][60144] Updated weights for policy 1, policy_version 39592 (0.0011) +[2023-10-09 05:45:48,974][60144] Updated weights for policy 1, policy_version 39602 (0.0007) +[2023-10-09 05:45:49,262][60143] Updated weights for policy 0, policy_version 39142 (0.0009) +[2023-10-09 05:45:49,336][60144] Updated weights for policy 1, policy_version 39612 (0.0009) +[2023-10-09 05:45:49,622][60143] Updated weights for policy 0, policy_version 39152 (0.0009) +[2023-10-09 05:45:49,993][60143] Updated weights for policy 0, policy_version 39162 (0.0008) +[2023-10-09 05:45:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 80674816. Throughput: 0: 1697.0, 1: 1731.9. Samples: 20169640. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:45:51,053][59242] Avg episode reward: [(0, '30.500'), (1, '29.580')] +[2023-10-09 05:45:53,269][60144] Updated weights for policy 1, policy_version 39622 (0.0007) +[2023-10-09 05:45:53,637][60144] Updated weights for policy 1, policy_version 39632 (0.0009) +[2023-10-09 05:45:54,009][60144] Updated weights for policy 1, policy_version 39642 (0.0007) +[2023-10-09 05:45:54,197][60143] Updated weights for policy 0, policy_version 39172 (0.0008) +[2023-10-09 05:45:54,587][60143] Updated weights for policy 0, policy_version 39182 (0.0010) +[2023-10-09 05:45:54,963][60143] Updated weights for policy 0, policy_version 39192 (0.0008) +[2023-10-09 05:45:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 80740352. Throughput: 0: 1685.9, 1: 1707.2. Samples: 20188978. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:45:56,052][59242] Avg episode reward: [(0, '30.250'), (1, '30.220')] +[2023-10-09 05:45:57,917][60144] Updated weights for policy 1, policy_version 39652 (0.0008) +[2023-10-09 05:45:58,293][60144] Updated weights for policy 1, policy_version 39662 (0.0007) +[2023-10-09 05:45:58,654][60144] Updated weights for policy 1, policy_version 39672 (0.0007) +[2023-10-09 05:45:58,804][60143] Updated weights for policy 0, policy_version 39202 (0.0009) +[2023-10-09 05:45:59,174][60143] Updated weights for policy 0, policy_version 39212 (0.0007) +[2023-10-09 05:45:59,547][60143] Updated weights for policy 0, policy_version 39222 (0.0007) +[2023-10-09 05:45:59,914][60143] Updated weights for policy 0, policy_version 39232 (0.0008) +[2023-10-09 05:46:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 80805888. Throughput: 0: 1675.1, 1: 1718.8. Samples: 20209442. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 05:46:01,053][59242] Avg episode reward: [(0, '30.980'), (1, '29.600')] +[2023-10-09 05:46:02,728][60144] Updated weights for policy 1, policy_version 39682 (0.0008) +[2023-10-09 05:46:03,091][60144] Updated weights for policy 1, policy_version 39692 (0.0011) +[2023-10-09 05:46:03,454][60144] Updated weights for policy 1, policy_version 39702 (0.0009) +[2023-10-09 05:46:03,815][60143] Updated weights for policy 0, policy_version 39242 (0.0008) +[2023-10-09 05:46:03,821][60144] Updated weights for policy 1, policy_version 39712 (0.0009) +[2023-10-09 05:46:04,190][60143] Updated weights for policy 0, policy_version 39252 (0.0010) +[2023-10-09 05:46:04,560][60143] Updated weights for policy 0, policy_version 39262 (0.0009) +[2023-10-09 05:46:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 80871424. Throughput: 0: 1700.2, 1: 1716.8. Samples: 20220278. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 05:46:06,053][59242] Avg episode reward: [(0, '31.920'), (1, '27.060')] +[2023-10-09 05:46:07,860][60144] Updated weights for policy 1, policy_version 39722 (0.0007) +[2023-10-09 05:46:08,219][60144] Updated weights for policy 1, policy_version 39732 (0.0008) +[2023-10-09 05:46:08,588][60144] Updated weights for policy 1, policy_version 39742 (0.0008) +[2023-10-09 05:46:08,673][60143] Updated weights for policy 0, policy_version 39272 (0.0009) +[2023-10-09 05:46:09,051][60143] Updated weights for policy 0, policy_version 39282 (0.0009) +[2023-10-09 05:46:09,412][60143] Updated weights for policy 0, policy_version 39292 (0.0009) +[2023-10-09 05:46:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 80936960. Throughput: 0: 1676.2, 1: 1705.1. Samples: 20240016. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 05:46:11,053][59242] Avg episode reward: [(0, '31.550'), (1, '26.580')] +[2023-10-09 05:46:12,683][60144] Updated weights for policy 1, policy_version 39752 (0.0007) +[2023-10-09 05:46:13,048][60144] Updated weights for policy 1, policy_version 39762 (0.0009) +[2023-10-09 05:46:13,414][60144] Updated weights for policy 1, policy_version 39772 (0.0007) +[2023-10-09 05:46:13,622][60143] Updated weights for policy 0, policy_version 39302 (0.0008) +[2023-10-09 05:46:13,989][60143] Updated weights for policy 0, policy_version 39312 (0.0011) +[2023-10-09 05:46:14,360][60143] Updated weights for policy 0, policy_version 39322 (0.0011) +[2023-10-09 05:46:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 81002496. Throughput: 0: 1689.2, 1: 1731.2. Samples: 20260988. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 05:46:16,052][59242] Avg episode reward: [(0, '33.610'), (1, '26.660')] +[2023-10-09 05:46:16,060][59934] Saving new best policy, reward=33.610! +[2023-10-09 05:46:17,300][60144] Updated weights for policy 1, policy_version 39782 (0.0008) +[2023-10-09 05:46:17,667][60144] Updated weights for policy 1, policy_version 39792 (0.0008) +[2023-10-09 05:46:18,029][60144] Updated weights for policy 1, policy_version 39802 (0.0009) +[2023-10-09 05:46:18,208][60143] Updated weights for policy 0, policy_version 39332 (0.0008) +[2023-10-09 05:46:18,575][60143] Updated weights for policy 0, policy_version 39342 (0.0007) +[2023-10-09 05:46:18,940][60143] Updated weights for policy 0, policy_version 39352 (0.0010) +[2023-10-09 05:46:21,053][59242] Fps is (10 sec: 13106.5, 60 sec: 14199.3, 300 sec: 13662.6). Total num frames: 81068032. Throughput: 0: 1691.7, 1: 1704.8. Samples: 20271268. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 05:46:21,053][59242] Avg episode reward: [(0, '32.850'), (1, '26.450')] +[2023-10-09 05:46:21,909][60144] Updated weights for policy 1, policy_version 39812 (0.0010) +[2023-10-09 05:46:22,283][60144] Updated weights for policy 1, policy_version 39822 (0.0009) +[2023-10-09 05:46:22,650][60144] Updated weights for policy 1, policy_version 39832 (0.0008) +[2023-10-09 05:46:23,026][60143] Updated weights for policy 0, policy_version 39362 (0.0008) +[2023-10-09 05:46:23,395][60143] Updated weights for policy 0, policy_version 39372 (0.0009) +[2023-10-09 05:46:23,752][60143] Updated weights for policy 0, policy_version 39382 (0.0009) +[2023-10-09 05:46:24,120][60143] Updated weights for policy 0, policy_version 39392 (0.0009) +[2023-10-09 05:46:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 81133568. Throughput: 0: 1672.4, 1: 1723.9. Samples: 20291670. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 05:46:26,052][59242] Avg episode reward: [(0, '34.350'), (1, '26.910')] +[2023-10-09 05:46:26,053][59934] Saving new best policy, reward=34.350! +[2023-10-09 05:46:26,430][60144] Updated weights for policy 1, policy_version 39842 (0.0007) +[2023-10-09 05:46:26,792][60144] Updated weights for policy 1, policy_version 39852 (0.0007) +[2023-10-09 05:46:27,168][60144] Updated weights for policy 1, policy_version 39862 (0.0007) +[2023-10-09 05:46:27,535][60144] Updated weights for policy 1, policy_version 39872 (0.0009) +[2023-10-09 05:46:27,952][60143] Updated weights for policy 0, policy_version 39402 (0.0009) +[2023-10-09 05:46:28,320][60143] Updated weights for policy 0, policy_version 39412 (0.0009) +[2023-10-09 05:46:28,686][60143] Updated weights for policy 0, policy_version 39422 (0.0007) +[2023-10-09 05:46:31,052][59242] Fps is (10 sec: 13108.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 81199104. Throughput: 0: 1705.4, 1: 1744.3. Samples: 20313220. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-09 05:46:31,053][59242] Avg episode reward: [(0, '32.700'), (1, '27.840')] +[2023-10-09 05:46:31,555][60144] Updated weights for policy 1, policy_version 39882 (0.0007) +[2023-10-09 05:46:31,918][60144] Updated weights for policy 1, policy_version 39892 (0.0009) +[2023-10-09 05:46:32,286][60144] Updated weights for policy 1, policy_version 39902 (0.0007) +[2023-10-09 05:46:32,857][60143] Updated weights for policy 0, policy_version 39432 (0.0008) +[2023-10-09 05:46:33,229][60143] Updated weights for policy 0, policy_version 39442 (0.0009) +[2023-10-09 05:46:33,595][60143] Updated weights for policy 0, policy_version 39452 (0.0007) +[2023-10-09 05:46:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 81264640. Throughput: 0: 1684.5, 1: 1717.7. Samples: 20322740. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-09 05:46:36,052][59242] Avg episode reward: [(0, '31.870'), (1, '28.420')] +[2023-10-09 05:46:36,266][60144] Updated weights for policy 1, policy_version 39912 (0.0007) +[2023-10-09 05:46:36,643][60144] Updated weights for policy 1, policy_version 39922 (0.0008) +[2023-10-09 05:46:37,009][60144] Updated weights for policy 1, policy_version 39932 (0.0009) +[2023-10-09 05:46:37,642][60143] Updated weights for policy 0, policy_version 39462 (0.0009) +[2023-10-09 05:46:38,008][60143] Updated weights for policy 0, policy_version 39472 (0.0008) +[2023-10-09 05:46:38,374][60143] Updated weights for policy 0, policy_version 39482 (0.0007) +[2023-10-09 05:46:40,848][60144] Updated weights for policy 1, policy_version 39942 (0.0009) +[2023-10-09 05:46:41,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 81330176. Throughput: 0: 1694.7, 1: 1745.3. Samples: 20343778. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-09 05:46:41,054][59242] Avg episode reward: [(0, '31.970'), (1, '27.670')] +[2023-10-09 05:46:41,221][60144] Updated weights for policy 1, policy_version 39952 (0.0010) +[2023-10-09 05:46:41,584][60144] Updated weights for policy 1, policy_version 39962 (0.0010) +[2023-10-09 05:46:42,372][60143] Updated weights for policy 0, policy_version 39492 (0.0009) +[2023-10-09 05:46:42,772][60143] Updated weights for policy 0, policy_version 39502 (0.0007) +[2023-10-09 05:46:43,144][60143] Updated weights for policy 0, policy_version 39512 (0.0007) +[2023-10-09 05:46:45,588][60144] Updated weights for policy 1, policy_version 39972 (0.0010) +[2023-10-09 05:46:45,947][60144] Updated weights for policy 1, policy_version 39982 (0.0010) +[2023-10-09 05:46:46,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 81395712. Throughput: 0: 1709.6, 1: 1739.5. Samples: 20364650. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-09 05:46:46,053][59242] Avg episode reward: [(0, '31.770'), (1, '28.650')] +[2023-10-09 05:46:46,319][60144] Updated weights for policy 1, policy_version 39992 (0.0010) +[2023-10-09 05:46:47,098][60143] Updated weights for policy 0, policy_version 39522 (0.0008) +[2023-10-09 05:46:47,467][60143] Updated weights for policy 0, policy_version 39532 (0.0007) +[2023-10-09 05:46:47,832][60143] Updated weights for policy 0, policy_version 39542 (0.0008) +[2023-10-09 05:46:48,199][60143] Updated weights for policy 0, policy_version 39552 (0.0008) +[2023-10-09 05:46:50,275][60144] Updated weights for policy 1, policy_version 40002 (0.0007) +[2023-10-09 05:46:50,651][60144] Updated weights for policy 1, policy_version 40012 (0.0007) +[2023-10-09 05:46:51,019][60144] Updated weights for policy 1, policy_version 40022 (0.0008) +[2023-10-09 05:46:51,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 81461248. Throughput: 0: 1684.9, 1: 1736.4. Samples: 20374234. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-09 05:46:51,053][59242] Avg episode reward: [(0, '31.380'), (1, '27.980')] +[2023-10-09 05:46:51,383][60144] Updated weights for policy 1, policy_version 40032 (0.0010) +[2023-10-09 05:46:52,233][60143] Updated weights for policy 0, policy_version 39562 (0.0009) +[2023-10-09 05:46:52,601][60143] Updated weights for policy 0, policy_version 39572 (0.0007) +[2023-10-09 05:46:52,970][60143] Updated weights for policy 0, policy_version 39582 (0.0010) +[2023-10-09 05:46:55,321][60144] Updated weights for policy 1, policy_version 40042 (0.0008) +[2023-10-09 05:46:55,683][60144] Updated weights for policy 1, policy_version 40052 (0.0009) +[2023-10-09 05:46:56,051][60144] Updated weights for policy 1, policy_version 40062 (0.0008) +[2023-10-09 05:46:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 81526784. Throughput: 0: 1706.5, 1: 1742.4. Samples: 20395218. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-09 05:46:56,053][59242] Avg episode reward: [(0, '32.440'), (1, '28.510')] +[2023-10-09 05:46:57,027][60143] Updated weights for policy 0, policy_version 39592 (0.0010) +[2023-10-09 05:46:57,403][60143] Updated weights for policy 0, policy_version 39602 (0.0009) +[2023-10-09 05:46:57,770][60143] Updated weights for policy 0, policy_version 39612 (0.0008) +[2023-10-09 05:46:59,950][60144] Updated weights for policy 1, policy_version 40072 (0.0008) +[2023-10-09 05:47:00,327][60144] Updated weights for policy 1, policy_version 40082 (0.0008) +[2023-10-09 05:47:00,687][60144] Updated weights for policy 1, policy_version 40092 (0.0008) +[2023-10-09 05:47:01,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 81625088. Throughput: 0: 1714.2, 1: 1719.1. Samples: 20415486. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:47:01,052][59242] Avg episode reward: [(0, '32.990'), (1, '29.200')] +[2023-10-09 05:47:01,625][60143] Updated weights for policy 0, policy_version 39622 (0.0007) +[2023-10-09 05:47:02,000][60143] Updated weights for policy 0, policy_version 39632 (0.0008) +[2023-10-09 05:47:02,364][60143] Updated weights for policy 0, policy_version 39642 (0.0007) +[2023-10-09 05:47:04,667][60144] Updated weights for policy 1, policy_version 40102 (0.0008) +[2023-10-09 05:47:05,031][60144] Updated weights for policy 1, policy_version 40112 (0.0008) +[2023-10-09 05:47:05,395][60144] Updated weights for policy 1, policy_version 40122 (0.0010) +[2023-10-09 05:47:06,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 81690624. Throughput: 0: 1690.4, 1: 1742.8. Samples: 20425762. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:47:06,052][59242] Avg episode reward: [(0, '32.180'), (1, '28.540')] +[2023-10-09 05:47:06,297][60143] Updated weights for policy 0, policy_version 39652 (0.0007) +[2023-10-09 05:47:06,675][60143] Updated weights for policy 0, policy_version 39662 (0.0007) +[2023-10-09 05:47:07,040][60143] Updated weights for policy 0, policy_version 39672 (0.0009) +[2023-10-09 05:47:09,321][60144] Updated weights for policy 1, policy_version 40132 (0.0009) +[2023-10-09 05:47:09,688][60144] Updated weights for policy 1, policy_version 40142 (0.0008) +[2023-10-09 05:47:10,054][60144] Updated weights for policy 1, policy_version 40152 (0.0007) +[2023-10-09 05:47:10,982][60143] Updated weights for policy 0, policy_version 39682 (0.0008) +[2023-10-09 05:47:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 81756160. Throughput: 0: 1714.6, 1: 1730.1. Samples: 20446682. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:47:11,053][59242] Avg episode reward: [(0, '31.840'), (1, '28.900')] +[2023-10-09 05:47:11,355][60143] Updated weights for policy 0, policy_version 39692 (0.0008) +[2023-10-09 05:47:11,727][60143] Updated weights for policy 0, policy_version 39702 (0.0009) +[2023-10-09 05:47:12,082][60143] Updated weights for policy 0, policy_version 39712 (0.0007) +[2023-10-09 05:47:13,954][60144] Updated weights for policy 1, policy_version 40162 (0.0009) +[2023-10-09 05:47:14,326][60144] Updated weights for policy 1, policy_version 40172 (0.0011) +[2023-10-09 05:47:14,690][60144] Updated weights for policy 1, policy_version 40182 (0.0007) +[2023-10-09 05:47:15,060][60144] Updated weights for policy 1, policy_version 40192 (0.0007) +[2023-10-09 05:47:16,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 81821696. Throughput: 0: 1711.7, 1: 1706.0. Samples: 20467016. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:47:16,053][59242] Avg episode reward: [(0, '31.260'), (1, '29.670')] +[2023-10-09 05:47:16,167][60143] Updated weights for policy 0, policy_version 39722 (0.0008) +[2023-10-09 05:47:16,533][60143] Updated weights for policy 0, policy_version 39732 (0.0010) +[2023-10-09 05:47:16,906][60143] Updated weights for policy 0, policy_version 39742 (0.0008) +[2023-10-09 05:47:18,997][60144] Updated weights for policy 1, policy_version 40202 (0.0007) +[2023-10-09 05:47:19,374][60144] Updated weights for policy 1, policy_version 40212 (0.0007) +[2023-10-09 05:47:19,739][60144] Updated weights for policy 1, policy_version 40222 (0.0007) +[2023-10-09 05:47:20,970][60143] Updated weights for policy 0, policy_version 39752 (0.0008) +[2023-10-09 05:47:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.5, 300 sec: 13662.6). Total num frames: 81887232. Throughput: 0: 1703.4, 1: 1739.3. Samples: 20477660. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:47:21,052][59242] Avg episode reward: [(0, '30.830'), (1, '29.340')] +[2023-10-09 05:47:21,342][60143] Updated weights for policy 0, policy_version 39762 (0.0010) +[2023-10-09 05:47:21,708][60143] Updated weights for policy 0, policy_version 39772 (0.0010) +[2023-10-09 05:47:23,750][60144] Updated weights for policy 1, policy_version 40232 (0.0009) +[2023-10-09 05:47:24,120][60144] Updated weights for policy 1, policy_version 40242 (0.0009) +[2023-10-09 05:47:24,479][60144] Updated weights for policy 1, policy_version 40252 (0.0009) +[2023-10-09 05:47:25,741][60143] Updated weights for policy 0, policy_version 39782 (0.0008) +[2023-10-09 05:47:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 81952768. Throughput: 0: 1713.4, 1: 1709.7. Samples: 20497818. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:47:26,053][59242] Avg episode reward: [(0, '31.140'), (1, '28.520')] +[2023-10-09 05:47:26,117][60143] Updated weights for policy 0, policy_version 39792 (0.0007) +[2023-10-09 05:47:26,487][60143] Updated weights for policy 0, policy_version 39802 (0.0008) +[2023-10-09 05:47:28,403][60144] Updated weights for policy 1, policy_version 40262 (0.0009) +[2023-10-09 05:47:28,778][60144] Updated weights for policy 1, policy_version 40272 (0.0009) +[2023-10-09 05:47:29,143][60144] Updated weights for policy 1, policy_version 40282 (0.0007) +[2023-10-09 05:47:30,583][60143] Updated weights for policy 0, policy_version 39812 (0.0009) +[2023-10-09 05:47:30,984][60143] Updated weights for policy 0, policy_version 39822 (0.0009) +[2023-10-09 05:47:31,052][59242] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 82018304. Throughput: 0: 1714.2, 1: 1710.7. Samples: 20518768. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:47:31,053][59242] Avg episode reward: [(0, '32.170'), (1, '28.410')] +[2023-10-09 05:47:31,062][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000040288_41254912.pth... +[2023-10-09 05:47:31,091][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000038688_39616512.pth +[2023-10-09 05:47:31,095][60003] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p1/milestones/checkpoint_000040288_41254912.pth +[2023-10-09 05:47:31,351][60143] Updated weights for policy 0, policy_version 39832 (0.0007) +[2023-10-09 05:47:31,646][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000039840_40796160.pth... +[2023-10-09 05:47:31,679][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000038240_39157760.pth +[2023-10-09 05:47:31,683][59934] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p0/milestones/checkpoint_000039840_40796160.pth +[2023-10-09 05:47:33,054][60144] Updated weights for policy 1, policy_version 40292 (0.0008) +[2023-10-09 05:47:33,430][60144] Updated weights for policy 1, policy_version 40302 (0.0007) +[2023-10-09 05:47:33,791][60144] Updated weights for policy 1, policy_version 40312 (0.0009) +[2023-10-09 05:47:35,398][60143] Updated weights for policy 0, policy_version 39842 (0.0008) +[2023-10-09 05:47:35,760][60143] Updated weights for policy 0, policy_version 39852 (0.0011) +[2023-10-09 05:47:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 82083840. Throughput: 0: 1710.9, 1: 1721.4. Samples: 20528688. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:47:36,053][59242] Avg episode reward: [(0, '31.390'), (1, '28.780')] +[2023-10-09 05:47:36,127][60143] Updated weights for policy 0, policy_version 39862 (0.0010) +[2023-10-09 05:47:36,494][60143] Updated weights for policy 0, policy_version 39872 (0.0010) +[2023-10-09 05:47:37,792][60144] Updated weights for policy 1, policy_version 40322 (0.0008) +[2023-10-09 05:47:38,162][60144] Updated weights for policy 1, policy_version 40332 (0.0010) +[2023-10-09 05:47:38,532][60144] Updated weights for policy 1, policy_version 40342 (0.0009) +[2023-10-09 05:47:38,890][60144] Updated weights for policy 1, policy_version 40352 (0.0010) +[2023-10-09 05:47:40,491][60143] Updated weights for policy 0, policy_version 39882 (0.0008) +[2023-10-09 05:47:40,870][60143] Updated weights for policy 0, policy_version 39892 (0.0009) +[2023-10-09 05:47:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 82149376. Throughput: 0: 1717.3, 1: 1706.8. Samples: 20549302. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:47:41,053][59242] Avg episode reward: [(0, '31.710'), (1, '28.650')] +[2023-10-09 05:47:41,240][60143] Updated weights for policy 0, policy_version 39902 (0.0009) +[2023-10-09 05:47:42,786][60144] Updated weights for policy 1, policy_version 40362 (0.0009) +[2023-10-09 05:47:43,157][60144] Updated weights for policy 1, policy_version 40372 (0.0008) +[2023-10-09 05:47:43,522][60144] Updated weights for policy 1, policy_version 40382 (0.0011) +[2023-10-09 05:47:45,215][60143] Updated weights for policy 0, policy_version 39912 (0.0009) +[2023-10-09 05:47:45,586][60143] Updated weights for policy 0, policy_version 39922 (0.0009) +[2023-10-09 05:47:45,952][60143] Updated weights for policy 0, policy_version 39932 (0.0008) +[2023-10-09 05:47:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 82214912. Throughput: 0: 1698.0, 1: 1735.2. Samples: 20569978. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:47:46,053][59242] Avg episode reward: [(0, '29.880'), (1, '29.020')] +[2023-10-09 05:47:47,473][60144] Updated weights for policy 1, policy_version 40392 (0.0008) +[2023-10-09 05:47:47,836][60144] Updated weights for policy 1, policy_version 40402 (0.0009) +[2023-10-09 05:47:48,200][60144] Updated weights for policy 1, policy_version 40412 (0.0011) +[2023-10-09 05:47:49,799][60143] Updated weights for policy 0, policy_version 39942 (0.0010) +[2023-10-09 05:47:50,173][60143] Updated weights for policy 0, policy_version 39952 (0.0008) +[2023-10-09 05:47:50,540][60143] Updated weights for policy 0, policy_version 39962 (0.0009) +[2023-10-09 05:47:51,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 82313216. Throughput: 0: 1717.3, 1: 1710.8. Samples: 20580030. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:47:51,053][59242] Avg episode reward: [(0, '29.500'), (1, '28.060')] +[2023-10-09 05:47:52,244][60144] Updated weights for policy 1, policy_version 40422 (0.0009) +[2023-10-09 05:47:52,614][60144] Updated weights for policy 1, policy_version 40432 (0.0009) +[2023-10-09 05:47:52,978][60144] Updated weights for policy 1, policy_version 40442 (0.0007) +[2023-10-09 05:47:54,336][60143] Updated weights for policy 0, policy_version 39972 (0.0008) +[2023-10-09 05:47:54,703][60143] Updated weights for policy 0, policy_version 39982 (0.0010) +[2023-10-09 05:47:55,076][60143] Updated weights for policy 0, policy_version 39992 (0.0011) +[2023-10-09 05:47:56,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 82378752. Throughput: 0: 1711.9, 1: 1720.9. Samples: 20601158. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:47:56,052][59242] Avg episode reward: [(0, '30.310'), (1, '27.750')] +[2023-10-09 05:47:56,761][60144] Updated weights for policy 1, policy_version 40452 (0.0007) +[2023-10-09 05:47:57,139][60144] Updated weights for policy 1, policy_version 40462 (0.0007) +[2023-10-09 05:47:57,500][60144] Updated weights for policy 1, policy_version 40472 (0.0009) +[2023-10-09 05:47:58,867][60143] Updated weights for policy 0, policy_version 40002 (0.0008) +[2023-10-09 05:47:59,239][60143] Updated weights for policy 0, policy_version 40012 (0.0008) +[2023-10-09 05:47:59,607][60143] Updated weights for policy 0, policy_version 40022 (0.0008) +[2023-10-09 05:47:59,979][60143] Updated weights for policy 0, policy_version 40032 (0.0009) +[2023-10-09 05:48:01,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 82444288. Throughput: 0: 1691.4, 1: 1744.2. Samples: 20621618. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:48:01,053][59242] Avg episode reward: [(0, '30.670'), (1, '29.660')] +[2023-10-09 05:48:01,487][60144] Updated weights for policy 1, policy_version 40482 (0.0009) +[2023-10-09 05:48:01,855][60144] Updated weights for policy 1, policy_version 40492 (0.0008) +[2023-10-09 05:48:02,220][60144] Updated weights for policy 1, policy_version 40502 (0.0009) +[2023-10-09 05:48:02,582][60144] Updated weights for policy 1, policy_version 40512 (0.0008) +[2023-10-09 05:48:04,016][60143] Updated weights for policy 0, policy_version 40042 (0.0011) +[2023-10-09 05:48:04,383][60143] Updated weights for policy 0, policy_version 40052 (0.0010) +[2023-10-09 05:48:04,751][60143] Updated weights for policy 0, policy_version 40062 (0.0010) +[2023-10-09 05:48:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 82509824. Throughput: 0: 1719.6, 1: 1714.3. Samples: 20632188. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:48:06,053][59242] Avg episode reward: [(0, '32.170'), (1, '29.410')] +[2023-10-09 05:48:06,343][60144] Updated weights for policy 1, policy_version 40522 (0.0007) +[2023-10-09 05:48:06,711][60144] Updated weights for policy 1, policy_version 40532 (0.0009) +[2023-10-09 05:48:07,086][60144] Updated weights for policy 1, policy_version 40542 (0.0009) +[2023-10-09 05:48:08,875][60143] Updated weights for policy 0, policy_version 40072 (0.0008) +[2023-10-09 05:48:09,248][60143] Updated weights for policy 0, policy_version 40082 (0.0008) +[2023-10-09 05:48:09,617][60143] Updated weights for policy 0, policy_version 40092 (0.0008) +[2023-10-09 05:48:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 82575360. Throughput: 0: 1693.3, 1: 1743.3. Samples: 20652464. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:48:11,052][59242] Avg episode reward: [(0, '31.280'), (1, '27.540')] +[2023-10-09 05:48:11,073][60144] Updated weights for policy 1, policy_version 40552 (0.0010) +[2023-10-09 05:48:11,439][60144] Updated weights for policy 1, policy_version 40562 (0.0010) +[2023-10-09 05:48:11,799][60144] Updated weights for policy 1, policy_version 40572 (0.0009) +[2023-10-09 05:48:13,540][60143] Updated weights for policy 0, policy_version 40102 (0.0008) +[2023-10-09 05:48:13,912][60143] Updated weights for policy 0, policy_version 40112 (0.0010) +[2023-10-09 05:48:14,289][60143] Updated weights for policy 0, policy_version 40122 (0.0008) +[2023-10-09 05:48:15,763][60144] Updated weights for policy 1, policy_version 40582 (0.0008) +[2023-10-09 05:48:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 82640896. Throughput: 0: 1690.5, 1: 1743.1. Samples: 20673282. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:48:16,052][59242] Avg episode reward: [(0, '31.580'), (1, '28.980')] +[2023-10-09 05:48:16,135][60144] Updated weights for policy 1, policy_version 40592 (0.0007) +[2023-10-09 05:48:16,510][60144] Updated weights for policy 1, policy_version 40602 (0.0009) +[2023-10-09 05:48:18,715][60143] Updated weights for policy 0, policy_version 40132 (0.0009) +[2023-10-09 05:48:19,097][60143] Updated weights for policy 0, policy_version 40142 (0.0008) +[2023-10-09 05:48:19,466][60143] Updated weights for policy 0, policy_version 40152 (0.0009) +[2023-10-09 05:48:20,320][60144] Updated weights for policy 1, policy_version 40612 (0.0009) +[2023-10-09 05:48:20,685][60144] Updated weights for policy 1, policy_version 40622 (0.0009) +[2023-10-09 05:48:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 82706432. Throughput: 0: 1719.5, 1: 1730.9. Samples: 20683954. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:48:21,052][59242] Avg episode reward: [(0, '30.210'), (1, '28.930')] +[2023-10-09 05:48:21,063][60144] Updated weights for policy 1, policy_version 40632 (0.0007) +[2023-10-09 05:48:23,393][60143] Updated weights for policy 0, policy_version 40162 (0.0007) +[2023-10-09 05:48:23,773][60143] Updated weights for policy 0, policy_version 40172 (0.0010) +[2023-10-09 05:48:24,139][60143] Updated weights for policy 0, policy_version 40182 (0.0007) +[2023-10-09 05:48:24,504][60143] Updated weights for policy 0, policy_version 40192 (0.0007) +[2023-10-09 05:48:25,077][60144] Updated weights for policy 1, policy_version 40642 (0.0008) +[2023-10-09 05:48:25,440][60144] Updated weights for policy 1, policy_version 40652 (0.0011) +[2023-10-09 05:48:25,808][60144] Updated weights for policy 1, policy_version 40662 (0.0009) +[2023-10-09 05:48:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 82771968. Throughput: 0: 1688.5, 1: 1750.9. Samples: 20704076. Policy #0 lag: (min: 31.0, avg: 32.1, max: 55.0) +[2023-10-09 05:48:26,053][59242] Avg episode reward: [(0, '29.600'), (1, '28.770')] +[2023-10-09 05:48:26,172][60144] Updated weights for policy 1, policy_version 40672 (0.0007) +[2023-10-09 05:48:28,371][60143] Updated weights for policy 0, policy_version 40202 (0.0007) +[2023-10-09 05:48:28,737][60143] Updated weights for policy 0, policy_version 40212 (0.0009) +[2023-10-09 05:48:29,105][60143] Updated weights for policy 0, policy_version 40222 (0.0008) +[2023-10-09 05:48:30,106][60144] Updated weights for policy 1, policy_version 40682 (0.0008) +[2023-10-09 05:48:30,475][60144] Updated weights for policy 1, policy_version 40692 (0.0007) +[2023-10-09 05:48:30,843][60144] Updated weights for policy 1, policy_version 40702 (0.0008) +[2023-10-09 05:48:31,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.6, 300 sec: 13773.7). Total num frames: 82870272. Throughput: 0: 1705.1, 1: 1722.6. Samples: 20724222. Policy #0 lag: (min: 31.0, avg: 32.1, max: 55.0) +[2023-10-09 05:48:31,052][59242] Avg episode reward: [(0, '28.610'), (1, '29.750')] +[2023-10-09 05:48:33,037][60143] Updated weights for policy 0, policy_version 40232 (0.0007) +[2023-10-09 05:48:33,413][60143] Updated weights for policy 0, policy_version 40242 (0.0007) +[2023-10-09 05:48:33,783][60143] Updated weights for policy 0, policy_version 40252 (0.0008) +[2023-10-09 05:48:34,698][60144] Updated weights for policy 1, policy_version 40712 (0.0010) +[2023-10-09 05:48:35,070][60144] Updated weights for policy 1, policy_version 40722 (0.0009) +[2023-10-09 05:48:35,438][60144] Updated weights for policy 1, policy_version 40732 (0.0008) +[2023-10-09 05:48:36,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 82935808. Throughput: 0: 1699.8, 1: 1746.6. Samples: 20735118. Policy #0 lag: (min: 31.0, avg: 32.1, max: 55.0) +[2023-10-09 05:48:36,053][59242] Avg episode reward: [(0, '29.410'), (1, '28.890')] +[2023-10-09 05:48:37,864][60143] Updated weights for policy 0, policy_version 40262 (0.0009) +[2023-10-09 05:48:38,229][60143] Updated weights for policy 0, policy_version 40272 (0.0007) +[2023-10-09 05:48:38,607][60143] Updated weights for policy 0, policy_version 40282 (0.0010) +[2023-10-09 05:48:39,262][60144] Updated weights for policy 1, policy_version 40742 (0.0008) +[2023-10-09 05:48:39,627][60144] Updated weights for policy 1, policy_version 40752 (0.0008) +[2023-10-09 05:48:40,003][60144] Updated weights for policy 1, policy_version 40762 (0.0008) +[2023-10-09 05:48:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 83001344. Throughput: 0: 1692.5, 1: 1733.1. Samples: 20755310. Policy #0 lag: (min: 31.0, avg: 32.1, max: 55.0) +[2023-10-09 05:48:41,053][59242] Avg episode reward: [(0, '30.200'), (1, '29.710')] +[2023-10-09 05:48:42,562][60143] Updated weights for policy 0, policy_version 40292 (0.0008) +[2023-10-09 05:48:42,935][60143] Updated weights for policy 0, policy_version 40302 (0.0011) +[2023-10-09 05:48:43,302][60143] Updated weights for policy 0, policy_version 40312 (0.0010) +[2023-10-09 05:48:43,869][60144] Updated weights for policy 1, policy_version 40772 (0.0009) +[2023-10-09 05:48:44,230][60144] Updated weights for policy 1, policy_version 40782 (0.0008) +[2023-10-09 05:48:44,603][60144] Updated weights for policy 1, policy_version 40792 (0.0007) +[2023-10-09 05:48:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 83066880. Throughput: 0: 1710.1, 1: 1718.7. Samples: 20775916. Policy #0 lag: (min: 31.0, avg: 32.1, max: 55.0) +[2023-10-09 05:48:46,053][59242] Avg episode reward: [(0, '29.770'), (1, '30.990')] +[2023-10-09 05:48:47,380][60143] Updated weights for policy 0, policy_version 40322 (0.0008) +[2023-10-09 05:48:47,753][60143] Updated weights for policy 0, policy_version 40332 (0.0007) +[2023-10-09 05:48:48,119][60143] Updated weights for policy 0, policy_version 40342 (0.0007) +[2023-10-09 05:48:48,439][60144] Updated weights for policy 1, policy_version 40802 (0.0007) +[2023-10-09 05:48:48,484][60143] Updated weights for policy 0, policy_version 40352 (0.0007) +[2023-10-09 05:48:48,802][60144] Updated weights for policy 1, policy_version 40812 (0.0008) +[2023-10-09 05:48:49,173][60144] Updated weights for policy 1, policy_version 40822 (0.0010) +[2023-10-09 05:48:49,539][60144] Updated weights for policy 1, policy_version 40832 (0.0009) +[2023-10-09 05:48:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 83132416. Throughput: 0: 1681.0, 1: 1744.5. Samples: 20786336. Policy #0 lag: (min: 31.0, avg: 32.1, max: 55.0) +[2023-10-09 05:48:51,053][59242] Avg episode reward: [(0, '30.010'), (1, '31.210')] +[2023-10-09 05:48:52,388][60143] Updated weights for policy 0, policy_version 40362 (0.0010) +[2023-10-09 05:48:52,763][60143] Updated weights for policy 0, policy_version 40372 (0.0010) +[2023-10-09 05:48:53,133][60143] Updated weights for policy 0, policy_version 40382 (0.0007) +[2023-10-09 05:48:53,574][60144] Updated weights for policy 1, policy_version 40842 (0.0007) +[2023-10-09 05:48:53,941][60144] Updated weights for policy 1, policy_version 40852 (0.0008) +[2023-10-09 05:48:54,311][60144] Updated weights for policy 1, policy_version 40862 (0.0007) +[2023-10-09 05:48:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 83197952. Throughput: 0: 1706.2, 1: 1717.8. Samples: 20806542. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 05:48:56,053][59242] Avg episode reward: [(0, '29.730'), (1, '31.550')] +[2023-10-09 05:48:56,055][60003] Saving new best policy, reward=31.550! +[2023-10-09 05:48:57,175][60143] Updated weights for policy 0, policy_version 40392 (0.0008) +[2023-10-09 05:48:57,541][60143] Updated weights for policy 0, policy_version 40402 (0.0011) +[2023-10-09 05:48:57,916][60143] Updated weights for policy 0, policy_version 40412 (0.0009) +[2023-10-09 05:48:58,138][60144] Updated weights for policy 1, policy_version 40872 (0.0009) +[2023-10-09 05:48:58,501][60144] Updated weights for policy 1, policy_version 40882 (0.0008) +[2023-10-09 05:48:58,865][60144] Updated weights for policy 1, policy_version 40892 (0.0008) +[2023-10-09 05:49:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 83263488. Throughput: 0: 1708.1, 1: 1722.5. Samples: 20827662. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 05:49:01,053][59242] Avg episode reward: [(0, '30.160'), (1, '30.450')] +[2023-10-09 05:49:01,967][60143] Updated weights for policy 0, policy_version 40422 (0.0009) +[2023-10-09 05:49:02,339][60143] Updated weights for policy 0, policy_version 40432 (0.0008) +[2023-10-09 05:49:02,715][60143] Updated weights for policy 0, policy_version 40442 (0.0009) +[2023-10-09 05:49:02,916][60144] Updated weights for policy 1, policy_version 40902 (0.0008) +[2023-10-09 05:49:03,272][60144] Updated weights for policy 1, policy_version 40912 (0.0010) +[2023-10-09 05:49:03,641][60144] Updated weights for policy 1, policy_version 40922 (0.0008) +[2023-10-09 05:49:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 83329024. Throughput: 0: 1676.1, 1: 1729.3. Samples: 20837196. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 05:49:06,053][59242] Avg episode reward: [(0, '29.220'), (1, '30.240')] +[2023-10-09 05:49:06,595][60143] Updated weights for policy 0, policy_version 40452 (0.0009) +[2023-10-09 05:49:06,967][60143] Updated weights for policy 0, policy_version 40462 (0.0008) +[2023-10-09 05:49:07,344][60143] Updated weights for policy 0, policy_version 40472 (0.0007) +[2023-10-09 05:49:07,598][60144] Updated weights for policy 1, policy_version 40932 (0.0009) +[2023-10-09 05:49:07,961][60144] Updated weights for policy 1, policy_version 40942 (0.0007) +[2023-10-09 05:49:08,327][60144] Updated weights for policy 1, policy_version 40952 (0.0011) +[2023-10-09 05:49:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 83394560. Throughput: 0: 1709.1, 1: 1714.5. Samples: 20858140. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 05:49:11,052][59242] Avg episode reward: [(0, '28.960'), (1, '30.560')] +[2023-10-09 05:49:11,431][60143] Updated weights for policy 0, policy_version 40482 (0.0009) +[2023-10-09 05:49:11,858][60143] Updated weights for policy 0, policy_version 40492 (0.0007) +[2023-10-09 05:49:12,237][60143] Updated weights for policy 0, policy_version 40502 (0.0008) +[2023-10-09 05:49:12,279][60144] Updated weights for policy 1, policy_version 40962 (0.0010) +[2023-10-09 05:49:12,609][60143] Updated weights for policy 0, policy_version 40512 (0.0009) +[2023-10-09 05:49:12,659][60144] Updated weights for policy 1, policy_version 40972 (0.0007) +[2023-10-09 05:49:13,024][60144] Updated weights for policy 1, policy_version 40982 (0.0007) +[2023-10-09 05:49:13,394][60144] Updated weights for policy 1, policy_version 40992 (0.0007) +[2023-10-09 05:49:16,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 83460096. Throughput: 0: 1706.8, 1: 1738.3. Samples: 20879248. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 05:49:16,053][59242] Avg episode reward: [(0, '28.220'), (1, '30.870')] +[2023-10-09 05:49:16,394][60143] Updated weights for policy 0, policy_version 40522 (0.0008) +[2023-10-09 05:49:16,766][60143] Updated weights for policy 0, policy_version 40532 (0.0009) +[2023-10-09 05:49:17,145][60143] Updated weights for policy 0, policy_version 40542 (0.0008) +[2023-10-09 05:49:17,334][60144] Updated weights for policy 1, policy_version 41002 (0.0009) +[2023-10-09 05:49:17,703][60144] Updated weights for policy 1, policy_version 41012 (0.0008) +[2023-10-09 05:49:18,061][60144] Updated weights for policy 1, policy_version 41022 (0.0007) +[2023-10-09 05:49:21,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 83525632. Throughput: 0: 1694.5, 1: 1717.2. Samples: 20888644. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 05:49:21,053][59242] Avg episode reward: [(0, '27.780'), (1, '31.450')] +[2023-10-09 05:49:21,057][60143] Updated weights for policy 0, policy_version 40552 (0.0009) +[2023-10-09 05:49:21,432][60143] Updated weights for policy 0, policy_version 40562 (0.0009) +[2023-10-09 05:49:21,794][60143] Updated weights for policy 0, policy_version 40572 (0.0009) +[2023-10-09 05:49:22,110][60144] Updated weights for policy 1, policy_version 41032 (0.0007) +[2023-10-09 05:49:22,471][60144] Updated weights for policy 1, policy_version 41042 (0.0009) +[2023-10-09 05:49:22,846][60144] Updated weights for policy 1, policy_version 41052 (0.0007) +[2023-10-09 05:49:25,732][60143] Updated weights for policy 0, policy_version 40582 (0.0008) +[2023-10-09 05:49:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 83591168. Throughput: 0: 1710.4, 1: 1727.0. Samples: 20909996. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:49:26,054][59242] Avg episode reward: [(0, '26.430'), (1, '32.170')] +[2023-10-09 05:49:26,055][60003] Saving new best policy, reward=32.170! +[2023-10-09 05:49:26,096][60143] Updated weights for policy 0, policy_version 40592 (0.0008) +[2023-10-09 05:49:26,474][60143] Updated weights for policy 0, policy_version 40602 (0.0008) +[2023-10-09 05:49:26,789][60144] Updated weights for policy 1, policy_version 41062 (0.0008) +[2023-10-09 05:49:27,146][60144] Updated weights for policy 1, policy_version 41072 (0.0009) +[2023-10-09 05:49:27,531][60144] Updated weights for policy 1, policy_version 41082 (0.0008) +[2023-10-09 05:49:30,440][60143] Updated weights for policy 0, policy_version 40612 (0.0009) +[2023-10-09 05:49:30,810][60143] Updated weights for policy 0, policy_version 40622 (0.0010) +[2023-10-09 05:49:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 83656704. Throughput: 0: 1708.0, 1: 1745.0. Samples: 20931300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:49:31,053][59242] Avg episode reward: [(0, '26.050'), (1, '31.560')] +[2023-10-09 05:49:31,062][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000041088_42074112.pth... +[2023-10-09 05:49:31,097][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000039488_40435712.pth +[2023-10-09 05:49:31,183][60143] Updated weights for policy 0, policy_version 40632 (0.0007) +[2023-10-09 05:49:31,409][60144] Updated weights for policy 1, policy_version 41092 (0.0009) +[2023-10-09 05:49:31,474][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000040640_41615360.pth... +[2023-10-09 05:49:31,507][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000039040_39976960.pth +[2023-10-09 05:49:31,771][60144] Updated weights for policy 1, policy_version 41102 (0.0009) +[2023-10-09 05:49:32,147][60144] Updated weights for policy 1, policy_version 41112 (0.0009) +[2023-10-09 05:49:35,100][60143] Updated weights for policy 0, policy_version 40642 (0.0007) +[2023-10-09 05:49:35,468][60143] Updated weights for policy 0, policy_version 40652 (0.0009) +[2023-10-09 05:49:35,846][60143] Updated weights for policy 0, policy_version 40662 (0.0009) +[2023-10-09 05:49:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 83722240. Throughput: 0: 1716.7, 1: 1716.0. Samples: 20940808. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:49:36,053][59242] Avg episode reward: [(0, '26.550'), (1, '31.330')] +[2023-10-09 05:49:36,143][60144] Updated weights for policy 1, policy_version 41122 (0.0008) +[2023-10-09 05:49:36,207][60143] Updated weights for policy 0, policy_version 40672 (0.0009) +[2023-10-09 05:49:36,515][60144] Updated weights for policy 1, policy_version 41132 (0.0010) +[2023-10-09 05:49:36,882][60144] Updated weights for policy 1, policy_version 41142 (0.0008) +[2023-10-09 05:49:37,260][60144] Updated weights for policy 1, policy_version 41152 (0.0010) +[2023-10-09 05:49:40,338][60143] Updated weights for policy 0, policy_version 40682 (0.0008) +[2023-10-09 05:49:40,706][60143] Updated weights for policy 0, policy_version 40692 (0.0008) +[2023-10-09 05:49:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 83787776. Throughput: 0: 1721.5, 1: 1738.0. Samples: 20962218. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:49:41,052][59242] Avg episode reward: [(0, '26.550'), (1, '30.640')] +[2023-10-09 05:49:41,072][60143] Updated weights for policy 0, policy_version 40702 (0.0007) +[2023-10-09 05:49:41,219][60144] Updated weights for policy 1, policy_version 41162 (0.0007) +[2023-10-09 05:49:41,595][60144] Updated weights for policy 1, policy_version 41172 (0.0010) +[2023-10-09 05:49:41,978][60144] Updated weights for policy 1, policy_version 41182 (0.0011) +[2023-10-09 05:49:44,996][60143] Updated weights for policy 0, policy_version 40712 (0.0007) +[2023-10-09 05:49:45,378][60143] Updated weights for policy 0, policy_version 40722 (0.0007) +[2023-10-09 05:49:45,741][60143] Updated weights for policy 0, policy_version 40732 (0.0008) +[2023-10-09 05:49:45,757][60144] Updated weights for policy 1, policy_version 41192 (0.0008) +[2023-10-09 05:49:46,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 83886080. Throughput: 0: 1708.9, 1: 1731.9. Samples: 20982500. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:49:46,053][59242] Avg episode reward: [(0, '27.840'), (1, '31.780')] +[2023-10-09 05:49:46,125][60144] Updated weights for policy 1, policy_version 41202 (0.0010) +[2023-10-09 05:49:46,491][60144] Updated weights for policy 1, policy_version 41212 (0.0008) +[2023-10-09 05:49:49,757][60143] Updated weights for policy 0, policy_version 40742 (0.0008) +[2023-10-09 05:49:50,124][60143] Updated weights for policy 0, policy_version 40752 (0.0007) +[2023-10-09 05:49:50,492][60143] Updated weights for policy 0, policy_version 40762 (0.0007) +[2023-10-09 05:49:50,619][60144] Updated weights for policy 1, policy_version 41222 (0.0008) +[2023-10-09 05:49:50,977][60144] Updated weights for policy 1, policy_version 41232 (0.0008) +[2023-10-09 05:49:51,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 83951616. Throughput: 0: 1732.9, 1: 1726.1. Samples: 20992846. Policy #0 lag: (min: 31.0, avg: 33.5, max: 63.0) +[2023-10-09 05:49:51,052][59242] Avg episode reward: [(0, '28.380'), (1, '31.780')] +[2023-10-09 05:49:51,351][60144] Updated weights for policy 1, policy_version 41242 (0.0009) +[2023-10-09 05:49:54,417][60143] Updated weights for policy 0, policy_version 40772 (0.0008) +[2023-10-09 05:49:54,786][60143] Updated weights for policy 0, policy_version 40782 (0.0007) +[2023-10-09 05:49:55,162][60143] Updated weights for policy 0, policy_version 40792 (0.0007) +[2023-10-09 05:49:55,203][60144] Updated weights for policy 1, policy_version 41252 (0.0009) +[2023-10-09 05:49:55,563][60144] Updated weights for policy 1, policy_version 41262 (0.0009) +[2023-10-09 05:49:55,925][60144] Updated weights for policy 1, policy_version 41272 (0.0010) +[2023-10-09 05:49:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 84017152. Throughput: 0: 1720.7, 1: 1736.4. Samples: 21013708. Policy #0 lag: (min: 31.0, avg: 33.5, max: 63.0) +[2023-10-09 05:49:56,053][59242] Avg episode reward: [(0, '28.620'), (1, '32.250')] +[2023-10-09 05:49:56,218][60003] Saving new best policy, reward=32.250! +[2023-10-09 05:49:59,192][60143] Updated weights for policy 0, policy_version 40802 (0.0008) +[2023-10-09 05:49:59,606][60143] Updated weights for policy 0, policy_version 40812 (0.0012) +[2023-10-09 05:49:59,972][60143] Updated weights for policy 0, policy_version 40822 (0.0010) +[2023-10-09 05:50:00,068][60144] Updated weights for policy 1, policy_version 41282 (0.0008) +[2023-10-09 05:50:00,340][60143] Updated weights for policy 0, policy_version 40832 (0.0008) +[2023-10-09 05:50:00,433][60144] Updated weights for policy 1, policy_version 41292 (0.0007) +[2023-10-09 05:50:00,803][60144] Updated weights for policy 1, policy_version 41302 (0.0007) +[2023-10-09 05:50:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 84082688. Throughput: 0: 1693.6, 1: 1722.7. Samples: 21032984. Policy #0 lag: (min: 31.0, avg: 33.5, max: 63.0) +[2023-10-09 05:50:01,053][59242] Avg episode reward: [(0, '27.920'), (1, '31.130')] +[2023-10-09 05:50:01,167][60144] Updated weights for policy 1, policy_version 41312 (0.0009) +[2023-10-09 05:50:04,238][60143] Updated weights for policy 0, policy_version 40842 (0.0008) +[2023-10-09 05:50:04,611][60143] Updated weights for policy 0, policy_version 40852 (0.0009) +[2023-10-09 05:50:04,976][60143] Updated weights for policy 0, policy_version 40862 (0.0008) +[2023-10-09 05:50:05,279][60144] Updated weights for policy 1, policy_version 41322 (0.0007) +[2023-10-09 05:50:05,640][60144] Updated weights for policy 1, policy_version 41332 (0.0010) +[2023-10-09 05:50:06,018][60144] Updated weights for policy 1, policy_version 41342 (0.0010) +[2023-10-09 05:50:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 84148224. Throughput: 0: 1724.7, 1: 1731.0. Samples: 21044148. Policy #0 lag: (min: 31.0, avg: 33.5, max: 63.0) +[2023-10-09 05:50:06,052][59242] Avg episode reward: [(0, '27.980'), (1, '30.780')] +[2023-10-09 05:50:09,106][60143] Updated weights for policy 0, policy_version 40872 (0.0008) +[2023-10-09 05:50:09,464][60143] Updated weights for policy 0, policy_version 40882 (0.0009) +[2023-10-09 05:50:09,837][60143] Updated weights for policy 0, policy_version 40892 (0.0007) +[2023-10-09 05:50:09,955][60144] Updated weights for policy 1, policy_version 41352 (0.0007) +[2023-10-09 05:50:10,315][60144] Updated weights for policy 1, policy_version 41362 (0.0008) +[2023-10-09 05:50:10,686][60144] Updated weights for policy 1, policy_version 41372 (0.0009) +[2023-10-09 05:50:11,052][59242] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 84246528. Throughput: 0: 1702.9, 1: 1731.0. Samples: 21064522. Policy #0 lag: (min: 31.0, avg: 33.5, max: 63.0) +[2023-10-09 05:50:11,052][59242] Avg episode reward: [(0, '29.250'), (1, '28.800')] +[2023-10-09 05:50:13,768][60143] Updated weights for policy 0, policy_version 40902 (0.0008) +[2023-10-09 05:50:14,128][60143] Updated weights for policy 0, policy_version 40912 (0.0009) +[2023-10-09 05:50:14,484][60144] Updated weights for policy 1, policy_version 41382 (0.0008) +[2023-10-09 05:50:14,496][60143] Updated weights for policy 0, policy_version 40922 (0.0008) +[2023-10-09 05:50:14,847][60144] Updated weights for policy 1, policy_version 41392 (0.0007) +[2023-10-09 05:50:15,221][60144] Updated weights for policy 1, policy_version 41402 (0.0009) +[2023-10-09 05:50:16,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 84312064. Throughput: 0: 1693.7, 1: 1694.6. Samples: 21083774. Policy #0 lag: (min: 31.0, avg: 33.5, max: 63.0) +[2023-10-09 05:50:16,053][59242] Avg episode reward: [(0, '27.300'), (1, '30.450')] +[2023-10-09 05:50:18,447][60143] Updated weights for policy 0, policy_version 40932 (0.0008) +[2023-10-09 05:50:18,831][60143] Updated weights for policy 0, policy_version 40942 (0.0008) +[2023-10-09 05:50:19,192][60143] Updated weights for policy 0, policy_version 40952 (0.0010) +[2023-10-09 05:50:19,263][60144] Updated weights for policy 1, policy_version 41412 (0.0011) +[2023-10-09 05:50:19,627][60144] Updated weights for policy 1, policy_version 41422 (0.0008) +[2023-10-09 05:50:19,987][60144] Updated weights for policy 1, policy_version 41432 (0.0010) +[2023-10-09 05:50:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 84377600. Throughput: 0: 1712.2, 1: 1721.8. Samples: 21095338. Policy #0 lag: (min: 18.0, avg: 20.1, max: 49.0) +[2023-10-09 05:50:21,052][59242] Avg episode reward: [(0, '27.340'), (1, '30.430')] +[2023-10-09 05:50:23,142][60143] Updated weights for policy 0, policy_version 40962 (0.0007) +[2023-10-09 05:50:23,508][60143] Updated weights for policy 0, policy_version 40972 (0.0008) +[2023-10-09 05:50:23,879][60143] Updated weights for policy 0, policy_version 40982 (0.0007) +[2023-10-09 05:50:23,942][60144] Updated weights for policy 1, policy_version 41442 (0.0009) +[2023-10-09 05:50:24,244][60143] Updated weights for policy 0, policy_version 40992 (0.0007) +[2023-10-09 05:50:24,307][60144] Updated weights for policy 1, policy_version 41452 (0.0007) +[2023-10-09 05:50:24,684][60144] Updated weights for policy 1, policy_version 41462 (0.0007) +[2023-10-09 05:50:25,044][60144] Updated weights for policy 1, policy_version 41472 (0.0007) +[2023-10-09 05:50:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 84443136. Throughput: 0: 1680.7, 1: 1708.7. Samples: 21114740. Policy #0 lag: (min: 18.0, avg: 20.1, max: 49.0) +[2023-10-09 05:50:26,053][59242] Avg episode reward: [(0, '28.920'), (1, '31.290')] +[2023-10-09 05:50:28,279][60143] Updated weights for policy 0, policy_version 41002 (0.0008) +[2023-10-09 05:50:28,653][60143] Updated weights for policy 0, policy_version 41012 (0.0010) +[2023-10-09 05:50:29,018][60143] Updated weights for policy 0, policy_version 41022 (0.0009) +[2023-10-09 05:50:29,091][60144] Updated weights for policy 1, policy_version 41482 (0.0009) +[2023-10-09 05:50:29,477][60144] Updated weights for policy 1, policy_version 41492 (0.0008) +[2023-10-09 05:50:29,848][60144] Updated weights for policy 1, policy_version 41502 (0.0009) +[2023-10-09 05:50:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 84508672. Throughput: 0: 1697.0, 1: 1698.8. Samples: 21135312. Policy #0 lag: (min: 18.0, avg: 20.1, max: 49.0) +[2023-10-09 05:50:31,053][59242] Avg episode reward: [(0, '30.040'), (1, '32.640')] +[2023-10-09 05:50:31,065][60003] Saving new best policy, reward=32.640! +[2023-10-09 05:50:33,116][60143] Updated weights for policy 0, policy_version 41032 (0.0008) +[2023-10-09 05:50:33,482][60143] Updated weights for policy 0, policy_version 41042 (0.0007) +[2023-10-09 05:50:33,720][60144] Updated weights for policy 1, policy_version 41512 (0.0008) +[2023-10-09 05:50:33,856][60143] Updated weights for policy 0, policy_version 41052 (0.0009) +[2023-10-09 05:50:34,102][60144] Updated weights for policy 1, policy_version 41522 (0.0009) +[2023-10-09 05:50:34,471][60144] Updated weights for policy 1, policy_version 41532 (0.0007) +[2023-10-09 05:50:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 84574208. Throughput: 0: 1687.0, 1: 1722.7. Samples: 21146280. Policy #0 lag: (min: 18.0, avg: 20.1, max: 49.0) +[2023-10-09 05:50:36,053][59242] Avg episode reward: [(0, '29.600'), (1, '32.810')] +[2023-10-09 05:50:36,054][60003] Saving new best policy, reward=32.810! +[2023-10-09 05:50:37,767][60143] Updated weights for policy 0, policy_version 41062 (0.0007) +[2023-10-09 05:50:38,140][60143] Updated weights for policy 0, policy_version 41072 (0.0008) +[2023-10-09 05:50:38,410][60144] Updated weights for policy 1, policy_version 41542 (0.0007) +[2023-10-09 05:50:38,513][60143] Updated weights for policy 0, policy_version 41082 (0.0010) +[2023-10-09 05:50:38,772][60144] Updated weights for policy 1, policy_version 41552 (0.0007) +[2023-10-09 05:50:39,146][60144] Updated weights for policy 1, policy_version 41562 (0.0009) +[2023-10-09 05:50:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 84639744. Throughput: 0: 1684.5, 1: 1698.8. Samples: 21165960. Policy #0 lag: (min: 18.0, avg: 20.1, max: 49.0) +[2023-10-09 05:50:41,053][59242] Avg episode reward: [(0, '30.340'), (1, '32.090')] +[2023-10-09 05:50:42,508][60143] Updated weights for policy 0, policy_version 41092 (0.0010) +[2023-10-09 05:50:42,865][60143] Updated weights for policy 0, policy_version 41102 (0.0010) +[2023-10-09 05:50:43,020][60144] Updated weights for policy 1, policy_version 41572 (0.0007) +[2023-10-09 05:50:43,232][60143] Updated weights for policy 0, policy_version 41112 (0.0009) +[2023-10-09 05:50:43,395][60144] Updated weights for policy 1, policy_version 41582 (0.0009) +[2023-10-09 05:50:43,757][60144] Updated weights for policy 1, policy_version 41592 (0.0008) +[2023-10-09 05:50:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 84705280. Throughput: 0: 1714.7, 1: 1711.3. Samples: 21187152. Policy #0 lag: (min: 18.0, avg: 20.1, max: 49.0) +[2023-10-09 05:50:46,052][59242] Avg episode reward: [(0, '30.360'), (1, '31.490')] +[2023-10-09 05:50:47,291][60143] Updated weights for policy 0, policy_version 41122 (0.0008) +[2023-10-09 05:50:47,639][60144] Updated weights for policy 1, policy_version 41602 (0.0008) +[2023-10-09 05:50:47,699][60143] Updated weights for policy 0, policy_version 41132 (0.0009) +[2023-10-09 05:50:48,008][60144] Updated weights for policy 1, policy_version 41612 (0.0008) +[2023-10-09 05:50:48,078][60143] Updated weights for policy 0, policy_version 41142 (0.0008) +[2023-10-09 05:50:48,368][60144] Updated weights for policy 1, policy_version 41622 (0.0008) +[2023-10-09 05:50:48,449][60143] Updated weights for policy 0, policy_version 41152 (0.0009) +[2023-10-09 05:50:48,737][60144] Updated weights for policy 1, policy_version 41632 (0.0007) +[2023-10-09 05:50:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 84770816. Throughput: 0: 1680.5, 1: 1705.5. Samples: 21196518. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:50:51,053][59242] Avg episode reward: [(0, '30.660'), (1, '30.150')] +[2023-10-09 05:50:52,454][60143] Updated weights for policy 0, policy_version 41162 (0.0009) +[2023-10-09 05:50:52,651][60144] Updated weights for policy 1, policy_version 41642 (0.0008) +[2023-10-09 05:50:52,819][60143] Updated weights for policy 0, policy_version 41172 (0.0007) +[2023-10-09 05:50:53,008][60144] Updated weights for policy 1, policy_version 41652 (0.0008) +[2023-10-09 05:50:53,179][60143] Updated weights for policy 0, policy_version 41182 (0.0009) +[2023-10-09 05:50:53,370][60144] Updated weights for policy 1, policy_version 41662 (0.0008) +[2023-10-09 05:50:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 84836352. Throughput: 0: 1695.3, 1: 1703.0. Samples: 21217448. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:50:56,052][59242] Avg episode reward: [(0, '30.920'), (1, '30.330')] +[2023-10-09 05:50:57,273][60143] Updated weights for policy 0, policy_version 41192 (0.0008) +[2023-10-09 05:50:57,361][60144] Updated weights for policy 1, policy_version 41672 (0.0008) +[2023-10-09 05:50:57,643][60143] Updated weights for policy 0, policy_version 41202 (0.0007) +[2023-10-09 05:50:57,716][60144] Updated weights for policy 1, policy_version 41682 (0.0009) +[2023-10-09 05:50:58,018][60143] Updated weights for policy 0, policy_version 41212 (0.0007) +[2023-10-09 05:50:58,079][60144] Updated weights for policy 1, policy_version 41692 (0.0009) +[2023-10-09 05:51:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 84901888. Throughput: 0: 1705.3, 1: 1733.8. Samples: 21238532. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:51:01,052][59242] Avg episode reward: [(0, '30.870'), (1, '30.530')] +[2023-10-09 05:51:01,987][60144] Updated weights for policy 1, policy_version 41702 (0.0009) +[2023-10-09 05:51:02,142][60143] Updated weights for policy 0, policy_version 41222 (0.0009) +[2023-10-09 05:51:02,355][60144] Updated weights for policy 1, policy_version 41712 (0.0010) +[2023-10-09 05:51:02,505][60143] Updated weights for policy 0, policy_version 41232 (0.0009) +[2023-10-09 05:51:02,723][60144] Updated weights for policy 1, policy_version 41722 (0.0008) +[2023-10-09 05:51:02,883][60143] Updated weights for policy 0, policy_version 41242 (0.0009) +[2023-10-09 05:51:06,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 84967424. Throughput: 0: 1680.3, 1: 1707.5. Samples: 21247786. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:51:06,053][59242] Avg episode reward: [(0, '29.860'), (1, '30.350')] +[2023-10-09 05:51:06,803][60144] Updated weights for policy 1, policy_version 41732 (0.0008) +[2023-10-09 05:51:06,832][60143] Updated weights for policy 0, policy_version 41252 (0.0009) +[2023-10-09 05:51:07,171][60144] Updated weights for policy 1, policy_version 41742 (0.0008) +[2023-10-09 05:51:07,192][60143] Updated weights for policy 0, policy_version 41262 (0.0007) +[2023-10-09 05:51:07,524][60144] Updated weights for policy 1, policy_version 41752 (0.0008) +[2023-10-09 05:51:07,568][60143] Updated weights for policy 0, policy_version 41272 (0.0007) +[2023-10-09 05:51:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 85032960. Throughput: 0: 1702.0, 1: 1720.2. Samples: 21268740. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:51:11,053][59242] Avg episode reward: [(0, '28.680'), (1, '30.490')] +[2023-10-09 05:51:11,506][60144] Updated weights for policy 1, policy_version 41762 (0.0008) +[2023-10-09 05:51:11,586][60143] Updated weights for policy 0, policy_version 41282 (0.0007) +[2023-10-09 05:51:11,874][60144] Updated weights for policy 1, policy_version 41772 (0.0008) +[2023-10-09 05:51:11,946][60143] Updated weights for policy 0, policy_version 41292 (0.0009) +[2023-10-09 05:51:12,244][60144] Updated weights for policy 1, policy_version 41782 (0.0009) +[2023-10-09 05:51:12,323][60143] Updated weights for policy 0, policy_version 41302 (0.0009) +[2023-10-09 05:51:12,611][60144] Updated weights for policy 1, policy_version 41792 (0.0007) +[2023-10-09 05:51:12,699][60143] Updated weights for policy 0, policy_version 41312 (0.0009) +[2023-10-09 05:51:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 85098496. Throughput: 0: 1695.1, 1: 1733.1. Samples: 21289580. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:51:16,053][59242] Avg episode reward: [(0, '28.760'), (1, '29.630')] +[2023-10-09 05:51:16,713][60144] Updated weights for policy 1, policy_version 41802 (0.0008) +[2023-10-09 05:51:16,762][60143] Updated weights for policy 0, policy_version 41322 (0.0008) +[2023-10-09 05:51:17,089][60144] Updated weights for policy 1, policy_version 41812 (0.0007) +[2023-10-09 05:51:17,141][60143] Updated weights for policy 0, policy_version 41332 (0.0007) +[2023-10-09 05:51:17,461][60144] Updated weights for policy 1, policy_version 41822 (0.0008) +[2023-10-09 05:51:17,499][60143] Updated weights for policy 0, policy_version 41342 (0.0007) +[2023-10-09 05:51:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 85164032. Throughput: 0: 1681.9, 1: 1702.6. Samples: 21298584. Policy #0 lag: (min: 11.0, avg: 14.0, max: 43.0) +[2023-10-09 05:51:21,053][59242] Avg episode reward: [(0, '30.460'), (1, '31.050')] +[2023-10-09 05:51:21,528][60144] Updated weights for policy 1, policy_version 41832 (0.0009) +[2023-10-09 05:51:21,672][60143] Updated weights for policy 0, policy_version 41352 (0.0008) +[2023-10-09 05:51:21,889][60144] Updated weights for policy 1, policy_version 41842 (0.0007) +[2023-10-09 05:51:22,047][60143] Updated weights for policy 0, policy_version 41362 (0.0009) +[2023-10-09 05:51:22,268][60144] Updated weights for policy 1, policy_version 41852 (0.0009) +[2023-10-09 05:51:22,419][60143] Updated weights for policy 0, policy_version 41372 (0.0008) +[2023-10-09 05:51:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 85229568. Throughput: 0: 1690.2, 1: 1722.5. Samples: 21319530. Policy #0 lag: (min: 11.0, avg: 14.0, max: 43.0) +[2023-10-09 05:51:26,053][59242] Avg episode reward: [(0, '29.250'), (1, '30.190')] +[2023-10-09 05:51:26,160][60144] Updated weights for policy 1, policy_version 41862 (0.0010) +[2023-10-09 05:51:26,517][60144] Updated weights for policy 1, policy_version 41872 (0.0010) +[2023-10-09 05:51:26,665][60143] Updated weights for policy 0, policy_version 41382 (0.0008) +[2023-10-09 05:51:26,882][60144] Updated weights for policy 1, policy_version 41882 (0.0007) +[2023-10-09 05:51:27,031][60143] Updated weights for policy 0, policy_version 41392 (0.0007) +[2023-10-09 05:51:27,401][60143] Updated weights for policy 0, policy_version 41402 (0.0007) +[2023-10-09 05:51:30,718][60144] Updated weights for policy 1, policy_version 41892 (0.0008) +[2023-10-09 05:51:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 85295104. Throughput: 0: 1684.1, 1: 1729.3. Samples: 21340758. Policy #0 lag: (min: 11.0, avg: 14.0, max: 43.0) +[2023-10-09 05:51:31,053][59242] Avg episode reward: [(0, '29.490'), (1, '30.970')] +[2023-10-09 05:51:31,061][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000041408_42401792.pth... +[2023-10-09 05:51:31,089][60144] Updated weights for policy 1, policy_version 41902 (0.0008) +[2023-10-09 05:51:31,092][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000039840_40796160.pth +[2023-10-09 05:51:31,350][60143] Updated weights for policy 0, policy_version 41412 (0.0007) +[2023-10-09 05:51:31,463][60144] Updated weights for policy 1, policy_version 41912 (0.0009) +[2023-10-09 05:51:31,714][60143] Updated weights for policy 0, policy_version 41422 (0.0007) +[2023-10-09 05:51:31,750][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000041920_42926080.pth... +[2023-10-09 05:51:31,790][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000040288_41254912.pth +[2023-10-09 05:51:32,077][60143] Updated weights for policy 0, policy_version 41432 (0.0010) +[2023-10-09 05:51:35,364][60144] Updated weights for policy 1, policy_version 41922 (0.0008) +[2023-10-09 05:51:35,737][60144] Updated weights for policy 1, policy_version 41932 (0.0007) +[2023-10-09 05:51:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 85360640. Throughput: 0: 1687.3, 1: 1724.5. Samples: 21350048. Policy #0 lag: (min: 11.0, avg: 14.0, max: 43.0) +[2023-10-09 05:51:36,052][59242] Avg episode reward: [(0, '28.370'), (1, '30.200')] +[2023-10-09 05:51:36,104][60144] Updated weights for policy 1, policy_version 41942 (0.0007) +[2023-10-09 05:51:36,214][60143] Updated weights for policy 0, policy_version 41442 (0.0008) +[2023-10-09 05:51:36,467][60144] Updated weights for policy 1, policy_version 41952 (0.0009) +[2023-10-09 05:51:36,602][60143] Updated weights for policy 0, policy_version 41452 (0.0009) +[2023-10-09 05:51:36,964][60143] Updated weights for policy 0, policy_version 41462 (0.0009) +[2023-10-09 05:51:37,334][60143] Updated weights for policy 0, policy_version 41472 (0.0010) +[2023-10-09 05:51:40,376][60144] Updated weights for policy 1, policy_version 41962 (0.0009) +[2023-10-09 05:51:40,738][60144] Updated weights for policy 1, policy_version 41972 (0.0008) +[2023-10-09 05:51:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 85426176. Throughput: 0: 1684.8, 1: 1731.8. Samples: 21371194. Policy #0 lag: (min: 11.0, avg: 14.0, max: 43.0) +[2023-10-09 05:51:41,053][59242] Avg episode reward: [(0, '29.240'), (1, '29.300')] +[2023-10-09 05:51:41,107][60144] Updated weights for policy 1, policy_version 41982 (0.0009) +[2023-10-09 05:51:41,338][60143] Updated weights for policy 0, policy_version 41482 (0.0008) +[2023-10-09 05:51:41,710][60143] Updated weights for policy 0, policy_version 41492 (0.0008) +[2023-10-09 05:51:42,078][60143] Updated weights for policy 0, policy_version 41502 (0.0009) +[2023-10-09 05:51:44,893][60144] Updated weights for policy 1, policy_version 41992 (0.0009) +[2023-10-09 05:51:45,252][60144] Updated weights for policy 1, policy_version 42002 (0.0009) +[2023-10-09 05:51:45,625][60144] Updated weights for policy 1, policy_version 42012 (0.0009) +[2023-10-09 05:51:45,913][60143] Updated weights for policy 0, policy_version 41512 (0.0007) +[2023-10-09 05:51:46,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 85524480. Throughput: 0: 1691.2, 1: 1712.1. Samples: 21391682. Policy #0 lag: (min: 11.0, avg: 14.0, max: 43.0) +[2023-10-09 05:51:46,053][59242] Avg episode reward: [(0, '29.300'), (1, '28.600')] +[2023-10-09 05:51:46,289][60143] Updated weights for policy 0, policy_version 41522 (0.0010) +[2023-10-09 05:51:46,668][60143] Updated weights for policy 0, policy_version 41532 (0.0009) +[2023-10-09 05:51:49,593][60144] Updated weights for policy 1, policy_version 42022 (0.0010) +[2023-10-09 05:51:49,952][60144] Updated weights for policy 1, policy_version 42032 (0.0008) +[2023-10-09 05:51:50,318][60144] Updated weights for policy 1, policy_version 42042 (0.0007) +[2023-10-09 05:51:50,669][60143] Updated weights for policy 0, policy_version 41542 (0.0007) +[2023-10-09 05:51:51,042][60143] Updated weights for policy 0, policy_version 41552 (0.0008) +[2023-10-09 05:51:51,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 85590016. Throughput: 0: 1688.5, 1: 1737.4. Samples: 21401952. Policy #0 lag: (min: 31.0, avg: 33.1, max: 62.0) +[2023-10-09 05:51:51,053][59242] Avg episode reward: [(0, '28.840'), (1, '27.400')] +[2023-10-09 05:51:51,408][60143] Updated weights for policy 0, policy_version 41562 (0.0009) +[2023-10-09 05:51:54,397][60144] Updated weights for policy 1, policy_version 42052 (0.0009) +[2023-10-09 05:51:54,750][60144] Updated weights for policy 1, policy_version 42062 (0.0009) +[2023-10-09 05:51:55,123][60144] Updated weights for policy 1, policy_version 42072 (0.0010) +[2023-10-09 05:51:55,304][60143] Updated weights for policy 0, policy_version 41572 (0.0009) +[2023-10-09 05:51:55,673][60143] Updated weights for policy 0, policy_version 41582 (0.0009) +[2023-10-09 05:51:56,035][60143] Updated weights for policy 0, policy_version 41592 (0.0009) +[2023-10-09 05:51:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 85655552. Throughput: 0: 1693.9, 1: 1726.8. Samples: 21422668. Policy #0 lag: (min: 31.0, avg: 33.1, max: 62.0) +[2023-10-09 05:51:56,053][59242] Avg episode reward: [(0, '28.190'), (1, '27.650')] +[2023-10-09 05:51:59,161][60144] Updated weights for policy 1, policy_version 42082 (0.0010) +[2023-10-09 05:51:59,543][60144] Updated weights for policy 1, policy_version 42092 (0.0011) +[2023-10-09 05:51:59,913][60144] Updated weights for policy 1, policy_version 42102 (0.0008) +[2023-10-09 05:52:00,082][60143] Updated weights for policy 0, policy_version 41602 (0.0008) +[2023-10-09 05:52:00,275][60144] Updated weights for policy 1, policy_version 42112 (0.0008) +[2023-10-09 05:52:00,453][60143] Updated weights for policy 0, policy_version 41612 (0.0009) +[2023-10-09 05:52:00,823][60143] Updated weights for policy 0, policy_version 41622 (0.0010) +[2023-10-09 05:52:01,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 85721088. Throughput: 0: 1692.9, 1: 1707.3. Samples: 21442588. Policy #0 lag: (min: 31.0, avg: 33.1, max: 62.0) +[2023-10-09 05:52:01,052][59242] Avg episode reward: [(0, '26.990'), (1, '26.890')] +[2023-10-09 05:52:01,192][60143] Updated weights for policy 0, policy_version 41632 (0.0010) +[2023-10-09 05:52:04,262][60144] Updated weights for policy 1, policy_version 42122 (0.0008) +[2023-10-09 05:52:04,636][60144] Updated weights for policy 1, policy_version 42132 (0.0008) +[2023-10-09 05:52:05,016][60144] Updated weights for policy 1, policy_version 42142 (0.0010) +[2023-10-09 05:52:05,283][60143] Updated weights for policy 0, policy_version 41642 (0.0010) +[2023-10-09 05:52:05,649][60143] Updated weights for policy 0, policy_version 41652 (0.0009) +[2023-10-09 05:52:06,024][60143] Updated weights for policy 0, policy_version 41662 (0.0012) +[2023-10-09 05:52:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 85786624. Throughput: 0: 1705.0, 1: 1743.0. Samples: 21453742. Policy #0 lag: (min: 31.0, avg: 33.1, max: 62.0) +[2023-10-09 05:52:06,053][59242] Avg episode reward: [(0, '27.400'), (1, '27.280')] +[2023-10-09 05:52:08,867][60144] Updated weights for policy 1, policy_version 42152 (0.0009) +[2023-10-09 05:52:09,234][60144] Updated weights for policy 1, policy_version 42162 (0.0011) +[2023-10-09 05:52:09,612][60144] Updated weights for policy 1, policy_version 42172 (0.0010) +[2023-10-09 05:52:10,125][60143] Updated weights for policy 0, policy_version 41672 (0.0008) +[2023-10-09 05:52:10,495][60143] Updated weights for policy 0, policy_version 41682 (0.0007) +[2023-10-09 05:52:10,876][60143] Updated weights for policy 0, policy_version 41692 (0.0008) +[2023-10-09 05:52:11,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 85884928. Throughput: 0: 1705.9, 1: 1714.1. Samples: 21473428. Policy #0 lag: (min: 31.0, avg: 33.1, max: 62.0) +[2023-10-09 05:52:11,053][59242] Avg episode reward: [(0, '28.090'), (1, '26.920')] +[2023-10-09 05:52:13,745][60144] Updated weights for policy 1, policy_version 42182 (0.0011) +[2023-10-09 05:52:14,118][60144] Updated weights for policy 1, policy_version 42192 (0.0008) +[2023-10-09 05:52:14,490][60144] Updated weights for policy 1, policy_version 42202 (0.0008) +[2023-10-09 05:52:14,858][60143] Updated weights for policy 0, policy_version 41702 (0.0009) +[2023-10-09 05:52:15,220][60143] Updated weights for policy 0, policy_version 41712 (0.0009) +[2023-10-09 05:52:15,589][60143] Updated weights for policy 0, policy_version 41722 (0.0009) +[2023-10-09 05:52:16,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 85950464. Throughput: 0: 1695.1, 1: 1700.2. Samples: 21493544. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:52:16,053][59242] Avg episode reward: [(0, '28.820'), (1, '27.470')] +[2023-10-09 05:52:18,202][60144] Updated weights for policy 1, policy_version 42212 (0.0008) +[2023-10-09 05:52:18,583][60144] Updated weights for policy 1, policy_version 42222 (0.0008) +[2023-10-09 05:52:18,947][60144] Updated weights for policy 1, policy_version 42232 (0.0008) +[2023-10-09 05:52:19,463][60143] Updated weights for policy 0, policy_version 41732 (0.0008) +[2023-10-09 05:52:19,838][60143] Updated weights for policy 0, policy_version 41742 (0.0008) +[2023-10-09 05:52:20,210][60143] Updated weights for policy 0, policy_version 41752 (0.0008) +[2023-10-09 05:52:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 86016000. Throughput: 0: 1716.4, 1: 1722.3. Samples: 21504792. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:52:21,053][59242] Avg episode reward: [(0, '28.700'), (1, '27.730')] +[2023-10-09 05:52:22,911][60144] Updated weights for policy 1, policy_version 42242 (0.0008) +[2023-10-09 05:52:23,284][60144] Updated weights for policy 1, policy_version 42252 (0.0008) +[2023-10-09 05:52:23,645][60144] Updated weights for policy 1, policy_version 42262 (0.0009) +[2023-10-09 05:52:24,008][60144] Updated weights for policy 1, policy_version 42272 (0.0008) +[2023-10-09 05:52:24,296][60143] Updated weights for policy 0, policy_version 41762 (0.0007) +[2023-10-09 05:52:24,710][60143] Updated weights for policy 0, policy_version 41772 (0.0007) +[2023-10-09 05:52:25,076][60143] Updated weights for policy 0, policy_version 41782 (0.0009) +[2023-10-09 05:52:25,444][60143] Updated weights for policy 0, policy_version 41792 (0.0008) +[2023-10-09 05:52:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 86081536. Throughput: 0: 1711.9, 1: 1701.1. Samples: 21524776. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:52:26,053][59242] Avg episode reward: [(0, '29.830'), (1, '27.820')] +[2023-10-09 05:52:27,987][60144] Updated weights for policy 1, policy_version 42282 (0.0008) +[2023-10-09 05:52:28,355][60144] Updated weights for policy 1, policy_version 42292 (0.0008) +[2023-10-09 05:52:28,719][60144] Updated weights for policy 1, policy_version 42302 (0.0009) +[2023-10-09 05:52:29,275][60143] Updated weights for policy 0, policy_version 41802 (0.0011) +[2023-10-09 05:52:29,644][60143] Updated weights for policy 0, policy_version 41812 (0.0009) +[2023-10-09 05:52:30,023][60143] Updated weights for policy 0, policy_version 41822 (0.0007) +[2023-10-09 05:52:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 86147072. Throughput: 0: 1686.2, 1: 1721.6. Samples: 21545032. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:52:31,053][59242] Avg episode reward: [(0, '29.370'), (1, '28.540')] +[2023-10-09 05:52:32,647][60144] Updated weights for policy 1, policy_version 42312 (0.0009) +[2023-10-09 05:52:33,025][60144] Updated weights for policy 1, policy_version 42322 (0.0007) +[2023-10-09 05:52:33,400][60144] Updated weights for policy 1, policy_version 42332 (0.0008) +[2023-10-09 05:52:33,905][60143] Updated weights for policy 0, policy_version 41832 (0.0007) +[2023-10-09 05:52:34,275][60143] Updated weights for policy 0, policy_version 41842 (0.0008) +[2023-10-09 05:52:34,653][60143] Updated weights for policy 0, policy_version 41852 (0.0007) +[2023-10-09 05:52:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 86212608. Throughput: 0: 1719.9, 1: 1693.8. Samples: 21555572. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:52:36,053][59242] Avg episode reward: [(0, '30.300'), (1, '26.730')] +[2023-10-09 05:52:37,230][60144] Updated weights for policy 1, policy_version 42342 (0.0007) +[2023-10-09 05:52:37,596][60144] Updated weights for policy 1, policy_version 42352 (0.0007) +[2023-10-09 05:52:37,960][60144] Updated weights for policy 1, policy_version 42362 (0.0009) +[2023-10-09 05:52:38,631][60143] Updated weights for policy 0, policy_version 41862 (0.0008) +[2023-10-09 05:52:38,995][60143] Updated weights for policy 0, policy_version 41872 (0.0008) +[2023-10-09 05:52:39,368][60143] Updated weights for policy 0, policy_version 41882 (0.0007) +[2023-10-09 05:52:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 86278144. Throughput: 0: 1691.0, 1: 1714.3. Samples: 21575904. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:52:41,053][59242] Avg episode reward: [(0, '29.750'), (1, '27.760')] +[2023-10-09 05:52:41,754][60144] Updated weights for policy 1, policy_version 42372 (0.0008) +[2023-10-09 05:52:42,128][60144] Updated weights for policy 1, policy_version 42382 (0.0009) +[2023-10-09 05:52:42,502][60144] Updated weights for policy 1, policy_version 42392 (0.0008) +[2023-10-09 05:52:43,466][60143] Updated weights for policy 0, policy_version 41892 (0.0008) +[2023-10-09 05:52:43,841][60143] Updated weights for policy 0, policy_version 41902 (0.0007) +[2023-10-09 05:52:44,209][60143] Updated weights for policy 0, policy_version 41912 (0.0008) +[2023-10-09 05:52:46,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 86343680. Throughput: 0: 1693.2, 1: 1738.4. Samples: 21597010. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) +[2023-10-09 05:52:46,052][59242] Avg episode reward: [(0, '29.790'), (1, '29.340')] +[2023-10-09 05:52:46,612][60144] Updated weights for policy 1, policy_version 42402 (0.0009) +[2023-10-09 05:52:46,978][60144] Updated weights for policy 1, policy_version 42412 (0.0008) +[2023-10-09 05:52:47,345][60144] Updated weights for policy 1, policy_version 42422 (0.0008) +[2023-10-09 05:52:47,715][60144] Updated weights for policy 1, policy_version 42432 (0.0007) +[2023-10-09 05:52:48,166][60143] Updated weights for policy 0, policy_version 41922 (0.0009) +[2023-10-09 05:52:48,534][60143] Updated weights for policy 0, policy_version 41932 (0.0008) +[2023-10-09 05:52:48,909][60143] Updated weights for policy 0, policy_version 41942 (0.0009) +[2023-10-09 05:52:49,282][60143] Updated weights for policy 0, policy_version 41952 (0.0008) +[2023-10-09 05:52:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 86409216. Throughput: 0: 1707.2, 1: 1705.9. Samples: 21607330. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) +[2023-10-09 05:52:51,053][59242] Avg episode reward: [(0, '28.780'), (1, '27.970')] +[2023-10-09 05:52:51,803][60144] Updated weights for policy 1, policy_version 42442 (0.0009) +[2023-10-09 05:52:52,166][60144] Updated weights for policy 1, policy_version 42452 (0.0009) +[2023-10-09 05:52:52,533][60144] Updated weights for policy 1, policy_version 42462 (0.0008) +[2023-10-09 05:52:53,329][60143] Updated weights for policy 0, policy_version 41962 (0.0010) +[2023-10-09 05:52:53,701][60143] Updated weights for policy 0, policy_version 41972 (0.0011) +[2023-10-09 05:52:54,081][60143] Updated weights for policy 0, policy_version 41982 (0.0009) +[2023-10-09 05:52:56,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 86474752. Throughput: 0: 1689.2, 1: 1734.1. Samples: 21627474. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) +[2023-10-09 05:52:56,053][59242] Avg episode reward: [(0, '31.220'), (1, '27.800')] +[2023-10-09 05:52:56,467][60144] Updated weights for policy 1, policy_version 42472 (0.0008) +[2023-10-09 05:52:56,825][60144] Updated weights for policy 1, policy_version 42482 (0.0008) +[2023-10-09 05:52:57,195][60144] Updated weights for policy 1, policy_version 42492 (0.0008) +[2023-10-09 05:52:57,885][60143] Updated weights for policy 0, policy_version 41992 (0.0011) +[2023-10-09 05:52:58,258][60143] Updated weights for policy 0, policy_version 42002 (0.0007) +[2023-10-09 05:52:58,630][60143] Updated weights for policy 0, policy_version 42012 (0.0008) +[2023-10-09 05:53:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 86540288. Throughput: 0: 1710.7, 1: 1747.2. Samples: 21649148. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) +[2023-10-09 05:53:01,053][59242] Avg episode reward: [(0, '30.070'), (1, '28.000')] +[2023-10-09 05:53:01,211][60144] Updated weights for policy 1, policy_version 42502 (0.0007) +[2023-10-09 05:53:01,575][60144] Updated weights for policy 1, policy_version 42512 (0.0007) +[2023-10-09 05:53:01,941][60144] Updated weights for policy 1, policy_version 42522 (0.0009) +[2023-10-09 05:53:02,785][60143] Updated weights for policy 0, policy_version 42022 (0.0008) +[2023-10-09 05:53:03,149][60143] Updated weights for policy 0, policy_version 42032 (0.0009) +[2023-10-09 05:53:03,518][60143] Updated weights for policy 0, policy_version 42042 (0.0009) +[2023-10-09 05:53:05,767][60144] Updated weights for policy 1, policy_version 42532 (0.0007) +[2023-10-09 05:53:06,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 86605824. Throughput: 0: 1695.7, 1: 1726.9. Samples: 21658804. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) +[2023-10-09 05:53:06,053][59242] Avg episode reward: [(0, '29.440'), (1, '28.260')] +[2023-10-09 05:53:06,132][60144] Updated weights for policy 1, policy_version 42542 (0.0008) +[2023-10-09 05:53:06,505][60144] Updated weights for policy 1, policy_version 42552 (0.0007) +[2023-10-09 05:53:07,652][60143] Updated weights for policy 0, policy_version 42052 (0.0008) +[2023-10-09 05:53:08,025][60143] Updated weights for policy 0, policy_version 42062 (0.0009) +[2023-10-09 05:53:08,395][60143] Updated weights for policy 0, policy_version 42072 (0.0008) +[2023-10-09 05:53:10,320][60144] Updated weights for policy 1, policy_version 42562 (0.0008) +[2023-10-09 05:53:10,677][60144] Updated weights for policy 1, policy_version 42572 (0.0010) +[2023-10-09 05:53:11,052][60144] Updated weights for policy 1, policy_version 42582 (0.0008) +[2023-10-09 05:53:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 86671360. Throughput: 0: 1694.5, 1: 1746.1. Samples: 21679604. Policy #0 lag: (min: 6.0, avg: 8.8, max: 38.0) +[2023-10-09 05:53:11,053][59242] Avg episode reward: [(0, '30.090'), (1, '27.490')] +[2023-10-09 05:53:11,421][60144] Updated weights for policy 1, policy_version 42592 (0.0009) +[2023-10-09 05:53:12,241][60143] Updated weights for policy 0, policy_version 42082 (0.0009) +[2023-10-09 05:53:12,651][60143] Updated weights for policy 0, policy_version 42092 (0.0011) +[2023-10-09 05:53:13,024][60143] Updated weights for policy 0, policy_version 42102 (0.0009) +[2023-10-09 05:53:13,405][60143] Updated weights for policy 0, policy_version 42112 (0.0009) +[2023-10-09 05:53:15,357][60144] Updated weights for policy 1, policy_version 42602 (0.0007) +[2023-10-09 05:53:15,722][60144] Updated weights for policy 1, policy_version 42612 (0.0009) +[2023-10-09 05:53:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 86736896. Throughput: 0: 1717.8, 1: 1732.0. Samples: 21700270. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:53:16,053][59242] Avg episode reward: [(0, '30.360'), (1, '28.310')] +[2023-10-09 05:53:16,082][60144] Updated weights for policy 1, policy_version 42622 (0.0008) +[2023-10-09 05:53:17,330][60143] Updated weights for policy 0, policy_version 42122 (0.0011) +[2023-10-09 05:53:17,695][60143] Updated weights for policy 0, policy_version 42132 (0.0008) +[2023-10-09 05:53:18,066][60143] Updated weights for policy 0, policy_version 42142 (0.0007) +[2023-10-09 05:53:19,949][60144] Updated weights for policy 1, policy_version 42632 (0.0008) +[2023-10-09 05:53:20,311][60144] Updated weights for policy 1, policy_version 42642 (0.0007) +[2023-10-09 05:53:20,681][60144] Updated weights for policy 1, policy_version 42652 (0.0009) +[2023-10-09 05:53:21,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 86835200. Throughput: 0: 1686.5, 1: 1753.3. Samples: 21710362. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:53:21,053][59242] Avg episode reward: [(0, '29.770'), (1, '30.050')] +[2023-10-09 05:53:22,123][60143] Updated weights for policy 0, policy_version 42152 (0.0010) +[2023-10-09 05:53:22,494][60143] Updated weights for policy 0, policy_version 42162 (0.0010) +[2023-10-09 05:53:22,867][60143] Updated weights for policy 0, policy_version 42172 (0.0010) +[2023-10-09 05:53:24,639][60144] Updated weights for policy 1, policy_version 42662 (0.0008) +[2023-10-09 05:53:25,004][60144] Updated weights for policy 1, policy_version 42672 (0.0008) +[2023-10-09 05:53:25,373][60144] Updated weights for policy 1, policy_version 42682 (0.0012) +[2023-10-09 05:53:26,052][59242] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 86900736. Throughput: 0: 1711.1, 1: 1742.7. Samples: 21731322. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:53:26,052][59242] Avg episode reward: [(0, '30.290'), (1, '30.630')] +[2023-10-09 05:53:26,916][60143] Updated weights for policy 0, policy_version 42182 (0.0009) +[2023-10-09 05:53:27,282][60143] Updated weights for policy 0, policy_version 42192 (0.0009) +[2023-10-09 05:53:27,654][60143] Updated weights for policy 0, policy_version 42202 (0.0009) +[2023-10-09 05:53:29,132][60144] Updated weights for policy 1, policy_version 42692 (0.0009) +[2023-10-09 05:53:29,498][60144] Updated weights for policy 1, policy_version 42702 (0.0008) +[2023-10-09 05:53:29,872][60144] Updated weights for policy 1, policy_version 42712 (0.0007) +[2023-10-09 05:53:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 86966272. Throughput: 0: 1718.0, 1: 1717.4. Samples: 21751604. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:53:31,053][59242] Avg episode reward: [(0, '29.820'), (1, '30.910')] +[2023-10-09 05:53:31,063][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000042720_43745280.pth... +[2023-10-09 05:53:31,063][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000042208_43220992.pth... +[2023-10-09 05:53:31,099][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000040640_41615360.pth +[2023-10-09 05:53:31,100][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000041088_42074112.pth +[2023-10-09 05:53:31,403][60143] Updated weights for policy 0, policy_version 42212 (0.0009) +[2023-10-09 05:53:31,770][60143] Updated weights for policy 0, policy_version 42222 (0.0008) +[2023-10-09 05:53:32,142][60143] Updated weights for policy 0, policy_version 42232 (0.0010) +[2023-10-09 05:53:33,875][60144] Updated weights for policy 1, policy_version 42722 (0.0010) +[2023-10-09 05:53:34,240][60144] Updated weights for policy 1, policy_version 42732 (0.0008) +[2023-10-09 05:53:34,613][60144] Updated weights for policy 1, policy_version 42742 (0.0010) +[2023-10-09 05:53:34,984][60144] Updated weights for policy 1, policy_version 42752 (0.0010) +[2023-10-09 05:53:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 87031808. Throughput: 0: 1692.8, 1: 1748.0. Samples: 21762166. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:53:36,053][59242] Avg episode reward: [(0, '31.390'), (1, '31.270')] +[2023-10-09 05:53:36,164][60143] Updated weights for policy 0, policy_version 42242 (0.0009) +[2023-10-09 05:53:36,531][60143] Updated weights for policy 0, policy_version 42252 (0.0008) +[2023-10-09 05:53:36,898][60143] Updated weights for policy 0, policy_version 42262 (0.0008) +[2023-10-09 05:53:37,261][60143] Updated weights for policy 0, policy_version 42272 (0.0007) +[2023-10-09 05:53:39,009][60144] Updated weights for policy 1, policy_version 42762 (0.0009) +[2023-10-09 05:53:39,374][60144] Updated weights for policy 1, policy_version 42772 (0.0011) +[2023-10-09 05:53:39,740][60144] Updated weights for policy 1, policy_version 42782 (0.0010) +[2023-10-09 05:53:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 87097344. Throughput: 0: 1715.1, 1: 1724.7. Samples: 21782264. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:53:41,053][59242] Avg episode reward: [(0, '28.770'), (1, '31.920')] +[2023-10-09 05:53:41,236][60143] Updated weights for policy 0, policy_version 42282 (0.0009) +[2023-10-09 05:53:41,604][60143] Updated weights for policy 0, policy_version 42292 (0.0008) +[2023-10-09 05:53:41,973][60143] Updated weights for policy 0, policy_version 42302 (0.0011) +[2023-10-09 05:53:43,677][60144] Updated weights for policy 1, policy_version 42792 (0.0009) +[2023-10-09 05:53:44,046][60144] Updated weights for policy 1, policy_version 42802 (0.0009) +[2023-10-09 05:53:44,413][60144] Updated weights for policy 1, policy_version 42812 (0.0009) +[2023-10-09 05:53:45,931][60143] Updated weights for policy 0, policy_version 42312 (0.0009) +[2023-10-09 05:53:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 87162880. Throughput: 0: 1707.2, 1: 1712.2. Samples: 21803018. Policy #0 lag: (min: 9.0, avg: 20.3, max: 41.0) +[2023-10-09 05:53:46,053][59242] Avg episode reward: [(0, '28.550'), (1, '31.740')] +[2023-10-09 05:53:46,298][60143] Updated weights for policy 0, policy_version 42322 (0.0008) +[2023-10-09 05:53:46,674][60143] Updated weights for policy 0, policy_version 42332 (0.0009) +[2023-10-09 05:53:48,480][60144] Updated weights for policy 1, policy_version 42822 (0.0009) +[2023-10-09 05:53:48,843][60144] Updated weights for policy 1, policy_version 42832 (0.0008) +[2023-10-09 05:53:49,211][60144] Updated weights for policy 1, policy_version 42842 (0.0009) +[2023-10-09 05:53:50,700][60143] Updated weights for policy 0, policy_version 42342 (0.0007) +[2023-10-09 05:53:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 87228416. Throughput: 0: 1700.3, 1: 1731.1. Samples: 21813220. Policy #0 lag: (min: 9.0, avg: 20.3, max: 41.0) +[2023-10-09 05:53:51,053][59242] Avg episode reward: [(0, '28.050'), (1, '31.920')] +[2023-10-09 05:53:51,072][60143] Updated weights for policy 0, policy_version 42352 (0.0009) +[2023-10-09 05:53:51,444][60143] Updated weights for policy 0, policy_version 42362 (0.0010) +[2023-10-09 05:53:53,159][60144] Updated weights for policy 1, policy_version 42852 (0.0010) +[2023-10-09 05:53:53,536][60144] Updated weights for policy 1, policy_version 42862 (0.0011) +[2023-10-09 05:53:53,896][60144] Updated weights for policy 1, policy_version 42872 (0.0010) +[2023-10-09 05:53:55,793][60143] Updated weights for policy 0, policy_version 42372 (0.0008) +[2023-10-09 05:53:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 87293952. Throughput: 0: 1716.2, 1: 1704.7. Samples: 21833544. Policy #0 lag: (min: 9.0, avg: 20.3, max: 41.0) +[2023-10-09 05:53:56,052][59242] Avg episode reward: [(0, '29.040'), (1, '31.410')] +[2023-10-09 05:53:56,171][60143] Updated weights for policy 0, policy_version 42382 (0.0011) +[2023-10-09 05:53:56,542][60143] Updated weights for policy 0, policy_version 42392 (0.0009) +[2023-10-09 05:53:57,941][60144] Updated weights for policy 1, policy_version 42882 (0.0008) +[2023-10-09 05:53:58,307][60144] Updated weights for policy 1, policy_version 42892 (0.0007) +[2023-10-09 05:53:58,684][60144] Updated weights for policy 1, policy_version 42902 (0.0007) +[2023-10-09 05:53:59,048][60144] Updated weights for policy 1, policy_version 42912 (0.0007) +[2023-10-09 05:54:00,564][60143] Updated weights for policy 0, policy_version 42402 (0.0009) +[2023-10-09 05:54:00,973][60143] Updated weights for policy 0, policy_version 42412 (0.0010) +[2023-10-09 05:54:01,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 87359488. Throughput: 0: 1710.5, 1: 1719.4. Samples: 21854614. Policy #0 lag: (min: 9.0, avg: 20.3, max: 41.0) +[2023-10-09 05:54:01,052][59242] Avg episode reward: [(0, '29.100'), (1, '31.880')] +[2023-10-09 05:54:01,351][60143] Updated weights for policy 0, policy_version 42422 (0.0009) +[2023-10-09 05:54:01,715][60143] Updated weights for policy 0, policy_version 42432 (0.0008) +[2023-10-09 05:54:02,913][60144] Updated weights for policy 1, policy_version 42922 (0.0007) +[2023-10-09 05:54:03,279][60144] Updated weights for policy 1, policy_version 42932 (0.0007) +[2023-10-09 05:54:03,644][60144] Updated weights for policy 1, policy_version 42942 (0.0008) +[2023-10-09 05:54:05,480][60143] Updated weights for policy 0, policy_version 42442 (0.0008) +[2023-10-09 05:54:05,849][60143] Updated weights for policy 0, policy_version 42452 (0.0007) +[2023-10-09 05:54:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 87425024. Throughput: 0: 1712.5, 1: 1705.9. Samples: 21864192. Policy #0 lag: (min: 9.0, avg: 20.3, max: 41.0) +[2023-10-09 05:54:06,053][59242] Avg episode reward: [(0, '29.200'), (1, '32.330')] +[2023-10-09 05:54:06,225][60143] Updated weights for policy 0, policy_version 42462 (0.0009) +[2023-10-09 05:54:07,610][60144] Updated weights for policy 1, policy_version 42952 (0.0007) +[2023-10-09 05:54:07,970][60144] Updated weights for policy 1, policy_version 42962 (0.0009) +[2023-10-09 05:54:08,348][60144] Updated weights for policy 1, policy_version 42972 (0.0008) +[2023-10-09 05:54:10,190][60143] Updated weights for policy 0, policy_version 42472 (0.0008) +[2023-10-09 05:54:10,555][60143] Updated weights for policy 0, policy_version 42482 (0.0008) +[2023-10-09 05:54:10,936][60143] Updated weights for policy 0, policy_version 42492 (0.0008) +[2023-10-09 05:54:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 87490560. Throughput: 0: 1714.7, 1: 1703.6. Samples: 21885146. Policy #0 lag: (min: 9.0, avg: 20.3, max: 41.0) +[2023-10-09 05:54:11,053][59242] Avg episode reward: [(0, '29.260'), (1, '31.710')] +[2023-10-09 05:54:12,529][60144] Updated weights for policy 1, policy_version 42982 (0.0010) +[2023-10-09 05:54:12,897][60144] Updated weights for policy 1, policy_version 42992 (0.0008) +[2023-10-09 05:54:13,275][60144] Updated weights for policy 1, policy_version 43002 (0.0009) +[2023-10-09 05:54:14,721][60143] Updated weights for policy 0, policy_version 42502 (0.0008) +[2023-10-09 05:54:15,088][60143] Updated weights for policy 0, policy_version 42512 (0.0009) +[2023-10-09 05:54:15,456][60143] Updated weights for policy 0, policy_version 42522 (0.0009) +[2023-10-09 05:54:16,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 87588864. Throughput: 0: 1697.9, 1: 1724.2. Samples: 21905598. Policy #0 lag: (min: 22.0, avg: 27.1, max: 54.0) +[2023-10-09 05:54:16,053][59242] Avg episode reward: [(0, '28.310'), (1, '31.430')] +[2023-10-09 05:54:17,290][60144] Updated weights for policy 1, policy_version 43012 (0.0007) +[2023-10-09 05:54:17,663][60144] Updated weights for policy 1, policy_version 43022 (0.0007) +[2023-10-09 05:54:18,031][60144] Updated weights for policy 1, policy_version 43032 (0.0008) +[2023-10-09 05:54:19,436][60143] Updated weights for policy 0, policy_version 42532 (0.0009) +[2023-10-09 05:54:19,803][60143] Updated weights for policy 0, policy_version 42542 (0.0007) +[2023-10-09 05:54:20,160][60143] Updated weights for policy 0, policy_version 42552 (0.0007) +[2023-10-09 05:54:21,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 87654400. Throughput: 0: 1724.3, 1: 1693.0. Samples: 21915942. Policy #0 lag: (min: 22.0, avg: 27.1, max: 54.0) +[2023-10-09 05:54:21,053][59242] Avg episode reward: [(0, '27.470'), (1, '32.400')] +[2023-10-09 05:54:21,903][60144] Updated weights for policy 1, policy_version 43042 (0.0010) +[2023-10-09 05:54:22,283][60144] Updated weights for policy 1, policy_version 43052 (0.0007) +[2023-10-09 05:54:22,643][60144] Updated weights for policy 1, policy_version 43062 (0.0009) +[2023-10-09 05:54:23,015][60144] Updated weights for policy 1, policy_version 43072 (0.0008) +[2023-10-09 05:54:24,320][60143] Updated weights for policy 0, policy_version 42562 (0.0008) +[2023-10-09 05:54:24,692][60143] Updated weights for policy 0, policy_version 42572 (0.0008) +[2023-10-09 05:54:25,064][60143] Updated weights for policy 0, policy_version 42582 (0.0009) +[2023-10-09 05:54:25,425][60143] Updated weights for policy 0, policy_version 42592 (0.0008) +[2023-10-09 05:54:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 87719936. Throughput: 0: 1710.8, 1: 1723.9. Samples: 21936824. Policy #0 lag: (min: 22.0, avg: 27.1, max: 54.0) +[2023-10-09 05:54:26,053][59242] Avg episode reward: [(0, '28.190'), (1, '31.610')] +[2023-10-09 05:54:27,038][60144] Updated weights for policy 1, policy_version 43082 (0.0008) +[2023-10-09 05:54:27,409][60144] Updated weights for policy 1, policy_version 43092 (0.0007) +[2023-10-09 05:54:27,774][60144] Updated weights for policy 1, policy_version 43102 (0.0010) +[2023-10-09 05:54:29,352][60143] Updated weights for policy 0, policy_version 42602 (0.0008) +[2023-10-09 05:54:29,723][60143] Updated weights for policy 0, policy_version 42612 (0.0007) +[2023-10-09 05:54:30,090][60143] Updated weights for policy 0, policy_version 42622 (0.0012) +[2023-10-09 05:54:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 87785472. Throughput: 0: 1689.1, 1: 1737.2. Samples: 21957202. Policy #0 lag: (min: 22.0, avg: 27.1, max: 54.0) +[2023-10-09 05:54:31,053][59242] Avg episode reward: [(0, '29.370'), (1, '31.200')] +[2023-10-09 05:54:31,460][60144] Updated weights for policy 1, policy_version 43112 (0.0007) +[2023-10-09 05:54:31,836][60144] Updated weights for policy 1, policy_version 43122 (0.0008) +[2023-10-09 05:54:32,205][60144] Updated weights for policy 1, policy_version 43132 (0.0008) +[2023-10-09 05:54:33,907][60143] Updated weights for policy 0, policy_version 42632 (0.0009) +[2023-10-09 05:54:34,276][60143] Updated weights for policy 0, policy_version 42642 (0.0010) +[2023-10-09 05:54:34,649][60143] Updated weights for policy 0, policy_version 42652 (0.0008) +[2023-10-09 05:54:36,050][60144] Updated weights for policy 1, policy_version 43142 (0.0007) +[2023-10-09 05:54:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 87851008. Throughput: 0: 1724.8, 1: 1715.2. Samples: 21968018. Policy #0 lag: (min: 22.0, avg: 27.1, max: 54.0) +[2023-10-09 05:54:36,052][59242] Avg episode reward: [(0, '30.810'), (1, '32.570')] +[2023-10-09 05:54:36,426][60144] Updated weights for policy 1, policy_version 43152 (0.0008) +[2023-10-09 05:54:36,795][60144] Updated weights for policy 1, policy_version 43162 (0.0007) +[2023-10-09 05:54:38,786][60143] Updated weights for policy 0, policy_version 42662 (0.0008) +[2023-10-09 05:54:39,154][60143] Updated weights for policy 0, policy_version 42672 (0.0007) +[2023-10-09 05:54:39,522][60143] Updated weights for policy 0, policy_version 42682 (0.0007) +[2023-10-09 05:54:40,654][60144] Updated weights for policy 1, policy_version 43172 (0.0009) +[2023-10-09 05:54:41,020][60144] Updated weights for policy 1, policy_version 43182 (0.0007) +[2023-10-09 05:54:41,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 87916544. Throughput: 0: 1700.3, 1: 1739.0. Samples: 21988312. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-09 05:54:41,053][59242] Avg episode reward: [(0, '29.450'), (1, '33.140')] +[2023-10-09 05:54:41,381][60144] Updated weights for policy 1, policy_version 43192 (0.0007) +[2023-10-09 05:54:41,674][60003] Saving new best policy, reward=33.140! +[2023-10-09 05:54:43,427][60143] Updated weights for policy 0, policy_version 42692 (0.0008) +[2023-10-09 05:54:43,795][60143] Updated weights for policy 0, policy_version 42702 (0.0010) +[2023-10-09 05:54:44,169][60143] Updated weights for policy 0, policy_version 42712 (0.0010) +[2023-10-09 05:54:45,356][60144] Updated weights for policy 1, policy_version 43202 (0.0008) +[2023-10-09 05:54:45,729][60144] Updated weights for policy 1, policy_version 43212 (0.0010) +[2023-10-09 05:54:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 87982080. Throughput: 0: 1695.9, 1: 1731.1. Samples: 22008828. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-09 05:54:46,053][59242] Avg episode reward: [(0, '30.340'), (1, '32.720')] +[2023-10-09 05:54:46,096][60144] Updated weights for policy 1, policy_version 43222 (0.0010) +[2023-10-09 05:54:46,462][60144] Updated weights for policy 1, policy_version 43232 (0.0010) +[2023-10-09 05:54:48,163][60143] Updated weights for policy 0, policy_version 42722 (0.0008) +[2023-10-09 05:54:48,571][60143] Updated weights for policy 0, policy_version 42732 (0.0009) +[2023-10-09 05:54:48,944][60143] Updated weights for policy 0, policy_version 42742 (0.0009) +[2023-10-09 05:54:49,311][60143] Updated weights for policy 0, policy_version 42752 (0.0010) +[2023-10-09 05:54:50,538][60144] Updated weights for policy 1, policy_version 43242 (0.0007) +[2023-10-09 05:54:50,905][60144] Updated weights for policy 1, policy_version 43252 (0.0009) +[2023-10-09 05:54:51,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 88047616. Throughput: 0: 1715.2, 1: 1729.2. Samples: 22019188. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-09 05:54:51,053][59242] Avg episode reward: [(0, '31.230'), (1, '31.230')] +[2023-10-09 05:54:51,281][60144] Updated weights for policy 1, policy_version 43262 (0.0007) +[2023-10-09 05:54:53,395][60143] Updated weights for policy 0, policy_version 42762 (0.0007) +[2023-10-09 05:54:53,774][60143] Updated weights for policy 0, policy_version 42772 (0.0007) +[2023-10-09 05:54:54,137][60143] Updated weights for policy 0, policy_version 42782 (0.0009) +[2023-10-09 05:54:55,210][60144] Updated weights for policy 1, policy_version 43272 (0.0011) +[2023-10-09 05:54:55,575][60144] Updated weights for policy 1, policy_version 43282 (0.0009) +[2023-10-09 05:54:55,953][60144] Updated weights for policy 1, policy_version 43292 (0.0010) +[2023-10-09 05:54:56,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.2, 300 sec: 13662.6). Total num frames: 88113152. Throughput: 0: 1693.5, 1: 1737.5. Samples: 22039546. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-09 05:54:56,054][59242] Avg episode reward: [(0, '31.150'), (1, '31.390')] +[2023-10-09 05:54:58,096][60143] Updated weights for policy 0, policy_version 42792 (0.0010) +[2023-10-09 05:54:58,466][60143] Updated weights for policy 0, policy_version 42802 (0.0010) +[2023-10-09 05:54:58,838][60143] Updated weights for policy 0, policy_version 42812 (0.0009) +[2023-10-09 05:54:59,866][60144] Updated weights for policy 1, policy_version 43302 (0.0007) +[2023-10-09 05:55:00,233][60144] Updated weights for policy 1, policy_version 43312 (0.0008) +[2023-10-09 05:55:00,609][60144] Updated weights for policy 1, policy_version 43322 (0.0008) +[2023-10-09 05:55:01,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 88211456. Throughput: 0: 1708.9, 1: 1721.0. Samples: 22059942. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-09 05:55:01,053][59242] Avg episode reward: [(0, '29.760'), (1, '31.810')] +[2023-10-09 05:55:02,850][60143] Updated weights for policy 0, policy_version 42822 (0.0010) +[2023-10-09 05:55:03,218][60143] Updated weights for policy 0, policy_version 42832 (0.0009) +[2023-10-09 05:55:03,586][60143] Updated weights for policy 0, policy_version 42842 (0.0008) +[2023-10-09 05:55:04,420][60144] Updated weights for policy 1, policy_version 43332 (0.0009) +[2023-10-09 05:55:04,794][60144] Updated weights for policy 1, policy_version 43342 (0.0008) +[2023-10-09 05:55:05,163][60144] Updated weights for policy 1, policy_version 43352 (0.0008) +[2023-10-09 05:55:06,052][59242] Fps is (10 sec: 16384.6, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 88276992. Throughput: 0: 1690.6, 1: 1749.6. Samples: 22070752. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-09 05:55:06,053][59242] Avg episode reward: [(0, '30.090'), (1, '32.700')] +[2023-10-09 05:55:07,634][60143] Updated weights for policy 0, policy_version 42852 (0.0009) +[2023-10-09 05:55:08,001][60143] Updated weights for policy 0, policy_version 42862 (0.0010) +[2023-10-09 05:55:08,368][60143] Updated weights for policy 0, policy_version 42872 (0.0011) +[2023-10-09 05:55:09,062][60144] Updated weights for policy 1, policy_version 43362 (0.0008) +[2023-10-09 05:55:09,421][60144] Updated weights for policy 1, policy_version 43372 (0.0007) +[2023-10-09 05:55:09,787][60144] Updated weights for policy 1, policy_version 43382 (0.0008) +[2023-10-09 05:55:10,155][60144] Updated weights for policy 1, policy_version 43392 (0.0007) +[2023-10-09 05:55:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 88342528. Throughput: 0: 1691.7, 1: 1729.5. Samples: 22090782. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 05:55:11,053][59242] Avg episode reward: [(0, '30.650'), (1, '32.350')] +[2023-10-09 05:55:12,685][60143] Updated weights for policy 0, policy_version 42882 (0.0010) +[2023-10-09 05:55:13,057][60143] Updated weights for policy 0, policy_version 42892 (0.0008) +[2023-10-09 05:55:13,425][60143] Updated weights for policy 0, policy_version 42902 (0.0007) +[2023-10-09 05:55:13,796][60143] Updated weights for policy 0, policy_version 42912 (0.0008) +[2023-10-09 05:55:14,220][60144] Updated weights for policy 1, policy_version 43402 (0.0007) +[2023-10-09 05:55:14,593][60144] Updated weights for policy 1, policy_version 43412 (0.0007) +[2023-10-09 05:55:14,966][60144] Updated weights for policy 1, policy_version 43422 (0.0009) +[2023-10-09 05:55:16,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 88408064. Throughput: 0: 1715.9, 1: 1704.9. Samples: 22111138. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 05:55:16,053][59242] Avg episode reward: [(0, '31.930'), (1, '31.430')] +[2023-10-09 05:55:17,688][60143] Updated weights for policy 0, policy_version 42922 (0.0007) +[2023-10-09 05:55:18,049][60143] Updated weights for policy 0, policy_version 42932 (0.0008) +[2023-10-09 05:55:18,415][60143] Updated weights for policy 0, policy_version 42942 (0.0010) +[2023-10-09 05:55:18,804][60144] Updated weights for policy 1, policy_version 43432 (0.0007) +[2023-10-09 05:55:19,170][60144] Updated weights for policy 1, policy_version 43442 (0.0007) +[2023-10-09 05:55:19,531][60144] Updated weights for policy 1, policy_version 43452 (0.0011) +[2023-10-09 05:55:21,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 88473600. Throughput: 0: 1681.2, 1: 1733.1. Samples: 22121660. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 05:55:21,052][59242] Avg episode reward: [(0, '31.420'), (1, '33.910')] +[2023-10-09 05:55:21,053][60003] Saving new best policy, reward=33.910! +[2023-10-09 05:55:22,364][60143] Updated weights for policy 0, policy_version 42952 (0.0009) +[2023-10-09 05:55:22,732][60143] Updated weights for policy 0, policy_version 42962 (0.0011) +[2023-10-09 05:55:23,098][60143] Updated weights for policy 0, policy_version 42972 (0.0010) +[2023-10-09 05:55:23,463][60144] Updated weights for policy 1, policy_version 43462 (0.0008) +[2023-10-09 05:55:23,835][60144] Updated weights for policy 1, policy_version 43472 (0.0008) +[2023-10-09 05:55:24,194][60144] Updated weights for policy 1, policy_version 43482 (0.0009) +[2023-10-09 05:55:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 88539136. Throughput: 0: 1699.4, 1: 1707.2. Samples: 22141608. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 05:55:26,053][59242] Avg episode reward: [(0, '30.880'), (1, '33.870')] +[2023-10-09 05:55:27,140][60143] Updated weights for policy 0, policy_version 42982 (0.0009) +[2023-10-09 05:55:27,525][60143] Updated weights for policy 0, policy_version 42992 (0.0010) +[2023-10-09 05:55:27,895][60143] Updated weights for policy 0, policy_version 43002 (0.0007) +[2023-10-09 05:55:28,198][60144] Updated weights for policy 1, policy_version 43492 (0.0008) +[2023-10-09 05:55:28,558][60144] Updated weights for policy 1, policy_version 43502 (0.0009) +[2023-10-09 05:55:28,924][60144] Updated weights for policy 1, policy_version 43512 (0.0009) +[2023-10-09 05:55:31,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 88604672. Throughput: 0: 1706.7, 1: 1715.2. Samples: 22162810. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 05:55:31,053][59242] Avg episode reward: [(0, '31.740'), (1, '32.380')] +[2023-10-09 05:55:31,063][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000043520_44564480.pth... +[2023-10-09 05:55:31,064][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000043008_44040192.pth... +[2023-10-09 05:55:31,093][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000041920_42926080.pth +[2023-10-09 05:55:31,108][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000041408_42401792.pth +[2023-10-09 05:55:31,916][60143] Updated weights for policy 0, policy_version 43012 (0.0010) +[2023-10-09 05:55:32,289][60143] Updated weights for policy 0, policy_version 43022 (0.0008) +[2023-10-09 05:55:32,656][60143] Updated weights for policy 0, policy_version 43032 (0.0011) +[2023-10-09 05:55:32,849][60144] Updated weights for policy 1, policy_version 43522 (0.0009) +[2023-10-09 05:55:33,208][60144] Updated weights for policy 1, policy_version 43532 (0.0008) +[2023-10-09 05:55:33,575][60144] Updated weights for policy 1, policy_version 43542 (0.0007) +[2023-10-09 05:55:33,945][60144] Updated weights for policy 1, policy_version 43552 (0.0009) +[2023-10-09 05:55:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 88670208. Throughput: 0: 1686.4, 1: 1725.5. Samples: 22172720. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 05:55:36,053][59242] Avg episode reward: [(0, '32.140'), (1, '32.450')] +[2023-10-09 05:55:36,513][60143] Updated weights for policy 0, policy_version 43042 (0.0008) +[2023-10-09 05:55:36,884][60143] Updated weights for policy 0, policy_version 43052 (0.0009) +[2023-10-09 05:55:37,251][60143] Updated weights for policy 0, policy_version 43062 (0.0007) +[2023-10-09 05:55:37,622][60143] Updated weights for policy 0, policy_version 43072 (0.0008) +[2023-10-09 05:55:38,045][60144] Updated weights for policy 1, policy_version 43562 (0.0008) +[2023-10-09 05:55:38,414][60144] Updated weights for policy 1, policy_version 43572 (0.0008) +[2023-10-09 05:55:38,787][60144] Updated weights for policy 1, policy_version 43582 (0.0010) +[2023-10-09 05:55:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 88735744. Throughput: 0: 1705.5, 1: 1710.8. Samples: 22193278. Policy #0 lag: (min: 31.0, avg: 32.2, max: 55.0) +[2023-10-09 05:55:41,052][59242] Avg episode reward: [(0, '31.690'), (1, '31.430')] +[2023-10-09 05:55:41,672][60143] Updated weights for policy 0, policy_version 43082 (0.0008) +[2023-10-09 05:55:42,048][60143] Updated weights for policy 0, policy_version 43092 (0.0009) +[2023-10-09 05:55:42,417][60143] Updated weights for policy 0, policy_version 43102 (0.0008) +[2023-10-09 05:55:42,651][60144] Updated weights for policy 1, policy_version 43592 (0.0008) +[2023-10-09 05:55:43,019][60144] Updated weights for policy 1, policy_version 43602 (0.0007) +[2023-10-09 05:55:43,394][60144] Updated weights for policy 1, policy_version 43612 (0.0007) +[2023-10-09 05:55:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 88801280. Throughput: 0: 1697.4, 1: 1731.3. Samples: 22214234. Policy #0 lag: (min: 31.0, avg: 32.2, max: 55.0) +[2023-10-09 05:55:46,053][59242] Avg episode reward: [(0, '30.560'), (1, '30.750')] +[2023-10-09 05:55:46,505][60143] Updated weights for policy 0, policy_version 43112 (0.0010) +[2023-10-09 05:55:46,875][60143] Updated weights for policy 0, policy_version 43122 (0.0010) +[2023-10-09 05:55:47,248][60143] Updated weights for policy 0, policy_version 43132 (0.0009) +[2023-10-09 05:55:47,328][60144] Updated weights for policy 1, policy_version 43622 (0.0009) +[2023-10-09 05:55:47,695][60144] Updated weights for policy 1, policy_version 43632 (0.0009) +[2023-10-09 05:55:48,068][60144] Updated weights for policy 1, policy_version 43642 (0.0008) +[2023-10-09 05:55:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 88866816. Throughput: 0: 1688.5, 1: 1706.1. Samples: 22223512. Policy #0 lag: (min: 31.0, avg: 32.2, max: 55.0) +[2023-10-09 05:55:51,053][59242] Avg episode reward: [(0, '33.050'), (1, '31.290')] +[2023-10-09 05:55:51,296][60143] Updated weights for policy 0, policy_version 43142 (0.0008) +[2023-10-09 05:55:51,668][60143] Updated weights for policy 0, policy_version 43152 (0.0009) +[2023-10-09 05:55:52,040][60144] Updated weights for policy 1, policy_version 43652 (0.0007) +[2023-10-09 05:55:52,042][60143] Updated weights for policy 0, policy_version 43162 (0.0010) +[2023-10-09 05:55:52,408][60144] Updated weights for policy 1, policy_version 43662 (0.0008) +[2023-10-09 05:55:52,776][60144] Updated weights for policy 1, policy_version 43672 (0.0009) +[2023-10-09 05:55:56,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 88932352. Throughput: 0: 1697.6, 1: 1720.2. Samples: 22244582. Policy #0 lag: (min: 31.0, avg: 32.2, max: 55.0) +[2023-10-09 05:55:56,052][59242] Avg episode reward: [(0, '33.000'), (1, '30.570')] +[2023-10-09 05:55:56,117][60143] Updated weights for policy 0, policy_version 43172 (0.0008) +[2023-10-09 05:55:56,484][60143] Updated weights for policy 0, policy_version 43182 (0.0008) +[2023-10-09 05:55:56,726][60144] Updated weights for policy 1, policy_version 43682 (0.0009) +[2023-10-09 05:55:56,854][60143] Updated weights for policy 0, policy_version 43192 (0.0007) +[2023-10-09 05:55:57,089][60144] Updated weights for policy 1, policy_version 43692 (0.0009) +[2023-10-09 05:55:57,464][60144] Updated weights for policy 1, policy_version 43702 (0.0010) +[2023-10-09 05:55:57,831][60144] Updated weights for policy 1, policy_version 43712 (0.0008) +[2023-10-09 05:56:00,993][60143] Updated weights for policy 0, policy_version 43202 (0.0008) +[2023-10-09 05:56:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 88997888. Throughput: 0: 1697.6, 1: 1742.7. Samples: 22265950. Policy #0 lag: (min: 31.0, avg: 32.2, max: 55.0) +[2023-10-09 05:56:01,053][59242] Avg episode reward: [(0, '30.910'), (1, '30.210')] +[2023-10-09 05:56:01,367][60143] Updated weights for policy 0, policy_version 43212 (0.0009) +[2023-10-09 05:56:01,738][60143] Updated weights for policy 0, policy_version 43222 (0.0008) +[2023-10-09 05:56:01,800][60144] Updated weights for policy 1, policy_version 43722 (0.0009) +[2023-10-09 05:56:02,103][60143] Updated weights for policy 0, policy_version 43232 (0.0008) +[2023-10-09 05:56:02,163][60144] Updated weights for policy 1, policy_version 43732 (0.0010) +[2023-10-09 05:56:02,529][60144] Updated weights for policy 1, policy_version 43742 (0.0010) +[2023-10-09 05:56:06,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 89063424. Throughput: 0: 1699.1, 1: 1709.6. Samples: 22275052. Policy #0 lag: (min: 31.0, avg: 32.2, max: 55.0) +[2023-10-09 05:56:06,053][59242] Avg episode reward: [(0, '31.590'), (1, '31.090')] +[2023-10-09 05:56:06,074][60143] Updated weights for policy 0, policy_version 43242 (0.0007) +[2023-10-09 05:56:06,453][60143] Updated weights for policy 0, policy_version 43252 (0.0007) +[2023-10-09 05:56:06,703][60144] Updated weights for policy 1, policy_version 43752 (0.0008) +[2023-10-09 05:56:06,820][60143] Updated weights for policy 0, policy_version 43262 (0.0008) +[2023-10-09 05:56:07,066][60144] Updated weights for policy 1, policy_version 43762 (0.0008) +[2023-10-09 05:56:07,427][60144] Updated weights for policy 1, policy_version 43772 (0.0007) +[2023-10-09 05:56:10,900][60143] Updated weights for policy 0, policy_version 43272 (0.0008) +[2023-10-09 05:56:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 89128960. Throughput: 0: 1698.5, 1: 1731.4. Samples: 22295956. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) +[2023-10-09 05:56:11,052][59242] Avg episode reward: [(0, '30.460'), (1, '30.670')] +[2023-10-09 05:56:11,269][60143] Updated weights for policy 0, policy_version 43282 (0.0007) +[2023-10-09 05:56:11,356][60144] Updated weights for policy 1, policy_version 43782 (0.0008) +[2023-10-09 05:56:11,643][60143] Updated weights for policy 0, policy_version 43292 (0.0007) +[2023-10-09 05:56:11,728][60144] Updated weights for policy 1, policy_version 43792 (0.0009) +[2023-10-09 05:56:12,086][60144] Updated weights for policy 1, policy_version 43802 (0.0010) +[2023-10-09 05:56:15,632][60143] Updated weights for policy 0, policy_version 43302 (0.0009) +[2023-10-09 05:56:15,825][60144] Updated weights for policy 1, policy_version 43812 (0.0007) +[2023-10-09 05:56:16,002][60143] Updated weights for policy 0, policy_version 43312 (0.0009) +[2023-10-09 05:56:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 89194496. Throughput: 0: 1696.2, 1: 1738.3. Samples: 22317364. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) +[2023-10-09 05:56:16,053][59242] Avg episode reward: [(0, '30.650'), (1, '30.970')] +[2023-10-09 05:56:16,182][60144] Updated weights for policy 1, policy_version 43822 (0.0007) +[2023-10-09 05:56:16,370][60143] Updated weights for policy 0, policy_version 43322 (0.0008) +[2023-10-09 05:56:16,557][60144] Updated weights for policy 1, policy_version 43832 (0.0007) +[2023-10-09 05:56:20,322][60143] Updated weights for policy 0, policy_version 43332 (0.0009) +[2023-10-09 05:56:20,347][60144] Updated weights for policy 1, policy_version 43842 (0.0008) +[2023-10-09 05:56:20,682][60143] Updated weights for policy 0, policy_version 43342 (0.0007) +[2023-10-09 05:56:20,717][60144] Updated weights for policy 1, policy_version 43852 (0.0007) +[2023-10-09 05:56:21,050][60143] Updated weights for policy 0, policy_version 43352 (0.0007) +[2023-10-09 05:56:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 89260032. Throughput: 0: 1697.3, 1: 1722.9. Samples: 22326630. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) +[2023-10-09 05:56:21,052][59242] Avg episode reward: [(0, '30.250'), (1, '30.330')] +[2023-10-09 05:56:21,081][60144] Updated weights for policy 1, policy_version 43862 (0.0007) +[2023-10-09 05:56:21,448][60144] Updated weights for policy 1, policy_version 43872 (0.0007) +[2023-10-09 05:56:25,068][60143] Updated weights for policy 0, policy_version 43362 (0.0009) +[2023-10-09 05:56:25,359][60144] Updated weights for policy 1, policy_version 43882 (0.0010) +[2023-10-09 05:56:25,436][60143] Updated weights for policy 0, policy_version 43372 (0.0009) +[2023-10-09 05:56:25,723][60144] Updated weights for policy 1, policy_version 43892 (0.0008) +[2023-10-09 05:56:25,807][60143] Updated weights for policy 0, policy_version 43382 (0.0008) +[2023-10-09 05:56:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 89325568. Throughput: 0: 1700.3, 1: 1738.7. Samples: 22348038. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) +[2023-10-09 05:56:26,053][59242] Avg episode reward: [(0, '30.950'), (1, '29.530')] +[2023-10-09 05:56:26,098][60144] Updated weights for policy 1, policy_version 43902 (0.0009) +[2023-10-09 05:56:26,168][60143] Updated weights for policy 0, policy_version 43392 (0.0009) +[2023-10-09 05:56:30,159][60144] Updated weights for policy 1, policy_version 43912 (0.0008) +[2023-10-09 05:56:30,230][60143] Updated weights for policy 0, policy_version 43402 (0.0009) +[2023-10-09 05:56:30,515][60144] Updated weights for policy 1, policy_version 43922 (0.0009) +[2023-10-09 05:56:30,608][60143] Updated weights for policy 0, policy_version 43412 (0.0010) +[2023-10-09 05:56:30,880][60144] Updated weights for policy 1, policy_version 43932 (0.0009) +[2023-10-09 05:56:30,984][60143] Updated weights for policy 0, policy_version 43422 (0.0008) +[2023-10-09 05:56:31,052][59242] Fps is (10 sec: 19660.8, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 89456640. Throughput: 0: 1687.3, 1: 1723.2. Samples: 22367708. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) +[2023-10-09 05:56:31,052][59242] Avg episode reward: [(0, '29.390'), (1, '29.790')] +[2023-10-09 05:56:34,855][60144] Updated weights for policy 1, policy_version 43942 (0.0009) +[2023-10-09 05:56:34,893][60143] Updated weights for policy 0, policy_version 43432 (0.0010) +[2023-10-09 05:56:35,208][60144] Updated weights for policy 1, policy_version 43952 (0.0008) +[2023-10-09 05:56:35,260][60143] Updated weights for policy 0, policy_version 43442 (0.0008) +[2023-10-09 05:56:35,581][60144] Updated weights for policy 1, policy_version 43962 (0.0007) +[2023-10-09 05:56:35,636][60143] Updated weights for policy 0, policy_version 43452 (0.0009) +[2023-10-09 05:56:36,052][59242] Fps is (10 sec: 19660.8, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 89522176. Throughput: 0: 1705.8, 1: 1741.6. Samples: 22378648. Policy #0 lag: (min: 26.0, avg: 33.4, max: 58.0) +[2023-10-09 05:56:36,053][59242] Avg episode reward: [(0, '30.960'), (1, '30.140')] +[2023-10-09 05:56:39,651][60144] Updated weights for policy 1, policy_version 43972 (0.0009) +[2023-10-09 05:56:39,769][60143] Updated weights for policy 0, policy_version 43462 (0.0007) +[2023-10-09 05:56:40,020][60144] Updated weights for policy 1, policy_version 43982 (0.0008) +[2023-10-09 05:56:40,135][60143] Updated weights for policy 0, policy_version 43472 (0.0008) +[2023-10-09 05:56:40,396][60144] Updated weights for policy 1, policy_version 43992 (0.0009) +[2023-10-09 05:56:40,502][60143] Updated weights for policy 0, policy_version 43482 (0.0009) +[2023-10-09 05:56:41,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 89587712. Throughput: 0: 1700.3, 1: 1740.8. Samples: 22399434. Policy #0 lag: (min: 26.0, avg: 33.4, max: 58.0) +[2023-10-09 05:56:41,054][59242] Avg episode reward: [(0, '31.810'), (1, '28.400')] +[2023-10-09 05:56:44,282][60144] Updated weights for policy 1, policy_version 44002 (0.0007) +[2023-10-09 05:56:44,422][60143] Updated weights for policy 0, policy_version 43492 (0.0009) +[2023-10-09 05:56:44,649][60144] Updated weights for policy 1, policy_version 44012 (0.0008) +[2023-10-09 05:56:44,789][60143] Updated weights for policy 0, policy_version 43502 (0.0007) +[2023-10-09 05:56:45,016][60144] Updated weights for policy 1, policy_version 44022 (0.0007) +[2023-10-09 05:56:45,161][60143] Updated weights for policy 0, policy_version 43512 (0.0007) +[2023-10-09 05:56:45,379][60144] Updated weights for policy 1, policy_version 44032 (0.0009) +[2023-10-09 05:56:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 89653248. Throughput: 0: 1669.5, 1: 1711.2. Samples: 22418080. Policy #0 lag: (min: 26.0, avg: 33.4, max: 58.0) +[2023-10-09 05:56:46,053][59242] Avg episode reward: [(0, '32.080'), (1, '27.420')] +[2023-10-09 05:56:49,206][60143] Updated weights for policy 0, policy_version 43522 (0.0008) +[2023-10-09 05:56:49,507][60144] Updated weights for policy 1, policy_version 44042 (0.0008) +[2023-10-09 05:56:49,574][60143] Updated weights for policy 0, policy_version 43532 (0.0008) +[2023-10-09 05:56:49,882][60144] Updated weights for policy 1, policy_version 44052 (0.0008) +[2023-10-09 05:56:49,949][60143] Updated weights for policy 0, policy_version 43542 (0.0007) +[2023-10-09 05:56:50,259][60144] Updated weights for policy 1, policy_version 44062 (0.0007) +[2023-10-09 05:56:50,317][60143] Updated weights for policy 0, policy_version 43552 (0.0008) +[2023-10-09 05:56:51,052][59242] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 89718784. Throughput: 0: 1697.5, 1: 1744.0. Samples: 22429918. Policy #0 lag: (min: 26.0, avg: 33.4, max: 58.0) +[2023-10-09 05:56:51,052][59242] Avg episode reward: [(0, '30.220'), (1, '27.610')] +[2023-10-09 05:56:54,163][60144] Updated weights for policy 1, policy_version 44072 (0.0008) +[2023-10-09 05:56:54,474][60143] Updated weights for policy 0, policy_version 43562 (0.0009) +[2023-10-09 05:56:54,538][60144] Updated weights for policy 1, policy_version 44082 (0.0008) +[2023-10-09 05:56:54,834][60143] Updated weights for policy 0, policy_version 43572 (0.0009) +[2023-10-09 05:56:54,908][60144] Updated weights for policy 1, policy_version 44092 (0.0009) +[2023-10-09 05:56:55,202][60143] Updated weights for policy 0, policy_version 43582 (0.0009) +[2023-10-09 05:56:56,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 89784320. Throughput: 0: 1693.4, 1: 1725.3. Samples: 22449798. Policy #0 lag: (min: 26.0, avg: 33.4, max: 58.0) +[2023-10-09 05:56:56,053][59242] Avg episode reward: [(0, '31.420'), (1, '29.670')] +[2023-10-09 05:56:58,845][60144] Updated weights for policy 1, policy_version 44102 (0.0009) +[2023-10-09 05:56:59,005][60143] Updated weights for policy 0, policy_version 43592 (0.0008) +[2023-10-09 05:56:59,219][60144] Updated weights for policy 1, policy_version 44112 (0.0009) +[2023-10-09 05:56:59,375][60143] Updated weights for policy 0, policy_version 43602 (0.0007) +[2023-10-09 05:56:59,591][60144] Updated weights for policy 1, policy_version 44122 (0.0008) +[2023-10-09 05:56:59,740][60143] Updated weights for policy 0, policy_version 43612 (0.0007) +[2023-10-09 05:57:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 89849856. Throughput: 0: 1678.5, 1: 1703.2. Samples: 22469540. Policy #0 lag: (min: 26.0, avg: 33.4, max: 58.0) +[2023-10-09 05:57:01,053][59242] Avg episode reward: [(0, '33.220'), (1, '30.060')] +[2023-10-09 05:57:03,634][60144] Updated weights for policy 1, policy_version 44132 (0.0007) +[2023-10-09 05:57:03,777][60143] Updated weights for policy 0, policy_version 43622 (0.0009) +[2023-10-09 05:57:03,998][60144] Updated weights for policy 1, policy_version 44142 (0.0008) +[2023-10-09 05:57:04,139][60143] Updated weights for policy 0, policy_version 43632 (0.0009) +[2023-10-09 05:57:04,365][60144] Updated weights for policy 1, policy_version 44152 (0.0008) +[2023-10-09 05:57:04,502][60143] Updated weights for policy 0, policy_version 43642 (0.0008) +[2023-10-09 05:57:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 89915392. Throughput: 0: 1708.7, 1: 1728.7. Samples: 22481310. Policy #0 lag: (min: 27.0, avg: 27.6, max: 43.0) +[2023-10-09 05:57:06,053][59242] Avg episode reward: [(0, '33.530'), (1, '30.810')] +[2023-10-09 05:57:08,416][60144] Updated weights for policy 1, policy_version 44162 (0.0008) +[2023-10-09 05:57:08,488][60143] Updated weights for policy 0, policy_version 43652 (0.0007) +[2023-10-09 05:57:08,791][60144] Updated weights for policy 1, policy_version 44172 (0.0007) +[2023-10-09 05:57:08,861][60143] Updated weights for policy 0, policy_version 43662 (0.0008) +[2023-10-09 05:57:09,156][60144] Updated weights for policy 1, policy_version 44182 (0.0007) +[2023-10-09 05:57:09,229][60143] Updated weights for policy 0, policy_version 43672 (0.0007) +[2023-10-09 05:57:09,524][60144] Updated weights for policy 1, policy_version 44192 (0.0008) +[2023-10-09 05:57:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 89980928. Throughput: 0: 1675.4, 1: 1694.8. Samples: 22499696. Policy #0 lag: (min: 27.0, avg: 27.6, max: 43.0) +[2023-10-09 05:57:11,054][59242] Avg episode reward: [(0, '32.850'), (1, '30.870')] +[2023-10-09 05:57:13,152][60143] Updated weights for policy 0, policy_version 43682 (0.0008) +[2023-10-09 05:57:13,425][60144] Updated weights for policy 1, policy_version 44202 (0.0007) +[2023-10-09 05:57:13,534][60143] Updated weights for policy 0, policy_version 43692 (0.0008) +[2023-10-09 05:57:13,786][60144] Updated weights for policy 1, policy_version 44212 (0.0008) +[2023-10-09 05:57:13,900][60143] Updated weights for policy 0, policy_version 43702 (0.0009) +[2023-10-09 05:57:14,158][60144] Updated weights for policy 1, policy_version 44222 (0.0009) +[2023-10-09 05:57:14,259][60143] Updated weights for policy 0, policy_version 43712 (0.0008) +[2023-10-09 05:57:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 90046464. Throughput: 0: 1698.5, 1: 1706.1. Samples: 22520916. Policy #0 lag: (min: 27.0, avg: 27.6, max: 43.0) +[2023-10-09 05:57:16,053][59242] Avg episode reward: [(0, '34.370'), (1, '31.060')] +[2023-10-09 05:57:16,065][59934] Saving new best policy, reward=34.370! +[2023-10-09 05:57:18,202][60144] Updated weights for policy 1, policy_version 44232 (0.0009) +[2023-10-09 05:57:18,454][60143] Updated weights for policy 0, policy_version 43722 (0.0008) +[2023-10-09 05:57:18,565][60144] Updated weights for policy 1, policy_version 44242 (0.0008) +[2023-10-09 05:57:18,830][60143] Updated weights for policy 0, policy_version 43732 (0.0008) +[2023-10-09 05:57:18,928][60144] Updated weights for policy 1, policy_version 44252 (0.0007) +[2023-10-09 05:57:19,189][60143] Updated weights for policy 0, policy_version 43742 (0.0009) +[2023-10-09 05:57:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 90112000. Throughput: 0: 1695.9, 1: 1697.2. Samples: 22531334. Policy #0 lag: (min: 27.0, avg: 27.6, max: 43.0) +[2023-10-09 05:57:21,053][59242] Avg episode reward: [(0, '32.120'), (1, '32.590')] +[2023-10-09 05:57:22,835][60144] Updated weights for policy 1, policy_version 44262 (0.0007) +[2023-10-09 05:57:23,214][60144] Updated weights for policy 1, policy_version 44272 (0.0008) +[2023-10-09 05:57:23,256][60143] Updated weights for policy 0, policy_version 43752 (0.0008) +[2023-10-09 05:57:23,587][60144] Updated weights for policy 1, policy_version 44282 (0.0007) +[2023-10-09 05:57:23,621][60143] Updated weights for policy 0, policy_version 43762 (0.0007) +[2023-10-09 05:57:24,001][60143] Updated weights for policy 0, policy_version 43772 (0.0009) +[2023-10-09 05:57:26,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 90177536. Throughput: 0: 1682.1, 1: 1684.2. Samples: 22550918. Policy #0 lag: (min: 27.0, avg: 27.6, max: 43.0) +[2023-10-09 05:57:26,053][59242] Avg episode reward: [(0, '32.600'), (1, '31.760')] +[2023-10-09 05:57:27,517][60144] Updated weights for policy 1, policy_version 44292 (0.0008) +[2023-10-09 05:57:27,882][60144] Updated weights for policy 1, policy_version 44302 (0.0008) +[2023-10-09 05:57:27,959][60143] Updated weights for policy 0, policy_version 43782 (0.0007) +[2023-10-09 05:57:28,243][60144] Updated weights for policy 1, policy_version 44312 (0.0008) +[2023-10-09 05:57:28,322][60143] Updated weights for policy 0, policy_version 43792 (0.0008) +[2023-10-09 05:57:28,694][60143] Updated weights for policy 0, policy_version 43802 (0.0007) +[2023-10-09 05:57:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 90243072. Throughput: 0: 1712.5, 1: 1711.6. Samples: 22572162. Policy #0 lag: (min: 27.0, avg: 27.6, max: 43.0) +[2023-10-09 05:57:31,053][59242] Avg episode reward: [(0, '31.560'), (1, '31.470')] +[2023-10-09 05:57:31,063][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000043808_44859392.pth... +[2023-10-09 05:57:31,063][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000044320_45383680.pth... +[2023-10-09 05:57:31,098][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000042208_43220992.pth +[2023-10-09 05:57:31,102][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000042720_43745280.pth +[2023-10-09 05:57:32,138][60144] Updated weights for policy 1, policy_version 44322 (0.0009) +[2023-10-09 05:57:32,505][60144] Updated weights for policy 1, policy_version 44332 (0.0008) +[2023-10-09 05:57:32,545][60143] Updated weights for policy 0, policy_version 43812 (0.0008) +[2023-10-09 05:57:32,868][60144] Updated weights for policy 1, policy_version 44342 (0.0008) +[2023-10-09 05:57:32,910][60143] Updated weights for policy 0, policy_version 43822 (0.0007) +[2023-10-09 05:57:33,225][60144] Updated weights for policy 1, policy_version 44352 (0.0007) +[2023-10-09 05:57:33,276][60143] Updated weights for policy 0, policy_version 43832 (0.0010) +[2023-10-09 05:57:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 90308608. Throughput: 0: 1686.4, 1: 1682.7. Samples: 22581528. Policy #0 lag: (min: 22.0, avg: 30.0, max: 54.0) +[2023-10-09 05:57:36,053][59242] Avg episode reward: [(0, '33.130'), (1, '33.370')] +[2023-10-09 05:57:37,133][60144] Updated weights for policy 1, policy_version 44362 (0.0008) +[2023-10-09 05:57:37,397][60143] Updated weights for policy 0, policy_version 43842 (0.0009) +[2023-10-09 05:57:37,495][60144] Updated weights for policy 1, policy_version 44372 (0.0008) +[2023-10-09 05:57:37,766][60143] Updated weights for policy 0, policy_version 43852 (0.0009) +[2023-10-09 05:57:37,866][60144] Updated weights for policy 1, policy_version 44382 (0.0007) +[2023-10-09 05:57:38,124][60143] Updated weights for policy 0, policy_version 43862 (0.0010) +[2023-10-09 05:57:38,494][60143] Updated weights for policy 0, policy_version 43872 (0.0007) +[2023-10-09 05:57:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 90374144. Throughput: 0: 1682.2, 1: 1710.9. Samples: 22602490. Policy #0 lag: (min: 22.0, avg: 30.0, max: 54.0) +[2023-10-09 05:57:41,053][59242] Avg episode reward: [(0, '32.850'), (1, '31.810')] +[2023-10-09 05:57:41,852][60144] Updated weights for policy 1, policy_version 44392 (0.0007) +[2023-10-09 05:57:42,228][60144] Updated weights for policy 1, policy_version 44402 (0.0007) +[2023-10-09 05:57:42,518][60143] Updated weights for policy 0, policy_version 43882 (0.0007) +[2023-10-09 05:57:42,594][60144] Updated weights for policy 1, policy_version 44412 (0.0008) +[2023-10-09 05:57:42,884][60143] Updated weights for policy 0, policy_version 43892 (0.0007) +[2023-10-09 05:57:43,255][60143] Updated weights for policy 0, policy_version 43902 (0.0008) +[2023-10-09 05:57:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 90439680. Throughput: 0: 1700.2, 1: 1730.2. Samples: 22623906. Policy #0 lag: (min: 22.0, avg: 30.0, max: 54.0) +[2023-10-09 05:57:46,053][59242] Avg episode reward: [(0, '32.790'), (1, '31.490')] +[2023-10-09 05:57:46,519][60144] Updated weights for policy 1, policy_version 44422 (0.0009) +[2023-10-09 05:57:46,894][60144] Updated weights for policy 1, policy_version 44432 (0.0008) +[2023-10-09 05:57:47,254][60144] Updated weights for policy 1, policy_version 44442 (0.0007) +[2023-10-09 05:57:47,282][60143] Updated weights for policy 0, policy_version 43912 (0.0007) +[2023-10-09 05:57:47,656][60143] Updated weights for policy 0, policy_version 43922 (0.0009) +[2023-10-09 05:57:48,026][60143] Updated weights for policy 0, policy_version 43932 (0.0008) +[2023-10-09 05:57:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 90505216. Throughput: 0: 1667.9, 1: 1704.5. Samples: 22633068. Policy #0 lag: (min: 22.0, avg: 30.0, max: 54.0) +[2023-10-09 05:57:51,053][59242] Avg episode reward: [(0, '33.360'), (1, '31.450')] +[2023-10-09 05:57:51,097][60144] Updated weights for policy 1, policy_version 44452 (0.0008) +[2023-10-09 05:57:51,469][60144] Updated weights for policy 1, policy_version 44462 (0.0007) +[2023-10-09 05:57:51,841][60144] Updated weights for policy 1, policy_version 44472 (0.0010) +[2023-10-09 05:57:52,090][60143] Updated weights for policy 0, policy_version 43942 (0.0008) +[2023-10-09 05:57:52,459][60143] Updated weights for policy 0, policy_version 43952 (0.0007) +[2023-10-09 05:57:52,837][60143] Updated weights for policy 0, policy_version 43962 (0.0009) +[2023-10-09 05:57:55,726][60144] Updated weights for policy 1, policy_version 44482 (0.0008) +[2023-10-09 05:57:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 90570752. Throughput: 0: 1704.0, 1: 1740.6. Samples: 22654702. Policy #0 lag: (min: 22.0, avg: 30.0, max: 54.0) +[2023-10-09 05:57:56,053][59242] Avg episode reward: [(0, '33.260'), (1, '30.280')] +[2023-10-09 05:57:56,092][60144] Updated weights for policy 1, policy_version 44492 (0.0008) +[2023-10-09 05:57:56,468][60144] Updated weights for policy 1, policy_version 44502 (0.0008) +[2023-10-09 05:57:56,798][60143] Updated weights for policy 0, policy_version 43972 (0.0008) +[2023-10-09 05:57:56,829][60144] Updated weights for policy 1, policy_version 44512 (0.0008) +[2023-10-09 05:57:57,170][60143] Updated weights for policy 0, policy_version 43982 (0.0007) +[2023-10-09 05:57:57,542][60143] Updated weights for policy 0, policy_version 43992 (0.0008) +[2023-10-09 05:58:00,707][60144] Updated weights for policy 1, policy_version 44522 (0.0010) +[2023-10-09 05:58:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 90636288. Throughput: 0: 1703.5, 1: 1739.7. Samples: 22675862. Policy #0 lag: (min: 22.0, avg: 30.0, max: 54.0) +[2023-10-09 05:58:01,053][59242] Avg episode reward: [(0, '31.400'), (1, '30.860')] +[2023-10-09 05:58:01,079][60144] Updated weights for policy 1, policy_version 44532 (0.0007) +[2023-10-09 05:58:01,441][60144] Updated weights for policy 1, policy_version 44542 (0.0009) +[2023-10-09 05:58:01,485][60143] Updated weights for policy 0, policy_version 44002 (0.0007) +[2023-10-09 05:58:01,858][60143] Updated weights for policy 0, policy_version 44012 (0.0008) +[2023-10-09 05:58:02,237][60143] Updated weights for policy 0, policy_version 44022 (0.0008) +[2023-10-09 05:58:02,605][60143] Updated weights for policy 0, policy_version 44032 (0.0008) +[2023-10-09 05:58:05,329][60144] Updated weights for policy 1, policy_version 44552 (0.0008) +[2023-10-09 05:58:05,699][60144] Updated weights for policy 1, policy_version 44562 (0.0009) +[2023-10-09 05:58:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 90701824. Throughput: 0: 1689.1, 1: 1736.4. Samples: 22685482. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) +[2023-10-09 05:58:06,053][59242] Avg episode reward: [(0, '30.740'), (1, '31.630')] +[2023-10-09 05:58:06,063][60144] Updated weights for policy 1, policy_version 44572 (0.0012) +[2023-10-09 05:58:06,721][60143] Updated weights for policy 0, policy_version 44042 (0.0009) +[2023-10-09 05:58:07,092][60143] Updated weights for policy 0, policy_version 44052 (0.0011) +[2023-10-09 05:58:07,472][60143] Updated weights for policy 0, policy_version 44062 (0.0011) +[2023-10-09 05:58:10,052][60144] Updated weights for policy 1, policy_version 44582 (0.0011) +[2023-10-09 05:58:10,423][60144] Updated weights for policy 1, policy_version 44592 (0.0009) +[2023-10-09 05:58:10,787][60144] Updated weights for policy 1, policy_version 44602 (0.0010) +[2023-10-09 05:58:11,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 90800128. Throughput: 0: 1706.6, 1: 1752.1. Samples: 22706558. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) +[2023-10-09 05:58:11,053][59242] Avg episode reward: [(0, '30.930'), (1, '30.490')] +[2023-10-09 05:58:11,411][60143] Updated weights for policy 0, policy_version 44072 (0.0009) +[2023-10-09 05:58:11,777][60143] Updated weights for policy 0, policy_version 44082 (0.0008) +[2023-10-09 05:58:12,153][60143] Updated weights for policy 0, policy_version 44092 (0.0007) +[2023-10-09 05:58:14,632][60144] Updated weights for policy 1, policy_version 44612 (0.0008) +[2023-10-09 05:58:15,004][60144] Updated weights for policy 1, policy_version 44622 (0.0008) +[2023-10-09 05:58:15,366][60144] Updated weights for policy 1, policy_version 44632 (0.0009) +[2023-10-09 05:58:16,033][60143] Updated weights for policy 0, policy_version 44102 (0.0008) +[2023-10-09 05:58:16,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 90865664. Throughput: 0: 1714.4, 1: 1724.8. Samples: 22726924. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) +[2023-10-09 05:58:16,052][59242] Avg episode reward: [(0, '31.670'), (1, '30.480')] +[2023-10-09 05:58:16,398][60143] Updated weights for policy 0, policy_version 44112 (0.0008) +[2023-10-09 05:58:16,771][60143] Updated weights for policy 0, policy_version 44122 (0.0007) +[2023-10-09 05:58:19,364][60144] Updated weights for policy 1, policy_version 44642 (0.0009) +[2023-10-09 05:58:19,733][60144] Updated weights for policy 1, policy_version 44652 (0.0009) +[2023-10-09 05:58:20,098][60144] Updated weights for policy 1, policy_version 44662 (0.0008) +[2023-10-09 05:58:20,469][60144] Updated weights for policy 1, policy_version 44672 (0.0008) +[2023-10-09 05:58:20,605][60143] Updated weights for policy 0, policy_version 44132 (0.0007) +[2023-10-09 05:58:20,979][60143] Updated weights for policy 0, policy_version 44142 (0.0008) +[2023-10-09 05:58:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 90931200. Throughput: 0: 1712.1, 1: 1755.0. Samples: 22737548. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) +[2023-10-09 05:58:21,052][59242] Avg episode reward: [(0, '31.290'), (1, '29.200')] +[2023-10-09 05:58:21,346][60143] Updated weights for policy 0, policy_version 44152 (0.0007) +[2023-10-09 05:58:24,445][60144] Updated weights for policy 1, policy_version 44682 (0.0010) +[2023-10-09 05:58:24,812][60144] Updated weights for policy 1, policy_version 44692 (0.0007) +[2023-10-09 05:58:25,183][60144] Updated weights for policy 1, policy_version 44702 (0.0008) +[2023-10-09 05:58:25,218][60143] Updated weights for policy 0, policy_version 44162 (0.0010) +[2023-10-09 05:58:25,581][60143] Updated weights for policy 0, policy_version 44172 (0.0008) +[2023-10-09 05:58:25,958][60143] Updated weights for policy 0, policy_version 44182 (0.0010) +[2023-10-09 05:58:26,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 90996736. Throughput: 0: 1729.5, 1: 1736.9. Samples: 22758476. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) +[2023-10-09 05:58:26,053][59242] Avg episode reward: [(0, '30.370'), (1, '28.310')] +[2023-10-09 05:58:26,342][60143] Updated weights for policy 0, policy_version 44192 (0.0012) +[2023-10-09 05:58:29,072][60144] Updated weights for policy 1, policy_version 44712 (0.0010) +[2023-10-09 05:58:29,446][60144] Updated weights for policy 1, policy_version 44722 (0.0010) +[2023-10-09 05:58:29,825][60144] Updated weights for policy 1, policy_version 44732 (0.0008) +[2023-10-09 05:58:30,311][60143] Updated weights for policy 0, policy_version 44202 (0.0009) +[2023-10-09 05:58:30,680][60143] Updated weights for policy 0, policy_version 44212 (0.0011) +[2023-10-09 05:58:31,039][60143] Updated weights for policy 0, policy_version 44222 (0.0011) +[2023-10-09 05:58:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 91062272. Throughput: 0: 1722.1, 1: 1716.1. Samples: 22778624. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) +[2023-10-09 05:58:31,053][59242] Avg episode reward: [(0, '28.980'), (1, '28.800')] +[2023-10-09 05:58:33,783][60144] Updated weights for policy 1, policy_version 44742 (0.0008) +[2023-10-09 05:58:34,144][60144] Updated weights for policy 1, policy_version 44752 (0.0010) +[2023-10-09 05:58:34,525][60144] Updated weights for policy 1, policy_version 44762 (0.0008) +[2023-10-09 05:58:35,104][60143] Updated weights for policy 0, policy_version 44232 (0.0008) +[2023-10-09 05:58:35,476][60143] Updated weights for policy 0, policy_version 44242 (0.0010) +[2023-10-09 05:58:35,847][60143] Updated weights for policy 0, policy_version 44252 (0.0011) +[2023-10-09 05:58:36,052][59242] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 91160576. Throughput: 0: 1737.0, 1: 1746.1. Samples: 22789806. Policy #0 lag: (min: 8.0, avg: 22.7, max: 40.0) +[2023-10-09 05:58:36,052][59242] Avg episode reward: [(0, '29.270'), (1, '27.940')] +[2023-10-09 05:58:38,471][60144] Updated weights for policy 1, policy_version 44772 (0.0008) +[2023-10-09 05:58:38,839][60144] Updated weights for policy 1, policy_version 44782 (0.0008) +[2023-10-09 05:58:39,211][60144] Updated weights for policy 1, policy_version 44792 (0.0008) +[2023-10-09 05:58:39,930][60143] Updated weights for policy 0, policy_version 44262 (0.0008) +[2023-10-09 05:58:40,304][60143] Updated weights for policy 0, policy_version 44272 (0.0008) +[2023-10-09 05:58:40,666][60143] Updated weights for policy 0, policy_version 44282 (0.0008) +[2023-10-09 05:58:41,052][59242] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 91226112. Throughput: 0: 1730.0, 1: 1713.8. Samples: 22809674. Policy #0 lag: (min: 8.0, avg: 22.7, max: 40.0) +[2023-10-09 05:58:41,053][59242] Avg episode reward: [(0, '29.890'), (1, '30.240')] +[2023-10-09 05:58:43,383][60144] Updated weights for policy 1, policy_version 44802 (0.0010) +[2023-10-09 05:58:43,740][60144] Updated weights for policy 1, policy_version 44812 (0.0007) +[2023-10-09 05:58:44,114][60144] Updated weights for policy 1, policy_version 44822 (0.0008) +[2023-10-09 05:58:44,476][60144] Updated weights for policy 1, policy_version 44832 (0.0009) +[2023-10-09 05:58:44,506][60143] Updated weights for policy 0, policy_version 44292 (0.0008) +[2023-10-09 05:58:44,885][60143] Updated weights for policy 0, policy_version 44302 (0.0008) +[2023-10-09 05:58:45,249][60143] Updated weights for policy 0, policy_version 44312 (0.0010) +[2023-10-09 05:58:46,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 91291648. Throughput: 0: 1707.2, 1: 1708.5. Samples: 22829568. Policy #0 lag: (min: 8.0, avg: 22.7, max: 40.0) +[2023-10-09 05:58:46,053][59242] Avg episode reward: [(0, '29.560'), (1, '29.570')] +[2023-10-09 05:58:48,327][60144] Updated weights for policy 1, policy_version 44842 (0.0008) +[2023-10-09 05:58:48,684][60144] Updated weights for policy 1, policy_version 44852 (0.0009) +[2023-10-09 05:58:49,056][60144] Updated weights for policy 1, policy_version 44862 (0.0009) +[2023-10-09 05:58:49,069][60143] Updated weights for policy 0, policy_version 44322 (0.0010) +[2023-10-09 05:58:49,448][60143] Updated weights for policy 0, policy_version 44332 (0.0009) +[2023-10-09 05:58:49,821][60143] Updated weights for policy 0, policy_version 44342 (0.0008) +[2023-10-09 05:58:50,195][60143] Updated weights for policy 0, policy_version 44352 (0.0010) +[2023-10-09 05:58:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 91357184. Throughput: 0: 1733.9, 1: 1716.1. Samples: 22840734. Policy #0 lag: (min: 8.0, avg: 22.7, max: 40.0) +[2023-10-09 05:58:51,053][59242] Avg episode reward: [(0, '29.440'), (1, '30.070')] +[2023-10-09 05:58:53,011][60144] Updated weights for policy 1, policy_version 44872 (0.0010) +[2023-10-09 05:58:53,374][60144] Updated weights for policy 1, policy_version 44882 (0.0008) +[2023-10-09 05:58:53,744][60144] Updated weights for policy 1, policy_version 44892 (0.0011) +[2023-10-09 05:58:54,402][60143] Updated weights for policy 0, policy_version 44362 (0.0011) +[2023-10-09 05:58:54,774][60143] Updated weights for policy 0, policy_version 44372 (0.0008) +[2023-10-09 05:58:55,142][60143] Updated weights for policy 0, policy_version 44382 (0.0008) +[2023-10-09 05:58:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 91422720. Throughput: 0: 1723.8, 1: 1698.3. Samples: 22860554. Policy #0 lag: (min: 8.0, avg: 22.7, max: 40.0) +[2023-10-09 05:58:56,053][59242] Avg episode reward: [(0, '28.830'), (1, '28.770')] +[2023-10-09 05:58:57,855][60144] Updated weights for policy 1, policy_version 44902 (0.0009) +[2023-10-09 05:58:58,222][60144] Updated weights for policy 1, policy_version 44912 (0.0008) +[2023-10-09 05:58:58,589][60144] Updated weights for policy 1, policy_version 44922 (0.0008) +[2023-10-09 05:58:59,149][60143] Updated weights for policy 0, policy_version 44392 (0.0011) +[2023-10-09 05:58:59,520][60143] Updated weights for policy 0, policy_version 44402 (0.0009) +[2023-10-09 05:58:59,887][60143] Updated weights for policy 0, policy_version 44412 (0.0010) +[2023-10-09 05:59:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 91488256. Throughput: 0: 1697.7, 1: 1726.8. Samples: 22881028. Policy #0 lag: (min: 8.0, avg: 22.7, max: 40.0) +[2023-10-09 05:59:01,053][59242] Avg episode reward: [(0, '28.230'), (1, '29.680')] +[2023-10-09 05:59:02,385][60144] Updated weights for policy 1, policy_version 44932 (0.0008) +[2023-10-09 05:59:02,750][60144] Updated weights for policy 1, policy_version 44942 (0.0008) +[2023-10-09 05:59:03,112][60144] Updated weights for policy 1, policy_version 44952 (0.0009) +[2023-10-09 05:59:03,917][60143] Updated weights for policy 0, policy_version 44422 (0.0008) +[2023-10-09 05:59:04,283][60143] Updated weights for policy 0, policy_version 44432 (0.0010) +[2023-10-09 05:59:04,643][60143] Updated weights for policy 0, policy_version 44442 (0.0011) +[2023-10-09 05:59:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 91553792. Throughput: 0: 1728.0, 1: 1696.6. Samples: 22891654. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:59:06,053][59242] Avg episode reward: [(0, '28.650'), (1, '29.920')] +[2023-10-09 05:59:07,019][60144] Updated weights for policy 1, policy_version 44962 (0.0009) +[2023-10-09 05:59:07,389][60144] Updated weights for policy 1, policy_version 44972 (0.0007) +[2023-10-09 05:59:07,755][60144] Updated weights for policy 1, policy_version 44982 (0.0008) +[2023-10-09 05:59:08,128][60144] Updated weights for policy 1, policy_version 44992 (0.0008) +[2023-10-09 05:59:08,506][60143] Updated weights for policy 0, policy_version 44452 (0.0009) +[2023-10-09 05:59:08,877][60143] Updated weights for policy 0, policy_version 44462 (0.0009) +[2023-10-09 05:59:09,252][60143] Updated weights for policy 0, policy_version 44472 (0.0011) +[2023-10-09 05:59:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 91619328. Throughput: 0: 1694.7, 1: 1712.5. Samples: 22911796. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:59:11,053][59242] Avg episode reward: [(0, '29.620'), (1, '30.100')] +[2023-10-09 05:59:12,018][60144] Updated weights for policy 1, policy_version 45002 (0.0010) +[2023-10-09 05:59:12,391][60144] Updated weights for policy 1, policy_version 45012 (0.0010) +[2023-10-09 05:59:12,755][60144] Updated weights for policy 1, policy_version 45022 (0.0007) +[2023-10-09 05:59:13,259][60143] Updated weights for policy 0, policy_version 44482 (0.0010) +[2023-10-09 05:59:13,626][60143] Updated weights for policy 0, policy_version 44492 (0.0010) +[2023-10-09 05:59:14,004][60143] Updated weights for policy 0, policy_version 44502 (0.0011) +[2023-10-09 05:59:14,385][60143] Updated weights for policy 0, policy_version 44512 (0.0010) +[2023-10-09 05:59:16,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 91684864. Throughput: 0: 1702.1, 1: 1731.2. Samples: 22933122. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:59:16,053][59242] Avg episode reward: [(0, '29.860'), (1, '30.160')] +[2023-10-09 05:59:16,806][60144] Updated weights for policy 1, policy_version 45032 (0.0009) +[2023-10-09 05:59:17,180][60144] Updated weights for policy 1, policy_version 45042 (0.0007) +[2023-10-09 05:59:17,548][60144] Updated weights for policy 1, policy_version 45052 (0.0008) +[2023-10-09 05:59:18,137][60143] Updated weights for policy 0, policy_version 44522 (0.0009) +[2023-10-09 05:59:18,514][60143] Updated weights for policy 0, policy_version 44532 (0.0008) +[2023-10-09 05:59:18,879][60143] Updated weights for policy 0, policy_version 44542 (0.0007) +[2023-10-09 05:59:21,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 91750400. Throughput: 0: 1705.8, 1: 1701.9. Samples: 22943152. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:59:21,053][59242] Avg episode reward: [(0, '29.900'), (1, '30.370')] +[2023-10-09 05:59:21,444][60144] Updated weights for policy 1, policy_version 45062 (0.0008) +[2023-10-09 05:59:21,821][60144] Updated weights for policy 1, policy_version 45072 (0.0007) +[2023-10-09 05:59:22,195][60144] Updated weights for policy 1, policy_version 45082 (0.0008) +[2023-10-09 05:59:22,900][60143] Updated weights for policy 0, policy_version 44552 (0.0009) +[2023-10-09 05:59:23,265][60143] Updated weights for policy 0, policy_version 44562 (0.0008) +[2023-10-09 05:59:23,636][60143] Updated weights for policy 0, policy_version 44572 (0.0007) +[2023-10-09 05:59:26,014][60144] Updated weights for policy 1, policy_version 45092 (0.0009) +[2023-10-09 05:59:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 91815936. Throughput: 0: 1697.0, 1: 1728.9. Samples: 22963842. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:59:26,052][59242] Avg episode reward: [(0, '30.710'), (1, '29.590')] +[2023-10-09 05:59:26,381][60144] Updated weights for policy 1, policy_version 45102 (0.0007) +[2023-10-09 05:59:26,744][60144] Updated weights for policy 1, policy_version 45112 (0.0007) +[2023-10-09 05:59:27,628][60143] Updated weights for policy 0, policy_version 44582 (0.0008) +[2023-10-09 05:59:28,001][60143] Updated weights for policy 0, policy_version 44592 (0.0008) +[2023-10-09 05:59:28,375][60143] Updated weights for policy 0, policy_version 44602 (0.0011) +[2023-10-09 05:59:30,816][60144] Updated weights for policy 1, policy_version 45122 (0.0008) +[2023-10-09 05:59:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 91881472. Throughput: 0: 1722.7, 1: 1734.0. Samples: 22985120. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:59:31,052][59242] Avg episode reward: [(0, '31.250'), (1, '29.990')] +[2023-10-09 05:59:31,063][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000044608_45678592.pth... +[2023-10-09 05:59:31,106][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000043008_44040192.pth +[2023-10-09 05:59:31,181][60144] Updated weights for policy 1, policy_version 45132 (0.0010) +[2023-10-09 05:59:31,541][60144] Updated weights for policy 1, policy_version 45142 (0.0007) +[2023-10-09 05:59:31,904][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000045152_46235648.pth... +[2023-10-09 05:59:31,910][60144] Updated weights for policy 1, policy_version 45152 (0.0007) +[2023-10-09 05:59:31,933][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000043520_44564480.pth +[2023-10-09 05:59:32,231][60143] Updated weights for policy 0, policy_version 44612 (0.0010) +[2023-10-09 05:59:32,601][60143] Updated weights for policy 0, policy_version 44622 (0.0009) +[2023-10-09 05:59:32,968][60143] Updated weights for policy 0, policy_version 44632 (0.0010) +[2023-10-09 05:59:35,711][60144] Updated weights for policy 1, policy_version 45162 (0.0009) +[2023-10-09 05:59:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 91947008. Throughput: 0: 1700.1, 1: 1722.1. Samples: 22994736. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 05:59:36,053][59242] Avg episode reward: [(0, '31.510'), (1, '31.010')] +[2023-10-09 05:59:36,073][60144] Updated weights for policy 1, policy_version 45172 (0.0009) +[2023-10-09 05:59:36,439][60144] Updated weights for policy 1, policy_version 45182 (0.0011) +[2023-10-09 05:59:36,912][60143] Updated weights for policy 0, policy_version 44642 (0.0008) +[2023-10-09 05:59:37,277][60143] Updated weights for policy 0, policy_version 44652 (0.0008) +[2023-10-09 05:59:37,650][60143] Updated weights for policy 0, policy_version 44662 (0.0008) +[2023-10-09 05:59:38,019][60143] Updated weights for policy 0, policy_version 44672 (0.0009) +[2023-10-09 05:59:40,480][60144] Updated weights for policy 1, policy_version 45192 (0.0009) +[2023-10-09 05:59:40,858][60144] Updated weights for policy 1, policy_version 45202 (0.0009) +[2023-10-09 05:59:41,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 92012544. Throughput: 0: 1711.7, 1: 1739.6. Samples: 23015866. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 05:59:41,053][59242] Avg episode reward: [(0, '30.730'), (1, '30.870')] +[2023-10-09 05:59:41,219][60144] Updated weights for policy 1, policy_version 45212 (0.0009) +[2023-10-09 05:59:42,172][60143] Updated weights for policy 0, policy_version 44682 (0.0008) +[2023-10-09 05:59:42,534][60143] Updated weights for policy 0, policy_version 44692 (0.0009) +[2023-10-09 05:59:42,911][60143] Updated weights for policy 0, policy_version 44702 (0.0009) +[2023-10-09 05:59:45,183][60144] Updated weights for policy 1, policy_version 45222 (0.0009) +[2023-10-09 05:59:45,545][60144] Updated weights for policy 1, policy_version 45232 (0.0009) +[2023-10-09 05:59:45,912][60144] Updated weights for policy 1, policy_version 45242 (0.0010) +[2023-10-09 05:59:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 92078080. Throughput: 0: 1729.6, 1: 1724.2. Samples: 23036450. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 05:59:46,053][59242] Avg episode reward: [(0, '31.140'), (1, '31.650')] +[2023-10-09 05:59:47,008][60143] Updated weights for policy 0, policy_version 44712 (0.0009) +[2023-10-09 05:59:47,376][60143] Updated weights for policy 0, policy_version 44722 (0.0008) +[2023-10-09 05:59:47,748][60143] Updated weights for policy 0, policy_version 44732 (0.0008) +[2023-10-09 05:59:49,985][60144] Updated weights for policy 1, policy_version 45252 (0.0010) +[2023-10-09 05:59:50,348][60144] Updated weights for policy 1, policy_version 45262 (0.0011) +[2023-10-09 05:59:50,712][60144] Updated weights for policy 1, policy_version 45272 (0.0010) +[2023-10-09 05:59:51,052][59242] Fps is (10 sec: 16384.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 92176384. Throughput: 0: 1698.0, 1: 1745.6. Samples: 23046616. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 05:59:51,052][59242] Avg episode reward: [(0, '30.930'), (1, '32.180')] +[2023-10-09 05:59:51,688][60143] Updated weights for policy 0, policy_version 44742 (0.0011) +[2023-10-09 05:59:52,063][60143] Updated weights for policy 0, policy_version 44752 (0.0010) +[2023-10-09 05:59:52,437][60143] Updated weights for policy 0, policy_version 44762 (0.0009) +[2023-10-09 05:59:54,741][60144] Updated weights for policy 1, policy_version 45282 (0.0009) +[2023-10-09 05:59:55,106][60144] Updated weights for policy 1, policy_version 45292 (0.0010) +[2023-10-09 05:59:55,476][60144] Updated weights for policy 1, policy_version 45302 (0.0009) +[2023-10-09 05:59:55,847][60144] Updated weights for policy 1, policy_version 45312 (0.0007) +[2023-10-09 05:59:56,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 92241920. Throughput: 0: 1722.1, 1: 1739.9. Samples: 23067588. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 05:59:56,053][59242] Avg episode reward: [(0, '30.870'), (1, '32.230')] +[2023-10-09 05:59:56,388][60143] Updated weights for policy 0, policy_version 44772 (0.0010) +[2023-10-09 05:59:56,762][60143] Updated weights for policy 0, policy_version 44782 (0.0009) +[2023-10-09 05:59:57,133][60143] Updated weights for policy 0, policy_version 44792 (0.0008) +[2023-10-09 05:59:59,762][60144] Updated weights for policy 1, policy_version 45322 (0.0009) +[2023-10-09 06:00:00,141][60144] Updated weights for policy 1, policy_version 45332 (0.0008) +[2023-10-09 06:00:00,504][60144] Updated weights for policy 1, policy_version 45342 (0.0008) +[2023-10-09 06:00:00,979][60143] Updated weights for policy 0, policy_version 44802 (0.0008) +[2023-10-09 06:00:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 92307456. Throughput: 0: 1726.7, 1: 1709.6. Samples: 23087754. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 06:00:01,053][59242] Avg episode reward: [(0, '30.820'), (1, '32.300')] +[2023-10-09 06:00:01,352][60143] Updated weights for policy 0, policy_version 44812 (0.0009) +[2023-10-09 06:00:01,735][60143] Updated weights for policy 0, policy_version 44822 (0.0008) +[2023-10-09 06:00:02,104][60143] Updated weights for policy 0, policy_version 44832 (0.0009) +[2023-10-09 06:00:04,483][60144] Updated weights for policy 1, policy_version 45352 (0.0009) +[2023-10-09 06:00:04,861][60144] Updated weights for policy 1, policy_version 45362 (0.0008) +[2023-10-09 06:00:05,227][60144] Updated weights for policy 1, policy_version 45372 (0.0009) +[2023-10-09 06:00:06,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 92372992. Throughput: 0: 1704.6, 1: 1741.1. Samples: 23098208. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 06:00:06,052][59242] Avg episode reward: [(0, '31.110'), (1, '31.770')] +[2023-10-09 06:00:06,264][60143] Updated weights for policy 0, policy_version 44842 (0.0008) +[2023-10-09 06:00:06,630][60143] Updated weights for policy 0, policy_version 44852 (0.0010) +[2023-10-09 06:00:07,002][60143] Updated weights for policy 0, policy_version 44862 (0.0007) +[2023-10-09 06:00:09,020][60144] Updated weights for policy 1, policy_version 45382 (0.0011) +[2023-10-09 06:00:09,390][60144] Updated weights for policy 1, policy_version 45392 (0.0010) +[2023-10-09 06:00:09,758][60144] Updated weights for policy 1, policy_version 45402 (0.0009) +[2023-10-09 06:00:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 92438528. Throughput: 0: 1708.4, 1: 1720.9. Samples: 23118162. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 06:00:11,053][59242] Avg episode reward: [(0, '32.280'), (1, '30.900')] +[2023-10-09 06:00:11,064][60143] Updated weights for policy 0, policy_version 44872 (0.0009) +[2023-10-09 06:00:11,430][60143] Updated weights for policy 0, policy_version 44882 (0.0007) +[2023-10-09 06:00:11,802][60143] Updated weights for policy 0, policy_version 44892 (0.0009) +[2023-10-09 06:00:13,545][60144] Updated weights for policy 1, policy_version 45412 (0.0009) +[2023-10-09 06:00:13,916][60144] Updated weights for policy 1, policy_version 45422 (0.0008) +[2023-10-09 06:00:14,273][60144] Updated weights for policy 1, policy_version 45432 (0.0009) +[2023-10-09 06:00:15,906][60143] Updated weights for policy 0, policy_version 44902 (0.0009) +[2023-10-09 06:00:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 92504064. Throughput: 0: 1704.3, 1: 1714.0. Samples: 23138944. Policy #0 lag: (min: 7.0, avg: 9.7, max: 39.0) +[2023-10-09 06:00:16,053][59242] Avg episode reward: [(0, '31.030'), (1, '31.770')] +[2023-10-09 06:00:16,288][60143] Updated weights for policy 0, policy_version 44912 (0.0009) +[2023-10-09 06:00:16,652][60143] Updated weights for policy 0, policy_version 44922 (0.0007) +[2023-10-09 06:00:18,082][60144] Updated weights for policy 1, policy_version 45442 (0.0008) +[2023-10-09 06:00:18,461][60144] Updated weights for policy 1, policy_version 45452 (0.0008) +[2023-10-09 06:00:18,825][60144] Updated weights for policy 1, policy_version 45462 (0.0010) +[2023-10-09 06:00:19,193][60144] Updated weights for policy 1, policy_version 45472 (0.0009) +[2023-10-09 06:00:20,755][60143] Updated weights for policy 0, policy_version 44932 (0.0007) +[2023-10-09 06:00:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 92569600. Throughput: 0: 1701.0, 1: 1731.9. Samples: 23149216. Policy #0 lag: (min: 7.0, avg: 9.7, max: 39.0) +[2023-10-09 06:00:21,053][59242] Avg episode reward: [(0, '30.910'), (1, '31.440')] +[2023-10-09 06:00:21,128][60143] Updated weights for policy 0, policy_version 44942 (0.0010) +[2023-10-09 06:00:21,500][60143] Updated weights for policy 0, policy_version 44952 (0.0009) +[2023-10-09 06:00:23,122][60144] Updated weights for policy 1, policy_version 45482 (0.0007) +[2023-10-09 06:00:23,487][60144] Updated weights for policy 1, policy_version 45492 (0.0008) +[2023-10-09 06:00:23,852][60144] Updated weights for policy 1, policy_version 45502 (0.0010) +[2023-10-09 06:00:25,455][60143] Updated weights for policy 0, policy_version 44962 (0.0008) +[2023-10-09 06:00:25,837][60143] Updated weights for policy 0, policy_version 44972 (0.0010) +[2023-10-09 06:00:26,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 92635136. Throughput: 0: 1702.9, 1: 1717.6. Samples: 23169788. Policy #0 lag: (min: 7.0, avg: 9.7, max: 39.0) +[2023-10-09 06:00:26,053][59242] Avg episode reward: [(0, '32.990'), (1, '31.170')] +[2023-10-09 06:00:26,208][60143] Updated weights for policy 0, policy_version 44982 (0.0009) +[2023-10-09 06:00:26,582][60143] Updated weights for policy 0, policy_version 44992 (0.0008) +[2023-10-09 06:00:27,810][60144] Updated weights for policy 1, policy_version 45512 (0.0011) +[2023-10-09 06:00:28,175][60144] Updated weights for policy 1, policy_version 45522 (0.0010) +[2023-10-09 06:00:28,536][60144] Updated weights for policy 1, policy_version 45532 (0.0011) +[2023-10-09 06:00:30,624][60143] Updated weights for policy 0, policy_version 45002 (0.0007) +[2023-10-09 06:00:31,002][60143] Updated weights for policy 0, policy_version 45012 (0.0008) +[2023-10-09 06:00:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 92700672. Throughput: 0: 1694.0, 1: 1729.1. Samples: 23190492. Policy #0 lag: (min: 7.0, avg: 9.7, max: 39.0) +[2023-10-09 06:00:31,053][59242] Avg episode reward: [(0, '32.190'), (1, '30.040')] +[2023-10-09 06:00:31,371][60143] Updated weights for policy 0, policy_version 45022 (0.0008) +[2023-10-09 06:00:32,582][60144] Updated weights for policy 1, policy_version 45542 (0.0008) +[2023-10-09 06:00:32,950][60144] Updated weights for policy 1, policy_version 45552 (0.0009) +[2023-10-09 06:00:33,314][60144] Updated weights for policy 1, policy_version 45562 (0.0009) +[2023-10-09 06:00:35,300][60143] Updated weights for policy 0, policy_version 45032 (0.0009) +[2023-10-09 06:00:35,662][60143] Updated weights for policy 0, policy_version 45042 (0.0010) +[2023-10-09 06:00:36,029][60143] Updated weights for policy 0, policy_version 45052 (0.0009) +[2023-10-09 06:00:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 92766208. Throughput: 0: 1698.7, 1: 1709.3. Samples: 23199978. Policy #0 lag: (min: 7.0, avg: 9.7, max: 39.0) +[2023-10-09 06:00:36,052][59242] Avg episode reward: [(0, '30.610'), (1, '30.400')] +[2023-10-09 06:00:37,380][60144] Updated weights for policy 1, policy_version 45572 (0.0008) +[2023-10-09 06:00:37,743][60144] Updated weights for policy 1, policy_version 45582 (0.0009) +[2023-10-09 06:00:38,116][60144] Updated weights for policy 1, policy_version 45592 (0.0009) +[2023-10-09 06:00:39,986][60143] Updated weights for policy 0, policy_version 45062 (0.0007) +[2023-10-09 06:00:40,363][60143] Updated weights for policy 0, policy_version 45072 (0.0009) +[2023-10-09 06:00:40,741][60143] Updated weights for policy 0, policy_version 45082 (0.0011) +[2023-10-09 06:00:41,052][59242] Fps is (10 sec: 16384.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 92864512. Throughput: 0: 1703.9, 1: 1715.2. Samples: 23221448. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:00:41,052][59242] Avg episode reward: [(0, '29.720'), (1, '30.710')] +[2023-10-09 06:00:42,090][60144] Updated weights for policy 1, policy_version 45602 (0.0010) +[2023-10-09 06:00:42,457][60144] Updated weights for policy 1, policy_version 45612 (0.0008) +[2023-10-09 06:00:42,827][60144] Updated weights for policy 1, policy_version 45622 (0.0009) +[2023-10-09 06:00:43,191][60144] Updated weights for policy 1, policy_version 45632 (0.0009) +[2023-10-09 06:00:44,771][60143] Updated weights for policy 0, policy_version 45092 (0.0010) +[2023-10-09 06:00:45,137][60143] Updated weights for policy 0, policy_version 45102 (0.0010) +[2023-10-09 06:00:45,506][60143] Updated weights for policy 0, policy_version 45112 (0.0009) +[2023-10-09 06:00:46,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 92930048. Throughput: 0: 1676.1, 1: 1739.2. Samples: 23241444. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:00:46,053][59242] Avg episode reward: [(0, '28.950'), (1, '31.740')] +[2023-10-09 06:00:47,132][60144] Updated weights for policy 1, policy_version 45642 (0.0008) +[2023-10-09 06:00:47,501][60144] Updated weights for policy 1, policy_version 45652 (0.0007) +[2023-10-09 06:00:47,860][60144] Updated weights for policy 1, policy_version 45662 (0.0008) +[2023-10-09 06:00:49,552][60143] Updated weights for policy 0, policy_version 45122 (0.0009) +[2023-10-09 06:00:49,919][60143] Updated weights for policy 0, policy_version 45132 (0.0008) +[2023-10-09 06:00:50,286][60143] Updated weights for policy 0, policy_version 45142 (0.0007) +[2023-10-09 06:00:50,654][60143] Updated weights for policy 0, policy_version 45152 (0.0007) +[2023-10-09 06:00:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 92995584. Throughput: 0: 1700.9, 1: 1709.2. Samples: 23251662. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:00:51,052][59242] Avg episode reward: [(0, '28.520'), (1, '31.670')] +[2023-10-09 06:00:51,834][60144] Updated weights for policy 1, policy_version 45672 (0.0008) +[2023-10-09 06:00:52,220][60144] Updated weights for policy 1, policy_version 45682 (0.0007) +[2023-10-09 06:00:52,585][60144] Updated weights for policy 1, policy_version 45692 (0.0007) +[2023-10-09 06:00:54,409][60143] Updated weights for policy 0, policy_version 45162 (0.0009) +[2023-10-09 06:00:54,777][60143] Updated weights for policy 0, policy_version 45172 (0.0010) +[2023-10-09 06:00:55,157][60143] Updated weights for policy 0, policy_version 45182 (0.0008) +[2023-10-09 06:00:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 93061120. Throughput: 0: 1698.7, 1: 1724.4. Samples: 23272202. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:00:56,053][59242] Avg episode reward: [(0, '27.980'), (1, '32.450')] +[2023-10-09 06:00:56,442][60144] Updated weights for policy 1, policy_version 45702 (0.0007) +[2023-10-09 06:00:56,820][60144] Updated weights for policy 1, policy_version 45712 (0.0008) +[2023-10-09 06:00:57,186][60144] Updated weights for policy 1, policy_version 45722 (0.0011) +[2023-10-09 06:00:58,983][60143] Updated weights for policy 0, policy_version 45192 (0.0008) +[2023-10-09 06:00:59,344][60143] Updated weights for policy 0, policy_version 45202 (0.0007) +[2023-10-09 06:00:59,718][60143] Updated weights for policy 0, policy_version 45212 (0.0007) +[2023-10-09 06:01:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 93126656. Throughput: 0: 1687.3, 1: 1736.1. Samples: 23293000. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:01:01,052][59242] Avg episode reward: [(0, '28.750'), (1, '34.920')] +[2023-10-09 06:01:01,164][60144] Updated weights for policy 1, policy_version 45732 (0.0010) +[2023-10-09 06:01:01,522][60144] Updated weights for policy 1, policy_version 45742 (0.0009) +[2023-10-09 06:01:01,892][60144] Updated weights for policy 1, policy_version 45752 (0.0010) +[2023-10-09 06:01:02,191][60003] Saving new best policy, reward=34.920! +[2023-10-09 06:01:03,688][60143] Updated weights for policy 0, policy_version 45222 (0.0008) +[2023-10-09 06:01:04,051][60143] Updated weights for policy 0, policy_version 45232 (0.0008) +[2023-10-09 06:01:04,425][60143] Updated weights for policy 0, policy_version 45242 (0.0008) +[2023-10-09 06:01:05,882][60144] Updated weights for policy 1, policy_version 45762 (0.0008) +[2023-10-09 06:01:06,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 93192192. Throughput: 0: 1717.6, 1: 1712.7. Samples: 23303580. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:01:06,052][59242] Avg episode reward: [(0, '28.710'), (1, '34.650')] +[2023-10-09 06:01:06,258][60144] Updated weights for policy 1, policy_version 45772 (0.0007) +[2023-10-09 06:01:06,622][60144] Updated weights for policy 1, policy_version 45782 (0.0007) +[2023-10-09 06:01:06,989][60144] Updated weights for policy 1, policy_version 45792 (0.0008) +[2023-10-09 06:01:08,474][60143] Updated weights for policy 0, policy_version 45252 (0.0008) +[2023-10-09 06:01:08,835][60143] Updated weights for policy 0, policy_version 45262 (0.0010) +[2023-10-09 06:01:09,202][60143] Updated weights for policy 0, policy_version 45272 (0.0008) +[2023-10-09 06:01:10,975][60144] Updated weights for policy 1, policy_version 45802 (0.0007) +[2023-10-09 06:01:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 93257728. Throughput: 0: 1691.1, 1: 1726.7. Samples: 23323590. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:01:11,053][59242] Avg episode reward: [(0, '29.700'), (1, '35.010')] +[2023-10-09 06:01:11,334][60144] Updated weights for policy 1, policy_version 45812 (0.0008) +[2023-10-09 06:01:11,700][60144] Updated weights for policy 1, policy_version 45822 (0.0008) +[2023-10-09 06:01:11,773][60003] Saving new best policy, reward=35.010! +[2023-10-09 06:01:13,229][60143] Updated weights for policy 0, policy_version 45282 (0.0009) +[2023-10-09 06:01:13,598][60143] Updated weights for policy 0, policy_version 45292 (0.0008) +[2023-10-09 06:01:13,959][60143] Updated weights for policy 0, policy_version 45302 (0.0008) +[2023-10-09 06:01:14,328][60143] Updated weights for policy 0, policy_version 45312 (0.0009) +[2023-10-09 06:01:15,614][60144] Updated weights for policy 1, policy_version 45832 (0.0010) +[2023-10-09 06:01:15,982][60144] Updated weights for policy 1, policy_version 45842 (0.0010) +[2023-10-09 06:01:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 93323264. Throughput: 0: 1698.5, 1: 1727.0. Samples: 23344640. Policy #0 lag: (min: 26.0, avg: 31.3, max: 58.0) +[2023-10-09 06:01:16,052][59242] Avg episode reward: [(0, '29.470'), (1, '34.990')] +[2023-10-09 06:01:16,347][60144] Updated weights for policy 1, policy_version 45852 (0.0009) +[2023-10-09 06:01:18,368][60143] Updated weights for policy 0, policy_version 45322 (0.0008) +[2023-10-09 06:01:18,742][60143] Updated weights for policy 0, policy_version 45332 (0.0009) +[2023-10-09 06:01:19,107][60143] Updated weights for policy 0, policy_version 45342 (0.0010) +[2023-10-09 06:01:20,279][60144] Updated weights for policy 1, policy_version 45862 (0.0011) +[2023-10-09 06:01:20,637][60144] Updated weights for policy 1, policy_version 45872 (0.0009) +[2023-10-09 06:01:21,014][60144] Updated weights for policy 1, policy_version 45882 (0.0009) +[2023-10-09 06:01:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 93388800. Throughput: 0: 1711.9, 1: 1731.0. Samples: 23354908. Policy #0 lag: (min: 26.0, avg: 31.3, max: 58.0) +[2023-10-09 06:01:21,052][59242] Avg episode reward: [(0, '28.370'), (1, '33.940')] +[2023-10-09 06:01:23,246][60143] Updated weights for policy 0, policy_version 45352 (0.0008) +[2023-10-09 06:01:23,612][60143] Updated weights for policy 0, policy_version 45362 (0.0008) +[2023-10-09 06:01:23,982][60143] Updated weights for policy 0, policy_version 45372 (0.0010) +[2023-10-09 06:01:24,946][60144] Updated weights for policy 1, policy_version 45892 (0.0010) +[2023-10-09 06:01:25,306][60144] Updated weights for policy 1, policy_version 45902 (0.0011) +[2023-10-09 06:01:25,672][60144] Updated weights for policy 1, policy_version 45912 (0.0011) +[2023-10-09 06:01:26,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 93487104. Throughput: 0: 1689.0, 1: 1731.9. Samples: 23375388. Policy #0 lag: (min: 26.0, avg: 31.3, max: 58.0) +[2023-10-09 06:01:26,052][59242] Avg episode reward: [(0, '29.910'), (1, '33.670')] +[2023-10-09 06:01:27,933][60143] Updated weights for policy 0, policy_version 45382 (0.0009) +[2023-10-09 06:01:28,297][60143] Updated weights for policy 0, policy_version 45392 (0.0008) +[2023-10-09 06:01:28,664][60143] Updated weights for policy 0, policy_version 45402 (0.0007) +[2023-10-09 06:01:29,682][60144] Updated weights for policy 1, policy_version 45922 (0.0009) +[2023-10-09 06:01:30,051][60144] Updated weights for policy 1, policy_version 45932 (0.0009) +[2023-10-09 06:01:30,418][60144] Updated weights for policy 1, policy_version 45942 (0.0009) +[2023-10-09 06:01:30,780][60144] Updated weights for policy 1, policy_version 45952 (0.0008) +[2023-10-09 06:01:31,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 93552640. Throughput: 0: 1711.8, 1: 1706.0. Samples: 23395244. Policy #0 lag: (min: 26.0, avg: 31.3, max: 58.0) +[2023-10-09 06:01:31,053][59242] Avg episode reward: [(0, '27.530'), (1, '34.680')] +[2023-10-09 06:01:31,063][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000045408_46497792.pth... +[2023-10-09 06:01:31,063][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000045952_47054848.pth... +[2023-10-09 06:01:31,097][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000044320_45383680.pth +[2023-10-09 06:01:31,104][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000043808_44859392.pth +[2023-10-09 06:01:32,566][60143] Updated weights for policy 0, policy_version 45412 (0.0009) +[2023-10-09 06:01:32,940][60143] Updated weights for policy 0, policy_version 45422 (0.0007) +[2023-10-09 06:01:33,317][60143] Updated weights for policy 0, policy_version 45432 (0.0007) +[2023-10-09 06:01:34,643][60144] Updated weights for policy 1, policy_version 45962 (0.0008) +[2023-10-09 06:01:35,006][60144] Updated weights for policy 1, policy_version 45972 (0.0011) +[2023-10-09 06:01:35,374][60144] Updated weights for policy 1, policy_version 45982 (0.0009) +[2023-10-09 06:01:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 93618176. Throughput: 0: 1694.8, 1: 1732.6. Samples: 23405894. Policy #0 lag: (min: 26.0, avg: 31.3, max: 58.0) +[2023-10-09 06:01:36,052][59242] Avg episode reward: [(0, '29.040'), (1, '34.870')] +[2023-10-09 06:01:37,340][60143] Updated weights for policy 0, policy_version 45442 (0.0008) +[2023-10-09 06:01:37,712][60143] Updated weights for policy 0, policy_version 45452 (0.0009) +[2023-10-09 06:01:38,083][60143] Updated weights for policy 0, policy_version 45462 (0.0007) +[2023-10-09 06:01:38,455][60143] Updated weights for policy 0, policy_version 45472 (0.0009) +[2023-10-09 06:01:39,436][60144] Updated weights for policy 1, policy_version 45992 (0.0007) +[2023-10-09 06:01:39,814][60144] Updated weights for policy 1, policy_version 46002 (0.0007) +[2023-10-09 06:01:40,186][60144] Updated weights for policy 1, policy_version 46012 (0.0008) +[2023-10-09 06:01:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 93683712. Throughput: 0: 1701.2, 1: 1723.7. Samples: 23426324. Policy #0 lag: (min: 26.0, avg: 31.3, max: 58.0) +[2023-10-09 06:01:41,053][59242] Avg episode reward: [(0, '29.400'), (1, '33.360')] +[2023-10-09 06:01:42,539][60143] Updated weights for policy 0, policy_version 45482 (0.0007) +[2023-10-09 06:01:42,918][60143] Updated weights for policy 0, policy_version 45492 (0.0007) +[2023-10-09 06:01:43,289][60143] Updated weights for policy 0, policy_version 45502 (0.0007) +[2023-10-09 06:01:44,046][60144] Updated weights for policy 1, policy_version 46022 (0.0009) +[2023-10-09 06:01:44,420][60144] Updated weights for policy 1, policy_version 46032 (0.0008) +[2023-10-09 06:01:44,776][60144] Updated weights for policy 1, policy_version 46042 (0.0007) +[2023-10-09 06:01:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 93749248. Throughput: 0: 1715.0, 1: 1697.3. Samples: 23446554. Policy #0 lag: (min: 26.0, avg: 31.3, max: 58.0) +[2023-10-09 06:01:46,053][59242] Avg episode reward: [(0, '28.350'), (1, '32.080')] +[2023-10-09 06:01:47,283][60143] Updated weights for policy 0, policy_version 45512 (0.0010) +[2023-10-09 06:01:47,656][60143] Updated weights for policy 0, policy_version 45522 (0.0009) +[2023-10-09 06:01:48,023][60143] Updated weights for policy 0, policy_version 45532 (0.0007) +[2023-10-09 06:01:48,864][60144] Updated weights for policy 1, policy_version 46052 (0.0009) +[2023-10-09 06:01:49,230][60144] Updated weights for policy 1, policy_version 46062 (0.0008) +[2023-10-09 06:01:49,605][60144] Updated weights for policy 1, policy_version 46072 (0.0009) +[2023-10-09 06:01:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 93814784. Throughput: 0: 1680.7, 1: 1729.7. Samples: 23457048. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:01:51,053][59242] Avg episode reward: [(0, '27.590'), (1, '33.210')] +[2023-10-09 06:01:52,003][60143] Updated weights for policy 0, policy_version 45542 (0.0008) +[2023-10-09 06:01:52,370][60143] Updated weights for policy 0, policy_version 45552 (0.0011) +[2023-10-09 06:01:52,750][60143] Updated weights for policy 0, policy_version 45562 (0.0009) +[2023-10-09 06:01:53,632][60144] Updated weights for policy 1, policy_version 46082 (0.0008) +[2023-10-09 06:01:53,997][60144] Updated weights for policy 1, policy_version 46092 (0.0007) +[2023-10-09 06:01:54,374][60144] Updated weights for policy 1, policy_version 46102 (0.0009) +[2023-10-09 06:01:54,744][60144] Updated weights for policy 1, policy_version 46112 (0.0009) +[2023-10-09 06:01:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 93880320. Throughput: 0: 1712.2, 1: 1708.4. Samples: 23477516. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:01:56,053][59242] Avg episode reward: [(0, '28.200'), (1, '33.590')] +[2023-10-09 06:01:56,693][60143] Updated weights for policy 0, policy_version 45572 (0.0009) +[2023-10-09 06:01:57,057][60143] Updated weights for policy 0, policy_version 45582 (0.0007) +[2023-10-09 06:01:57,417][60143] Updated weights for policy 0, policy_version 45592 (0.0008) +[2023-10-09 06:01:58,814][60144] Updated weights for policy 1, policy_version 46122 (0.0008) +[2023-10-09 06:01:59,172][60144] Updated weights for policy 1, policy_version 46132 (0.0007) +[2023-10-09 06:01:59,545][60144] Updated weights for policy 1, policy_version 46142 (0.0007) +[2023-10-09 06:02:01,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 93945856. Throughput: 0: 1714.1, 1: 1702.1. Samples: 23498370. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:02:01,053][59242] Avg episode reward: [(0, '27.890'), (1, '33.180')] +[2023-10-09 06:02:01,441][60143] Updated weights for policy 0, policy_version 45602 (0.0010) +[2023-10-09 06:02:01,811][60143] Updated weights for policy 0, policy_version 45612 (0.0009) +[2023-10-09 06:02:02,186][60143] Updated weights for policy 0, policy_version 45622 (0.0009) +[2023-10-09 06:02:02,558][60143] Updated weights for policy 0, policy_version 45632 (0.0009) +[2023-10-09 06:02:03,346][60144] Updated weights for policy 1, policy_version 46152 (0.0007) +[2023-10-09 06:02:03,723][60144] Updated weights for policy 1, policy_version 46162 (0.0008) +[2023-10-09 06:02:04,086][60144] Updated weights for policy 1, policy_version 46172 (0.0008) +[2023-10-09 06:02:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 94011392. Throughput: 0: 1695.5, 1: 1717.8. Samples: 23508506. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:02:06,053][59242] Avg episode reward: [(0, '27.940'), (1, '33.850')] +[2023-10-09 06:02:06,428][60143] Updated weights for policy 0, policy_version 45642 (0.0007) +[2023-10-09 06:02:06,802][60143] Updated weights for policy 0, policy_version 45652 (0.0009) +[2023-10-09 06:02:07,167][60143] Updated weights for policy 0, policy_version 45662 (0.0007) +[2023-10-09 06:02:08,160][60144] Updated weights for policy 1, policy_version 46182 (0.0010) +[2023-10-09 06:02:08,528][60144] Updated weights for policy 1, policy_version 46192 (0.0007) +[2023-10-09 06:02:08,895][60144] Updated weights for policy 1, policy_version 46202 (0.0009) +[2023-10-09 06:02:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 94076928. Throughput: 0: 1718.9, 1: 1697.0. Samples: 23529106. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:02:11,053][59242] Avg episode reward: [(0, '28.920'), (1, '32.040')] +[2023-10-09 06:02:11,224][60143] Updated weights for policy 0, policy_version 45672 (0.0008) +[2023-10-09 06:02:11,595][60143] Updated weights for policy 0, policy_version 45682 (0.0008) +[2023-10-09 06:02:11,967][60143] Updated weights for policy 0, policy_version 45692 (0.0009) +[2023-10-09 06:02:12,761][60144] Updated weights for policy 1, policy_version 46212 (0.0008) +[2023-10-09 06:02:13,136][60144] Updated weights for policy 1, policy_version 46222 (0.0007) +[2023-10-09 06:02:13,504][60144] Updated weights for policy 1, policy_version 46232 (0.0010) +[2023-10-09 06:02:16,016][60143] Updated weights for policy 0, policy_version 45702 (0.0008) +[2023-10-09 06:02:16,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 94142464. Throughput: 0: 1722.5, 1: 1730.0. Samples: 23550610. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:02:16,053][59242] Avg episode reward: [(0, '28.750'), (1, '29.930')] +[2023-10-09 06:02:16,376][60143] Updated weights for policy 0, policy_version 45712 (0.0009) +[2023-10-09 06:02:16,755][60143] Updated weights for policy 0, policy_version 45722 (0.0009) +[2023-10-09 06:02:17,302][60144] Updated weights for policy 1, policy_version 46242 (0.0007) +[2023-10-09 06:02:17,667][60144] Updated weights for policy 1, policy_version 46252 (0.0009) +[2023-10-09 06:02:18,033][60144] Updated weights for policy 1, policy_version 46262 (0.0009) +[2023-10-09 06:02:18,398][60144] Updated weights for policy 1, policy_version 46272 (0.0009) +[2023-10-09 06:02:20,695][60143] Updated weights for policy 0, policy_version 45732 (0.0007) +[2023-10-09 06:02:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 94208000. Throughput: 0: 1718.0, 1: 1703.1. Samples: 23559844. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:02:21,053][59242] Avg episode reward: [(0, '30.080'), (1, '29.800')] +[2023-10-09 06:02:21,065][60143] Updated weights for policy 0, policy_version 45742 (0.0008) +[2023-10-09 06:02:21,437][60143] Updated weights for policy 0, policy_version 45752 (0.0010) +[2023-10-09 06:02:22,299][60144] Updated weights for policy 1, policy_version 46282 (0.0009) +[2023-10-09 06:02:22,677][60144] Updated weights for policy 1, policy_version 46292 (0.0009) +[2023-10-09 06:02:23,039][60144] Updated weights for policy 1, policy_version 46302 (0.0008) +[2023-10-09 06:02:25,363][60143] Updated weights for policy 0, policy_version 45762 (0.0010) +[2023-10-09 06:02:25,733][60143] Updated weights for policy 0, policy_version 45772 (0.0009) +[2023-10-09 06:02:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 94273536. Throughput: 0: 1720.3, 1: 1719.7. Samples: 23581120. Policy #0 lag: (min: 31.0, avg: 32.6, max: 58.0) +[2023-10-09 06:02:26,053][59242] Avg episode reward: [(0, '28.610'), (1, '30.050')] +[2023-10-09 06:02:26,104][60143] Updated weights for policy 0, policy_version 45782 (0.0010) +[2023-10-09 06:02:26,472][60143] Updated weights for policy 0, policy_version 45792 (0.0008) +[2023-10-09 06:02:27,068][60144] Updated weights for policy 1, policy_version 46312 (0.0007) +[2023-10-09 06:02:27,456][60144] Updated weights for policy 1, policy_version 46322 (0.0008) +[2023-10-09 06:02:27,816][60144] Updated weights for policy 1, policy_version 46332 (0.0009) +[2023-10-09 06:02:30,286][60143] Updated weights for policy 0, policy_version 45802 (0.0009) +[2023-10-09 06:02:30,655][60143] Updated weights for policy 0, policy_version 45812 (0.0008) +[2023-10-09 06:02:31,033][60143] Updated weights for policy 0, policy_version 45822 (0.0012) +[2023-10-09 06:02:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 94339072. Throughput: 0: 1709.2, 1: 1739.2. Samples: 23601728. Policy #0 lag: (min: 31.0, avg: 32.6, max: 58.0) +[2023-10-09 06:02:31,052][59242] Avg episode reward: [(0, '29.870'), (1, '28.200')] +[2023-10-09 06:02:31,803][60144] Updated weights for policy 1, policy_version 46342 (0.0008) +[2023-10-09 06:02:32,161][60144] Updated weights for policy 1, policy_version 46352 (0.0009) +[2023-10-09 06:02:32,527][60144] Updated weights for policy 1, policy_version 46362 (0.0010) +[2023-10-09 06:02:34,899][60143] Updated weights for policy 0, policy_version 45832 (0.0011) +[2023-10-09 06:02:35,266][60143] Updated weights for policy 0, policy_version 45842 (0.0010) +[2023-10-09 06:02:35,632][60143] Updated weights for policy 0, policy_version 45852 (0.0011) +[2023-10-09 06:02:36,052][59242] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 94437376. Throughput: 0: 1732.5, 1: 1704.8. Samples: 23611728. Policy #0 lag: (min: 31.0, avg: 32.6, max: 58.0) +[2023-10-09 06:02:36,053][59242] Avg episode reward: [(0, '29.230'), (1, '28.220')] +[2023-10-09 06:02:36,586][60144] Updated weights for policy 1, policy_version 46372 (0.0010) +[2023-10-09 06:02:36,955][60144] Updated weights for policy 1, policy_version 46382 (0.0007) +[2023-10-09 06:02:37,316][60144] Updated weights for policy 1, policy_version 46392 (0.0008) +[2023-10-09 06:02:39,675][60143] Updated weights for policy 0, policy_version 45862 (0.0008) +[2023-10-09 06:02:40,045][60143] Updated weights for policy 0, policy_version 45872 (0.0009) +[2023-10-09 06:02:40,416][60143] Updated weights for policy 0, policy_version 45882 (0.0008) +[2023-10-09 06:02:41,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 94502912. Throughput: 0: 1728.0, 1: 1728.5. Samples: 23633058. Policy #0 lag: (min: 31.0, avg: 32.6, max: 58.0) +[2023-10-09 06:02:41,053][59242] Avg episode reward: [(0, '27.500'), (1, '28.490')] +[2023-10-09 06:02:41,239][60144] Updated weights for policy 1, policy_version 46402 (0.0009) +[2023-10-09 06:02:41,606][60144] Updated weights for policy 1, policy_version 46412 (0.0009) +[2023-10-09 06:02:41,974][60144] Updated weights for policy 1, policy_version 46422 (0.0008) +[2023-10-09 06:02:42,344][60144] Updated weights for policy 1, policy_version 46432 (0.0010) +[2023-10-09 06:02:44,390][60143] Updated weights for policy 0, policy_version 45892 (0.0010) +[2023-10-09 06:02:44,766][60143] Updated weights for policy 0, policy_version 45902 (0.0008) +[2023-10-09 06:02:45,131][60143] Updated weights for policy 0, policy_version 45912 (0.0009) +[2023-10-09 06:02:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 94568448. Throughput: 0: 1698.3, 1: 1734.9. Samples: 23652864. Policy #0 lag: (min: 31.0, avg: 32.6, max: 58.0) +[2023-10-09 06:02:46,053][59242] Avg episode reward: [(0, '27.930'), (1, '29.010')] +[2023-10-09 06:02:46,309][60144] Updated weights for policy 1, policy_version 46442 (0.0008) +[2023-10-09 06:02:46,671][60144] Updated weights for policy 1, policy_version 46452 (0.0009) +[2023-10-09 06:02:47,035][60144] Updated weights for policy 1, policy_version 46462 (0.0007) +[2023-10-09 06:02:49,111][60143] Updated weights for policy 0, policy_version 45922 (0.0009) +[2023-10-09 06:02:49,482][60143] Updated weights for policy 0, policy_version 45932 (0.0008) +[2023-10-09 06:02:49,850][60143] Updated weights for policy 0, policy_version 45942 (0.0012) +[2023-10-09 06:02:50,219][60143] Updated weights for policy 0, policy_version 45952 (0.0010) +[2023-10-09 06:02:51,009][60144] Updated weights for policy 1, policy_version 46472 (0.0008) +[2023-10-09 06:02:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 94633984. Throughput: 0: 1728.8, 1: 1717.2. Samples: 23663574. Policy #0 lag: (min: 31.0, avg: 32.6, max: 58.0) +[2023-10-09 06:02:51,053][59242] Avg episode reward: [(0, '26.400'), (1, '30.510')] +[2023-10-09 06:02:51,370][60144] Updated weights for policy 1, policy_version 46482 (0.0008) +[2023-10-09 06:02:51,737][60144] Updated weights for policy 1, policy_version 46492 (0.0007) +[2023-10-09 06:02:54,262][60143] Updated weights for policy 0, policy_version 45962 (0.0008) +[2023-10-09 06:02:54,620][60143] Updated weights for policy 0, policy_version 45972 (0.0010) +[2023-10-09 06:02:54,985][60143] Updated weights for policy 0, policy_version 45982 (0.0010) +[2023-10-09 06:02:55,624][60144] Updated weights for policy 1, policy_version 46502 (0.0009) +[2023-10-09 06:02:55,989][60144] Updated weights for policy 1, policy_version 46512 (0.0007) +[2023-10-09 06:02:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 94699520. Throughput: 0: 1713.0, 1: 1734.7. Samples: 23684252. Policy #0 lag: (min: 1.0, avg: 7.0, max: 33.0) +[2023-10-09 06:02:56,053][59242] Avg episode reward: [(0, '27.300'), (1, '29.050')] +[2023-10-09 06:02:56,347][60144] Updated weights for policy 1, policy_version 46522 (0.0007) +[2023-10-09 06:02:58,977][60143] Updated weights for policy 0, policy_version 45992 (0.0009) +[2023-10-09 06:02:59,348][60143] Updated weights for policy 0, policy_version 46002 (0.0011) +[2023-10-09 06:02:59,721][60143] Updated weights for policy 0, policy_version 46012 (0.0007) +[2023-10-09 06:03:00,182][60144] Updated weights for policy 1, policy_version 46532 (0.0009) +[2023-10-09 06:03:00,546][60144] Updated weights for policy 1, policy_version 46542 (0.0010) +[2023-10-09 06:03:00,921][60144] Updated weights for policy 1, policy_version 46552 (0.0009) +[2023-10-09 06:03:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 94765056. Throughput: 0: 1695.5, 1: 1723.6. Samples: 23704470. Policy #0 lag: (min: 1.0, avg: 7.0, max: 33.0) +[2023-10-09 06:03:01,053][59242] Avg episode reward: [(0, '29.320'), (1, '30.100')] +[2023-10-09 06:03:03,767][60143] Updated weights for policy 0, policy_version 46022 (0.0008) +[2023-10-09 06:03:04,135][60143] Updated weights for policy 0, policy_version 46032 (0.0011) +[2023-10-09 06:03:04,502][60143] Updated weights for policy 0, policy_version 46042 (0.0011) +[2023-10-09 06:03:04,870][60144] Updated weights for policy 1, policy_version 46562 (0.0009) +[2023-10-09 06:03:05,234][60144] Updated weights for policy 1, policy_version 46572 (0.0009) +[2023-10-09 06:03:05,600][60144] Updated weights for policy 1, policy_version 46582 (0.0008) +[2023-10-09 06:03:05,968][60144] Updated weights for policy 1, policy_version 46592 (0.0008) +[2023-10-09 06:03:06,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 94863360. Throughput: 0: 1725.3, 1: 1737.1. Samples: 23715652. Policy #0 lag: (min: 1.0, avg: 7.0, max: 33.0) +[2023-10-09 06:03:06,053][59242] Avg episode reward: [(0, '28.450'), (1, '30.290')] +[2023-10-09 06:03:08,522][60143] Updated weights for policy 0, policy_version 46052 (0.0009) +[2023-10-09 06:03:08,893][60143] Updated weights for policy 0, policy_version 46062 (0.0008) +[2023-10-09 06:03:09,257][60143] Updated weights for policy 0, policy_version 46072 (0.0009) +[2023-10-09 06:03:09,875][60144] Updated weights for policy 1, policy_version 46602 (0.0007) +[2023-10-09 06:03:10,244][60144] Updated weights for policy 1, policy_version 46612 (0.0008) +[2023-10-09 06:03:10,622][60144] Updated weights for policy 1, policy_version 46622 (0.0007) +[2023-10-09 06:03:11,052][59242] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 94928896. Throughput: 0: 1698.2, 1: 1733.9. Samples: 23735564. Policy #0 lag: (min: 1.0, avg: 7.0, max: 33.0) +[2023-10-09 06:03:11,052][59242] Avg episode reward: [(0, '29.420'), (1, '29.440')] +[2023-10-09 06:03:13,369][60143] Updated weights for policy 0, policy_version 46082 (0.0007) +[2023-10-09 06:03:13,736][60143] Updated weights for policy 0, policy_version 46092 (0.0009) +[2023-10-09 06:03:14,105][60143] Updated weights for policy 0, policy_version 46102 (0.0007) +[2023-10-09 06:03:14,474][60143] Updated weights for policy 0, policy_version 46112 (0.0010) +[2023-10-09 06:03:14,645][60144] Updated weights for policy 1, policy_version 46632 (0.0008) +[2023-10-09 06:03:15,021][60144] Updated weights for policy 1, policy_version 46642 (0.0007) +[2023-10-09 06:03:15,392][60144] Updated weights for policy 1, policy_version 46652 (0.0008) +[2023-10-09 06:03:16,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 94994432. Throughput: 0: 1705.3, 1: 1709.3. Samples: 23755386. Policy #0 lag: (min: 1.0, avg: 7.0, max: 33.0) +[2023-10-09 06:03:16,053][59242] Avg episode reward: [(0, '29.850'), (1, '30.030')] +[2023-10-09 06:03:18,354][60143] Updated weights for policy 0, policy_version 46122 (0.0007) +[2023-10-09 06:03:18,722][60143] Updated weights for policy 0, policy_version 46132 (0.0007) +[2023-10-09 06:03:19,079][60143] Updated weights for policy 0, policy_version 46142 (0.0008) +[2023-10-09 06:03:19,107][60144] Updated weights for policy 1, policy_version 46662 (0.0009) +[2023-10-09 06:03:19,471][60144] Updated weights for policy 1, policy_version 46672 (0.0010) +[2023-10-09 06:03:19,845][60144] Updated weights for policy 1, policy_version 46682 (0.0007) +[2023-10-09 06:03:21,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 95059968. Throughput: 0: 1701.2, 1: 1745.8. Samples: 23766844. Policy #0 lag: (min: 1.0, avg: 7.0, max: 33.0) +[2023-10-09 06:03:21,053][59242] Avg episode reward: [(0, '30.100'), (1, '29.320')] +[2023-10-09 06:03:23,008][60143] Updated weights for policy 0, policy_version 46152 (0.0009) +[2023-10-09 06:03:23,383][60143] Updated weights for policy 0, policy_version 46162 (0.0008) +[2023-10-09 06:03:23,758][60143] Updated weights for policy 0, policy_version 46172 (0.0008) +[2023-10-09 06:03:23,814][60144] Updated weights for policy 1, policy_version 46692 (0.0007) +[2023-10-09 06:03:24,187][60144] Updated weights for policy 1, policy_version 46702 (0.0008) +[2023-10-09 06:03:24,552][60144] Updated weights for policy 1, policy_version 46712 (0.0009) +[2023-10-09 06:03:26,052][59242] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 95125504. Throughput: 0: 1688.8, 1: 1722.2. Samples: 23786556. Policy #0 lag: (min: 0.0, avg: 29.4, max: 32.0) +[2023-10-09 06:03:26,052][59242] Avg episode reward: [(0, '30.340'), (1, '29.700')] +[2023-10-09 06:03:27,633][60143] Updated weights for policy 0, policy_version 46182 (0.0008) +[2023-10-09 06:03:28,002][60143] Updated weights for policy 0, policy_version 46192 (0.0009) +[2023-10-09 06:03:28,374][60143] Updated weights for policy 0, policy_version 46202 (0.0008) +[2023-10-09 06:03:28,718][60144] Updated weights for policy 1, policy_version 46722 (0.0010) +[2023-10-09 06:03:29,087][60144] Updated weights for policy 1, policy_version 46732 (0.0008) +[2023-10-09 06:03:29,461][60144] Updated weights for policy 1, policy_version 46742 (0.0008) +[2023-10-09 06:03:29,836][60144] Updated weights for policy 1, policy_version 46752 (0.0008) +[2023-10-09 06:03:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 95191040. Throughput: 0: 1721.4, 1: 1710.3. Samples: 23807290. Policy #0 lag: (min: 0.0, avg: 29.4, max: 32.0) +[2023-10-09 06:03:31,053][59242] Avg episode reward: [(0, '29.840'), (1, '29.230')] +[2023-10-09 06:03:31,066][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000046752_47874048.pth... +[2023-10-09 06:03:31,066][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000046208_47316992.pth... +[2023-10-09 06:03:31,099][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000044608_45678592.pth +[2023-10-09 06:03:31,105][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000045152_46235648.pth +[2023-10-09 06:03:32,204][60143] Updated weights for policy 0, policy_version 46212 (0.0010) +[2023-10-09 06:03:32,577][60143] Updated weights for policy 0, policy_version 46222 (0.0011) +[2023-10-09 06:03:32,950][60143] Updated weights for policy 0, policy_version 46232 (0.0009) +[2023-10-09 06:03:33,691][60144] Updated weights for policy 1, policy_version 46762 (0.0009) +[2023-10-09 06:03:34,064][60144] Updated weights for policy 1, policy_version 46772 (0.0009) +[2023-10-09 06:03:34,433][60144] Updated weights for policy 1, policy_version 46782 (0.0007) +[2023-10-09 06:03:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 95256576. Throughput: 0: 1694.6, 1: 1732.5. Samples: 23817794. Policy #0 lag: (min: 0.0, avg: 29.4, max: 32.0) +[2023-10-09 06:03:36,052][59242] Avg episode reward: [(0, '30.090'), (1, '30.800')] +[2023-10-09 06:03:36,914][60143] Updated weights for policy 0, policy_version 46242 (0.0009) +[2023-10-09 06:03:37,280][60143] Updated weights for policy 0, policy_version 46252 (0.0010) +[2023-10-09 06:03:37,654][60143] Updated weights for policy 0, policy_version 46262 (0.0009) +[2023-10-09 06:03:38,023][60143] Updated weights for policy 0, policy_version 46272 (0.0007) +[2023-10-09 06:03:38,325][60144] Updated weights for policy 1, policy_version 46792 (0.0007) +[2023-10-09 06:03:38,696][60144] Updated weights for policy 1, policy_version 46802 (0.0007) +[2023-10-09 06:03:39,054][60144] Updated weights for policy 1, policy_version 46812 (0.0008) +[2023-10-09 06:03:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 95322112. Throughput: 0: 1709.6, 1: 1711.1. Samples: 23838186. Policy #0 lag: (min: 0.0, avg: 29.4, max: 32.0) +[2023-10-09 06:03:41,053][59242] Avg episode reward: [(0, '29.130'), (1, '30.770')] +[2023-10-09 06:03:42,202][60143] Updated weights for policy 0, policy_version 46282 (0.0007) +[2023-10-09 06:03:42,567][60143] Updated weights for policy 0, policy_version 46292 (0.0007) +[2023-10-09 06:03:42,943][60143] Updated weights for policy 0, policy_version 46302 (0.0008) +[2023-10-09 06:03:42,943][60144] Updated weights for policy 1, policy_version 46822 (0.0008) +[2023-10-09 06:03:43,317][60144] Updated weights for policy 1, policy_version 46832 (0.0008) +[2023-10-09 06:03:43,682][60144] Updated weights for policy 1, policy_version 46842 (0.0008) +[2023-10-09 06:03:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 95387648. Throughput: 0: 1723.1, 1: 1721.2. Samples: 23859462. Policy #0 lag: (min: 0.0, avg: 29.4, max: 32.0) +[2023-10-09 06:03:46,053][59242] Avg episode reward: [(0, '29.800'), (1, '28.550')] +[2023-10-09 06:03:46,913][60143] Updated weights for policy 0, policy_version 46312 (0.0007) +[2023-10-09 06:03:47,285][60143] Updated weights for policy 0, policy_version 46322 (0.0007) +[2023-10-09 06:03:47,651][60143] Updated weights for policy 0, policy_version 46332 (0.0008) +[2023-10-09 06:03:47,726][60144] Updated weights for policy 1, policy_version 46852 (0.0010) +[2023-10-09 06:03:48,087][60144] Updated weights for policy 1, policy_version 46862 (0.0010) +[2023-10-09 06:03:48,459][60144] Updated weights for policy 1, policy_version 46872 (0.0012) +[2023-10-09 06:03:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 95453184. Throughput: 0: 1693.6, 1: 1713.7. Samples: 23868980. Policy #0 lag: (min: 0.0, avg: 29.4, max: 32.0) +[2023-10-09 06:03:51,053][59242] Avg episode reward: [(0, '30.260'), (1, '29.330')] +[2023-10-09 06:03:51,696][60143] Updated weights for policy 0, policy_version 46342 (0.0009) +[2023-10-09 06:03:52,063][60143] Updated weights for policy 0, policy_version 46352 (0.0008) +[2023-10-09 06:03:52,429][60143] Updated weights for policy 0, policy_version 46362 (0.0009) +[2023-10-09 06:03:52,455][60144] Updated weights for policy 1, policy_version 46882 (0.0010) +[2023-10-09 06:03:52,818][60144] Updated weights for policy 1, policy_version 46892 (0.0008) +[2023-10-09 06:03:53,187][60144] Updated weights for policy 1, policy_version 46902 (0.0007) +[2023-10-09 06:03:53,553][60144] Updated weights for policy 1, policy_version 46912 (0.0008) +[2023-10-09 06:03:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 95518720. Throughput: 0: 1719.8, 1: 1713.3. Samples: 23890054. Policy #0 lag: (min: 0.0, avg: 29.4, max: 32.0) +[2023-10-09 06:03:56,052][59242] Avg episode reward: [(0, '28.920'), (1, '29.880')] +[2023-10-09 06:03:56,342][60143] Updated weights for policy 0, policy_version 46372 (0.0008) +[2023-10-09 06:03:56,717][60143] Updated weights for policy 0, policy_version 46382 (0.0008) +[2023-10-09 06:03:57,087][60143] Updated weights for policy 0, policy_version 46392 (0.0008) +[2023-10-09 06:03:57,401][60144] Updated weights for policy 1, policy_version 46922 (0.0009) +[2023-10-09 06:03:57,772][60144] Updated weights for policy 1, policy_version 46932 (0.0010) +[2023-10-09 06:03:58,135][60144] Updated weights for policy 1, policy_version 46942 (0.0010) +[2023-10-09 06:04:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 95584256. Throughput: 0: 1726.0, 1: 1741.4. Samples: 23911420. Policy #0 lag: (min: 31.0, avg: 32.3, max: 55.0) +[2023-10-09 06:04:01,053][59242] Avg episode reward: [(0, '29.980'), (1, '29.130')] +[2023-10-09 06:04:01,244][60143] Updated weights for policy 0, policy_version 46402 (0.0007) +[2023-10-09 06:04:01,614][60143] Updated weights for policy 0, policy_version 46412 (0.0007) +[2023-10-09 06:04:01,984][60143] Updated weights for policy 0, policy_version 46422 (0.0009) +[2023-10-09 06:04:02,180][60144] Updated weights for policy 1, policy_version 46952 (0.0009) +[2023-10-09 06:04:02,351][60143] Updated weights for policy 0, policy_version 46432 (0.0007) +[2023-10-09 06:04:02,562][60144] Updated weights for policy 1, policy_version 46962 (0.0008) +[2023-10-09 06:04:02,938][60144] Updated weights for policy 1, policy_version 46972 (0.0010) +[2023-10-09 06:04:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 95649792. Throughput: 0: 1709.5, 1: 1701.2. Samples: 23920326. Policy #0 lag: (min: 31.0, avg: 32.3, max: 55.0) +[2023-10-09 06:04:06,053][59242] Avg episode reward: [(0, '31.200'), (1, '29.490')] +[2023-10-09 06:04:06,155][60143] Updated weights for policy 0, policy_version 46442 (0.0011) +[2023-10-09 06:04:06,526][60143] Updated weights for policy 0, policy_version 46452 (0.0010) +[2023-10-09 06:04:06,886][60144] Updated weights for policy 1, policy_version 46982 (0.0009) +[2023-10-09 06:04:06,887][60143] Updated weights for policy 0, policy_version 46462 (0.0007) +[2023-10-09 06:04:07,246][60144] Updated weights for policy 1, policy_version 46992 (0.0011) +[2023-10-09 06:04:07,614][60144] Updated weights for policy 1, policy_version 47002 (0.0009) +[2023-10-09 06:04:10,897][60143] Updated weights for policy 0, policy_version 46472 (0.0009) +[2023-10-09 06:04:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 95715328. Throughput: 0: 1723.6, 1: 1727.1. Samples: 23941834. Policy #0 lag: (min: 31.0, avg: 32.3, max: 55.0) +[2023-10-09 06:04:11,053][59242] Avg episode reward: [(0, '32.220'), (1, '28.320')] +[2023-10-09 06:04:11,275][60143] Updated weights for policy 0, policy_version 46482 (0.0009) +[2023-10-09 06:04:11,503][60144] Updated weights for policy 1, policy_version 47012 (0.0009) +[2023-10-09 06:04:11,640][60143] Updated weights for policy 0, policy_version 46492 (0.0008) +[2023-10-09 06:04:11,878][60144] Updated weights for policy 1, policy_version 47022 (0.0009) +[2023-10-09 06:04:12,248][60144] Updated weights for policy 1, policy_version 47032 (0.0009) +[2023-10-09 06:04:15,490][60143] Updated weights for policy 0, policy_version 46502 (0.0009) +[2023-10-09 06:04:15,862][60143] Updated weights for policy 0, policy_version 46512 (0.0010) +[2023-10-09 06:04:15,903][60144] Updated weights for policy 1, policy_version 47042 (0.0008) +[2023-10-09 06:04:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 95780864. Throughput: 0: 1719.3, 1: 1742.1. Samples: 23963054. Policy #0 lag: (min: 31.0, avg: 32.3, max: 55.0) +[2023-10-09 06:04:16,052][59242] Avg episode reward: [(0, '33.520'), (1, '28.060')] +[2023-10-09 06:04:16,230][60143] Updated weights for policy 0, policy_version 46522 (0.0008) +[2023-10-09 06:04:16,268][60144] Updated weights for policy 1, policy_version 47052 (0.0007) +[2023-10-09 06:04:16,643][60144] Updated weights for policy 1, policy_version 47062 (0.0007) +[2023-10-09 06:04:17,011][60144] Updated weights for policy 1, policy_version 47072 (0.0010) +[2023-10-09 06:04:20,268][60143] Updated weights for policy 0, policy_version 46532 (0.0008) +[2023-10-09 06:04:20,632][60143] Updated weights for policy 0, policy_version 46542 (0.0010) +[2023-10-09 06:04:20,987][60144] Updated weights for policy 1, policy_version 47082 (0.0010) +[2023-10-09 06:04:21,005][60143] Updated weights for policy 0, policy_version 46552 (0.0009) +[2023-10-09 06:04:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 95846400. Throughput: 0: 1724.0, 1: 1715.8. Samples: 23972588. Policy #0 lag: (min: 31.0, avg: 32.3, max: 55.0) +[2023-10-09 06:04:21,052][59242] Avg episode reward: [(0, '31.970'), (1, '30.520')] +[2023-10-09 06:04:21,363][60144] Updated weights for policy 1, policy_version 47092 (0.0007) +[2023-10-09 06:04:21,722][60144] Updated weights for policy 1, policy_version 47102 (0.0007) +[2023-10-09 06:04:25,098][60143] Updated weights for policy 0, policy_version 46562 (0.0008) +[2023-10-09 06:04:25,470][60143] Updated weights for policy 0, policy_version 46572 (0.0009) +[2023-10-09 06:04:25,788][60144] Updated weights for policy 1, policy_version 47112 (0.0007) +[2023-10-09 06:04:25,830][60143] Updated weights for policy 0, policy_version 46582 (0.0009) +[2023-10-09 06:04:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 95911936. Throughput: 0: 1713.9, 1: 1735.0. Samples: 23993386. Policy #0 lag: (min: 31.0, avg: 32.3, max: 55.0) +[2023-10-09 06:04:26,053][59242] Avg episode reward: [(0, '32.620'), (1, '29.240')] +[2023-10-09 06:04:26,158][60144] Updated weights for policy 1, policy_version 47122 (0.0007) +[2023-10-09 06:04:26,201][60143] Updated weights for policy 0, policy_version 46592 (0.0009) +[2023-10-09 06:04:26,524][60144] Updated weights for policy 1, policy_version 47132 (0.0008) +[2023-10-09 06:04:30,271][60143] Updated weights for policy 0, policy_version 46602 (0.0008) +[2023-10-09 06:04:30,645][60143] Updated weights for policy 0, policy_version 46612 (0.0007) +[2023-10-09 06:04:30,668][60144] Updated weights for policy 1, policy_version 47142 (0.0007) +[2023-10-09 06:04:31,013][60143] Updated weights for policy 0, policy_version 46622 (0.0008) +[2023-10-09 06:04:31,041][60144] Updated weights for policy 1, policy_version 47152 (0.0008) +[2023-10-09 06:04:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 95977472. Throughput: 0: 1698.3, 1: 1731.8. Samples: 24013820. Policy #0 lag: (min: 31.0, avg: 32.3, max: 55.0) +[2023-10-09 06:04:31,053][59242] Avg episode reward: [(0, '31.200'), (1, '30.280')] +[2023-10-09 06:04:31,406][60144] Updated weights for policy 1, policy_version 47162 (0.0010) +[2023-10-09 06:04:34,937][60143] Updated weights for policy 0, policy_version 46632 (0.0008) +[2023-10-09 06:04:35,309][60143] Updated weights for policy 0, policy_version 46642 (0.0008) +[2023-10-09 06:04:35,354][60144] Updated weights for policy 1, policy_version 47172 (0.0008) +[2023-10-09 06:04:35,676][60143] Updated weights for policy 0, policy_version 46652 (0.0008) +[2023-10-09 06:04:35,717][60144] Updated weights for policy 1, policy_version 47182 (0.0010) +[2023-10-09 06:04:36,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 96075776. Throughput: 0: 1715.6, 1: 1727.3. Samples: 24023912. Policy #0 lag: (min: 26.0, avg: 26.2, max: 33.0) +[2023-10-09 06:04:36,053][59242] Avg episode reward: [(0, '31.950'), (1, '29.020')] +[2023-10-09 06:04:36,076][60144] Updated weights for policy 1, policy_version 47192 (0.0008) +[2023-10-09 06:04:39,665][60143] Updated weights for policy 0, policy_version 46662 (0.0009) +[2023-10-09 06:04:39,915][60144] Updated weights for policy 1, policy_version 47202 (0.0008) +[2023-10-09 06:04:40,043][60143] Updated weights for policy 0, policy_version 46672 (0.0008) +[2023-10-09 06:04:40,285][60144] Updated weights for policy 1, policy_version 47212 (0.0008) +[2023-10-09 06:04:40,402][60143] Updated weights for policy 0, policy_version 46682 (0.0009) +[2023-10-09 06:04:40,656][60144] Updated weights for policy 1, policy_version 47222 (0.0007) +[2023-10-09 06:04:41,019][60144] Updated weights for policy 1, policy_version 47232 (0.0007) +[2023-10-09 06:04:41,052][59242] Fps is (10 sec: 19661.0, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 96174080. Throughput: 0: 1710.3, 1: 1736.0. Samples: 24045140. Policy #0 lag: (min: 26.0, avg: 26.2, max: 33.0) +[2023-10-09 06:04:41,053][59242] Avg episode reward: [(0, '31.230'), (1, '29.230')] +[2023-10-09 06:04:44,488][60143] Updated weights for policy 0, policy_version 46692 (0.0007) +[2023-10-09 06:04:44,815][60144] Updated weights for policy 1, policy_version 47242 (0.0009) +[2023-10-09 06:04:44,851][60143] Updated weights for policy 0, policy_version 46702 (0.0007) +[2023-10-09 06:04:45,178][60144] Updated weights for policy 1, policy_version 47252 (0.0008) +[2023-10-09 06:04:45,220][60143] Updated weights for policy 0, policy_version 46712 (0.0007) +[2023-10-09 06:04:45,538][60144] Updated weights for policy 1, policy_version 47262 (0.0007) +[2023-10-09 06:04:46,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 96239616. Throughput: 0: 1677.9, 1: 1713.6. Samples: 24064040. Policy #0 lag: (min: 26.0, avg: 26.2, max: 33.0) +[2023-10-09 06:04:46,053][59242] Avg episode reward: [(0, '32.970'), (1, '28.920')] +[2023-10-09 06:04:49,270][60143] Updated weights for policy 0, policy_version 46722 (0.0008) +[2023-10-09 06:04:49,639][60143] Updated weights for policy 0, policy_version 46732 (0.0009) +[2023-10-09 06:04:49,697][60144] Updated weights for policy 1, policy_version 47272 (0.0008) +[2023-10-09 06:04:50,007][60143] Updated weights for policy 0, policy_version 46742 (0.0008) +[2023-10-09 06:04:50,080][60144] Updated weights for policy 1, policy_version 47282 (0.0009) +[2023-10-09 06:04:50,382][60143] Updated weights for policy 0, policy_version 46752 (0.0009) +[2023-10-09 06:04:50,441][60144] Updated weights for policy 1, policy_version 47292 (0.0009) +[2023-10-09 06:04:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 96305152. Throughput: 0: 1703.1, 1: 1750.4. Samples: 24075732. Policy #0 lag: (min: 26.0, avg: 26.2, max: 33.0) +[2023-10-09 06:04:51,053][59242] Avg episode reward: [(0, '32.540'), (1, '28.960')] +[2023-10-09 06:04:54,294][60144] Updated weights for policy 1, policy_version 47302 (0.0009) +[2023-10-09 06:04:54,324][60143] Updated weights for policy 0, policy_version 46762 (0.0007) +[2023-10-09 06:04:54,655][60144] Updated weights for policy 1, policy_version 47312 (0.0009) +[2023-10-09 06:04:54,697][60143] Updated weights for policy 0, policy_version 46772 (0.0007) +[2023-10-09 06:04:55,022][60144] Updated weights for policy 1, policy_version 47322 (0.0007) +[2023-10-09 06:04:55,069][60143] Updated weights for policy 0, policy_version 46782 (0.0007) +[2023-10-09 06:04:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 96370688. Throughput: 0: 1690.0, 1: 1729.2. Samples: 24095694. Policy #0 lag: (min: 26.0, avg: 26.2, max: 33.0) +[2023-10-09 06:04:56,052][59242] Avg episode reward: [(0, '32.550'), (1, '29.380')] +[2023-10-09 06:04:58,933][60144] Updated weights for policy 1, policy_version 47332 (0.0008) +[2023-10-09 06:04:59,245][60143] Updated weights for policy 0, policy_version 46792 (0.0007) +[2023-10-09 06:04:59,292][60144] Updated weights for policy 1, policy_version 47342 (0.0007) +[2023-10-09 06:04:59,616][60143] Updated weights for policy 0, policy_version 46802 (0.0008) +[2023-10-09 06:04:59,663][60144] Updated weights for policy 1, policy_version 47352 (0.0008) +[2023-10-09 06:04:59,978][60143] Updated weights for policy 0, policy_version 46812 (0.0008) +[2023-10-09 06:05:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 96436224. Throughput: 0: 1671.6, 1: 1708.0. Samples: 24115136. Policy #0 lag: (min: 26.0, avg: 26.2, max: 33.0) +[2023-10-09 06:05:01,053][59242] Avg episode reward: [(0, '32.400'), (1, '28.300')] +[2023-10-09 06:05:03,609][60144] Updated weights for policy 1, policy_version 47362 (0.0008) +[2023-10-09 06:05:03,973][60144] Updated weights for policy 1, policy_version 47372 (0.0011) +[2023-10-09 06:05:04,159][60143] Updated weights for policy 0, policy_version 46822 (0.0009) +[2023-10-09 06:05:04,343][60144] Updated weights for policy 1, policy_version 47382 (0.0007) +[2023-10-09 06:05:04,524][60143] Updated weights for policy 0, policy_version 46832 (0.0010) +[2023-10-09 06:05:04,703][60144] Updated weights for policy 1, policy_version 47392 (0.0008) +[2023-10-09 06:05:04,897][60143] Updated weights for policy 0, policy_version 46842 (0.0009) +[2023-10-09 06:05:06,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 96501760. Throughput: 0: 1690.0, 1: 1736.1. Samples: 24126766. Policy #0 lag: (min: 24.0, avg: 51.3, max: 56.0) +[2023-10-09 06:05:06,053][59242] Avg episode reward: [(0, '33.300'), (1, '29.050')] +[2023-10-09 06:05:08,792][60144] Updated weights for policy 1, policy_version 47402 (0.0008) +[2023-10-09 06:05:08,974][60143] Updated weights for policy 0, policy_version 46852 (0.0009) +[2023-10-09 06:05:09,149][60144] Updated weights for policy 1, policy_version 47412 (0.0009) +[2023-10-09 06:05:09,354][60143] Updated weights for policy 0, policy_version 46862 (0.0008) +[2023-10-09 06:05:09,517][60144] Updated weights for policy 1, policy_version 47422 (0.0008) +[2023-10-09 06:05:09,723][60143] Updated weights for policy 0, policy_version 46872 (0.0007) +[2023-10-09 06:05:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 96567296. Throughput: 0: 1680.8, 1: 1711.2. Samples: 24146026. Policy #0 lag: (min: 24.0, avg: 51.3, max: 56.0) +[2023-10-09 06:05:11,053][59242] Avg episode reward: [(0, '32.410'), (1, '29.160')] +[2023-10-09 06:05:13,369][60144] Updated weights for policy 1, policy_version 47432 (0.0008) +[2023-10-09 06:05:13,701][60143] Updated weights for policy 0, policy_version 46882 (0.0007) +[2023-10-09 06:05:13,738][60144] Updated weights for policy 1, policy_version 47442 (0.0007) +[2023-10-09 06:05:14,060][60143] Updated weights for policy 0, policy_version 46892 (0.0009) +[2023-10-09 06:05:14,105][60144] Updated weights for policy 1, policy_version 47452 (0.0008) +[2023-10-09 06:05:14,435][60143] Updated weights for policy 0, policy_version 46902 (0.0008) +[2023-10-09 06:05:14,805][60143] Updated weights for policy 0, policy_version 46912 (0.0009) +[2023-10-09 06:05:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 96632832. Throughput: 0: 1685.0, 1: 1714.4. Samples: 24166794. Policy #0 lag: (min: 24.0, avg: 51.3, max: 56.0) +[2023-10-09 06:05:16,053][59242] Avg episode reward: [(0, '31.570'), (1, '28.030')] +[2023-10-09 06:05:17,983][60144] Updated weights for policy 1, policy_version 47462 (0.0008) +[2023-10-09 06:05:18,346][60144] Updated weights for policy 1, policy_version 47472 (0.0008) +[2023-10-09 06:05:18,718][60144] Updated weights for policy 1, policy_version 47482 (0.0008) +[2023-10-09 06:05:18,822][60143] Updated weights for policy 0, policy_version 46922 (0.0007) +[2023-10-09 06:05:19,185][60143] Updated weights for policy 0, policy_version 46932 (0.0008) +[2023-10-09 06:05:19,561][60143] Updated weights for policy 0, policy_version 46942 (0.0009) +[2023-10-09 06:05:21,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 96698368. Throughput: 0: 1698.1, 1: 1724.2. Samples: 24177914. Policy #0 lag: (min: 24.0, avg: 51.3, max: 56.0) +[2023-10-09 06:05:21,052][59242] Avg episode reward: [(0, '33.010'), (1, '28.190')] +[2023-10-09 06:05:22,685][60144] Updated weights for policy 1, policy_version 47492 (0.0008) +[2023-10-09 06:05:23,060][60144] Updated weights for policy 1, policy_version 47502 (0.0008) +[2023-10-09 06:05:23,365][60143] Updated weights for policy 0, policy_version 46952 (0.0008) +[2023-10-09 06:05:23,433][60144] Updated weights for policy 1, policy_version 47512 (0.0009) +[2023-10-09 06:05:23,736][60143] Updated weights for policy 0, policy_version 46962 (0.0008) +[2023-10-09 06:05:24,095][60143] Updated weights for policy 0, policy_version 46972 (0.0009) +[2023-10-09 06:05:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 96763904. Throughput: 0: 1680.2, 1: 1702.5. Samples: 24197362. Policy #0 lag: (min: 24.0, avg: 51.3, max: 56.0) +[2023-10-09 06:05:26,053][59242] Avg episode reward: [(0, '32.640'), (1, '29.750')] +[2023-10-09 06:05:27,426][60144] Updated weights for policy 1, policy_version 47522 (0.0009) +[2023-10-09 06:05:27,799][60144] Updated weights for policy 1, policy_version 47532 (0.0009) +[2023-10-09 06:05:27,937][60143] Updated weights for policy 0, policy_version 46982 (0.0009) +[2023-10-09 06:05:28,166][60144] Updated weights for policy 1, policy_version 47542 (0.0008) +[2023-10-09 06:05:28,313][60143] Updated weights for policy 0, policy_version 46992 (0.0009) +[2023-10-09 06:05:28,531][60144] Updated weights for policy 1, policy_version 47552 (0.0007) +[2023-10-09 06:05:28,680][60143] Updated weights for policy 0, policy_version 47002 (0.0008) +[2023-10-09 06:05:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 96829440. Throughput: 0: 1711.7, 1: 1724.6. Samples: 24218674. Policy #0 lag: (min: 24.0, avg: 51.3, max: 56.0) +[2023-10-09 06:05:31,053][59242] Avg episode reward: [(0, '34.010'), (1, '28.770')] +[2023-10-09 06:05:31,063][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000047552_48693248.pth... +[2023-10-09 06:05:31,063][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000047008_48136192.pth... +[2023-10-09 06:05:31,092][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000045952_47054848.pth +[2023-10-09 06:05:31,102][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000045408_46497792.pth +[2023-10-09 06:05:32,548][60144] Updated weights for policy 1, policy_version 47562 (0.0011) +[2023-10-09 06:05:32,724][60143] Updated weights for policy 0, policy_version 47012 (0.0008) +[2023-10-09 06:05:32,910][60144] Updated weights for policy 1, policy_version 47572 (0.0008) +[2023-10-09 06:05:33,091][60143] Updated weights for policy 0, policy_version 47022 (0.0008) +[2023-10-09 06:05:33,282][60144] Updated weights for policy 1, policy_version 47582 (0.0007) +[2023-10-09 06:05:33,456][60143] Updated weights for policy 0, policy_version 47032 (0.0008) +[2023-10-09 06:05:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 96894976. Throughput: 0: 1693.3, 1: 1699.0. Samples: 24228386. Policy #0 lag: (min: 24.0, avg: 51.3, max: 56.0) +[2023-10-09 06:05:36,053][59242] Avg episode reward: [(0, '31.680'), (1, '29.200')] +[2023-10-09 06:05:37,167][60144] Updated weights for policy 1, policy_version 47592 (0.0009) +[2023-10-09 06:05:37,486][60143] Updated weights for policy 0, policy_version 47042 (0.0007) +[2023-10-09 06:05:37,536][60144] Updated weights for policy 1, policy_version 47602 (0.0008) +[2023-10-09 06:05:37,850][60143] Updated weights for policy 0, policy_version 47052 (0.0009) +[2023-10-09 06:05:37,900][60144] Updated weights for policy 1, policy_version 47612 (0.0007) +[2023-10-09 06:05:38,232][60143] Updated weights for policy 0, policy_version 47062 (0.0009) +[2023-10-09 06:05:38,596][60143] Updated weights for policy 0, policy_version 47072 (0.0007) +[2023-10-09 06:05:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 96960512. Throughput: 0: 1693.9, 1: 1720.3. Samples: 24249334. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-09 06:05:41,053][59242] Avg episode reward: [(0, '30.840'), (1, '28.900')] +[2023-10-09 06:05:42,012][60144] Updated weights for policy 1, policy_version 47622 (0.0009) +[2023-10-09 06:05:42,411][60144] Updated weights for policy 1, policy_version 47632 (0.0007) +[2023-10-09 06:05:42,660][60143] Updated weights for policy 0, policy_version 47082 (0.0007) +[2023-10-09 06:05:42,778][60144] Updated weights for policy 1, policy_version 47642 (0.0007) +[2023-10-09 06:05:43,026][60143] Updated weights for policy 0, policy_version 47092 (0.0010) +[2023-10-09 06:05:43,402][60143] Updated weights for policy 0, policy_version 47102 (0.0008) +[2023-10-09 06:05:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 97026048. Throughput: 0: 1709.3, 1: 1739.7. Samples: 24270340. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-09 06:05:46,053][59242] Avg episode reward: [(0, '30.230'), (1, '28.630')] +[2023-10-09 06:05:46,373][60144] Updated weights for policy 1, policy_version 47652 (0.0007) +[2023-10-09 06:05:46,736][60144] Updated weights for policy 1, policy_version 47662 (0.0009) +[2023-10-09 06:05:47,113][60144] Updated weights for policy 1, policy_version 47672 (0.0008) +[2023-10-09 06:05:47,420][60143] Updated weights for policy 0, policy_version 47112 (0.0009) +[2023-10-09 06:05:47,792][60143] Updated weights for policy 0, policy_version 47122 (0.0008) +[2023-10-09 06:05:48,170][60143] Updated weights for policy 0, policy_version 47132 (0.0008) +[2023-10-09 06:05:50,792][60144] Updated weights for policy 1, policy_version 47682 (0.0007) +[2023-10-09 06:05:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 97091584. Throughput: 0: 1681.2, 1: 1715.2. Samples: 24279604. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-09 06:05:51,053][59242] Avg episode reward: [(0, '30.790'), (1, '28.610')] +[2023-10-09 06:05:51,161][60144] Updated weights for policy 1, policy_version 47692 (0.0008) +[2023-10-09 06:05:51,523][60144] Updated weights for policy 1, policy_version 47702 (0.0007) +[2023-10-09 06:05:51,883][60144] Updated weights for policy 1, policy_version 47712 (0.0009) +[2023-10-09 06:05:52,113][60143] Updated weights for policy 0, policy_version 47142 (0.0010) +[2023-10-09 06:05:52,481][60143] Updated weights for policy 0, policy_version 47152 (0.0010) +[2023-10-09 06:05:52,858][60143] Updated weights for policy 0, policy_version 47162 (0.0011) +[2023-10-09 06:05:55,825][60144] Updated weights for policy 1, policy_version 47722 (0.0008) +[2023-10-09 06:05:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 97157120. Throughput: 0: 1697.4, 1: 1748.5. Samples: 24301088. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-09 06:05:56,052][59242] Avg episode reward: [(0, '31.360'), (1, '30.520')] +[2023-10-09 06:05:56,186][60144] Updated weights for policy 1, policy_version 47732 (0.0010) +[2023-10-09 06:05:56,556][60144] Updated weights for policy 1, policy_version 47742 (0.0010) +[2023-10-09 06:05:56,758][60143] Updated weights for policy 0, policy_version 47172 (0.0008) +[2023-10-09 06:05:57,131][60143] Updated weights for policy 0, policy_version 47182 (0.0007) +[2023-10-09 06:05:57,493][60143] Updated weights for policy 0, policy_version 47192 (0.0009) +[2023-10-09 06:06:00,300][60144] Updated weights for policy 1, policy_version 47752 (0.0007) +[2023-10-09 06:06:00,663][60144] Updated weights for policy 1, policy_version 47762 (0.0011) +[2023-10-09 06:06:01,040][60144] Updated weights for policy 1, policy_version 47772 (0.0008) +[2023-10-09 06:06:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 97222656. Throughput: 0: 1710.8, 1: 1739.2. Samples: 24322044. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-09 06:06:01,052][59242] Avg episode reward: [(0, '30.400'), (1, '29.930')] +[2023-10-09 06:06:01,625][60143] Updated weights for policy 0, policy_version 47202 (0.0008) +[2023-10-09 06:06:01,993][60143] Updated weights for policy 0, policy_version 47212 (0.0011) +[2023-10-09 06:06:02,362][60143] Updated weights for policy 0, policy_version 47222 (0.0010) +[2023-10-09 06:06:02,735][60143] Updated weights for policy 0, policy_version 47232 (0.0008) +[2023-10-09 06:06:04,859][60144] Updated weights for policy 1, policy_version 47782 (0.0010) +[2023-10-09 06:06:05,221][60144] Updated weights for policy 1, policy_version 47792 (0.0008) +[2023-10-09 06:06:05,590][60144] Updated weights for policy 1, policy_version 47802 (0.0009) +[2023-10-09 06:06:06,052][59242] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 97320960. Throughput: 0: 1677.9, 1: 1745.9. Samples: 24331990. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-09 06:06:06,053][59242] Avg episode reward: [(0, '29.250'), (1, '30.570')] +[2023-10-09 06:06:06,858][60143] Updated weights for policy 0, policy_version 47242 (0.0008) +[2023-10-09 06:06:07,234][60143] Updated weights for policy 0, policy_version 47252 (0.0007) +[2023-10-09 06:06:07,599][60143] Updated weights for policy 0, policy_version 47262 (0.0010) +[2023-10-09 06:06:09,751][60144] Updated weights for policy 1, policy_version 47812 (0.0008) +[2023-10-09 06:06:10,122][60144] Updated weights for policy 1, policy_version 47822 (0.0009) +[2023-10-09 06:06:10,493][60144] Updated weights for policy 1, policy_version 47832 (0.0009) +[2023-10-09 06:06:11,052][59242] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 97386496. Throughput: 0: 1706.0, 1: 1754.9. Samples: 24353106. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-09 06:06:11,053][59242] Avg episode reward: [(0, '29.040'), (1, '30.620')] +[2023-10-09 06:06:11,387][60143] Updated weights for policy 0, policy_version 47272 (0.0009) +[2023-10-09 06:06:11,747][60143] Updated weights for policy 0, policy_version 47282 (0.0009) +[2023-10-09 06:06:12,119][60143] Updated weights for policy 0, policy_version 47292 (0.0011) +[2023-10-09 06:06:14,546][60144] Updated weights for policy 1, policy_version 47842 (0.0008) +[2023-10-09 06:06:14,908][60144] Updated weights for policy 1, policy_version 47852 (0.0007) +[2023-10-09 06:06:15,276][60144] Updated weights for policy 1, policy_version 47862 (0.0009) +[2023-10-09 06:06:15,640][60144] Updated weights for policy 1, policy_version 47872 (0.0012) +[2023-10-09 06:06:16,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 97452032. Throughput: 0: 1706.0, 1: 1726.7. Samples: 24373144. Policy #0 lag: (min: 27.0, avg: 34.0, max: 59.0) +[2023-10-09 06:06:16,052][59242] Avg episode reward: [(0, '29.690'), (1, '30.510')] +[2023-10-09 06:06:16,199][60143] Updated weights for policy 0, policy_version 47302 (0.0008) +[2023-10-09 06:06:16,570][60143] Updated weights for policy 0, policy_version 47312 (0.0009) +[2023-10-09 06:06:16,945][60143] Updated weights for policy 0, policy_version 47322 (0.0010) +[2023-10-09 06:06:19,519][60144] Updated weights for policy 1, policy_version 47882 (0.0008) +[2023-10-09 06:06:19,887][60144] Updated weights for policy 1, policy_version 47892 (0.0007) +[2023-10-09 06:06:20,254][60144] Updated weights for policy 1, policy_version 47902 (0.0008) +[2023-10-09 06:06:20,979][60143] Updated weights for policy 0, policy_version 47332 (0.0009) +[2023-10-09 06:06:21,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 97517568. Throughput: 0: 1698.8, 1: 1754.2. Samples: 24383768. Policy #0 lag: (min: 27.0, avg: 34.0, max: 59.0) +[2023-10-09 06:06:21,052][59242] Avg episode reward: [(0, '29.520'), (1, '30.530')] +[2023-10-09 06:06:21,339][60143] Updated weights for policy 0, policy_version 47342 (0.0011) +[2023-10-09 06:06:21,711][60143] Updated weights for policy 0, policy_version 47352 (0.0009) +[2023-10-09 06:06:24,219][60144] Updated weights for policy 1, policy_version 47912 (0.0009) +[2023-10-09 06:06:24,593][60144] Updated weights for policy 1, policy_version 47922 (0.0007) +[2023-10-09 06:06:24,953][60144] Updated weights for policy 1, policy_version 47932 (0.0011) +[2023-10-09 06:06:25,756][60143] Updated weights for policy 0, policy_version 47362 (0.0009) +[2023-10-09 06:06:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 97583104. Throughput: 0: 1705.4, 1: 1736.1. Samples: 24404204. Policy #0 lag: (min: 27.0, avg: 34.0, max: 59.0) +[2023-10-09 06:06:26,052][59242] Avg episode reward: [(0, '30.420'), (1, '30.490')] +[2023-10-09 06:06:26,120][60143] Updated weights for policy 0, policy_version 47372 (0.0009) +[2023-10-09 06:06:26,496][60143] Updated weights for policy 0, policy_version 47382 (0.0007) +[2023-10-09 06:06:26,862][60143] Updated weights for policy 0, policy_version 47392 (0.0008) +[2023-10-09 06:06:28,984][60144] Updated weights for policy 1, policy_version 47942 (0.0008) +[2023-10-09 06:06:29,358][60144] Updated weights for policy 1, policy_version 47952 (0.0008) +[2023-10-09 06:06:29,729][60144] Updated weights for policy 1, policy_version 47962 (0.0007) +[2023-10-09 06:06:30,914][60143] Updated weights for policy 0, policy_version 47402 (0.0009) +[2023-10-09 06:06:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 97648640. Throughput: 0: 1711.7, 1: 1719.0. Samples: 24424724. Policy #0 lag: (min: 27.0, avg: 34.0, max: 59.0) +[2023-10-09 06:06:31,053][59242] Avg episode reward: [(0, '29.210'), (1, '30.290')] +[2023-10-09 06:06:31,282][60143] Updated weights for policy 0, policy_version 47412 (0.0009) +[2023-10-09 06:06:31,656][60143] Updated weights for policy 0, policy_version 47422 (0.0009) +[2023-10-09 06:06:33,660][60144] Updated weights for policy 1, policy_version 47972 (0.0008) +[2023-10-09 06:06:34,020][60144] Updated weights for policy 1, policy_version 47982 (0.0009) +[2023-10-09 06:06:34,395][60144] Updated weights for policy 1, policy_version 47992 (0.0009) +[2023-10-09 06:06:35,625][60143] Updated weights for policy 0, policy_version 47432 (0.0008) +[2023-10-09 06:06:35,995][60143] Updated weights for policy 0, policy_version 47442 (0.0008) +[2023-10-09 06:06:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 97714176. Throughput: 0: 1712.1, 1: 1748.5. Samples: 24435332. Policy #0 lag: (min: 27.0, avg: 34.0, max: 59.0) +[2023-10-09 06:06:36,053][59242] Avg episode reward: [(0, '30.780'), (1, '30.150')] +[2023-10-09 06:06:36,362][60143] Updated weights for policy 0, policy_version 47452 (0.0008) +[2023-10-09 06:06:38,497][60144] Updated weights for policy 1, policy_version 48002 (0.0010) +[2023-10-09 06:06:38,862][60144] Updated weights for policy 1, policy_version 48012 (0.0008) +[2023-10-09 06:06:39,223][60144] Updated weights for policy 1, policy_version 48022 (0.0009) +[2023-10-09 06:06:39,593][60144] Updated weights for policy 1, policy_version 48032 (0.0009) +[2023-10-09 06:06:40,480][60143] Updated weights for policy 0, policy_version 47462 (0.0009) +[2023-10-09 06:06:40,852][60143] Updated weights for policy 0, policy_version 47472 (0.0009) +[2023-10-09 06:06:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 97779712. Throughput: 0: 1712.5, 1: 1714.7. Samples: 24455312. Policy #0 lag: (min: 27.0, avg: 34.0, max: 59.0) +[2023-10-09 06:06:41,053][59242] Avg episode reward: [(0, '31.460'), (1, '31.510')] +[2023-10-09 06:06:41,228][60143] Updated weights for policy 0, policy_version 47482 (0.0008) +[2023-10-09 06:06:43,553][60144] Updated weights for policy 1, policy_version 48042 (0.0008) +[2023-10-09 06:06:43,919][60144] Updated weights for policy 1, policy_version 48052 (0.0010) +[2023-10-09 06:06:44,287][60144] Updated weights for policy 1, policy_version 48062 (0.0008) +[2023-10-09 06:06:45,172][60143] Updated weights for policy 0, policy_version 47492 (0.0009) +[2023-10-09 06:06:45,548][60143] Updated weights for policy 0, policy_version 47502 (0.0007) +[2023-10-09 06:06:45,920][60143] Updated weights for policy 0, policy_version 47512 (0.0008) +[2023-10-09 06:06:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 97845248. Throughput: 0: 1698.1, 1: 1721.3. Samples: 24475920. Policy #0 lag: (min: 27.0, avg: 34.0, max: 59.0) +[2023-10-09 06:06:46,053][59242] Avg episode reward: [(0, '31.090'), (1, '31.650')] +[2023-10-09 06:06:48,258][60144] Updated weights for policy 1, policy_version 48072 (0.0011) +[2023-10-09 06:06:48,623][60144] Updated weights for policy 1, policy_version 48082 (0.0010) +[2023-10-09 06:06:48,993][60144] Updated weights for policy 1, policy_version 48092 (0.0008) +[2023-10-09 06:06:49,869][60143] Updated weights for policy 0, policy_version 47522 (0.0008) +[2023-10-09 06:06:50,236][60143] Updated weights for policy 0, policy_version 47532 (0.0008) +[2023-10-09 06:06:50,610][60143] Updated weights for policy 0, policy_version 47542 (0.0011) +[2023-10-09 06:06:50,983][60143] Updated weights for policy 0, policy_version 47552 (0.0008) +[2023-10-09 06:06:51,052][59242] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 97943552. Throughput: 0: 1710.7, 1: 1717.3. Samples: 24486250. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 06:06:51,052][59242] Avg episode reward: [(0, '32.120'), (1, '31.060')] +[2023-10-09 06:06:52,979][60144] Updated weights for policy 1, policy_version 48102 (0.0008) +[2023-10-09 06:06:53,342][60144] Updated weights for policy 1, policy_version 48112 (0.0007) +[2023-10-09 06:06:53,701][60144] Updated weights for policy 1, policy_version 48122 (0.0007) +[2023-10-09 06:06:54,954][60143] Updated weights for policy 0, policy_version 47562 (0.0008) +[2023-10-09 06:06:55,328][60143] Updated weights for policy 0, policy_version 47572 (0.0008) +[2023-10-09 06:06:55,693][60143] Updated weights for policy 0, policy_version 47582 (0.0010) +[2023-10-09 06:06:56,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 98009088. Throughput: 0: 1710.8, 1: 1706.3. Samples: 24506878. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 06:06:56,053][59242] Avg episode reward: [(0, '30.770'), (1, '31.190')] +[2023-10-09 06:06:57,723][60144] Updated weights for policy 1, policy_version 48132 (0.0007) +[2023-10-09 06:06:58,094][60144] Updated weights for policy 1, policy_version 48142 (0.0007) +[2023-10-09 06:06:58,465][60144] Updated weights for policy 1, policy_version 48152 (0.0007) +[2023-10-09 06:06:59,748][60143] Updated weights for policy 0, policy_version 47592 (0.0007) +[2023-10-09 06:07:00,117][60143] Updated weights for policy 0, policy_version 47602 (0.0010) +[2023-10-09 06:07:00,481][60143] Updated weights for policy 0, policy_version 47612 (0.0010) +[2023-10-09 06:07:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 98074624. Throughput: 0: 1678.5, 1: 1737.0. Samples: 24526842. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 06:07:01,052][59242] Avg episode reward: [(0, '31.030'), (1, '31.200')] +[2023-10-09 06:07:02,291][60144] Updated weights for policy 1, policy_version 48162 (0.0008) +[2023-10-09 06:07:02,656][60144] Updated weights for policy 1, policy_version 48172 (0.0008) +[2023-10-09 06:07:03,029][60144] Updated weights for policy 1, policy_version 48182 (0.0007) +[2023-10-09 06:07:03,394][60144] Updated weights for policy 1, policy_version 48192 (0.0008) +[2023-10-09 06:07:04,527][60143] Updated weights for policy 0, policy_version 47622 (0.0007) +[2023-10-09 06:07:04,894][60143] Updated weights for policy 0, policy_version 47632 (0.0008) +[2023-10-09 06:07:05,257][60143] Updated weights for policy 0, policy_version 47642 (0.0008) +[2023-10-09 06:07:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 98140160. Throughput: 0: 1704.7, 1: 1703.8. Samples: 24537150. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 06:07:06,053][59242] Avg episode reward: [(0, '31.260'), (1, '31.950')] +[2023-10-09 06:07:07,256][60144] Updated weights for policy 1, policy_version 48202 (0.0009) +[2023-10-09 06:07:07,628][60144] Updated weights for policy 1, policy_version 48212 (0.0009) +[2023-10-09 06:07:07,996][60144] Updated weights for policy 1, policy_version 48222 (0.0009) +[2023-10-09 06:07:09,151][60143] Updated weights for policy 0, policy_version 47652 (0.0009) +[2023-10-09 06:07:09,515][60143] Updated weights for policy 0, policy_version 47662 (0.0007) +[2023-10-09 06:07:09,891][60143] Updated weights for policy 0, policy_version 47672 (0.0007) +[2023-10-09 06:07:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 98205696. Throughput: 0: 1698.0, 1: 1723.5. Samples: 24558168. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 06:07:11,052][59242] Avg episode reward: [(0, '31.440'), (1, '31.820')] +[2023-10-09 06:07:11,911][60144] Updated weights for policy 1, policy_version 48232 (0.0009) +[2023-10-09 06:07:12,278][60144] Updated weights for policy 1, policy_version 48242 (0.0010) +[2023-10-09 06:07:12,657][60144] Updated weights for policy 1, policy_version 48252 (0.0008) +[2023-10-09 06:07:13,997][60143] Updated weights for policy 0, policy_version 47682 (0.0008) +[2023-10-09 06:07:14,372][60143] Updated weights for policy 0, policy_version 47692 (0.0009) +[2023-10-09 06:07:14,745][60143] Updated weights for policy 0, policy_version 47702 (0.0009) +[2023-10-09 06:07:15,113][60143] Updated weights for policy 0, policy_version 47712 (0.0008) +[2023-10-09 06:07:16,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 98271232. Throughput: 0: 1675.3, 1: 1743.3. Samples: 24578560. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 06:07:16,053][59242] Avg episode reward: [(0, '31.260'), (1, '32.080')] +[2023-10-09 06:07:16,581][60144] Updated weights for policy 1, policy_version 48262 (0.0010) +[2023-10-09 06:07:16,964][60144] Updated weights for policy 1, policy_version 48272 (0.0009) +[2023-10-09 06:07:17,331][60144] Updated weights for policy 1, policy_version 48282 (0.0009) +[2023-10-09 06:07:19,145][60143] Updated weights for policy 0, policy_version 47722 (0.0008) +[2023-10-09 06:07:19,510][60143] Updated weights for policy 0, policy_version 47732 (0.0008) +[2023-10-09 06:07:19,888][60143] Updated weights for policy 0, policy_version 47742 (0.0009) +[2023-10-09 06:07:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 98336768. Throughput: 0: 1707.7, 1: 1708.0. Samples: 24589038. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:07:21,052][59242] Avg episode reward: [(0, '31.650'), (1, '31.680')] +[2023-10-09 06:07:21,176][60144] Updated weights for policy 1, policy_version 48292 (0.0007) +[2023-10-09 06:07:21,547][60144] Updated weights for policy 1, policy_version 48302 (0.0008) +[2023-10-09 06:07:21,913][60144] Updated weights for policy 1, policy_version 48312 (0.0009) +[2023-10-09 06:07:23,960][60143] Updated weights for policy 0, policy_version 47752 (0.0009) +[2023-10-09 06:07:24,336][60143] Updated weights for policy 0, policy_version 47762 (0.0007) +[2023-10-09 06:07:24,709][60143] Updated weights for policy 0, policy_version 47772 (0.0008) +[2023-10-09 06:07:25,873][60144] Updated weights for policy 1, policy_version 48322 (0.0008) +[2023-10-09 06:07:26,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 98402304. Throughput: 0: 1688.4, 1: 1739.4. Samples: 24609564. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:07:26,053][59242] Avg episode reward: [(0, '32.220'), (1, '31.630')] +[2023-10-09 06:07:26,234][60144] Updated weights for policy 1, policy_version 48332 (0.0007) +[2023-10-09 06:07:26,587][60144] Updated weights for policy 1, policy_version 48342 (0.0009) +[2023-10-09 06:07:26,954][60144] Updated weights for policy 1, policy_version 48352 (0.0008) +[2023-10-09 06:07:28,533][60143] Updated weights for policy 0, policy_version 47782 (0.0008) +[2023-10-09 06:07:28,913][60143] Updated weights for policy 0, policy_version 47792 (0.0008) +[2023-10-09 06:07:29,285][60143] Updated weights for policy 0, policy_version 47802 (0.0008) +[2023-10-09 06:07:30,754][60144] Updated weights for policy 1, policy_version 48362 (0.0009) +[2023-10-09 06:07:31,052][59242] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 98467840. Throughput: 0: 1692.3, 1: 1737.4. Samples: 24630256. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:07:31,053][59242] Avg episode reward: [(0, '30.710'), (1, '31.070')] +[2023-10-09 06:07:31,066][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000047808_48955392.pth... +[2023-10-09 06:07:31,104][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000046208_47316992.pth +[2023-10-09 06:07:31,109][59934] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p0/milestones/checkpoint_000047808_48955392.pth +[2023-10-09 06:07:31,117][60144] Updated weights for policy 1, policy_version 48372 (0.0008) +[2023-10-09 06:07:31,487][60144] Updated weights for policy 1, policy_version 48382 (0.0009) +[2023-10-09 06:07:31,552][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000048384_49545216.pth... +[2023-10-09 06:07:31,585][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000046752_47874048.pth +[2023-10-09 06:07:31,591][60003] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p1/milestones/checkpoint_000048384_49545216.pth +[2023-10-09 06:07:33,205][60143] Updated weights for policy 0, policy_version 47812 (0.0009) +[2023-10-09 06:07:33,573][60143] Updated weights for policy 0, policy_version 47822 (0.0011) +[2023-10-09 06:07:33,952][60143] Updated weights for policy 0, policy_version 47832 (0.0011) +[2023-10-09 06:07:35,368][60144] Updated weights for policy 1, policy_version 48392 (0.0009) +[2023-10-09 06:07:35,730][60144] Updated weights for policy 1, policy_version 48402 (0.0008) +[2023-10-09 06:07:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 98533376. Throughput: 0: 1699.9, 1: 1728.6. Samples: 24640536. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:07:36,053][59242] Avg episode reward: [(0, '32.530'), (1, '32.820')] +[2023-10-09 06:07:36,090][60144] Updated weights for policy 1, policy_version 48412 (0.0007) +[2023-10-09 06:07:37,938][60143] Updated weights for policy 0, policy_version 47842 (0.0011) +[2023-10-09 06:07:38,303][60143] Updated weights for policy 0, policy_version 47852 (0.0009) +[2023-10-09 06:07:38,686][60143] Updated weights for policy 0, policy_version 47862 (0.0010) +[2023-10-09 06:07:39,044][60143] Updated weights for policy 0, policy_version 47872 (0.0010) +[2023-10-09 06:07:40,060][60144] Updated weights for policy 1, policy_version 48422 (0.0010) +[2023-10-09 06:07:40,440][60144] Updated weights for policy 1, policy_version 48432 (0.0011) +[2023-10-09 06:07:40,809][60144] Updated weights for policy 1, policy_version 48442 (0.0010) +[2023-10-09 06:07:41,052][59242] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 98631680. Throughput: 0: 1677.4, 1: 1752.0. Samples: 24661200. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:07:41,053][59242] Avg episode reward: [(0, '32.700'), (1, '32.080')] +[2023-10-09 06:07:43,133][60143] Updated weights for policy 0, policy_version 47882 (0.0008) +[2023-10-09 06:07:43,506][60143] Updated weights for policy 0, policy_version 47892 (0.0008) +[2023-10-09 06:07:43,880][60143] Updated weights for policy 0, policy_version 47902 (0.0008) +[2023-10-09 06:07:44,616][60144] Updated weights for policy 1, policy_version 48452 (0.0009) +[2023-10-09 06:07:44,986][60144] Updated weights for policy 1, policy_version 48462 (0.0008) +[2023-10-09 06:07:45,360][60144] Updated weights for policy 1, policy_version 48472 (0.0008) +[2023-10-09 06:07:46,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 98697216. Throughput: 0: 1702.0, 1: 1727.7. Samples: 24681178. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:07:46,053][59242] Avg episode reward: [(0, '34.240'), (1, '31.740')] +[2023-10-09 06:07:47,880][60143] Updated weights for policy 0, policy_version 47912 (0.0009) +[2023-10-09 06:07:48,256][60143] Updated weights for policy 0, policy_version 47922 (0.0008) +[2023-10-09 06:07:48,632][60143] Updated weights for policy 0, policy_version 47932 (0.0008) +[2023-10-09 06:07:49,250][60144] Updated weights for policy 1, policy_version 48482 (0.0009) +[2023-10-09 06:07:49,617][60144] Updated weights for policy 1, policy_version 48492 (0.0010) +[2023-10-09 06:07:49,991][60144] Updated weights for policy 1, policy_version 48502 (0.0010) +[2023-10-09 06:07:50,360][60144] Updated weights for policy 1, policy_version 48512 (0.0010) +[2023-10-09 06:07:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 98762752. Throughput: 0: 1684.1, 1: 1756.0. Samples: 24691952. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:07:51,053][59242] Avg episode reward: [(0, '33.050'), (1, '32.630')] +[2023-10-09 06:07:52,686][60143] Updated weights for policy 0, policy_version 47942 (0.0007) +[2023-10-09 06:07:53,069][60143] Updated weights for policy 0, policy_version 47952 (0.0011) +[2023-10-09 06:07:53,435][60143] Updated weights for policy 0, policy_version 47962 (0.0011) +[2023-10-09 06:07:54,312][60144] Updated weights for policy 1, policy_version 48522 (0.0009) +[2023-10-09 06:07:54,676][60144] Updated weights for policy 1, policy_version 48532 (0.0007) +[2023-10-09 06:07:55,048][60144] Updated weights for policy 1, policy_version 48542 (0.0007) +[2023-10-09 06:07:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 98828288. Throughput: 0: 1687.0, 1: 1731.8. Samples: 24712016. Policy #0 lag: (min: 23.0, avg: 33.7, max: 55.0) +[2023-10-09 06:07:56,053][59242] Avg episode reward: [(0, '31.490'), (1, '33.020')] +[2023-10-09 06:07:57,480][60143] Updated weights for policy 0, policy_version 47972 (0.0011) +[2023-10-09 06:07:57,851][60143] Updated weights for policy 0, policy_version 47982 (0.0011) +[2023-10-09 06:07:58,221][60143] Updated weights for policy 0, policy_version 47992 (0.0011) +[2023-10-09 06:07:59,003][60144] Updated weights for policy 1, policy_version 48552 (0.0009) +[2023-10-09 06:07:59,378][60144] Updated weights for policy 1, policy_version 48562 (0.0009) +[2023-10-09 06:07:59,741][60144] Updated weights for policy 1, policy_version 48572 (0.0007) +[2023-10-09 06:08:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 98893824. Throughput: 0: 1705.2, 1: 1714.8. Samples: 24732462. Policy #0 lag: (min: 23.0, avg: 33.7, max: 55.0) +[2023-10-09 06:08:01,053][59242] Avg episode reward: [(0, '30.670'), (1, '33.190')] +[2023-10-09 06:08:02,367][60143] Updated weights for policy 0, policy_version 48002 (0.0010) +[2023-10-09 06:08:02,732][60143] Updated weights for policy 0, policy_version 48012 (0.0007) +[2023-10-09 06:08:03,108][60143] Updated weights for policy 0, policy_version 48022 (0.0008) +[2023-10-09 06:08:03,476][60143] Updated weights for policy 0, policy_version 48032 (0.0009) +[2023-10-09 06:08:03,686][60144] Updated weights for policy 1, policy_version 48582 (0.0008) +[2023-10-09 06:08:04,053][60144] Updated weights for policy 1, policy_version 48592 (0.0010) +[2023-10-09 06:08:04,408][60144] Updated weights for policy 1, policy_version 48602 (0.0007) +[2023-10-09 06:08:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 98959360. Throughput: 0: 1672.9, 1: 1746.8. Samples: 24742924. Policy #0 lag: (min: 23.0, avg: 33.7, max: 55.0) +[2023-10-09 06:08:06,053][59242] Avg episode reward: [(0, '30.900'), (1, '33.440')] +[2023-10-09 06:08:07,266][60143] Updated weights for policy 0, policy_version 48042 (0.0010) +[2023-10-09 06:08:07,620][60143] Updated weights for policy 0, policy_version 48052 (0.0010) +[2023-10-09 06:08:07,989][60143] Updated weights for policy 0, policy_version 48062 (0.0010) +[2023-10-09 06:08:08,347][60144] Updated weights for policy 1, policy_version 48612 (0.0008) +[2023-10-09 06:08:08,717][60144] Updated weights for policy 1, policy_version 48622 (0.0009) +[2023-10-09 06:08:09,075][60144] Updated weights for policy 1, policy_version 48632 (0.0009) +[2023-10-09 06:08:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 99024896. Throughput: 0: 1695.4, 1: 1716.1. Samples: 24763082. Policy #0 lag: (min: 23.0, avg: 33.7, max: 55.0) +[2023-10-09 06:08:11,053][59242] Avg episode reward: [(0, '30.120'), (1, '33.580')] +[2023-10-09 06:08:11,970][60143] Updated weights for policy 0, policy_version 48072 (0.0008) +[2023-10-09 06:08:12,350][60143] Updated weights for policy 0, policy_version 48082 (0.0009) +[2023-10-09 06:08:12,721][60143] Updated weights for policy 0, policy_version 48092 (0.0009) +[2023-10-09 06:08:12,991][60144] Updated weights for policy 1, policy_version 48642 (0.0008) +[2023-10-09 06:08:13,365][60144] Updated weights for policy 1, policy_version 48652 (0.0007) +[2023-10-09 06:08:13,732][60144] Updated weights for policy 1, policy_version 48662 (0.0009) +[2023-10-09 06:08:14,101][60144] Updated weights for policy 1, policy_version 48672 (0.0007) +[2023-10-09 06:08:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 99090432. Throughput: 0: 1703.3, 1: 1715.2. Samples: 24784088. Policy #0 lag: (min: 23.0, avg: 33.7, max: 55.0) +[2023-10-09 06:08:16,053][59242] Avg episode reward: [(0, '28.940'), (1, '33.310')] +[2023-10-09 06:08:16,684][60143] Updated weights for policy 0, policy_version 48102 (0.0009) +[2023-10-09 06:08:17,060][60143] Updated weights for policy 0, policy_version 48112 (0.0007) +[2023-10-09 06:08:17,426][60143] Updated weights for policy 0, policy_version 48122 (0.0007) +[2023-10-09 06:08:18,052][60144] Updated weights for policy 1, policy_version 48682 (0.0011) +[2023-10-09 06:08:18,419][60144] Updated weights for policy 1, policy_version 48692 (0.0010) +[2023-10-09 06:08:18,796][60144] Updated weights for policy 1, policy_version 48702 (0.0008) +[2023-10-09 06:08:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 99155968. Throughput: 0: 1689.8, 1: 1721.8. Samples: 24794060. Policy #0 lag: (min: 23.0, avg: 33.7, max: 55.0) +[2023-10-09 06:08:21,053][59242] Avg episode reward: [(0, '30.380'), (1, '32.430')] +[2023-10-09 06:08:21,455][60143] Updated weights for policy 0, policy_version 48132 (0.0009) +[2023-10-09 06:08:21,821][60143] Updated weights for policy 0, policy_version 48142 (0.0008) +[2023-10-09 06:08:22,189][60143] Updated weights for policy 0, policy_version 48152 (0.0008) +[2023-10-09 06:08:22,827][60144] Updated weights for policy 1, policy_version 48712 (0.0008) +[2023-10-09 06:08:23,192][60144] Updated weights for policy 1, policy_version 48722 (0.0009) +[2023-10-09 06:08:23,570][60144] Updated weights for policy 1, policy_version 48732 (0.0007) +[2023-10-09 06:08:26,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 99221504. Throughput: 0: 1708.2, 1: 1705.9. Samples: 24814834. Policy #0 lag: (min: 23.0, avg: 33.7, max: 55.0) +[2023-10-09 06:08:26,053][59242] Avg episode reward: [(0, '29.890'), (1, '32.890')] +[2023-10-09 06:08:26,489][60143] Updated weights for policy 0, policy_version 48162 (0.0008) +[2023-10-09 06:08:26,857][60143] Updated weights for policy 0, policy_version 48172 (0.0009) +[2023-10-09 06:08:27,227][60143] Updated weights for policy 0, policy_version 48182 (0.0008) +[2023-10-09 06:08:27,593][60143] Updated weights for policy 0, policy_version 48192 (0.0007) +[2023-10-09 06:08:27,623][60144] Updated weights for policy 1, policy_version 48742 (0.0008) +[2023-10-09 06:08:27,999][60144] Updated weights for policy 1, policy_version 48752 (0.0009) +[2023-10-09 06:08:28,364][60144] Updated weights for policy 1, policy_version 48762 (0.0007) +[2023-10-09 06:08:31,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 99287040. Throughput: 0: 1711.6, 1: 1729.0. Samples: 24836006. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:08:31,052][59242] Avg episode reward: [(0, '29.800'), (1, '32.940')] +[2023-10-09 06:08:31,544][60143] Updated weights for policy 0, policy_version 48202 (0.0008) +[2023-10-09 06:08:31,921][60143] Updated weights for policy 0, policy_version 48212 (0.0009) +[2023-10-09 06:08:32,176][60144] Updated weights for policy 1, policy_version 48772 (0.0008) +[2023-10-09 06:08:32,282][60143] Updated weights for policy 0, policy_version 48222 (0.0008) +[2023-10-09 06:08:32,542][60144] Updated weights for policy 1, policy_version 48782 (0.0008) +[2023-10-09 06:08:32,896][60144] Updated weights for policy 1, policy_version 48792 (0.0007) +[2023-10-09 06:08:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 99352576. Throughput: 0: 1705.3, 1: 1704.6. Samples: 24845396. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:08:36,053][59242] Avg episode reward: [(0, '30.610'), (1, '31.460')] +[2023-10-09 06:08:36,297][60143] Updated weights for policy 0, policy_version 48232 (0.0008) +[2023-10-09 06:08:36,663][60143] Updated weights for policy 0, policy_version 48242 (0.0010) +[2023-10-09 06:08:36,873][60144] Updated weights for policy 1, policy_version 48802 (0.0007) +[2023-10-09 06:08:37,030][60143] Updated weights for policy 0, policy_version 48252 (0.0009) +[2023-10-09 06:08:37,233][60144] Updated weights for policy 1, policy_version 48812 (0.0010) +[2023-10-09 06:08:37,604][60144] Updated weights for policy 1, policy_version 48822 (0.0007) +[2023-10-09 06:08:37,967][60144] Updated weights for policy 1, policy_version 48832 (0.0007) +[2023-10-09 06:08:40,993][60143] Updated weights for policy 0, policy_version 48262 (0.0009) +[2023-10-09 06:08:41,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 99418112. Throughput: 0: 1714.4, 1: 1723.0. Samples: 24866698. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:08:41,053][59242] Avg episode reward: [(0, '30.030'), (1, '32.240')] +[2023-10-09 06:08:41,357][60143] Updated weights for policy 0, policy_version 48272 (0.0009) +[2023-10-09 06:08:41,722][60143] Updated weights for policy 0, policy_version 48282 (0.0008) +[2023-10-09 06:08:41,914][60144] Updated weights for policy 1, policy_version 48842 (0.0009) +[2023-10-09 06:08:42,284][60144] Updated weights for policy 1, policy_version 48852 (0.0009) +[2023-10-09 06:08:42,649][60144] Updated weights for policy 1, policy_version 48862 (0.0011) +[2023-10-09 06:08:45,572][60143] Updated weights for policy 0, policy_version 48292 (0.0007) +[2023-10-09 06:08:45,936][60143] Updated weights for policy 0, policy_version 48302 (0.0009) +[2023-10-09 06:08:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 99483648. Throughput: 0: 1714.8, 1: 1737.4. Samples: 24887812. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:08:46,053][59242] Avg episode reward: [(0, '29.620'), (1, '30.910')] +[2023-10-09 06:08:46,308][60143] Updated weights for policy 0, policy_version 48312 (0.0010) +[2023-10-09 06:08:46,588][60144] Updated weights for policy 1, policy_version 48872 (0.0008) +[2023-10-09 06:08:46,950][60144] Updated weights for policy 1, policy_version 48882 (0.0009) +[2023-10-09 06:08:47,321][60144] Updated weights for policy 1, policy_version 48892 (0.0007) +[2023-10-09 06:08:50,090][60143] Updated weights for policy 0, policy_version 48322 (0.0009) +[2023-10-09 06:08:50,456][60143] Updated weights for policy 0, policy_version 48332 (0.0010) +[2023-10-09 06:08:50,828][60143] Updated weights for policy 0, policy_version 48342 (0.0009) +[2023-10-09 06:08:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 99549184. Throughput: 0: 1723.3, 1: 1707.2. Samples: 24897296. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:08:51,052][59242] Avg episode reward: [(0, '30.390'), (1, '31.860')] +[2023-10-09 06:08:51,195][60143] Updated weights for policy 0, policy_version 48352 (0.0008) +[2023-10-09 06:08:51,498][60144] Updated weights for policy 1, policy_version 48902 (0.0010) +[2023-10-09 06:08:51,895][60144] Updated weights for policy 1, policy_version 48912 (0.0011) +[2023-10-09 06:08:52,262][60144] Updated weights for policy 1, policy_version 48922 (0.0010) +[2023-10-09 06:08:55,202][60143] Updated weights for policy 0, policy_version 48362 (0.0007) +[2023-10-09 06:08:55,561][60143] Updated weights for policy 0, policy_version 48372 (0.0008) +[2023-10-09 06:08:55,930][60143] Updated weights for policy 0, policy_version 48382 (0.0009) +[2023-10-09 06:08:56,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 99647488. Throughput: 0: 1722.6, 1: 1731.6. Samples: 24918522. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:08:56,053][59242] Avg episode reward: [(0, '29.770'), (1, '30.730')] +[2023-10-09 06:08:56,200][60144] Updated weights for policy 1, policy_version 48932 (0.0010) +[2023-10-09 06:08:56,558][60144] Updated weights for policy 1, policy_version 48942 (0.0007) +[2023-10-09 06:08:56,923][60144] Updated weights for policy 1, policy_version 48952 (0.0009) +[2023-10-09 06:08:59,941][60143] Updated weights for policy 0, policy_version 48392 (0.0008) +[2023-10-09 06:09:00,308][60143] Updated weights for policy 0, policy_version 48402 (0.0008) +[2023-10-09 06:09:00,683][60143] Updated weights for policy 0, policy_version 48412 (0.0008) +[2023-10-09 06:09:00,853][60144] Updated weights for policy 1, policy_version 48962 (0.0009) +[2023-10-09 06:09:01,052][59242] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 99713024. Throughput: 0: 1707.7, 1: 1739.3. Samples: 24939204. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:09:01,053][59242] Avg episode reward: [(0, '31.870'), (1, '30.520')] +[2023-10-09 06:09:01,213][60144] Updated weights for policy 1, policy_version 48972 (0.0007) +[2023-10-09 06:09:01,579][60144] Updated weights for policy 1, policy_version 48982 (0.0008) +[2023-10-09 06:09:01,945][60144] Updated weights for policy 1, policy_version 48992 (0.0009) +[2023-10-09 06:09:04,508][60143] Updated weights for policy 0, policy_version 48422 (0.0008) +[2023-10-09 06:09:04,883][60143] Updated weights for policy 0, policy_version 48432 (0.0007) +[2023-10-09 06:09:05,254][60143] Updated weights for policy 0, policy_version 48442 (0.0008) +[2023-10-09 06:09:05,627][60144] Updated weights for policy 1, policy_version 49002 (0.0009) +[2023-10-09 06:09:06,007][60144] Updated weights for policy 1, policy_version 49012 (0.0008) +[2023-10-09 06:09:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 99778560. Throughput: 0: 1726.9, 1: 1728.3. Samples: 24949540. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:09:06,053][59242] Avg episode reward: [(0, '31.980'), (1, '30.980')] +[2023-10-09 06:09:06,363][60144] Updated weights for policy 1, policy_version 49022 (0.0009) +[2023-10-09 06:09:09,194][60143] Updated weights for policy 0, policy_version 48452 (0.0008) +[2023-10-09 06:09:09,563][60143] Updated weights for policy 0, policy_version 48462 (0.0008) +[2023-10-09 06:09:09,931][60143] Updated weights for policy 0, policy_version 48472 (0.0007) +[2023-10-09 06:09:10,332][60144] Updated weights for policy 1, policy_version 49032 (0.0009) +[2023-10-09 06:09:10,699][60144] Updated weights for policy 1, policy_version 49042 (0.0007) +[2023-10-09 06:09:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 99844096. Throughput: 0: 1721.2, 1: 1740.4. Samples: 24970608. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:09:11,053][59242] Avg episode reward: [(0, '31.680'), (1, '30.630')] +[2023-10-09 06:09:11,065][60144] Updated weights for policy 1, policy_version 49052 (0.0008) +[2023-10-09 06:09:13,894][60143] Updated weights for policy 0, policy_version 48482 (0.0009) +[2023-10-09 06:09:14,271][60143] Updated weights for policy 0, policy_version 48492 (0.0007) +[2023-10-09 06:09:14,638][60143] Updated weights for policy 0, policy_version 48502 (0.0009) +[2023-10-09 06:09:14,996][60144] Updated weights for policy 1, policy_version 49062 (0.0008) +[2023-10-09 06:09:15,013][60143] Updated weights for policy 0, policy_version 48512 (0.0007) +[2023-10-09 06:09:15,353][60144] Updated weights for policy 1, policy_version 49072 (0.0009) +[2023-10-09 06:09:15,727][60144] Updated weights for policy 1, policy_version 49082 (0.0009) +[2023-10-09 06:09:16,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 99942400. Throughput: 0: 1709.2, 1: 1724.0. Samples: 24990498. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:09:16,053][59242] Avg episode reward: [(0, '31.320'), (1, '30.830')] +[2023-10-09 06:09:19,043][60143] Updated weights for policy 0, policy_version 48522 (0.0009) +[2023-10-09 06:09:19,421][60143] Updated weights for policy 0, policy_version 48532 (0.0010) +[2023-10-09 06:09:19,544][60144] Updated weights for policy 1, policy_version 49092 (0.0010) +[2023-10-09 06:09:19,783][60143] Updated weights for policy 0, policy_version 48542 (0.0009) +[2023-10-09 06:09:19,903][60144] Updated weights for policy 1, policy_version 49102 (0.0010) +[2023-10-09 06:09:20,275][60144] Updated weights for policy 1, policy_version 49112 (0.0007) +[2023-10-09 06:09:21,052][59242] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 100007936. Throughput: 0: 1741.7, 1: 1745.5. Samples: 25002318. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:09:21,052][59242] Avg episode reward: [(0, '30.810'), (1, '31.890')] +[2023-10-09 06:09:23,620][60143] Updated weights for policy 0, policy_version 48552 (0.0010) +[2023-10-09 06:09:23,991][60143] Updated weights for policy 0, policy_version 48562 (0.0011) +[2023-10-09 06:09:24,174][60144] Updated weights for policy 1, policy_version 49122 (0.0007) +[2023-10-09 06:09:24,362][60143] Updated weights for policy 0, policy_version 48572 (0.0007) +[2023-10-09 06:09:24,530][60144] Updated weights for policy 1, policy_version 49132 (0.0007) +[2023-10-09 06:09:24,905][60144] Updated weights for policy 1, policy_version 49142 (0.0007) +[2023-10-09 06:09:25,277][60144] Updated weights for policy 1, policy_version 49152 (0.0007) +[2023-10-09 06:09:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 100073472. Throughput: 0: 1705.9, 1: 1738.5. Samples: 25021692. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:09:26,053][59242] Avg episode reward: [(0, '30.220'), (1, '31.400')] +[2023-10-09 06:09:28,510][60143] Updated weights for policy 0, policy_version 48582 (0.0007) +[2023-10-09 06:09:28,881][60143] Updated weights for policy 0, policy_version 48592 (0.0011) +[2023-10-09 06:09:29,244][60143] Updated weights for policy 0, policy_version 48602 (0.0008) +[2023-10-09 06:09:29,284][60144] Updated weights for policy 1, policy_version 49162 (0.0010) +[2023-10-09 06:09:29,648][60144] Updated weights for policy 1, policy_version 49172 (0.0009) +[2023-10-09 06:09:30,027][60144] Updated weights for policy 1, policy_version 49182 (0.0011) +[2023-10-09 06:09:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 100139008. Throughput: 0: 1708.4, 1: 1718.5. Samples: 25042024. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:09:31,053][59242] Avg episode reward: [(0, '29.710'), (1, '30.670')] +[2023-10-09 06:09:31,064][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000049184_50364416.pth... +[2023-10-09 06:09:31,064][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000048608_49774592.pth... +[2023-10-09 06:09:31,105][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000047008_48136192.pth +[2023-10-09 06:09:31,107][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000047552_48693248.pth +[2023-10-09 06:09:33,249][60143] Updated weights for policy 0, policy_version 48612 (0.0007) +[2023-10-09 06:09:33,623][60143] Updated weights for policy 0, policy_version 48622 (0.0009) +[2023-10-09 06:09:33,940][60144] Updated weights for policy 1, policy_version 49192 (0.0009) +[2023-10-09 06:09:34,000][60143] Updated weights for policy 0, policy_version 48632 (0.0008) +[2023-10-09 06:09:34,300][60144] Updated weights for policy 1, policy_version 49202 (0.0008) +[2023-10-09 06:09:34,667][60144] Updated weights for policy 1, policy_version 49212 (0.0008) +[2023-10-09 06:09:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 100204544. Throughput: 0: 1719.2, 1: 1754.9. Samples: 25053634. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-09 06:09:36,053][59242] Avg episode reward: [(0, '30.970'), (1, '30.710')] +[2023-10-09 06:09:37,958][60143] Updated weights for policy 0, policy_version 48642 (0.0009) +[2023-10-09 06:09:38,338][60143] Updated weights for policy 0, policy_version 48652 (0.0007) +[2023-10-09 06:09:38,708][60144] Updated weights for policy 1, policy_version 49222 (0.0008) +[2023-10-09 06:09:38,709][60143] Updated weights for policy 0, policy_version 48662 (0.0008) +[2023-10-09 06:09:39,086][60144] Updated weights for policy 1, policy_version 49232 (0.0009) +[2023-10-09 06:09:39,087][60143] Updated weights for policy 0, policy_version 48672 (0.0009) +[2023-10-09 06:09:39,453][60144] Updated weights for policy 1, policy_version 49242 (0.0010) +[2023-10-09 06:09:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 100270080. Throughput: 0: 1695.3, 1: 1726.7. Samples: 25072512. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-09 06:09:41,053][59242] Avg episode reward: [(0, '31.500'), (1, '30.590')] +[2023-10-09 06:09:43,216][60144] Updated weights for policy 1, policy_version 49252 (0.0009) +[2023-10-09 06:09:43,321][60143] Updated weights for policy 0, policy_version 48682 (0.0008) +[2023-10-09 06:09:43,579][60144] Updated weights for policy 1, policy_version 49262 (0.0009) +[2023-10-09 06:09:43,679][60143] Updated weights for policy 0, policy_version 48692 (0.0008) +[2023-10-09 06:09:43,945][60144] Updated weights for policy 1, policy_version 49272 (0.0009) +[2023-10-09 06:09:44,055][60143] Updated weights for policy 0, policy_version 48702 (0.0009) +[2023-10-09 06:09:46,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 100335616. Throughput: 0: 1704.6, 1: 1725.1. Samples: 25093540. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-09 06:09:46,052][59242] Avg episode reward: [(0, '30.210'), (1, '30.440')] +[2023-10-09 06:09:48,048][60144] Updated weights for policy 1, policy_version 49282 (0.0009) +[2023-10-09 06:09:48,083][60143] Updated weights for policy 0, policy_version 48712 (0.0007) +[2023-10-09 06:09:48,408][60144] Updated weights for policy 1, policy_version 49292 (0.0009) +[2023-10-09 06:09:48,448][60143] Updated weights for policy 0, policy_version 48722 (0.0007) +[2023-10-09 06:09:48,780][60144] Updated weights for policy 1, policy_version 49302 (0.0008) +[2023-10-09 06:09:48,810][60143] Updated weights for policy 0, policy_version 48732 (0.0008) +[2023-10-09 06:09:49,150][60144] Updated weights for policy 1, policy_version 49312 (0.0009) +[2023-10-09 06:09:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 100401152. Throughput: 0: 1696.0, 1: 1734.8. Samples: 25103924. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-09 06:09:51,052][59242] Avg episode reward: [(0, '29.910'), (1, '32.020')] +[2023-10-09 06:09:52,732][60143] Updated weights for policy 0, policy_version 48742 (0.0008) +[2023-10-09 06:09:53,110][60143] Updated weights for policy 0, policy_version 48752 (0.0008) +[2023-10-09 06:09:53,206][60144] Updated weights for policy 1, policy_version 49322 (0.0009) +[2023-10-09 06:09:53,483][60143] Updated weights for policy 0, policy_version 48762 (0.0009) +[2023-10-09 06:09:53,582][60144] Updated weights for policy 1, policy_version 49332 (0.0008) +[2023-10-09 06:09:53,940][60144] Updated weights for policy 1, policy_version 49342 (0.0010) +[2023-10-09 06:09:56,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 100466688. Throughput: 0: 1691.8, 1: 1714.1. Samples: 25123874. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-09 06:09:56,053][59242] Avg episode reward: [(0, '30.470'), (1, '30.580')] +[2023-10-09 06:09:57,493][60143] Updated weights for policy 0, policy_version 48772 (0.0008) +[2023-10-09 06:09:57,858][60143] Updated weights for policy 0, policy_version 48782 (0.0007) +[2023-10-09 06:09:57,891][60144] Updated weights for policy 1, policy_version 49352 (0.0010) +[2023-10-09 06:09:58,227][60143] Updated weights for policy 0, policy_version 48792 (0.0007) +[2023-10-09 06:09:58,249][60144] Updated weights for policy 1, policy_version 49362 (0.0010) +[2023-10-09 06:09:58,622][60144] Updated weights for policy 1, policy_version 49372 (0.0008) +[2023-10-09 06:10:01,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 100532224. Throughput: 0: 1702.9, 1: 1728.3. Samples: 25144900. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-09 06:10:01,053][59242] Avg episode reward: [(0, '29.970'), (1, '31.590')] +[2023-10-09 06:10:02,236][60143] Updated weights for policy 0, policy_version 48802 (0.0007) +[2023-10-09 06:10:02,617][60143] Updated weights for policy 0, policy_version 48812 (0.0008) +[2023-10-09 06:10:02,659][60144] Updated weights for policy 1, policy_version 49382 (0.0010) +[2023-10-09 06:10:02,985][60143] Updated weights for policy 0, policy_version 48822 (0.0007) +[2023-10-09 06:10:03,025][60144] Updated weights for policy 1, policy_version 49392 (0.0007) +[2023-10-09 06:10:03,368][60143] Updated weights for policy 0, policy_version 48832 (0.0007) +[2023-10-09 06:10:03,392][60144] Updated weights for policy 1, policy_version 49402 (0.0007) +[2023-10-09 06:10:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 100597760. Throughput: 0: 1669.7, 1: 1706.8. Samples: 25154260. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-09 06:10:06,053][59242] Avg episode reward: [(0, '30.700'), (1, '31.550')] +[2023-10-09 06:10:07,409][60143] Updated weights for policy 0, policy_version 48842 (0.0007) +[2023-10-09 06:10:07,414][60144] Updated weights for policy 1, policy_version 49412 (0.0010) +[2023-10-09 06:10:07,777][60143] Updated weights for policy 0, policy_version 48852 (0.0009) +[2023-10-09 06:10:07,785][60144] Updated weights for policy 1, policy_version 49422 (0.0008) +[2023-10-09 06:10:08,144][60143] Updated weights for policy 0, policy_version 48862 (0.0009) +[2023-10-09 06:10:08,159][60144] Updated weights for policy 1, policy_version 49432 (0.0009) +[2023-10-09 06:10:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 100663296. Throughput: 0: 1705.1, 1: 1710.9. Samples: 25175412. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:10:11,053][59242] Avg episode reward: [(0, '31.010'), (1, '31.600')] +[2023-10-09 06:10:11,951][60143] Updated weights for policy 0, policy_version 48872 (0.0010) +[2023-10-09 06:10:11,987][60144] Updated weights for policy 1, policy_version 49442 (0.0010) +[2023-10-09 06:10:12,321][60143] Updated weights for policy 0, policy_version 48882 (0.0007) +[2023-10-09 06:10:12,342][60144] Updated weights for policy 1, policy_version 49452 (0.0007) +[2023-10-09 06:10:12,690][60143] Updated weights for policy 0, policy_version 48892 (0.0009) +[2023-10-09 06:10:12,708][60144] Updated weights for policy 1, policy_version 49462 (0.0007) +[2023-10-09 06:10:13,074][60144] Updated weights for policy 1, policy_version 49472 (0.0007) +[2023-10-09 06:10:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 100728832. Throughput: 0: 1701.7, 1: 1727.9. Samples: 25196358. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:10:16,053][59242] Avg episode reward: [(0, '30.670'), (1, '29.660')] +[2023-10-09 06:10:16,880][60143] Updated weights for policy 0, policy_version 48902 (0.0009) +[2023-10-09 06:10:17,172][60144] Updated weights for policy 1, policy_version 49482 (0.0008) +[2023-10-09 06:10:17,240][60143] Updated weights for policy 0, policy_version 48912 (0.0007) +[2023-10-09 06:10:17,539][60144] Updated weights for policy 1, policy_version 49492 (0.0008) +[2023-10-09 06:10:17,610][60143] Updated weights for policy 0, policy_version 48922 (0.0008) +[2023-10-09 06:10:17,901][60144] Updated weights for policy 1, policy_version 49502 (0.0007) +[2023-10-09 06:10:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 100794368. Throughput: 0: 1684.7, 1: 1692.2. Samples: 25205594. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:10:21,053][59242] Avg episode reward: [(0, '31.070'), (1, '31.940')] +[2023-10-09 06:10:21,646][60143] Updated weights for policy 0, policy_version 48932 (0.0008) +[2023-10-09 06:10:21,886][60144] Updated weights for policy 1, policy_version 49512 (0.0008) +[2023-10-09 06:10:22,020][60143] Updated weights for policy 0, policy_version 48942 (0.0009) +[2023-10-09 06:10:22,245][60144] Updated weights for policy 1, policy_version 49522 (0.0010) +[2023-10-09 06:10:22,388][60143] Updated weights for policy 0, policy_version 48952 (0.0008) +[2023-10-09 06:10:22,614][60144] Updated weights for policy 1, policy_version 49532 (0.0008) +[2023-10-09 06:10:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 100859904. Throughput: 0: 1708.7, 1: 1723.0. Samples: 25226936. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:10:26,052][59242] Avg episode reward: [(0, '30.590'), (1, '31.850')] +[2023-10-09 06:10:26,318][60143] Updated weights for policy 0, policy_version 48962 (0.0008) +[2023-10-09 06:10:26,643][60144] Updated weights for policy 1, policy_version 49542 (0.0007) +[2023-10-09 06:10:26,681][60143] Updated weights for policy 0, policy_version 48972 (0.0007) +[2023-10-09 06:10:27,030][60144] Updated weights for policy 1, policy_version 49552 (0.0008) +[2023-10-09 06:10:27,055][60143] Updated weights for policy 0, policy_version 48982 (0.0007) +[2023-10-09 06:10:27,394][60144] Updated weights for policy 1, policy_version 49562 (0.0008) +[2023-10-09 06:10:27,423][60143] Updated weights for policy 0, policy_version 48992 (0.0008) +[2023-10-09 06:10:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 100925440. Throughput: 0: 1716.8, 1: 1714.7. Samples: 25247956. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:10:31,052][59242] Avg episode reward: [(0, '30.880'), (1, '30.960')] +[2023-10-09 06:10:31,255][60144] Updated weights for policy 1, policy_version 49572 (0.0007) +[2023-10-09 06:10:31,398][60143] Updated weights for policy 0, policy_version 49002 (0.0007) +[2023-10-09 06:10:31,631][60144] Updated weights for policy 1, policy_version 49582 (0.0007) +[2023-10-09 06:10:31,768][60143] Updated weights for policy 0, policy_version 49012 (0.0008) +[2023-10-09 06:10:31,995][60144] Updated weights for policy 1, policy_version 49592 (0.0008) +[2023-10-09 06:10:32,136][60143] Updated weights for policy 0, policy_version 49022 (0.0009) +[2023-10-09 06:10:36,009][60144] Updated weights for policy 1, policy_version 49602 (0.0007) +[2023-10-09 06:10:36,012][60143] Updated weights for policy 0, policy_version 49032 (0.0009) +[2023-10-09 06:10:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 100990976. Throughput: 0: 1701.7, 1: 1704.0. Samples: 25257180. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:10:36,053][59242] Avg episode reward: [(0, '30.680'), (1, '31.450')] +[2023-10-09 06:10:36,375][60144] Updated weights for policy 1, policy_version 49612 (0.0007) +[2023-10-09 06:10:36,387][60143] Updated weights for policy 0, policy_version 49042 (0.0008) +[2023-10-09 06:10:36,747][60144] Updated weights for policy 1, policy_version 49622 (0.0008) +[2023-10-09 06:10:36,756][60143] Updated weights for policy 0, policy_version 49052 (0.0008) +[2023-10-09 06:10:37,107][60144] Updated weights for policy 1, policy_version 49632 (0.0007) +[2023-10-09 06:10:40,676][60143] Updated weights for policy 0, policy_version 49062 (0.0009) +[2023-10-09 06:10:41,040][60143] Updated weights for policy 0, policy_version 49072 (0.0009) +[2023-10-09 06:10:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 101056512. Throughput: 0: 1716.9, 1: 1718.1. Samples: 25278450. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:10:41,052][59242] Avg episode reward: [(0, '30.600'), (1, '30.770')] +[2023-10-09 06:10:41,073][60144] Updated weights for policy 1, policy_version 49642 (0.0009) +[2023-10-09 06:10:41,406][60143] Updated weights for policy 0, policy_version 49082 (0.0008) +[2023-10-09 06:10:41,435][60144] Updated weights for policy 1, policy_version 49652 (0.0007) +[2023-10-09 06:10:41,798][60144] Updated weights for policy 1, policy_version 49662 (0.0010) +[2023-10-09 06:10:45,343][60143] Updated weights for policy 0, policy_version 49092 (0.0009) +[2023-10-09 06:10:45,718][60143] Updated weights for policy 0, policy_version 49102 (0.0009) +[2023-10-09 06:10:45,905][60144] Updated weights for policy 1, policy_version 49672 (0.0009) +[2023-10-09 06:10:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 101122048. Throughput: 0: 1708.8, 1: 1715.9. Samples: 25299012. Policy #0 lag: (min: 9.0, avg: 20.0, max: 41.0) +[2023-10-09 06:10:46,053][59242] Avg episode reward: [(0, '31.440'), (1, '30.920')] +[2023-10-09 06:10:46,093][60143] Updated weights for policy 0, policy_version 49112 (0.0008) +[2023-10-09 06:10:46,270][60144] Updated weights for policy 1, policy_version 49682 (0.0009) +[2023-10-09 06:10:46,638][60144] Updated weights for policy 1, policy_version 49692 (0.0008) +[2023-10-09 06:10:50,087][60143] Updated weights for policy 0, policy_version 49122 (0.0009) +[2023-10-09 06:10:50,452][60143] Updated weights for policy 0, policy_version 49132 (0.0009) +[2023-10-09 06:10:50,518][60144] Updated weights for policy 1, policy_version 49702 (0.0008) +[2023-10-09 06:10:50,814][60143] Updated weights for policy 0, policy_version 49142 (0.0009) +[2023-10-09 06:10:50,877][60144] Updated weights for policy 1, policy_version 49712 (0.0007) +[2023-10-09 06:10:51,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13662.6). Total num frames: 101187584. Throughput: 0: 1717.4, 1: 1711.6. Samples: 25308566. Policy #0 lag: (min: 9.0, avg: 20.0, max: 41.0) +[2023-10-09 06:10:51,053][59242] Avg episode reward: [(0, '31.560'), (1, '29.850')] +[2023-10-09 06:10:51,182][60143] Updated weights for policy 0, policy_version 49152 (0.0009) +[2023-10-09 06:10:51,243][60144] Updated weights for policy 1, policy_version 49722 (0.0008) +[2023-10-09 06:10:55,225][60143] Updated weights for policy 0, policy_version 49162 (0.0009) +[2023-10-09 06:10:55,229][60144] Updated weights for policy 1, policy_version 49732 (0.0009) +[2023-10-09 06:10:55,595][60143] Updated weights for policy 0, policy_version 49172 (0.0007) +[2023-10-09 06:10:55,597][60144] Updated weights for policy 1, policy_version 49742 (0.0008) +[2023-10-09 06:10:55,958][60143] Updated weights for policy 0, policy_version 49182 (0.0007) +[2023-10-09 06:10:55,958][60144] Updated weights for policy 1, policy_version 49752 (0.0008) +[2023-10-09 06:10:56,052][59242] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 101285888. Throughput: 0: 1718.7, 1: 1713.9. Samples: 25329880. Policy #0 lag: (min: 9.0, avg: 20.0, max: 41.0) +[2023-10-09 06:10:56,053][59242] Avg episode reward: [(0, '32.580'), (1, '30.680')] +[2023-10-09 06:10:59,895][60144] Updated weights for policy 1, policy_version 49762 (0.0007) +[2023-10-09 06:11:00,087][60143] Updated weights for policy 0, policy_version 49192 (0.0008) +[2023-10-09 06:11:00,266][60144] Updated weights for policy 1, policy_version 49772 (0.0008) +[2023-10-09 06:11:00,472][60143] Updated weights for policy 0, policy_version 49202 (0.0009) +[2023-10-09 06:11:00,636][60144] Updated weights for policy 1, policy_version 49782 (0.0008) +[2023-10-09 06:11:00,847][60143] Updated weights for policy 0, policy_version 49212 (0.0009) +[2023-10-09 06:11:00,995][60144] Updated weights for policy 1, policy_version 49792 (0.0008) +[2023-10-09 06:11:01,052][59242] Fps is (10 sec: 19661.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 101384192. Throughput: 0: 1701.5, 1: 1698.2. Samples: 25349346. Policy #0 lag: (min: 9.0, avg: 20.0, max: 41.0) +[2023-10-09 06:11:01,052][59242] Avg episode reward: [(0, '31.650'), (1, '32.010')] +[2023-10-09 06:11:04,824][60143] Updated weights for policy 0, policy_version 49222 (0.0009) +[2023-10-09 06:11:05,008][60144] Updated weights for policy 1, policy_version 49802 (0.0008) +[2023-10-09 06:11:05,197][60143] Updated weights for policy 0, policy_version 49232 (0.0008) +[2023-10-09 06:11:05,368][60144] Updated weights for policy 1, policy_version 49812 (0.0008) +[2023-10-09 06:11:05,560][60143] Updated weights for policy 0, policy_version 49242 (0.0008) +[2023-10-09 06:11:05,739][60144] Updated weights for policy 1, policy_version 49822 (0.0008) +[2023-10-09 06:11:06,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 101449728. Throughput: 0: 1717.5, 1: 1720.1. Samples: 25360288. Policy #0 lag: (min: 9.0, avg: 20.0, max: 41.0) +[2023-10-09 06:11:06,053][59242] Avg episode reward: [(0, '32.450'), (1, '32.620')] +[2023-10-09 06:11:09,542][60143] Updated weights for policy 0, policy_version 49252 (0.0009) +[2023-10-09 06:11:09,681][60144] Updated weights for policy 1, policy_version 49832 (0.0007) +[2023-10-09 06:11:09,912][60143] Updated weights for policy 0, policy_version 49262 (0.0008) +[2023-10-09 06:11:10,048][60144] Updated weights for policy 1, policy_version 49842 (0.0007) +[2023-10-09 06:11:10,272][60143] Updated weights for policy 0, policy_version 49272 (0.0008) +[2023-10-09 06:11:10,415][60144] Updated weights for policy 1, policy_version 49852 (0.0009) +[2023-10-09 06:11:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 101515264. Throughput: 0: 1714.5, 1: 1715.8. Samples: 25381300. Policy #0 lag: (min: 9.0, avg: 20.0, max: 41.0) +[2023-10-09 06:11:11,053][59242] Avg episode reward: [(0, '31.570'), (1, '32.140')] +[2023-10-09 06:11:14,236][60143] Updated weights for policy 0, policy_version 49282 (0.0009) +[2023-10-09 06:11:14,416][60144] Updated weights for policy 1, policy_version 49862 (0.0008) +[2023-10-09 06:11:14,600][60143] Updated weights for policy 0, policy_version 49292 (0.0008) +[2023-10-09 06:11:14,790][60144] Updated weights for policy 1, policy_version 49872 (0.0007) +[2023-10-09 06:11:14,963][60143] Updated weights for policy 0, policy_version 49302 (0.0008) +[2023-10-09 06:11:15,158][60144] Updated weights for policy 1, policy_version 49882 (0.0007) +[2023-10-09 06:11:15,334][60143] Updated weights for policy 0, policy_version 49312 (0.0009) +[2023-10-09 06:11:16,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 101580800. Throughput: 0: 1685.6, 1: 1695.7. Samples: 25400114. Policy #0 lag: (min: 6.0, avg: 6.4, max: 20.0) +[2023-10-09 06:11:16,053][59242] Avg episode reward: [(0, '32.570'), (1, '33.710')] +[2023-10-09 06:11:19,165][60144] Updated weights for policy 1, policy_version 49892 (0.0007) +[2023-10-09 06:11:19,388][60143] Updated weights for policy 0, policy_version 49322 (0.0007) +[2023-10-09 06:11:19,524][60144] Updated weights for policy 1, policy_version 49902 (0.0009) +[2023-10-09 06:11:19,758][60143] Updated weights for policy 0, policy_version 49332 (0.0007) +[2023-10-09 06:11:19,899][60144] Updated weights for policy 1, policy_version 49912 (0.0008) +[2023-10-09 06:11:20,126][60143] Updated weights for policy 0, policy_version 49342 (0.0010) +[2023-10-09 06:11:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 101646336. Throughput: 0: 1716.0, 1: 1725.2. Samples: 25412036. Policy #0 lag: (min: 6.0, avg: 6.4, max: 20.0) +[2023-10-09 06:11:21,053][59242] Avg episode reward: [(0, '34.270'), (1, '34.580')] +[2023-10-09 06:11:23,884][60144] Updated weights for policy 1, policy_version 49922 (0.0009) +[2023-10-09 06:11:24,039][60143] Updated weights for policy 0, policy_version 49352 (0.0008) +[2023-10-09 06:11:24,249][60144] Updated weights for policy 1, policy_version 49932 (0.0009) +[2023-10-09 06:11:24,407][60143] Updated weights for policy 0, policy_version 49362 (0.0007) +[2023-10-09 06:11:24,616][60144] Updated weights for policy 1, policy_version 49942 (0.0007) +[2023-10-09 06:11:24,775][60143] Updated weights for policy 0, policy_version 49372 (0.0008) +[2023-10-09 06:11:24,976][60144] Updated weights for policy 1, policy_version 49952 (0.0008) +[2023-10-09 06:11:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 101711872. Throughput: 0: 1693.5, 1: 1707.6. Samples: 25431500. Policy #0 lag: (min: 6.0, avg: 6.4, max: 20.0) +[2023-10-09 06:11:26,053][59242] Avg episode reward: [(0, '32.990'), (1, '32.310')] +[2023-10-09 06:11:28,848][60143] Updated weights for policy 0, policy_version 49382 (0.0008) +[2023-10-09 06:11:29,002][60144] Updated weights for policy 1, policy_version 49962 (0.0008) +[2023-10-09 06:11:29,221][60143] Updated weights for policy 0, policy_version 49392 (0.0008) +[2023-10-09 06:11:29,370][60144] Updated weights for policy 1, policy_version 49972 (0.0009) +[2023-10-09 06:11:29,584][60143] Updated weights for policy 0, policy_version 49402 (0.0009) +[2023-10-09 06:11:29,751][60144] Updated weights for policy 1, policy_version 49982 (0.0008) +[2023-10-09 06:11:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 101777408. Throughput: 0: 1691.1, 1: 1700.4. Samples: 25451630. Policy #0 lag: (min: 6.0, avg: 6.4, max: 20.0) +[2023-10-09 06:11:31,053][59242] Avg episode reward: [(0, '33.040'), (1, '33.290')] +[2023-10-09 06:11:31,063][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000049408_50593792.pth... +[2023-10-09 06:11:31,063][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000049984_51183616.pth... +[2023-10-09 06:11:31,103][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000047808_48955392.pth +[2023-10-09 06:11:31,104][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000048384_49545216.pth +[2023-10-09 06:11:33,659][60143] Updated weights for policy 0, policy_version 49412 (0.0007) +[2023-10-09 06:11:33,686][60144] Updated weights for policy 1, policy_version 49992 (0.0008) +[2023-10-09 06:11:34,027][60143] Updated weights for policy 0, policy_version 49422 (0.0007) +[2023-10-09 06:11:34,052][60144] Updated weights for policy 1, policy_version 50002 (0.0008) +[2023-10-09 06:11:34,401][60143] Updated weights for policy 0, policy_version 49432 (0.0009) +[2023-10-09 06:11:34,425][60144] Updated weights for policy 1, policy_version 50012 (0.0008) +[2023-10-09 06:11:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 101842944. Throughput: 0: 1711.0, 1: 1730.1. Samples: 25463414. Policy #0 lag: (min: 6.0, avg: 6.4, max: 20.0) +[2023-10-09 06:11:36,053][59242] Avg episode reward: [(0, '31.120'), (1, '33.040')] +[2023-10-09 06:11:38,324][60144] Updated weights for policy 1, policy_version 50022 (0.0007) +[2023-10-09 06:11:38,495][60143] Updated weights for policy 0, policy_version 49442 (0.0008) +[2023-10-09 06:11:38,688][60144] Updated weights for policy 1, policy_version 50032 (0.0008) +[2023-10-09 06:11:38,873][60143] Updated weights for policy 0, policy_version 49452 (0.0011) +[2023-10-09 06:11:39,044][60144] Updated weights for policy 1, policy_version 50042 (0.0008) +[2023-10-09 06:11:39,246][60143] Updated weights for policy 0, policy_version 49462 (0.0010) +[2023-10-09 06:11:39,614][60143] Updated weights for policy 0, policy_version 49472 (0.0010) +[2023-10-09 06:11:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 101908480. Throughput: 0: 1678.8, 1: 1704.1. Samples: 25482110. Policy #0 lag: (min: 6.0, avg: 6.4, max: 20.0) +[2023-10-09 06:11:41,053][59242] Avg episode reward: [(0, '30.370'), (1, '33.380')] +[2023-10-09 06:11:42,839][60144] Updated weights for policy 1, policy_version 50052 (0.0008) +[2023-10-09 06:11:43,213][60144] Updated weights for policy 1, policy_version 50062 (0.0008) +[2023-10-09 06:11:43,578][60144] Updated weights for policy 1, policy_version 50072 (0.0009) +[2023-10-09 06:11:43,657][60143] Updated weights for policy 0, policy_version 49482 (0.0007) +[2023-10-09 06:11:44,013][60143] Updated weights for policy 0, policy_version 49492 (0.0007) +[2023-10-09 06:11:44,393][60143] Updated weights for policy 0, policy_version 49502 (0.0008) +[2023-10-09 06:11:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 101974016. Throughput: 0: 1689.8, 1: 1727.9. Samples: 25503142. Policy #0 lag: (min: 6.0, avg: 6.4, max: 20.0) +[2023-10-09 06:11:46,053][59242] Avg episode reward: [(0, '29.960'), (1, '34.200')] +[2023-10-09 06:11:47,355][60144] Updated weights for policy 1, policy_version 50082 (0.0007) +[2023-10-09 06:11:47,730][60144] Updated weights for policy 1, policy_version 50092 (0.0008) +[2023-10-09 06:11:48,087][60144] Updated weights for policy 1, policy_version 50102 (0.0009) +[2023-10-09 06:11:48,460][60144] Updated weights for policy 1, policy_version 50112 (0.0009) +[2023-10-09 06:11:48,491][60143] Updated weights for policy 0, policy_version 49512 (0.0009) +[2023-10-09 06:11:48,858][60143] Updated weights for policy 0, policy_version 49522 (0.0009) +[2023-10-09 06:11:49,242][60143] Updated weights for policy 0, policy_version 49532 (0.0008) +[2023-10-09 06:11:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 102039552. Throughput: 0: 1690.4, 1: 1708.2. Samples: 25513228. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 06:11:51,053][59242] Avg episode reward: [(0, '30.900'), (1, '34.670')] +[2023-10-09 06:11:52,595][60144] Updated weights for policy 1, policy_version 50122 (0.0008) +[2023-10-09 06:11:52,960][60144] Updated weights for policy 1, policy_version 50132 (0.0008) +[2023-10-09 06:11:53,228][60143] Updated weights for policy 0, policy_version 49542 (0.0007) +[2023-10-09 06:11:53,331][60144] Updated weights for policy 1, policy_version 50142 (0.0007) +[2023-10-09 06:11:53,593][60143] Updated weights for policy 0, policy_version 49552 (0.0007) +[2023-10-09 06:11:53,955][60143] Updated weights for policy 0, policy_version 49562 (0.0009) +[2023-10-09 06:11:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 102105088. Throughput: 0: 1669.1, 1: 1713.6. Samples: 25533522. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 06:11:56,053][59242] Avg episode reward: [(0, '30.820'), (1, '35.780')] +[2023-10-09 06:11:56,054][60003] Saving new best policy, reward=35.780! +[2023-10-09 06:11:57,179][60144] Updated weights for policy 1, policy_version 50152 (0.0008) +[2023-10-09 06:11:57,548][60144] Updated weights for policy 1, policy_version 50162 (0.0007) +[2023-10-09 06:11:57,861][60143] Updated weights for policy 0, policy_version 49572 (0.0007) +[2023-10-09 06:11:57,914][60144] Updated weights for policy 1, policy_version 50172 (0.0010) +[2023-10-09 06:11:58,229][60143] Updated weights for policy 0, policy_version 49582 (0.0008) +[2023-10-09 06:11:58,594][60143] Updated weights for policy 0, policy_version 49592 (0.0008) +[2023-10-09 06:12:01,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 102170624. Throughput: 0: 1701.3, 1: 1736.7. Samples: 25554822. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 06:12:01,053][59242] Avg episode reward: [(0, '31.040'), (1, '33.820')] +[2023-10-09 06:12:02,022][60144] Updated weights for policy 1, policy_version 50182 (0.0009) +[2023-10-09 06:12:02,419][60144] Updated weights for policy 1, policy_version 50192 (0.0008) +[2023-10-09 06:12:02,656][60143] Updated weights for policy 0, policy_version 49602 (0.0008) +[2023-10-09 06:12:02,787][60144] Updated weights for policy 1, policy_version 50202 (0.0007) +[2023-10-09 06:12:03,026][60143] Updated weights for policy 0, policy_version 49612 (0.0010) +[2023-10-09 06:12:03,396][60143] Updated weights for policy 0, policy_version 49622 (0.0009) +[2023-10-09 06:12:03,761][60143] Updated weights for policy 0, policy_version 49632 (0.0008) +[2023-10-09 06:12:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 102236160. Throughput: 0: 1677.5, 1: 1703.9. Samples: 25564200. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 06:12:06,053][59242] Avg episode reward: [(0, '30.990'), (1, '34.780')] +[2023-10-09 06:12:06,693][60144] Updated weights for policy 1, policy_version 50212 (0.0008) +[2023-10-09 06:12:07,051][60144] Updated weights for policy 1, policy_version 50222 (0.0010) +[2023-10-09 06:12:07,422][60144] Updated weights for policy 1, policy_version 50232 (0.0007) +[2023-10-09 06:12:07,690][60143] Updated weights for policy 0, policy_version 49642 (0.0008) +[2023-10-09 06:12:08,059][60143] Updated weights for policy 0, policy_version 49652 (0.0010) +[2023-10-09 06:12:08,426][60143] Updated weights for policy 0, policy_version 49662 (0.0010) +[2023-10-09 06:12:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 102301696. Throughput: 0: 1682.4, 1: 1732.2. Samples: 25585156. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 06:12:11,053][59242] Avg episode reward: [(0, '30.480'), (1, '35.210')] +[2023-10-09 06:12:11,301][60144] Updated weights for policy 1, policy_version 50242 (0.0007) +[2023-10-09 06:12:11,673][60144] Updated weights for policy 1, policy_version 50252 (0.0011) +[2023-10-09 06:12:12,032][60144] Updated weights for policy 1, policy_version 50262 (0.0010) +[2023-10-09 06:12:12,408][60144] Updated weights for policy 1, policy_version 50272 (0.0008) +[2023-10-09 06:12:12,426][60143] Updated weights for policy 0, policy_version 49672 (0.0008) +[2023-10-09 06:12:12,791][60143] Updated weights for policy 0, policy_version 49682 (0.0010) +[2023-10-09 06:12:13,171][60143] Updated weights for policy 0, policy_version 49692 (0.0009) +[2023-10-09 06:12:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 102367232. Throughput: 0: 1695.7, 1: 1740.6. Samples: 25606262. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 06:12:16,052][59242] Avg episode reward: [(0, '30.010'), (1, '35.300')] +[2023-10-09 06:12:16,361][60144] Updated weights for policy 1, policy_version 50282 (0.0008) +[2023-10-09 06:12:16,723][60144] Updated weights for policy 1, policy_version 50292 (0.0007) +[2023-10-09 06:12:17,098][60144] Updated weights for policy 1, policy_version 50302 (0.0007) +[2023-10-09 06:12:17,154][60143] Updated weights for policy 0, policy_version 49702 (0.0007) +[2023-10-09 06:12:17,519][60143] Updated weights for policy 0, policy_version 49712 (0.0010) +[2023-10-09 06:12:17,891][60143] Updated weights for policy 0, policy_version 49722 (0.0011) +[2023-10-09 06:12:20,932][60144] Updated weights for policy 1, policy_version 50312 (0.0007) +[2023-10-09 06:12:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 102432768. Throughput: 0: 1668.7, 1: 1715.4. Samples: 25615696. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 06:12:21,053][59242] Avg episode reward: [(0, '30.430'), (1, '35.210')] +[2023-10-09 06:12:21,302][60144] Updated weights for policy 1, policy_version 50322 (0.0008) +[2023-10-09 06:12:21,673][60144] Updated weights for policy 1, policy_version 50332 (0.0009) +[2023-10-09 06:12:21,954][60143] Updated weights for policy 0, policy_version 49732 (0.0008) +[2023-10-09 06:12:22,319][60143] Updated weights for policy 0, policy_version 49742 (0.0008) +[2023-10-09 06:12:22,697][60143] Updated weights for policy 0, policy_version 49752 (0.0008) +[2023-10-09 06:12:25,570][60144] Updated weights for policy 1, policy_version 50342 (0.0009) +[2023-10-09 06:12:25,942][60144] Updated weights for policy 1, policy_version 50352 (0.0009) +[2023-10-09 06:12:26,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 102498304. Throughput: 0: 1701.9, 1: 1739.5. Samples: 25636972. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:12:26,053][59242] Avg episode reward: [(0, '29.500'), (1, '35.150')] +[2023-10-09 06:12:26,309][60144] Updated weights for policy 1, policy_version 50362 (0.0009) +[2023-10-09 06:12:26,757][60143] Updated weights for policy 0, policy_version 49762 (0.0009) +[2023-10-09 06:12:27,126][60143] Updated weights for policy 0, policy_version 49772 (0.0007) +[2023-10-09 06:12:27,491][60143] Updated weights for policy 0, policy_version 49782 (0.0007) +[2023-10-09 06:12:27,858][60143] Updated weights for policy 0, policy_version 49792 (0.0009) +[2023-10-09 06:12:30,155][60144] Updated weights for policy 1, policy_version 50372 (0.0008) +[2023-10-09 06:12:30,530][60144] Updated weights for policy 1, policy_version 50382 (0.0007) +[2023-10-09 06:12:30,886][60144] Updated weights for policy 1, policy_version 50392 (0.0007) +[2023-10-09 06:12:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 102563840. Throughput: 0: 1711.6, 1: 1723.7. Samples: 25657730. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:12:31,052][59242] Avg episode reward: [(0, '29.200'), (1, '33.980')] +[2023-10-09 06:12:31,819][60143] Updated weights for policy 0, policy_version 49802 (0.0009) +[2023-10-09 06:12:32,182][60143] Updated weights for policy 0, policy_version 49812 (0.0009) +[2023-10-09 06:12:32,549][60143] Updated weights for policy 0, policy_version 49822 (0.0008) +[2023-10-09 06:12:34,974][60144] Updated weights for policy 1, policy_version 50402 (0.0007) +[2023-10-09 06:12:35,343][60144] Updated weights for policy 1, policy_version 50412 (0.0010) +[2023-10-09 06:12:35,719][60144] Updated weights for policy 1, policy_version 50422 (0.0008) +[2023-10-09 06:12:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 102629376. Throughput: 0: 1695.7, 1: 1739.1. Samples: 25667794. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:12:36,052][59242] Avg episode reward: [(0, '28.830'), (1, '33.830')] +[2023-10-09 06:12:36,086][60144] Updated weights for policy 1, policy_version 50432 (0.0008) +[2023-10-09 06:12:36,644][60143] Updated weights for policy 0, policy_version 49832 (0.0008) +[2023-10-09 06:12:37,021][60143] Updated weights for policy 0, policy_version 49842 (0.0009) +[2023-10-09 06:12:37,406][60143] Updated weights for policy 0, policy_version 49852 (0.0008) +[2023-10-09 06:12:40,080][60144] Updated weights for policy 1, policy_version 50442 (0.0009) +[2023-10-09 06:12:40,446][60144] Updated weights for policy 1, policy_version 50452 (0.0008) +[2023-10-09 06:12:40,814][60144] Updated weights for policy 1, policy_version 50462 (0.0008) +[2023-10-09 06:12:41,052][59242] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 102727680. Throughput: 0: 1715.5, 1: 1735.7. Samples: 25688826. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:12:41,053][59242] Avg episode reward: [(0, '28.160'), (1, '31.190')] +[2023-10-09 06:12:41,303][60143] Updated weights for policy 0, policy_version 49862 (0.0008) +[2023-10-09 06:12:41,671][60143] Updated weights for policy 0, policy_version 49872 (0.0010) +[2023-10-09 06:12:42,052][60143] Updated weights for policy 0, policy_version 49882 (0.0010) +[2023-10-09 06:12:44,797][60144] Updated weights for policy 1, policy_version 50472 (0.0009) +[2023-10-09 06:12:45,168][60144] Updated weights for policy 1, policy_version 50482 (0.0009) +[2023-10-09 06:12:45,532][60144] Updated weights for policy 1, policy_version 50492 (0.0009) +[2023-10-09 06:12:45,932][60143] Updated weights for policy 0, policy_version 49892 (0.0007) +[2023-10-09 06:12:46,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 102793216. Throughput: 0: 1715.1, 1: 1710.3. Samples: 25708964. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:12:46,052][59242] Avg episode reward: [(0, '28.630'), (1, '32.300')] +[2023-10-09 06:12:46,305][60143] Updated weights for policy 0, policy_version 49902 (0.0008) +[2023-10-09 06:12:46,688][60143] Updated weights for policy 0, policy_version 49912 (0.0008) +[2023-10-09 06:12:49,581][60144] Updated weights for policy 1, policy_version 50502 (0.0009) +[2023-10-09 06:12:49,946][60144] Updated weights for policy 1, policy_version 50512 (0.0008) +[2023-10-09 06:12:50,305][60144] Updated weights for policy 1, policy_version 50522 (0.0008) +[2023-10-09 06:12:50,666][60143] Updated weights for policy 0, policy_version 49922 (0.0009) +[2023-10-09 06:12:51,045][60143] Updated weights for policy 0, policy_version 49932 (0.0007) +[2023-10-09 06:12:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 102858752. Throughput: 0: 1709.2, 1: 1739.0. Samples: 25719368. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:12:51,053][59242] Avg episode reward: [(0, '28.150'), (1, '31.750')] +[2023-10-09 06:12:51,419][60143] Updated weights for policy 0, policy_version 49942 (0.0009) +[2023-10-09 06:12:51,793][60143] Updated weights for policy 0, policy_version 49952 (0.0010) +[2023-10-09 06:12:54,237][60144] Updated weights for policy 1, policy_version 50532 (0.0008) +[2023-10-09 06:12:54,611][60144] Updated weights for policy 1, policy_version 50542 (0.0007) +[2023-10-09 06:12:54,978][60144] Updated weights for policy 1, policy_version 50552 (0.0007) +[2023-10-09 06:12:55,695][60143] Updated weights for policy 0, policy_version 49962 (0.0008) +[2023-10-09 06:12:56,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 102924288. Throughput: 0: 1723.2, 1: 1718.8. Samples: 25740046. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:12:56,053][59242] Avg episode reward: [(0, '29.260'), (1, '32.780')] +[2023-10-09 06:12:56,067][60143] Updated weights for policy 0, policy_version 49972 (0.0008) +[2023-10-09 06:12:56,438][60143] Updated weights for policy 0, policy_version 49982 (0.0007) +[2023-10-09 06:12:58,909][60144] Updated weights for policy 1, policy_version 50562 (0.0007) +[2023-10-09 06:12:59,282][60144] Updated weights for policy 1, policy_version 50572 (0.0009) +[2023-10-09 06:12:59,650][60144] Updated weights for policy 1, policy_version 50582 (0.0009) +[2023-10-09 06:13:00,009][60144] Updated weights for policy 1, policy_version 50592 (0.0007) +[2023-10-09 06:13:00,422][60143] Updated weights for policy 0, policy_version 49992 (0.0009) +[2023-10-09 06:13:00,788][60143] Updated weights for policy 0, policy_version 50002 (0.0007) +[2023-10-09 06:13:01,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 102989824. Throughput: 0: 1720.6, 1: 1704.0. Samples: 25760368. Policy #0 lag: (min: 10.0, avg: 15.5, max: 42.0) +[2023-10-09 06:13:01,053][59242] Avg episode reward: [(0, '29.640'), (1, '32.000')] +[2023-10-09 06:13:01,157][60143] Updated weights for policy 0, policy_version 50012 (0.0007) +[2023-10-09 06:13:03,893][60144] Updated weights for policy 1, policy_version 50602 (0.0010) +[2023-10-09 06:13:04,270][60144] Updated weights for policy 1, policy_version 50612 (0.0009) +[2023-10-09 06:13:04,621][60144] Updated weights for policy 1, policy_version 50622 (0.0010) +[2023-10-09 06:13:05,163][60143] Updated weights for policy 0, policy_version 50022 (0.0010) +[2023-10-09 06:13:05,527][60143] Updated weights for policy 0, policy_version 50032 (0.0010) +[2023-10-09 06:13:05,907][60143] Updated weights for policy 0, policy_version 50042 (0.0009) +[2023-10-09 06:13:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 103055360. Throughput: 0: 1727.1, 1: 1728.8. Samples: 25771212. Policy #0 lag: (min: 10.0, avg: 15.5, max: 42.0) +[2023-10-09 06:13:06,053][59242] Avg episode reward: [(0, '29.030'), (1, '33.280')] +[2023-10-09 06:13:08,705][60144] Updated weights for policy 1, policy_version 50632 (0.0009) +[2023-10-09 06:13:09,068][60144] Updated weights for policy 1, policy_version 50642 (0.0007) +[2023-10-09 06:13:09,435][60144] Updated weights for policy 1, policy_version 50652 (0.0008) +[2023-10-09 06:13:09,954][60143] Updated weights for policy 0, policy_version 50052 (0.0009) +[2023-10-09 06:13:10,324][60143] Updated weights for policy 0, policy_version 50062 (0.0011) +[2023-10-09 06:13:10,704][60143] Updated weights for policy 0, policy_version 50072 (0.0010) +[2023-10-09 06:13:11,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 103153664. Throughput: 0: 1724.9, 1: 1703.9. Samples: 25791268. Policy #0 lag: (min: 10.0, avg: 15.5, max: 42.0) +[2023-10-09 06:13:11,053][59242] Avg episode reward: [(0, '29.440'), (1, '32.650')] +[2023-10-09 06:13:13,390][60144] Updated weights for policy 1, policy_version 50662 (0.0007) +[2023-10-09 06:13:13,772][60144] Updated weights for policy 1, policy_version 50672 (0.0009) +[2023-10-09 06:13:14,131][60144] Updated weights for policy 1, policy_version 50682 (0.0008) +[2023-10-09 06:13:14,655][60143] Updated weights for policy 0, policy_version 50082 (0.0010) +[2023-10-09 06:13:15,015][60143] Updated weights for policy 0, policy_version 50092 (0.0007) +[2023-10-09 06:13:15,384][60143] Updated weights for policy 0, policy_version 50102 (0.0008) +[2023-10-09 06:13:15,750][60143] Updated weights for policy 0, policy_version 50112 (0.0007) +[2023-10-09 06:13:16,052][59242] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 103219200. Throughput: 0: 1703.1, 1: 1714.3. Samples: 25811512. Policy #0 lag: (min: 10.0, avg: 15.5, max: 42.0) +[2023-10-09 06:13:16,052][59242] Avg episode reward: [(0, '30.340'), (1, '33.150')] +[2023-10-09 06:13:17,949][60144] Updated weights for policy 1, policy_version 50692 (0.0010) +[2023-10-09 06:13:18,317][60144] Updated weights for policy 1, policy_version 50702 (0.0007) +[2023-10-09 06:13:18,686][60144] Updated weights for policy 1, policy_version 50712 (0.0008) +[2023-10-09 06:13:19,669][60143] Updated weights for policy 0, policy_version 50122 (0.0007) +[2023-10-09 06:13:20,037][60143] Updated weights for policy 0, policy_version 50132 (0.0007) +[2023-10-09 06:13:20,416][60143] Updated weights for policy 0, policy_version 50142 (0.0009) +[2023-10-09 06:13:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 103284736. Throughput: 0: 1725.9, 1: 1708.0. Samples: 25822322. Policy #0 lag: (min: 10.0, avg: 15.5, max: 42.0) +[2023-10-09 06:13:21,053][59242] Avg episode reward: [(0, '29.200'), (1, '32.420')] +[2023-10-09 06:13:22,584][60144] Updated weights for policy 1, policy_version 50722 (0.0008) +[2023-10-09 06:13:22,957][60144] Updated weights for policy 1, policy_version 50732 (0.0007) +[2023-10-09 06:13:23,313][60144] Updated weights for policy 1, policy_version 50742 (0.0008) +[2023-10-09 06:13:23,681][60144] Updated weights for policy 1, policy_version 50752 (0.0009) +[2023-10-09 06:13:24,568][60143] Updated weights for policy 0, policy_version 50152 (0.0008) +[2023-10-09 06:13:24,942][60143] Updated weights for policy 0, policy_version 50162 (0.0008) +[2023-10-09 06:13:25,320][60143] Updated weights for policy 0, policy_version 50172 (0.0009) +[2023-10-09 06:13:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 103350272. Throughput: 0: 1719.1, 1: 1703.7. Samples: 25842850. Policy #0 lag: (min: 10.0, avg: 15.5, max: 42.0) +[2023-10-09 06:13:26,053][59242] Avg episode reward: [(0, '29.310'), (1, '32.260')] +[2023-10-09 06:13:27,626][60144] Updated weights for policy 1, policy_version 50762 (0.0008) +[2023-10-09 06:13:27,992][60144] Updated weights for policy 1, policy_version 50772 (0.0010) +[2023-10-09 06:13:28,355][60144] Updated weights for policy 1, policy_version 50782 (0.0011) +[2023-10-09 06:13:29,247][60143] Updated weights for policy 0, policy_version 50182 (0.0009) +[2023-10-09 06:13:29,624][60143] Updated weights for policy 0, policy_version 50192 (0.0009) +[2023-10-09 06:13:29,994][60143] Updated weights for policy 0, policy_version 50202 (0.0007) +[2023-10-09 06:13:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 103415808. Throughput: 0: 1689.9, 1: 1736.0. Samples: 25863128. Policy #0 lag: (min: 16.0, avg: 43.8, max: 48.0) +[2023-10-09 06:13:31,053][59242] Avg episode reward: [(0, '28.670'), (1, '32.710')] +[2023-10-09 06:13:31,061][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000050208_51412992.pth... +[2023-10-09 06:13:31,061][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000050784_52002816.pth... +[2023-10-09 06:13:31,091][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000048608_49774592.pth +[2023-10-09 06:13:31,097][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000049184_50364416.pth +[2023-10-09 06:13:32,401][60144] Updated weights for policy 1, policy_version 50792 (0.0008) +[2023-10-09 06:13:32,768][60144] Updated weights for policy 1, policy_version 50802 (0.0010) +[2023-10-09 06:13:33,133][60144] Updated weights for policy 1, policy_version 50812 (0.0007) +[2023-10-09 06:13:34,070][60143] Updated weights for policy 0, policy_version 50212 (0.0008) +[2023-10-09 06:13:34,427][60143] Updated weights for policy 0, policy_version 50222 (0.0008) +[2023-10-09 06:13:34,796][60143] Updated weights for policy 0, policy_version 50232 (0.0008) +[2023-10-09 06:13:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 103481344. Throughput: 0: 1716.3, 1: 1708.4. Samples: 25873480. Policy #0 lag: (min: 16.0, avg: 43.8, max: 48.0) +[2023-10-09 06:13:36,053][59242] Avg episode reward: [(0, '29.880'), (1, '31.780')] +[2023-10-09 06:13:37,032][60144] Updated weights for policy 1, policy_version 50822 (0.0007) +[2023-10-09 06:13:37,386][60144] Updated weights for policy 1, policy_version 50832 (0.0010) +[2023-10-09 06:13:37,755][60144] Updated weights for policy 1, policy_version 50842 (0.0009) +[2023-10-09 06:13:38,877][60143] Updated weights for policy 0, policy_version 50242 (0.0009) +[2023-10-09 06:13:39,248][60143] Updated weights for policy 0, policy_version 50252 (0.0009) +[2023-10-09 06:13:39,624][60143] Updated weights for policy 0, policy_version 50262 (0.0009) +[2023-10-09 06:13:39,993][60143] Updated weights for policy 0, policy_version 50272 (0.0008) +[2023-10-09 06:13:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 103546880. Throughput: 0: 1698.1, 1: 1724.5. Samples: 25894066. Policy #0 lag: (min: 16.0, avg: 43.8, max: 48.0) +[2023-10-09 06:13:41,053][59242] Avg episode reward: [(0, '30.050'), (1, '30.750')] +[2023-10-09 06:13:41,553][60144] Updated weights for policy 1, policy_version 50852 (0.0009) +[2023-10-09 06:13:41,951][60144] Updated weights for policy 1, policy_version 50862 (0.0009) +[2023-10-09 06:13:42,312][60144] Updated weights for policy 1, policy_version 50872 (0.0010) +[2023-10-09 06:13:43,897][60143] Updated weights for policy 0, policy_version 50282 (0.0010) +[2023-10-09 06:13:44,260][60143] Updated weights for policy 0, policy_version 50292 (0.0010) +[2023-10-09 06:13:44,639][60143] Updated weights for policy 0, policy_version 50302 (0.0011) +[2023-10-09 06:13:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 103612416. Throughput: 0: 1688.2, 1: 1742.5. Samples: 25914748. Policy #0 lag: (min: 16.0, avg: 43.8, max: 48.0) +[2023-10-09 06:13:46,052][59242] Avg episode reward: [(0, '30.150'), (1, '31.070')] +[2023-10-09 06:13:46,207][60144] Updated weights for policy 1, policy_version 50882 (0.0009) +[2023-10-09 06:13:46,568][60144] Updated weights for policy 1, policy_version 50892 (0.0007) +[2023-10-09 06:13:46,933][60144] Updated weights for policy 1, policy_version 50902 (0.0009) +[2023-10-09 06:13:47,308][60144] Updated weights for policy 1, policy_version 50912 (0.0008) +[2023-10-09 06:13:48,615][60143] Updated weights for policy 0, policy_version 50312 (0.0008) +[2023-10-09 06:13:48,988][60143] Updated weights for policy 0, policy_version 50322 (0.0007) +[2023-10-09 06:13:49,365][60143] Updated weights for policy 0, policy_version 50332 (0.0008) +[2023-10-09 06:13:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 103677952. Throughput: 0: 1705.8, 1: 1714.7. Samples: 25925136. Policy #0 lag: (min: 16.0, avg: 43.8, max: 48.0) +[2023-10-09 06:13:51,053][59242] Avg episode reward: [(0, '30.590'), (1, '32.190')] +[2023-10-09 06:13:51,278][60144] Updated weights for policy 1, policy_version 50922 (0.0009) +[2023-10-09 06:13:51,644][60144] Updated weights for policy 1, policy_version 50932 (0.0007) +[2023-10-09 06:13:52,010][60144] Updated weights for policy 1, policy_version 50942 (0.0009) +[2023-10-09 06:13:53,192][60143] Updated weights for policy 0, policy_version 50342 (0.0007) +[2023-10-09 06:13:53,555][60143] Updated weights for policy 0, policy_version 50352 (0.0009) +[2023-10-09 06:13:53,932][60143] Updated weights for policy 0, policy_version 50362 (0.0008) +[2023-10-09 06:13:55,920][60144] Updated weights for policy 1, policy_version 50952 (0.0009) +[2023-10-09 06:13:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 103743488. Throughput: 0: 1682.0, 1: 1746.4. Samples: 25945546. Policy #0 lag: (min: 16.0, avg: 43.8, max: 48.0) +[2023-10-09 06:13:56,052][59242] Avg episode reward: [(0, '29.670'), (1, '32.210')] +[2023-10-09 06:13:56,283][60144] Updated weights for policy 1, policy_version 50962 (0.0009) +[2023-10-09 06:13:56,658][60144] Updated weights for policy 1, policy_version 50972 (0.0007) +[2023-10-09 06:13:57,958][60143] Updated weights for policy 0, policy_version 50372 (0.0009) +[2023-10-09 06:13:58,323][60143] Updated weights for policy 0, policy_version 50382 (0.0010) +[2023-10-09 06:13:58,698][60143] Updated weights for policy 0, policy_version 50392 (0.0011) +[2023-10-09 06:14:00,611][60144] Updated weights for policy 1, policy_version 50982 (0.0010) +[2023-10-09 06:14:00,974][60144] Updated weights for policy 1, policy_version 50992 (0.0008) +[2023-10-09 06:14:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 103809024. Throughput: 0: 1704.8, 1: 1742.5. Samples: 25966640. Policy #0 lag: (min: 16.0, avg: 43.8, max: 48.0) +[2023-10-09 06:14:01,053][59242] Avg episode reward: [(0, '30.220'), (1, '33.930')] +[2023-10-09 06:14:01,334][60144] Updated weights for policy 1, policy_version 51002 (0.0008) +[2023-10-09 06:14:02,647][60143] Updated weights for policy 0, policy_version 50402 (0.0010) +[2023-10-09 06:14:03,027][60143] Updated weights for policy 0, policy_version 50412 (0.0009) +[2023-10-09 06:14:03,389][60143] Updated weights for policy 0, policy_version 50422 (0.0009) +[2023-10-09 06:14:03,761][60143] Updated weights for policy 0, policy_version 50432 (0.0008) +[2023-10-09 06:14:05,197][60144] Updated weights for policy 1, policy_version 51012 (0.0007) +[2023-10-09 06:14:05,567][60144] Updated weights for policy 1, policy_version 51022 (0.0007) +[2023-10-09 06:14:05,936][60144] Updated weights for policy 1, policy_version 51032 (0.0008) +[2023-10-09 06:14:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 103874560. Throughput: 0: 1689.8, 1: 1740.2. Samples: 25976672. Policy #0 lag: (min: 31.0, avg: 41.9, max: 63.0) +[2023-10-09 06:14:06,053][59242] Avg episode reward: [(0, '30.600'), (1, '33.510')] +[2023-10-09 06:14:07,790][60143] Updated weights for policy 0, policy_version 50442 (0.0010) +[2023-10-09 06:14:08,169][60143] Updated weights for policy 0, policy_version 50452 (0.0011) +[2023-10-09 06:14:08,548][60143] Updated weights for policy 0, policy_version 50462 (0.0009) +[2023-10-09 06:14:09,891][60144] Updated weights for policy 1, policy_version 51042 (0.0008) +[2023-10-09 06:14:10,255][60144] Updated weights for policy 1, policy_version 51052 (0.0008) +[2023-10-09 06:14:10,621][60144] Updated weights for policy 1, policy_version 51062 (0.0009) +[2023-10-09 06:14:10,992][60144] Updated weights for policy 1, policy_version 51072 (0.0008) +[2023-10-09 06:14:11,052][59242] Fps is (10 sec: 16384.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 103972864. Throughput: 0: 1688.7, 1: 1749.7. Samples: 25997576. Policy #0 lag: (min: 31.0, avg: 41.9, max: 63.0) +[2023-10-09 06:14:11,053][59242] Avg episode reward: [(0, '29.740'), (1, '33.810')] +[2023-10-09 06:14:12,539][60143] Updated weights for policy 0, policy_version 50472 (0.0010) +[2023-10-09 06:14:12,913][60143] Updated weights for policy 0, policy_version 50482 (0.0010) +[2023-10-09 06:14:13,283][60143] Updated weights for policy 0, policy_version 50492 (0.0009) +[2023-10-09 06:14:14,959][60144] Updated weights for policy 1, policy_version 51082 (0.0010) +[2023-10-09 06:14:15,325][60144] Updated weights for policy 1, policy_version 51092 (0.0009) +[2023-10-09 06:14:15,697][60144] Updated weights for policy 1, policy_version 51102 (0.0009) +[2023-10-09 06:14:16,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 104038400. Throughput: 0: 1713.1, 1: 1721.5. Samples: 26017686. Policy #0 lag: (min: 31.0, avg: 41.9, max: 63.0) +[2023-10-09 06:14:16,053][59242] Avg episode reward: [(0, '30.300'), (1, '33.840')] +[2023-10-09 06:14:17,167][60143] Updated weights for policy 0, policy_version 50502 (0.0007) +[2023-10-09 06:14:17,536][60143] Updated weights for policy 0, policy_version 50512 (0.0007) +[2023-10-09 06:14:17,913][60143] Updated weights for policy 0, policy_version 50522 (0.0007) +[2023-10-09 06:14:19,681][60144] Updated weights for policy 1, policy_version 51112 (0.0008) +[2023-10-09 06:14:20,040][60144] Updated weights for policy 1, policy_version 51122 (0.0008) +[2023-10-09 06:14:20,415][60144] Updated weights for policy 1, policy_version 51132 (0.0009) +[2023-10-09 06:14:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 104103936. Throughput: 0: 1691.6, 1: 1749.0. Samples: 26028306. Policy #0 lag: (min: 31.0, avg: 41.9, max: 63.0) +[2023-10-09 06:14:21,053][59242] Avg episode reward: [(0, '29.950'), (1, '33.640')] +[2023-10-09 06:14:21,892][60143] Updated weights for policy 0, policy_version 50532 (0.0009) +[2023-10-09 06:14:22,263][60143] Updated weights for policy 0, policy_version 50542 (0.0009) +[2023-10-09 06:14:22,649][60143] Updated weights for policy 0, policy_version 50552 (0.0007) +[2023-10-09 06:14:24,276][60144] Updated weights for policy 1, policy_version 51142 (0.0009) +[2023-10-09 06:14:24,650][60144] Updated weights for policy 1, policy_version 51152 (0.0009) +[2023-10-09 06:14:25,017][60144] Updated weights for policy 1, policy_version 51162 (0.0009) +[2023-10-09 06:14:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 104169472. Throughput: 0: 1706.8, 1: 1733.1. Samples: 26048858. Policy #0 lag: (min: 31.0, avg: 41.9, max: 63.0) +[2023-10-09 06:14:26,053][59242] Avg episode reward: [(0, '29.580'), (1, '32.250')] +[2023-10-09 06:14:26,675][60143] Updated weights for policy 0, policy_version 50562 (0.0009) +[2023-10-09 06:14:27,031][60143] Updated weights for policy 0, policy_version 50572 (0.0007) +[2023-10-09 06:14:27,398][60143] Updated weights for policy 0, policy_version 50582 (0.0007) +[2023-10-09 06:14:27,766][60143] Updated weights for policy 0, policy_version 50592 (0.0008) +[2023-10-09 06:14:28,962][60144] Updated weights for policy 1, policy_version 51172 (0.0009) +[2023-10-09 06:14:29,375][60144] Updated weights for policy 1, policy_version 51182 (0.0008) +[2023-10-09 06:14:29,747][60144] Updated weights for policy 1, policy_version 51192 (0.0007) +[2023-10-09 06:14:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 104235008. Throughput: 0: 1723.6, 1: 1713.5. Samples: 26069422. Policy #0 lag: (min: 31.0, avg: 41.9, max: 63.0) +[2023-10-09 06:14:31,053][59242] Avg episode reward: [(0, '29.170'), (1, '32.580')] +[2023-10-09 06:14:31,733][60143] Updated weights for policy 0, policy_version 50602 (0.0008) +[2023-10-09 06:14:32,100][60143] Updated weights for policy 0, policy_version 50612 (0.0011) +[2023-10-09 06:14:32,481][60143] Updated weights for policy 0, policy_version 50622 (0.0008) +[2023-10-09 06:14:33,705][60144] Updated weights for policy 1, policy_version 51202 (0.0008) +[2023-10-09 06:14:34,064][60144] Updated weights for policy 1, policy_version 51212 (0.0009) +[2023-10-09 06:14:34,432][60144] Updated weights for policy 1, policy_version 51222 (0.0009) +[2023-10-09 06:14:34,797][60144] Updated weights for policy 1, policy_version 51232 (0.0010) +[2023-10-09 06:14:36,052][59242] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 104300544. Throughput: 0: 1697.4, 1: 1743.9. Samples: 26079996. Policy #0 lag: (min: 31.0, avg: 41.9, max: 63.0) +[2023-10-09 06:14:36,054][59242] Avg episode reward: [(0, '27.980'), (1, '31.650')] +[2023-10-09 06:14:36,403][60143] Updated weights for policy 0, policy_version 50632 (0.0007) +[2023-10-09 06:14:36,778][60143] Updated weights for policy 0, policy_version 50642 (0.0007) +[2023-10-09 06:14:37,144][60143] Updated weights for policy 0, policy_version 50652 (0.0008) +[2023-10-09 06:14:38,603][60144] Updated weights for policy 1, policy_version 51242 (0.0009) +[2023-10-09 06:14:38,964][60144] Updated weights for policy 1, policy_version 51252 (0.0009) +[2023-10-09 06:14:39,339][60144] Updated weights for policy 1, policy_version 51262 (0.0009) +[2023-10-09 06:14:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 104366080. Throughput: 0: 1720.5, 1: 1714.3. Samples: 26100110. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:14:41,053][59242] Avg episode reward: [(0, '27.500'), (1, '32.200')] +[2023-10-09 06:14:41,054][60143] Updated weights for policy 0, policy_version 50662 (0.0008) +[2023-10-09 06:14:41,429][60143] Updated weights for policy 0, policy_version 50672 (0.0010) +[2023-10-09 06:14:41,789][60143] Updated weights for policy 0, policy_version 50682 (0.0008) +[2023-10-09 06:14:43,449][60144] Updated weights for policy 1, policy_version 51272 (0.0009) +[2023-10-09 06:14:43,805][60144] Updated weights for policy 1, policy_version 51282 (0.0009) +[2023-10-09 06:14:44,183][60144] Updated weights for policy 1, policy_version 51292 (0.0010) +[2023-10-09 06:14:45,910][60143] Updated weights for policy 0, policy_version 50692 (0.0007) +[2023-10-09 06:14:46,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 104431616. Throughput: 0: 1716.1, 1: 1722.1. Samples: 26121358. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:14:46,053][59242] Avg episode reward: [(0, '28.860'), (1, '31.360')] +[2023-10-09 06:14:46,274][60143] Updated weights for policy 0, policy_version 50702 (0.0008) +[2023-10-09 06:14:46,649][60143] Updated weights for policy 0, policy_version 50712 (0.0007) +[2023-10-09 06:14:48,103][60144] Updated weights for policy 1, policy_version 51302 (0.0010) +[2023-10-09 06:14:48,465][60144] Updated weights for policy 1, policy_version 51312 (0.0009) +[2023-10-09 06:14:48,827][60144] Updated weights for policy 1, policy_version 51322 (0.0007) +[2023-10-09 06:14:50,693][60143] Updated weights for policy 0, policy_version 50722 (0.0008) +[2023-10-09 06:14:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 104497152. Throughput: 0: 1706.7, 1: 1725.7. Samples: 26131132. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:14:51,052][59242] Avg episode reward: [(0, '27.810'), (1, '32.950')] +[2023-10-09 06:14:51,067][60143] Updated weights for policy 0, policy_version 50732 (0.0007) +[2023-10-09 06:14:51,446][60143] Updated weights for policy 0, policy_version 50742 (0.0008) +[2023-10-09 06:14:51,800][60143] Updated weights for policy 0, policy_version 50752 (0.0008) +[2023-10-09 06:14:52,557][60144] Updated weights for policy 1, policy_version 51332 (0.0008) +[2023-10-09 06:14:52,927][60144] Updated weights for policy 1, policy_version 51342 (0.0011) +[2023-10-09 06:14:53,288][60144] Updated weights for policy 1, policy_version 51352 (0.0008) +[2023-10-09 06:14:55,876][60143] Updated weights for policy 0, policy_version 50762 (0.0008) +[2023-10-09 06:14:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 104562688. Throughput: 0: 1718.7, 1: 1714.1. Samples: 26152054. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:14:56,053][59242] Avg episode reward: [(0, '27.830'), (1, '33.200')] +[2023-10-09 06:14:56,236][60143] Updated weights for policy 0, policy_version 50772 (0.0009) +[2023-10-09 06:14:56,601][60143] Updated weights for policy 0, policy_version 50782 (0.0008) +[2023-10-09 06:14:57,387][60144] Updated weights for policy 1, policy_version 51362 (0.0007) +[2023-10-09 06:14:57,746][60144] Updated weights for policy 1, policy_version 51372 (0.0008) +[2023-10-09 06:14:58,114][60144] Updated weights for policy 1, policy_version 51382 (0.0008) +[2023-10-09 06:14:58,480][60144] Updated weights for policy 1, policy_version 51392 (0.0007) +[2023-10-09 06:15:00,766][60143] Updated weights for policy 0, policy_version 50792 (0.0008) +[2023-10-09 06:15:01,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 104628224. Throughput: 0: 1714.1, 1: 1736.0. Samples: 26172938. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:15:01,053][59242] Avg episode reward: [(0, '27.960'), (1, '32.210')] +[2023-10-09 06:15:01,135][60143] Updated weights for policy 0, policy_version 50802 (0.0008) +[2023-10-09 06:15:01,502][60143] Updated weights for policy 0, policy_version 50812 (0.0007) +[2023-10-09 06:15:02,466][60144] Updated weights for policy 1, policy_version 51402 (0.0010) +[2023-10-09 06:15:02,838][60144] Updated weights for policy 1, policy_version 51412 (0.0007) +[2023-10-09 06:15:03,210][60144] Updated weights for policy 1, policy_version 51422 (0.0011) +[2023-10-09 06:15:05,582][60143] Updated weights for policy 0, policy_version 50822 (0.0010) +[2023-10-09 06:15:05,960][60143] Updated weights for policy 0, policy_version 50832 (0.0010) +[2023-10-09 06:15:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 104693760. Throughput: 0: 1709.7, 1: 1710.5. Samples: 26182218. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:15:06,053][59242] Avg episode reward: [(0, '29.010'), (1, '31.830')] +[2023-10-09 06:15:06,334][60143] Updated weights for policy 0, policy_version 50842 (0.0009) +[2023-10-09 06:15:07,114][60144] Updated weights for policy 1, policy_version 51432 (0.0008) +[2023-10-09 06:15:07,492][60144] Updated weights for policy 1, policy_version 51442 (0.0007) +[2023-10-09 06:15:07,856][60144] Updated weights for policy 1, policy_version 51452 (0.0010) +[2023-10-09 06:15:10,269][60143] Updated weights for policy 0, policy_version 50852 (0.0008) +[2023-10-09 06:15:10,635][60143] Updated weights for policy 0, policy_version 50862 (0.0008) +[2023-10-09 06:15:11,014][60143] Updated weights for policy 0, policy_version 50872 (0.0008) +[2023-10-09 06:15:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 104759296. Throughput: 0: 1710.4, 1: 1720.4. Samples: 26203244. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:15:11,053][59242] Avg episode reward: [(0, '29.370'), (1, '31.090')] +[2023-10-09 06:15:11,745][60144] Updated weights for policy 1, policy_version 51462 (0.0008) +[2023-10-09 06:15:12,105][60144] Updated weights for policy 1, policy_version 51472 (0.0008) +[2023-10-09 06:15:12,479][60144] Updated weights for policy 1, policy_version 51482 (0.0009) +[2023-10-09 06:15:14,938][60143] Updated weights for policy 0, policy_version 50882 (0.0010) +[2023-10-09 06:15:15,311][60143] Updated weights for policy 0, policy_version 50892 (0.0010) +[2023-10-09 06:15:15,668][60143] Updated weights for policy 0, policy_version 50902 (0.0010) +[2023-10-09 06:15:16,042][60143] Updated weights for policy 0, policy_version 50912 (0.0011) +[2023-10-09 06:15:16,052][59242] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 104857600. Throughput: 0: 1690.7, 1: 1740.4. Samples: 26223820. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 06:15:16,052][59242] Avg episode reward: [(0, '29.880'), (1, '28.770')] +[2023-10-09 06:15:16,570][60144] Updated weights for policy 1, policy_version 51492 (0.0008) +[2023-10-09 06:15:16,957][60144] Updated weights for policy 1, policy_version 51502 (0.0010) +[2023-10-09 06:15:17,337][60144] Updated weights for policy 1, policy_version 51512 (0.0010) +[2023-10-09 06:15:19,983][60143] Updated weights for policy 0, policy_version 50922 (0.0007) +[2023-10-09 06:15:20,347][60143] Updated weights for policy 0, policy_version 50932 (0.0007) +[2023-10-09 06:15:20,718][60143] Updated weights for policy 0, policy_version 50942 (0.0008) +[2023-10-09 06:15:21,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 104923136. Throughput: 0: 1708.0, 1: 1706.5. Samples: 26233650. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 06:15:21,053][59242] Avg episode reward: [(0, '30.190'), (1, '28.190')] +[2023-10-09 06:15:21,199][60144] Updated weights for policy 1, policy_version 51522 (0.0008) +[2023-10-09 06:15:21,572][60144] Updated weights for policy 1, policy_version 51532 (0.0010) +[2023-10-09 06:15:21,927][60144] Updated weights for policy 1, policy_version 51542 (0.0008) +[2023-10-09 06:15:22,294][60144] Updated weights for policy 1, policy_version 51552 (0.0007) +[2023-10-09 06:15:24,781][60143] Updated weights for policy 0, policy_version 50952 (0.0008) +[2023-10-09 06:15:25,147][60143] Updated weights for policy 0, policy_version 50962 (0.0008) +[2023-10-09 06:15:25,518][60143] Updated weights for policy 0, policy_version 50972 (0.0008) +[2023-10-09 06:15:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 104988672. Throughput: 0: 1709.2, 1: 1734.2. Samples: 26255066. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 06:15:26,053][59242] Avg episode reward: [(0, '30.360'), (1, '27.860')] +[2023-10-09 06:15:26,118][60144] Updated weights for policy 1, policy_version 51562 (0.0008) +[2023-10-09 06:15:26,493][60144] Updated weights for policy 1, policy_version 51572 (0.0007) +[2023-10-09 06:15:26,851][60144] Updated weights for policy 1, policy_version 51582 (0.0009) +[2023-10-09 06:15:29,554][60143] Updated weights for policy 0, policy_version 50982 (0.0008) +[2023-10-09 06:15:29,928][60143] Updated weights for policy 0, policy_version 50992 (0.0009) +[2023-10-09 06:15:30,304][60143] Updated weights for policy 0, policy_version 51002 (0.0009) +[2023-10-09 06:15:30,611][60144] Updated weights for policy 1, policy_version 51592 (0.0008) +[2023-10-09 06:15:30,979][60144] Updated weights for policy 1, policy_version 51602 (0.0010) +[2023-10-09 06:15:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 105054208. Throughput: 0: 1683.6, 1: 1731.5. Samples: 26275036. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 06:15:31,053][59242] Avg episode reward: [(0, '30.360'), (1, '27.700')] +[2023-10-09 06:15:31,059][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000051008_52232192.pth... +[2023-10-09 06:15:31,095][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000049408_50593792.pth +[2023-10-09 06:15:31,352][60144] Updated weights for policy 1, policy_version 51612 (0.0010) +[2023-10-09 06:15:31,496][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000051616_52854784.pth... +[2023-10-09 06:15:31,537][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000049984_51183616.pth +[2023-10-09 06:15:33,978][60143] Updated weights for policy 0, policy_version 51012 (0.0008) +[2023-10-09 06:15:34,352][60143] Updated weights for policy 0, policy_version 51022 (0.0009) +[2023-10-09 06:15:34,724][60143] Updated weights for policy 0, policy_version 51032 (0.0009) +[2023-10-09 06:15:35,330][60144] Updated weights for policy 1, policy_version 51622 (0.0008) +[2023-10-09 06:15:35,696][60144] Updated weights for policy 1, policy_version 51632 (0.0009) +[2023-10-09 06:15:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 105119744. Throughput: 0: 1715.6, 1: 1729.4. Samples: 26286158. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 06:15:36,052][59242] Avg episode reward: [(0, '30.520'), (1, '27.470')] +[2023-10-09 06:15:36,066][60144] Updated weights for policy 1, policy_version 51642 (0.0008) +[2023-10-09 06:15:38,542][60143] Updated weights for policy 0, policy_version 51042 (0.0008) +[2023-10-09 06:15:38,905][60143] Updated weights for policy 0, policy_version 51052 (0.0009) +[2023-10-09 06:15:39,273][60143] Updated weights for policy 0, policy_version 51062 (0.0011) +[2023-10-09 06:15:39,650][60143] Updated weights for policy 0, policy_version 51072 (0.0007) +[2023-10-09 06:15:40,022][60144] Updated weights for policy 1, policy_version 51652 (0.0009) +[2023-10-09 06:15:40,388][60144] Updated weights for policy 1, policy_version 51662 (0.0011) +[2023-10-09 06:15:40,745][60144] Updated weights for policy 1, policy_version 51672 (0.0009) +[2023-10-09 06:15:41,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 105218048. Throughput: 0: 1691.5, 1: 1740.8. Samples: 26306506. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 06:15:41,053][59242] Avg episode reward: [(0, '31.520'), (1, '27.450')] +[2023-10-09 06:15:43,877][60143] Updated weights for policy 0, policy_version 51082 (0.0011) +[2023-10-09 06:15:44,260][60143] Updated weights for policy 0, policy_version 51092 (0.0009) +[2023-10-09 06:15:44,624][60143] Updated weights for policy 0, policy_version 51102 (0.0007) +[2023-10-09 06:15:44,743][60144] Updated weights for policy 1, policy_version 51682 (0.0008) +[2023-10-09 06:15:45,106][60144] Updated weights for policy 1, policy_version 51692 (0.0009) +[2023-10-09 06:15:45,473][60144] Updated weights for policy 1, policy_version 51702 (0.0009) +[2023-10-09 06:15:45,838][60144] Updated weights for policy 1, policy_version 51712 (0.0008) +[2023-10-09 06:15:46,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 105283584. Throughput: 0: 1688.6, 1: 1722.2. Samples: 26326424. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:15:46,052][59242] Avg episode reward: [(0, '31.970'), (1, '29.970')] +[2023-10-09 06:15:48,725][60143] Updated weights for policy 0, policy_version 51112 (0.0008) +[2023-10-09 06:15:49,096][60143] Updated weights for policy 0, policy_version 51122 (0.0009) +[2023-10-09 06:15:49,470][60143] Updated weights for policy 0, policy_version 51132 (0.0008) +[2023-10-09 06:15:49,834][60144] Updated weights for policy 1, policy_version 51722 (0.0007) +[2023-10-09 06:15:50,213][60144] Updated weights for policy 1, policy_version 51732 (0.0008) +[2023-10-09 06:15:50,579][60144] Updated weights for policy 1, policy_version 51742 (0.0008) +[2023-10-09 06:15:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 105349120. Throughput: 0: 1714.3, 1: 1745.2. Samples: 26337892. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:15:51,052][59242] Avg episode reward: [(0, '31.630'), (1, '30.590')] +[2023-10-09 06:15:53,518][60143] Updated weights for policy 0, policy_version 51142 (0.0008) +[2023-10-09 06:15:53,892][60143] Updated weights for policy 0, policy_version 51152 (0.0009) +[2023-10-09 06:15:54,259][60143] Updated weights for policy 0, policy_version 51162 (0.0008) +[2023-10-09 06:15:54,445][60144] Updated weights for policy 1, policy_version 51752 (0.0009) +[2023-10-09 06:15:54,815][60144] Updated weights for policy 1, policy_version 51762 (0.0010) +[2023-10-09 06:15:55,182][60144] Updated weights for policy 1, policy_version 51772 (0.0010) +[2023-10-09 06:15:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 105414656. Throughput: 0: 1686.4, 1: 1737.3. Samples: 26357314. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:15:56,052][59242] Avg episode reward: [(0, '31.140'), (1, '30.160')] +[2023-10-09 06:15:58,232][60143] Updated weights for policy 0, policy_version 51172 (0.0008) +[2023-10-09 06:15:58,610][60143] Updated weights for policy 0, policy_version 51182 (0.0009) +[2023-10-09 06:15:58,980][60143] Updated weights for policy 0, policy_version 51192 (0.0009) +[2023-10-09 06:15:59,113][60144] Updated weights for policy 1, policy_version 51782 (0.0010) +[2023-10-09 06:15:59,485][60144] Updated weights for policy 1, policy_version 51792 (0.0008) +[2023-10-09 06:15:59,856][60144] Updated weights for policy 1, policy_version 51802 (0.0010) +[2023-10-09 06:16:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 105480192. Throughput: 0: 1697.7, 1: 1718.2. Samples: 26377538. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:16:01,052][59242] Avg episode reward: [(0, '31.020'), (1, '28.710')] +[2023-10-09 06:16:03,109][60143] Updated weights for policy 0, policy_version 51202 (0.0009) +[2023-10-09 06:16:03,479][60143] Updated weights for policy 0, policy_version 51212 (0.0009) +[2023-10-09 06:16:03,848][60143] Updated weights for policy 0, policy_version 51222 (0.0009) +[2023-10-09 06:16:03,880][60144] Updated weights for policy 1, policy_version 51812 (0.0009) +[2023-10-09 06:16:04,213][60143] Updated weights for policy 0, policy_version 51232 (0.0007) +[2023-10-09 06:16:04,289][60144] Updated weights for policy 1, policy_version 51822 (0.0007) +[2023-10-09 06:16:04,660][60144] Updated weights for policy 1, policy_version 51832 (0.0009) +[2023-10-09 06:16:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 105545728. Throughput: 0: 1696.5, 1: 1754.6. Samples: 26388950. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:16:06,052][59242] Avg episode reward: [(0, '32.240'), (1, '27.300')] +[2023-10-09 06:16:08,234][60143] Updated weights for policy 0, policy_version 51242 (0.0009) +[2023-10-09 06:16:08,492][60144] Updated weights for policy 1, policy_version 51842 (0.0009) +[2023-10-09 06:16:08,600][60143] Updated weights for policy 0, policy_version 51252 (0.0009) +[2023-10-09 06:16:08,861][60144] Updated weights for policy 1, policy_version 51852 (0.0008) +[2023-10-09 06:16:08,972][60143] Updated weights for policy 0, policy_version 51262 (0.0010) +[2023-10-09 06:16:09,223][60144] Updated weights for policy 1, policy_version 51862 (0.0008) +[2023-10-09 06:16:09,579][60144] Updated weights for policy 1, policy_version 51872 (0.0008) +[2023-10-09 06:16:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 105611264. Throughput: 0: 1677.6, 1: 1720.7. Samples: 26407990. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:16:11,053][59242] Avg episode reward: [(0, '32.570'), (1, '26.810')] +[2023-10-09 06:16:12,904][60143] Updated weights for policy 0, policy_version 51272 (0.0010) +[2023-10-09 06:16:13,268][60143] Updated weights for policy 0, policy_version 51282 (0.0008) +[2023-10-09 06:16:13,547][60144] Updated weights for policy 1, policy_version 51882 (0.0007) +[2023-10-09 06:16:13,641][60143] Updated weights for policy 0, policy_version 51292 (0.0007) +[2023-10-09 06:16:13,905][60144] Updated weights for policy 1, policy_version 51892 (0.0007) +[2023-10-09 06:16:14,280][60144] Updated weights for policy 1, policy_version 51902 (0.0008) +[2023-10-09 06:16:16,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 105676800. Throughput: 0: 1711.9, 1: 1718.2. Samples: 26429392. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:16:16,053][59242] Avg episode reward: [(0, '31.770'), (1, '28.230')] +[2023-10-09 06:16:17,438][60143] Updated weights for policy 0, policy_version 51302 (0.0007) +[2023-10-09 06:16:17,807][60143] Updated weights for policy 0, policy_version 51312 (0.0008) +[2023-10-09 06:16:18,172][60144] Updated weights for policy 1, policy_version 51912 (0.0009) +[2023-10-09 06:16:18,183][60143] Updated weights for policy 0, policy_version 51322 (0.0008) +[2023-10-09 06:16:18,530][60144] Updated weights for policy 1, policy_version 51922 (0.0007) +[2023-10-09 06:16:18,890][60144] Updated weights for policy 1, policy_version 51932 (0.0007) +[2023-10-09 06:16:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 105742336. Throughput: 0: 1680.0, 1: 1721.9. Samples: 26439244. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:16:21,053][59242] Avg episode reward: [(0, '33.040'), (1, '28.370')] +[2023-10-09 06:16:22,213][60143] Updated weights for policy 0, policy_version 51332 (0.0008) +[2023-10-09 06:16:22,584][60143] Updated weights for policy 0, policy_version 51342 (0.0010) +[2023-10-09 06:16:22,755][60144] Updated weights for policy 1, policy_version 51942 (0.0008) +[2023-10-09 06:16:22,945][60143] Updated weights for policy 0, policy_version 51352 (0.0009) +[2023-10-09 06:16:23,115][60144] Updated weights for policy 1, policy_version 51952 (0.0008) +[2023-10-09 06:16:23,485][60144] Updated weights for policy 1, policy_version 51962 (0.0008) +[2023-10-09 06:16:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 105807872. Throughput: 0: 1704.6, 1: 1709.3. Samples: 26460134. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:16:26,053][59242] Avg episode reward: [(0, '31.560'), (1, '29.400')] +[2023-10-09 06:16:26,770][60143] Updated weights for policy 0, policy_version 51362 (0.0007) +[2023-10-09 06:16:27,145][60143] Updated weights for policy 0, policy_version 51372 (0.0008) +[2023-10-09 06:16:27,447][60144] Updated weights for policy 1, policy_version 51972 (0.0010) +[2023-10-09 06:16:27,512][60143] Updated weights for policy 0, policy_version 51382 (0.0008) +[2023-10-09 06:16:27,814][60144] Updated weights for policy 1, policy_version 51982 (0.0008) +[2023-10-09 06:16:27,875][60143] Updated weights for policy 0, policy_version 51392 (0.0008) +[2023-10-09 06:16:28,177][60144] Updated weights for policy 1, policy_version 51992 (0.0008) +[2023-10-09 06:16:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 105873408. Throughput: 0: 1717.7, 1: 1730.1. Samples: 26481574. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:16:31,053][59242] Avg episode reward: [(0, '31.330'), (1, '29.420')] +[2023-10-09 06:16:31,980][60143] Updated weights for policy 0, policy_version 51402 (0.0007) +[2023-10-09 06:16:32,092][60144] Updated weights for policy 1, policy_version 52002 (0.0008) +[2023-10-09 06:16:32,342][60143] Updated weights for policy 0, policy_version 51412 (0.0008) +[2023-10-09 06:16:32,460][60144] Updated weights for policy 1, policy_version 52012 (0.0009) +[2023-10-09 06:16:32,707][60143] Updated weights for policy 0, policy_version 51422 (0.0007) +[2023-10-09 06:16:32,821][60144] Updated weights for policy 1, policy_version 52022 (0.0007) +[2023-10-09 06:16:33,192][60144] Updated weights for policy 1, policy_version 52032 (0.0007) +[2023-10-09 06:16:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 105938944. Throughput: 0: 1696.9, 1: 1706.9. Samples: 26491064. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:16:36,053][59242] Avg episode reward: [(0, '30.390'), (1, '30.250')] +[2023-10-09 06:16:36,861][60143] Updated weights for policy 0, policy_version 51432 (0.0007) +[2023-10-09 06:16:37,112][60144] Updated weights for policy 1, policy_version 52042 (0.0008) +[2023-10-09 06:16:37,225][60143] Updated weights for policy 0, policy_version 51442 (0.0008) +[2023-10-09 06:16:37,476][60144] Updated weights for policy 1, policy_version 52052 (0.0007) +[2023-10-09 06:16:37,589][60143] Updated weights for policy 0, policy_version 51452 (0.0008) +[2023-10-09 06:16:37,842][60144] Updated weights for policy 1, policy_version 52062 (0.0010) +[2023-10-09 06:16:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 106004480. Throughput: 0: 1722.0, 1: 1720.2. Samples: 26512216. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:16:41,052][59242] Avg episode reward: [(0, '31.280'), (1, '30.950')] +[2023-10-09 06:16:41,561][60143] Updated weights for policy 0, policy_version 51462 (0.0007) +[2023-10-09 06:16:41,779][60144] Updated weights for policy 1, policy_version 52072 (0.0008) +[2023-10-09 06:16:41,921][60143] Updated weights for policy 0, policy_version 51472 (0.0009) +[2023-10-09 06:16:42,142][60144] Updated weights for policy 1, policy_version 52082 (0.0008) +[2023-10-09 06:16:42,293][60143] Updated weights for policy 0, policy_version 51482 (0.0008) +[2023-10-09 06:16:42,510][60144] Updated weights for policy 1, policy_version 52092 (0.0009) +[2023-10-09 06:16:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13662.6). Total num frames: 106070016. Throughput: 0: 1725.1, 1: 1737.9. Samples: 26533376. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:16:46,053][59242] Avg episode reward: [(0, '31.060'), (1, '32.710')] +[2023-10-09 06:16:46,317][60143] Updated weights for policy 0, policy_version 51492 (0.0007) +[2023-10-09 06:16:46,636][60144] Updated weights for policy 1, policy_version 52102 (0.0009) +[2023-10-09 06:16:46,688][60143] Updated weights for policy 0, policy_version 51502 (0.0007) +[2023-10-09 06:16:47,003][60144] Updated weights for policy 1, policy_version 52112 (0.0007) +[2023-10-09 06:16:47,053][60143] Updated weights for policy 0, policy_version 51512 (0.0007) +[2023-10-09 06:16:47,376][60144] Updated weights for policy 1, policy_version 52122 (0.0008) +[2023-10-09 06:16:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 106135552. Throughput: 0: 1710.1, 1: 1707.4. Samples: 26542738. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:16:51,053][59242] Avg episode reward: [(0, '30.480'), (1, '32.970')] +[2023-10-09 06:16:51,147][60143] Updated weights for policy 0, policy_version 51522 (0.0008) +[2023-10-09 06:16:51,339][60144] Updated weights for policy 1, policy_version 52132 (0.0008) +[2023-10-09 06:16:51,525][60143] Updated weights for policy 0, policy_version 51532 (0.0009) +[2023-10-09 06:16:51,712][60144] Updated weights for policy 1, policy_version 52142 (0.0009) +[2023-10-09 06:16:51,894][60143] Updated weights for policy 0, policy_version 51542 (0.0008) +[2023-10-09 06:16:52,068][60144] Updated weights for policy 1, policy_version 52152 (0.0007) +[2023-10-09 06:16:52,265][60143] Updated weights for policy 0, policy_version 51552 (0.0007) +[2023-10-09 06:16:56,051][60144] Updated weights for policy 1, policy_version 52162 (0.0009) +[2023-10-09 06:16:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 106201088. Throughput: 0: 1724.0, 1: 1734.8. Samples: 26563640. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:16:56,053][59242] Avg episode reward: [(0, '30.730'), (1, '32.740')] +[2023-10-09 06:16:56,237][60143] Updated weights for policy 0, policy_version 51562 (0.0009) +[2023-10-09 06:16:56,426][60144] Updated weights for policy 1, policy_version 52172 (0.0009) +[2023-10-09 06:16:56,605][60143] Updated weights for policy 0, policy_version 51572 (0.0008) +[2023-10-09 06:16:56,790][60144] Updated weights for policy 1, policy_version 52182 (0.0007) +[2023-10-09 06:16:56,968][60143] Updated weights for policy 0, policy_version 51582 (0.0007) +[2023-10-09 06:16:57,149][60144] Updated weights for policy 1, policy_version 52192 (0.0007) +[2023-10-09 06:17:01,036][60143] Updated weights for policy 0, policy_version 51592 (0.0009) +[2023-10-09 06:17:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 106266624. Throughput: 0: 1716.8, 1: 1737.2. Samples: 26584820. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:17:01,053][59242] Avg episode reward: [(0, '30.590'), (1, '33.160')] +[2023-10-09 06:17:01,137][60144] Updated weights for policy 1, policy_version 52202 (0.0009) +[2023-10-09 06:17:01,401][60143] Updated weights for policy 0, policy_version 51602 (0.0008) +[2023-10-09 06:17:01,509][60144] Updated weights for policy 1, policy_version 52212 (0.0007) +[2023-10-09 06:17:01,769][60143] Updated weights for policy 0, policy_version 51612 (0.0009) +[2023-10-09 06:17:01,867][60144] Updated weights for policy 1, policy_version 52222 (0.0008) +[2023-10-09 06:17:05,615][60143] Updated weights for policy 0, policy_version 51622 (0.0009) +[2023-10-09 06:17:05,943][60144] Updated weights for policy 1, policy_version 52232 (0.0008) +[2023-10-09 06:17:05,977][60143] Updated weights for policy 0, policy_version 51632 (0.0008) +[2023-10-09 06:17:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 106332160. Throughput: 0: 1716.5, 1: 1723.2. Samples: 26594030. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:17:06,053][59242] Avg episode reward: [(0, '29.520'), (1, '34.130')] +[2023-10-09 06:17:06,318][60144] Updated weights for policy 1, policy_version 52242 (0.0007) +[2023-10-09 06:17:06,342][60143] Updated weights for policy 0, policy_version 51642 (0.0007) +[2023-10-09 06:17:06,690][60144] Updated weights for policy 1, policy_version 52252 (0.0008) +[2023-10-09 06:17:10,452][60143] Updated weights for policy 0, policy_version 51652 (0.0009) +[2023-10-09 06:17:10,651][60144] Updated weights for policy 1, policy_version 52262 (0.0009) +[2023-10-09 06:17:10,818][60143] Updated weights for policy 0, policy_version 51662 (0.0008) +[2023-10-09 06:17:11,023][60144] Updated weights for policy 1, policy_version 52272 (0.0008) +[2023-10-09 06:17:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 106397696. Throughput: 0: 1713.3, 1: 1729.8. Samples: 26615072. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:17:11,053][59242] Avg episode reward: [(0, '29.290'), (1, '33.330')] +[2023-10-09 06:17:11,189][60143] Updated weights for policy 0, policy_version 51672 (0.0008) +[2023-10-09 06:17:11,384][60144] Updated weights for policy 1, policy_version 52282 (0.0007) +[2023-10-09 06:17:15,310][60144] Updated weights for policy 1, policy_version 52292 (0.0008) +[2023-10-09 06:17:15,332][60143] Updated weights for policy 0, policy_version 51682 (0.0008) +[2023-10-09 06:17:15,664][60144] Updated weights for policy 1, policy_version 52302 (0.0009) +[2023-10-09 06:17:15,692][60143] Updated weights for policy 0, policy_version 51692 (0.0008) +[2023-10-09 06:17:16,035][60144] Updated weights for policy 1, policy_version 52312 (0.0007) +[2023-10-09 06:17:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 106463232. Throughput: 0: 1696.1, 1: 1718.6. Samples: 26635234. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:17:16,052][59242] Avg episode reward: [(0, '30.130'), (1, '32.950')] +[2023-10-09 06:17:16,061][60143] Updated weights for policy 0, policy_version 51702 (0.0008) +[2023-10-09 06:17:16,434][60143] Updated weights for policy 0, policy_version 51712 (0.0008) +[2023-10-09 06:17:19,816][60144] Updated weights for policy 1, policy_version 52322 (0.0008) +[2023-10-09 06:17:20,179][60144] Updated weights for policy 1, policy_version 52332 (0.0008) +[2023-10-09 06:17:20,497][60143] Updated weights for policy 0, policy_version 51722 (0.0008) +[2023-10-09 06:17:20,544][60144] Updated weights for policy 1, policy_version 52342 (0.0007) +[2023-10-09 06:17:20,861][60143] Updated weights for policy 0, policy_version 51732 (0.0008) +[2023-10-09 06:17:20,913][60144] Updated weights for policy 1, policy_version 52352 (0.0008) +[2023-10-09 06:17:21,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 106561536. Throughput: 0: 1698.0, 1: 1730.3. Samples: 26645334. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:17:21,053][59242] Avg episode reward: [(0, '31.020'), (1, '32.610')] +[2023-10-09 06:17:21,243][60143] Updated weights for policy 0, policy_version 51742 (0.0010) +[2023-10-09 06:17:25,047][60144] Updated weights for policy 1, policy_version 52362 (0.0007) +[2023-10-09 06:17:25,354][60143] Updated weights for policy 0, policy_version 51752 (0.0008) +[2023-10-09 06:17:25,413][60144] Updated weights for policy 1, policy_version 52372 (0.0009) +[2023-10-09 06:17:25,741][60143] Updated weights for policy 0, policy_version 51762 (0.0008) +[2023-10-09 06:17:25,777][60144] Updated weights for policy 1, policy_version 52382 (0.0008) +[2023-10-09 06:17:26,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 106627072. Throughput: 0: 1702.2, 1: 1725.4. Samples: 26666460. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:17:26,053][59242] Avg episode reward: [(0, '31.510'), (1, '33.090')] +[2023-10-09 06:17:26,112][60143] Updated weights for policy 0, policy_version 51772 (0.0008) +[2023-10-09 06:17:29,792][60144] Updated weights for policy 1, policy_version 52392 (0.0009) +[2023-10-09 06:17:29,857][60143] Updated weights for policy 0, policy_version 51782 (0.0009) +[2023-10-09 06:17:30,162][60144] Updated weights for policy 1, policy_version 52402 (0.0008) +[2023-10-09 06:17:30,224][60143] Updated weights for policy 0, policy_version 51792 (0.0009) +[2023-10-09 06:17:30,521][60144] Updated weights for policy 1, policy_version 52412 (0.0007) +[2023-10-09 06:17:30,590][60143] Updated weights for policy 0, policy_version 51802 (0.0008) +[2023-10-09 06:17:31,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 106725376. Throughput: 0: 1676.0, 1: 1699.2. Samples: 26685256. Policy #0 lag: (min: 14.0, avg: 16.7, max: 46.0) +[2023-10-09 06:17:31,053][59242] Avg episode reward: [(0, '30.700'), (1, '34.140')] +[2023-10-09 06:17:31,060][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000051808_53051392.pth... +[2023-10-09 06:17:31,060][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000052416_53673984.pth... +[2023-10-09 06:17:31,095][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000050784_52002816.pth +[2023-10-09 06:17:31,098][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000050208_51412992.pth +[2023-10-09 06:17:34,439][60144] Updated weights for policy 1, policy_version 52422 (0.0009) +[2023-10-09 06:17:34,709][60143] Updated weights for policy 0, policy_version 51812 (0.0009) +[2023-10-09 06:17:34,800][60144] Updated weights for policy 1, policy_version 52432 (0.0007) +[2023-10-09 06:17:35,083][60143] Updated weights for policy 0, policy_version 51822 (0.0008) +[2023-10-09 06:17:35,169][60144] Updated weights for policy 1, policy_version 52442 (0.0008) +[2023-10-09 06:17:35,462][60143] Updated weights for policy 0, policy_version 51832 (0.0009) +[2023-10-09 06:17:36,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 106790912. Throughput: 0: 1698.4, 1: 1721.4. Samples: 26696628. Policy #0 lag: (min: 14.0, avg: 16.7, max: 46.0) +[2023-10-09 06:17:36,053][59242] Avg episode reward: [(0, '30.260'), (1, '33.840')] +[2023-10-09 06:17:39,226][60144] Updated weights for policy 1, policy_version 52452 (0.0008) +[2023-10-09 06:17:39,343][60143] Updated weights for policy 0, policy_version 51842 (0.0010) +[2023-10-09 06:17:39,614][60144] Updated weights for policy 1, policy_version 52462 (0.0008) +[2023-10-09 06:17:39,725][60143] Updated weights for policy 0, policy_version 51852 (0.0007) +[2023-10-09 06:17:39,987][60144] Updated weights for policy 1, policy_version 52472 (0.0008) +[2023-10-09 06:17:40,089][60143] Updated weights for policy 0, policy_version 51862 (0.0009) +[2023-10-09 06:17:40,457][60143] Updated weights for policy 0, policy_version 51872 (0.0010) +[2023-10-09 06:17:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 106856448. Throughput: 0: 1695.6, 1: 1708.2. Samples: 26716810. Policy #0 lag: (min: 14.0, avg: 16.7, max: 46.0) +[2023-10-09 06:17:41,053][59242] Avg episode reward: [(0, '29.750'), (1, '32.960')] +[2023-10-09 06:17:43,948][60144] Updated weights for policy 1, policy_version 52482 (0.0008) +[2023-10-09 06:17:44,310][60144] Updated weights for policy 1, policy_version 52492 (0.0008) +[2023-10-09 06:17:44,592][60143] Updated weights for policy 0, policy_version 51882 (0.0009) +[2023-10-09 06:17:44,678][60144] Updated weights for policy 1, policy_version 52502 (0.0009) +[2023-10-09 06:17:44,957][60143] Updated weights for policy 0, policy_version 51892 (0.0008) +[2023-10-09 06:17:45,042][60144] Updated weights for policy 1, policy_version 52512 (0.0009) +[2023-10-09 06:17:45,329][60143] Updated weights for policy 0, policy_version 51902 (0.0009) +[2023-10-09 06:17:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 106921984. Throughput: 0: 1666.1, 1: 1689.7. Samples: 26735832. Policy #0 lag: (min: 14.0, avg: 16.7, max: 46.0) +[2023-10-09 06:17:46,053][59242] Avg episode reward: [(0, '30.380'), (1, '33.600')] +[2023-10-09 06:17:49,112][60144] Updated weights for policy 1, policy_version 52522 (0.0009) +[2023-10-09 06:17:49,367][60143] Updated weights for policy 0, policy_version 51912 (0.0007) +[2023-10-09 06:17:49,483][60144] Updated weights for policy 1, policy_version 52532 (0.0007) +[2023-10-09 06:17:49,733][60143] Updated weights for policy 0, policy_version 51922 (0.0009) +[2023-10-09 06:17:49,847][60144] Updated weights for policy 1, policy_version 52542 (0.0008) +[2023-10-09 06:17:50,114][60143] Updated weights for policy 0, policy_version 51932 (0.0008) +[2023-10-09 06:17:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 106987520. Throughput: 0: 1696.5, 1: 1718.9. Samples: 26747724. Policy #0 lag: (min: 14.0, avg: 16.7, max: 46.0) +[2023-10-09 06:17:51,052][59242] Avg episode reward: [(0, '28.960'), (1, '32.680')] +[2023-10-09 06:17:53,715][60144] Updated weights for policy 1, policy_version 52552 (0.0008) +[2023-10-09 06:17:54,080][60144] Updated weights for policy 1, policy_version 52562 (0.0009) +[2023-10-09 06:17:54,206][60143] Updated weights for policy 0, policy_version 51942 (0.0008) +[2023-10-09 06:17:54,443][60144] Updated weights for policy 1, policy_version 52572 (0.0008) +[2023-10-09 06:17:54,574][60143] Updated weights for policy 0, policy_version 51952 (0.0008) +[2023-10-09 06:17:54,940][60143] Updated weights for policy 0, policy_version 51962 (0.0009) +[2023-10-09 06:17:56,052][59242] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 107053056. Throughput: 0: 1682.0, 1: 1694.8. Samples: 26767028. Policy #0 lag: (min: 14.0, avg: 16.7, max: 46.0) +[2023-10-09 06:17:56,052][59242] Avg episode reward: [(0, '30.080'), (1, '32.500')] +[2023-10-09 06:17:58,436][60144] Updated weights for policy 1, policy_version 52582 (0.0011) +[2023-10-09 06:17:58,803][60144] Updated weights for policy 1, policy_version 52592 (0.0009) +[2023-10-09 06:17:59,094][60143] Updated weights for policy 0, policy_version 51972 (0.0008) +[2023-10-09 06:17:59,169][60144] Updated weights for policy 1, policy_version 52602 (0.0009) +[2023-10-09 06:17:59,475][60143] Updated weights for policy 0, policy_version 51982 (0.0010) +[2023-10-09 06:17:59,841][60143] Updated weights for policy 0, policy_version 51992 (0.0008) +[2023-10-09 06:18:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 107118592. Throughput: 0: 1676.8, 1: 1707.9. Samples: 26787550. Policy #0 lag: (min: 18.0, avg: 28.7, max: 50.0) +[2023-10-09 06:18:01,053][59242] Avg episode reward: [(0, '31.300'), (1, '33.250')] +[2023-10-09 06:18:03,051][60144] Updated weights for policy 1, policy_version 52612 (0.0008) +[2023-10-09 06:18:03,416][60144] Updated weights for policy 1, policy_version 52622 (0.0008) +[2023-10-09 06:18:03,765][60143] Updated weights for policy 0, policy_version 52002 (0.0007) +[2023-10-09 06:18:03,780][60144] Updated weights for policy 1, policy_version 52632 (0.0008) +[2023-10-09 06:18:04,141][60143] Updated weights for policy 0, policy_version 52012 (0.0008) +[2023-10-09 06:18:04,515][60143] Updated weights for policy 0, policy_version 52022 (0.0009) +[2023-10-09 06:18:04,880][60143] Updated weights for policy 0, policy_version 52032 (0.0009) +[2023-10-09 06:18:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 107184128. Throughput: 0: 1700.8, 1: 1713.2. Samples: 26798966. Policy #0 lag: (min: 18.0, avg: 28.7, max: 50.0) +[2023-10-09 06:18:06,052][59242] Avg episode reward: [(0, '30.980'), (1, '32.340')] +[2023-10-09 06:18:07,846][60144] Updated weights for policy 1, policy_version 52642 (0.0009) +[2023-10-09 06:18:08,204][60144] Updated weights for policy 1, policy_version 52652 (0.0008) +[2023-10-09 06:18:08,576][60144] Updated weights for policy 1, policy_version 52662 (0.0008) +[2023-10-09 06:18:08,863][60143] Updated weights for policy 0, policy_version 52042 (0.0008) +[2023-10-09 06:18:08,942][60144] Updated weights for policy 1, policy_version 52672 (0.0007) +[2023-10-09 06:18:09,243][60143] Updated weights for policy 0, policy_version 52052 (0.0009) +[2023-10-09 06:18:09,616][60143] Updated weights for policy 0, policy_version 52062 (0.0009) +[2023-10-09 06:18:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 107249664. Throughput: 0: 1677.4, 1: 1698.7. Samples: 26818382. Policy #0 lag: (min: 18.0, avg: 28.7, max: 50.0) +[2023-10-09 06:18:11,052][59242] Avg episode reward: [(0, '31.860'), (1, '31.930')] +[2023-10-09 06:18:12,891][60144] Updated weights for policy 1, policy_version 52682 (0.0007) +[2023-10-09 06:18:13,260][60144] Updated weights for policy 1, policy_version 52692 (0.0007) +[2023-10-09 06:18:13,623][60144] Updated weights for policy 1, policy_version 52702 (0.0008) +[2023-10-09 06:18:13,725][60143] Updated weights for policy 0, policy_version 52072 (0.0008) +[2023-10-09 06:18:14,095][60143] Updated weights for policy 0, policy_version 52082 (0.0009) +[2023-10-09 06:18:14,466][60143] Updated weights for policy 0, policy_version 52092 (0.0008) +[2023-10-09 06:18:16,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 107315200. Throughput: 0: 1695.0, 1: 1726.5. Samples: 26839222. Policy #0 lag: (min: 18.0, avg: 28.7, max: 50.0) +[2023-10-09 06:18:16,053][59242] Avg episode reward: [(0, '32.070'), (1, '31.700')] +[2023-10-09 06:18:17,562][60144] Updated weights for policy 1, policy_version 52712 (0.0007) +[2023-10-09 06:18:17,932][60144] Updated weights for policy 1, policy_version 52722 (0.0008) +[2023-10-09 06:18:18,264][60143] Updated weights for policy 0, policy_version 52102 (0.0007) +[2023-10-09 06:18:18,301][60144] Updated weights for policy 1, policy_version 52732 (0.0007) +[2023-10-09 06:18:18,632][60143] Updated weights for policy 0, policy_version 52112 (0.0009) +[2023-10-09 06:18:19,002][60143] Updated weights for policy 0, policy_version 52122 (0.0008) +[2023-10-09 06:18:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 107380736. Throughput: 0: 1692.0, 1: 1703.7. Samples: 26849436. Policy #0 lag: (min: 18.0, avg: 28.7, max: 50.0) +[2023-10-09 06:18:21,053][59242] Avg episode reward: [(0, '32.210'), (1, '33.550')] +[2023-10-09 06:18:22,281][60144] Updated weights for policy 1, policy_version 52742 (0.0008) +[2023-10-09 06:18:22,644][60144] Updated weights for policy 1, policy_version 52752 (0.0008) +[2023-10-09 06:18:22,986][60143] Updated weights for policy 0, policy_version 52132 (0.0008) +[2023-10-09 06:18:23,013][60144] Updated weights for policy 1, policy_version 52762 (0.0008) +[2023-10-09 06:18:23,347][60143] Updated weights for policy 0, policy_version 52142 (0.0007) +[2023-10-09 06:18:23,722][60143] Updated weights for policy 0, policy_version 52152 (0.0008) +[2023-10-09 06:18:26,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 107446272. Throughput: 0: 1681.3, 1: 1718.1. Samples: 26869784. Policy #0 lag: (min: 18.0, avg: 28.7, max: 50.0) +[2023-10-09 06:18:26,053][59242] Avg episode reward: [(0, '31.920'), (1, '32.850')] +[2023-10-09 06:18:27,070][60144] Updated weights for policy 1, policy_version 52772 (0.0008) +[2023-10-09 06:18:27,465][60144] Updated weights for policy 1, policy_version 52782 (0.0010) +[2023-10-09 06:18:27,644][60143] Updated weights for policy 0, policy_version 52162 (0.0011) +[2023-10-09 06:18:27,832][60144] Updated weights for policy 1, policy_version 52792 (0.0009) +[2023-10-09 06:18:28,014][60143] Updated weights for policy 0, policy_version 52172 (0.0007) +[2023-10-09 06:18:28,382][60143] Updated weights for policy 0, policy_version 52182 (0.0009) +[2023-10-09 06:18:28,749][60143] Updated weights for policy 0, policy_version 52192 (0.0010) +[2023-10-09 06:18:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 107511808. Throughput: 0: 1712.1, 1: 1729.3. Samples: 26890694. Policy #0 lag: (min: 18.0, avg: 28.7, max: 50.0) +[2023-10-09 06:18:31,053][59242] Avg episode reward: [(0, '32.260'), (1, '32.190')] +[2023-10-09 06:18:31,707][60144] Updated weights for policy 1, policy_version 52802 (0.0010) +[2023-10-09 06:18:32,071][60144] Updated weights for policy 1, policy_version 52812 (0.0011) +[2023-10-09 06:18:32,439][60144] Updated weights for policy 1, policy_version 52822 (0.0008) +[2023-10-09 06:18:32,672][60143] Updated weights for policy 0, policy_version 52202 (0.0007) +[2023-10-09 06:18:32,798][60144] Updated weights for policy 1, policy_version 52832 (0.0008) +[2023-10-09 06:18:33,043][60143] Updated weights for policy 0, policy_version 52212 (0.0007) +[2023-10-09 06:18:33,414][60143] Updated weights for policy 0, policy_version 52222 (0.0008) +[2023-10-09 06:18:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 107577344. Throughput: 0: 1681.9, 1: 1700.0. Samples: 26899910. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 06:18:36,053][59242] Avg episode reward: [(0, '32.530'), (1, '31.410')] +[2023-10-09 06:18:36,784][60144] Updated weights for policy 1, policy_version 52842 (0.0008) +[2023-10-09 06:18:37,152][60144] Updated weights for policy 1, policy_version 52852 (0.0010) +[2023-10-09 06:18:37,393][60143] Updated weights for policy 0, policy_version 52232 (0.0008) +[2023-10-09 06:18:37,523][60144] Updated weights for policy 1, policy_version 52862 (0.0007) +[2023-10-09 06:18:37,771][60143] Updated weights for policy 0, policy_version 52242 (0.0007) +[2023-10-09 06:18:38,134][60143] Updated weights for policy 0, policy_version 52252 (0.0007) +[2023-10-09 06:18:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 107642880. Throughput: 0: 1698.3, 1: 1728.6. Samples: 26921240. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 06:18:41,053][59242] Avg episode reward: [(0, '33.790'), (1, '32.660')] +[2023-10-09 06:18:41,464][60144] Updated weights for policy 1, policy_version 52872 (0.0007) +[2023-10-09 06:18:41,815][60144] Updated weights for policy 1, policy_version 52882 (0.0008) +[2023-10-09 06:18:42,182][60144] Updated weights for policy 1, policy_version 52892 (0.0008) +[2023-10-09 06:18:42,191][60143] Updated weights for policy 0, policy_version 52262 (0.0008) +[2023-10-09 06:18:42,563][60143] Updated weights for policy 0, policy_version 52272 (0.0008) +[2023-10-09 06:18:42,926][60143] Updated weights for policy 0, policy_version 52282 (0.0008) +[2023-10-09 06:18:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 107708416. Throughput: 0: 1710.1, 1: 1727.0. Samples: 26942220. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 06:18:46,053][59242] Avg episode reward: [(0, '34.840'), (1, '31.370')] +[2023-10-09 06:18:46,062][59934] Saving new best policy, reward=34.840! +[2023-10-09 06:18:46,148][60144] Updated weights for policy 1, policy_version 52902 (0.0011) +[2023-10-09 06:18:46,514][60144] Updated weights for policy 1, policy_version 52912 (0.0011) +[2023-10-09 06:18:46,882][60144] Updated weights for policy 1, policy_version 52922 (0.0009) +[2023-10-09 06:18:46,978][60143] Updated weights for policy 0, policy_version 52292 (0.0009) +[2023-10-09 06:18:47,345][60143] Updated weights for policy 0, policy_version 52302 (0.0008) +[2023-10-09 06:18:47,719][60143] Updated weights for policy 0, policy_version 52312 (0.0009) +[2023-10-09 06:18:50,967][60144] Updated weights for policy 1, policy_version 52932 (0.0007) +[2023-10-09 06:18:51,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 107773952. Throughput: 0: 1680.9, 1: 1709.7. Samples: 26951544. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 06:18:51,052][59242] Avg episode reward: [(0, '34.690'), (1, '32.500')] +[2023-10-09 06:18:51,338][60144] Updated weights for policy 1, policy_version 52942 (0.0008) +[2023-10-09 06:18:51,697][60144] Updated weights for policy 1, policy_version 52952 (0.0009) +[2023-10-09 06:18:51,748][60143] Updated weights for policy 0, policy_version 52322 (0.0009) +[2023-10-09 06:18:52,124][60143] Updated weights for policy 0, policy_version 52332 (0.0008) +[2023-10-09 06:18:52,497][60143] Updated weights for policy 0, policy_version 52342 (0.0008) +[2023-10-09 06:18:52,864][60143] Updated weights for policy 0, policy_version 52352 (0.0009) +[2023-10-09 06:18:55,717][60144] Updated weights for policy 1, policy_version 52962 (0.0008) +[2023-10-09 06:18:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 107839488. Throughput: 0: 1701.4, 1: 1724.7. Samples: 26972556. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 06:18:56,053][59242] Avg episode reward: [(0, '34.410'), (1, '32.820')] +[2023-10-09 06:18:56,093][60144] Updated weights for policy 1, policy_version 52972 (0.0007) +[2023-10-09 06:18:56,461][60144] Updated weights for policy 1, policy_version 52982 (0.0007) +[2023-10-09 06:18:56,833][60144] Updated weights for policy 1, policy_version 52992 (0.0008) +[2023-10-09 06:18:56,836][60143] Updated weights for policy 0, policy_version 52362 (0.0007) +[2023-10-09 06:18:57,206][60143] Updated weights for policy 0, policy_version 52372 (0.0008) +[2023-10-09 06:18:57,575][60143] Updated weights for policy 0, policy_version 52382 (0.0008) +[2023-10-09 06:19:00,808][60144] Updated weights for policy 1, policy_version 53002 (0.0007) +[2023-10-09 06:19:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 107905024. Throughput: 0: 1711.2, 1: 1717.1. Samples: 26993496. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 06:19:01,053][59242] Avg episode reward: [(0, '35.390'), (1, '33.020')] +[2023-10-09 06:19:01,061][59934] Saving new best policy, reward=35.390! +[2023-10-09 06:19:01,175][60144] Updated weights for policy 1, policy_version 53012 (0.0010) +[2023-10-09 06:19:01,543][60144] Updated weights for policy 1, policy_version 53022 (0.0009) +[2023-10-09 06:19:01,641][60143] Updated weights for policy 0, policy_version 52392 (0.0008) +[2023-10-09 06:19:02,012][60143] Updated weights for policy 0, policy_version 52402 (0.0009) +[2023-10-09 06:19:02,380][60143] Updated weights for policy 0, policy_version 52412 (0.0009) +[2023-10-09 06:19:05,559][60144] Updated weights for policy 1, policy_version 53032 (0.0009) +[2023-10-09 06:19:05,928][60144] Updated weights for policy 1, policy_version 53042 (0.0008) +[2023-10-09 06:19:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 107970560. Throughput: 0: 1688.9, 1: 1717.4. Samples: 27002716. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 06:19:06,052][59242] Avg episode reward: [(0, '34.810'), (1, '33.630')] +[2023-10-09 06:19:06,216][60143] Updated weights for policy 0, policy_version 52422 (0.0008) +[2023-10-09 06:19:06,287][60144] Updated weights for policy 1, policy_version 53052 (0.0008) +[2023-10-09 06:19:06,591][60143] Updated weights for policy 0, policy_version 52432 (0.0007) +[2023-10-09 06:19:06,970][60143] Updated weights for policy 0, policy_version 52442 (0.0009) +[2023-10-09 06:19:10,393][60144] Updated weights for policy 1, policy_version 53062 (0.0009) +[2023-10-09 06:19:10,763][60144] Updated weights for policy 1, policy_version 53072 (0.0009) +[2023-10-09 06:19:10,811][60143] Updated weights for policy 0, policy_version 52452 (0.0010) +[2023-10-09 06:19:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 108036096. Throughput: 0: 1710.5, 1: 1712.9. Samples: 27023834. Policy #0 lag: (min: 24.0, avg: 48.3, max: 56.0) +[2023-10-09 06:19:11,053][59242] Avg episode reward: [(0, '34.610'), (1, '33.660')] +[2023-10-09 06:19:11,131][60144] Updated weights for policy 1, policy_version 53082 (0.0008) +[2023-10-09 06:19:11,183][60143] Updated weights for policy 0, policy_version 52462 (0.0009) +[2023-10-09 06:19:11,547][60143] Updated weights for policy 0, policy_version 52472 (0.0008) +[2023-10-09 06:19:15,093][60144] Updated weights for policy 1, policy_version 53092 (0.0008) +[2023-10-09 06:19:15,432][60143] Updated weights for policy 0, policy_version 52482 (0.0009) +[2023-10-09 06:19:15,487][60144] Updated weights for policy 1, policy_version 53102 (0.0010) +[2023-10-09 06:19:15,789][60143] Updated weights for policy 0, policy_version 52492 (0.0008) +[2023-10-09 06:19:15,861][60144] Updated weights for policy 1, policy_version 53112 (0.0008) +[2023-10-09 06:19:16,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 108101632. Throughput: 0: 1713.9, 1: 1700.0. Samples: 27044322. Policy #0 lag: (min: 24.0, avg: 48.3, max: 56.0) +[2023-10-09 06:19:16,053][59242] Avg episode reward: [(0, '34.510'), (1, '33.300')] +[2023-10-09 06:19:16,153][60143] Updated weights for policy 0, policy_version 52502 (0.0007) +[2023-10-09 06:19:16,528][60143] Updated weights for policy 0, policy_version 52512 (0.0009) +[2023-10-09 06:19:19,843][60144] Updated weights for policy 1, policy_version 53122 (0.0009) +[2023-10-09 06:19:20,210][60144] Updated weights for policy 1, policy_version 53132 (0.0008) +[2023-10-09 06:19:20,546][60143] Updated weights for policy 0, policy_version 52522 (0.0008) +[2023-10-09 06:19:20,579][60144] Updated weights for policy 1, policy_version 53142 (0.0008) +[2023-10-09 06:19:20,910][60143] Updated weights for policy 0, policy_version 52532 (0.0007) +[2023-10-09 06:19:20,943][60144] Updated weights for policy 1, policy_version 53152 (0.0008) +[2023-10-09 06:19:21,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 108199936. Throughput: 0: 1715.8, 1: 1718.5. Samples: 27054454. Policy #0 lag: (min: 24.0, avg: 48.3, max: 56.0) +[2023-10-09 06:19:21,052][59242] Avg episode reward: [(0, '34.570'), (1, '32.860')] +[2023-10-09 06:19:21,278][60143] Updated weights for policy 0, policy_version 52542 (0.0010) +[2023-10-09 06:19:24,777][60144] Updated weights for policy 1, policy_version 53162 (0.0007) +[2023-10-09 06:19:25,136][60144] Updated weights for policy 1, policy_version 53172 (0.0008) +[2023-10-09 06:19:25,306][60143] Updated weights for policy 0, policy_version 52552 (0.0010) +[2023-10-09 06:19:25,510][60144] Updated weights for policy 1, policy_version 53182 (0.0008) +[2023-10-09 06:19:25,673][60143] Updated weights for policy 0, policy_version 52562 (0.0009) +[2023-10-09 06:19:26,044][60143] Updated weights for policy 0, policy_version 52572 (0.0009) +[2023-10-09 06:19:26,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 108265472. Throughput: 0: 1717.6, 1: 1713.7. Samples: 27075650. Policy #0 lag: (min: 24.0, avg: 48.3, max: 56.0) +[2023-10-09 06:19:26,052][59242] Avg episode reward: [(0, '33.280'), (1, '33.400')] +[2023-10-09 06:19:29,393][60144] Updated weights for policy 1, policy_version 53192 (0.0010) +[2023-10-09 06:19:29,759][60144] Updated weights for policy 1, policy_version 53202 (0.0009) +[2023-10-09 06:19:29,972][60143] Updated weights for policy 0, policy_version 52582 (0.0008) +[2023-10-09 06:19:30,124][60144] Updated weights for policy 1, policy_version 53212 (0.0008) +[2023-10-09 06:19:30,341][60143] Updated weights for policy 0, policy_version 52592 (0.0010) +[2023-10-09 06:19:30,714][60143] Updated weights for policy 0, policy_version 52602 (0.0008) +[2023-10-09 06:19:31,052][59242] Fps is (10 sec: 16383.4, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 108363776. Throughput: 0: 1707.9, 1: 1688.4. Samples: 27095056. Policy #0 lag: (min: 24.0, avg: 48.3, max: 56.0) +[2023-10-09 06:19:31,053][59242] Avg episode reward: [(0, '32.440'), (1, '32.850')] +[2023-10-09 06:19:31,062][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000052608_53870592.pth... +[2023-10-09 06:19:31,063][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000053216_54493184.pth... +[2023-10-09 06:19:31,093][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000051008_52232192.pth +[2023-10-09 06:19:31,100][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000051616_52854784.pth +[2023-10-09 06:19:33,932][60144] Updated weights for policy 1, policy_version 53222 (0.0010) +[2023-10-09 06:19:34,304][60144] Updated weights for policy 1, policy_version 53232 (0.0008) +[2023-10-09 06:19:34,563][60143] Updated weights for policy 0, policy_version 52612 (0.0008) +[2023-10-09 06:19:34,665][60144] Updated weights for policy 1, policy_version 53242 (0.0008) +[2023-10-09 06:19:34,925][60143] Updated weights for policy 0, policy_version 52622 (0.0007) +[2023-10-09 06:19:35,296][60143] Updated weights for policy 0, policy_version 52632 (0.0007) +[2023-10-09 06:19:36,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 108429312. Throughput: 0: 1728.4, 1: 1719.0. Samples: 27106678. Policy #0 lag: (min: 24.0, avg: 48.3, max: 56.0) +[2023-10-09 06:19:36,052][59242] Avg episode reward: [(0, '32.480'), (1, '33.240')] +[2023-10-09 06:19:38,722][60144] Updated weights for policy 1, policy_version 53252 (0.0008) +[2023-10-09 06:19:39,087][60144] Updated weights for policy 1, policy_version 53262 (0.0009) +[2023-10-09 06:19:39,265][60143] Updated weights for policy 0, policy_version 52642 (0.0009) +[2023-10-09 06:19:39,445][60144] Updated weights for policy 1, policy_version 53272 (0.0008) +[2023-10-09 06:19:39,644][60143] Updated weights for policy 0, policy_version 52652 (0.0007) +[2023-10-09 06:19:40,018][60143] Updated weights for policy 0, policy_version 52662 (0.0007) +[2023-10-09 06:19:40,396][60143] Updated weights for policy 0, policy_version 52672 (0.0008) +[2023-10-09 06:19:41,052][59242] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 108494848. Throughput: 0: 1726.3, 1: 1699.6. Samples: 27126720. Policy #0 lag: (min: 24.0, avg: 48.3, max: 56.0) +[2023-10-09 06:19:41,052][59242] Avg episode reward: [(0, '32.950'), (1, '32.200')] +[2023-10-09 06:19:43,457][60144] Updated weights for policy 1, policy_version 53282 (0.0009) +[2023-10-09 06:19:43,818][60144] Updated weights for policy 1, policy_version 53292 (0.0010) +[2023-10-09 06:19:44,183][60144] Updated weights for policy 1, policy_version 53302 (0.0009) +[2023-10-09 06:19:44,401][60143] Updated weights for policy 0, policy_version 52682 (0.0010) +[2023-10-09 06:19:44,554][60144] Updated weights for policy 1, policy_version 53312 (0.0010) +[2023-10-09 06:19:44,769][60143] Updated weights for policy 0, policy_version 52692 (0.0008) +[2023-10-09 06:19:45,139][60143] Updated weights for policy 0, policy_version 52702 (0.0007) +[2023-10-09 06:19:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 108560384. Throughput: 0: 1705.1, 1: 1702.3. Samples: 27146828. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) +[2023-10-09 06:19:46,053][59242] Avg episode reward: [(0, '32.970'), (1, '31.430')] +[2023-10-09 06:19:48,313][60144] Updated weights for policy 1, policy_version 53322 (0.0007) +[2023-10-09 06:19:48,681][60144] Updated weights for policy 1, policy_version 53332 (0.0007) +[2023-10-09 06:19:49,053][60144] Updated weights for policy 1, policy_version 53342 (0.0008) +[2023-10-09 06:19:49,167][60143] Updated weights for policy 0, policy_version 52712 (0.0009) +[2023-10-09 06:19:49,539][60143] Updated weights for policy 0, policy_version 52722 (0.0008) +[2023-10-09 06:19:49,904][60143] Updated weights for policy 0, policy_version 52732 (0.0008) +[2023-10-09 06:19:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 108625920. Throughput: 0: 1739.3, 1: 1720.5. Samples: 27158408. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) +[2023-10-09 06:19:51,052][59242] Avg episode reward: [(0, '32.430'), (1, '31.190')] +[2023-10-09 06:19:52,873][60144] Updated weights for policy 1, policy_version 53352 (0.0010) +[2023-10-09 06:19:53,229][60144] Updated weights for policy 1, policy_version 53362 (0.0009) +[2023-10-09 06:19:53,609][60144] Updated weights for policy 1, policy_version 53372 (0.0007) +[2023-10-09 06:19:53,762][60143] Updated weights for policy 0, policy_version 52742 (0.0007) +[2023-10-09 06:19:54,139][60143] Updated weights for policy 0, policy_version 52752 (0.0011) +[2023-10-09 06:19:54,515][60143] Updated weights for policy 0, policy_version 52762 (0.0008) +[2023-10-09 06:19:56,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 108691456. Throughput: 0: 1709.9, 1: 1719.9. Samples: 27178178. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) +[2023-10-09 06:19:56,053][59242] Avg episode reward: [(0, '33.280'), (1, '32.060')] +[2023-10-09 06:19:57,478][60144] Updated weights for policy 1, policy_version 53382 (0.0008) +[2023-10-09 06:19:57,835][60144] Updated weights for policy 1, policy_version 53392 (0.0009) +[2023-10-09 06:19:58,196][60144] Updated weights for policy 1, policy_version 53402 (0.0009) +[2023-10-09 06:19:58,330][60143] Updated weights for policy 0, policy_version 52772 (0.0007) +[2023-10-09 06:19:58,699][60143] Updated weights for policy 0, policy_version 52782 (0.0007) +[2023-10-09 06:19:59,067][60143] Updated weights for policy 0, policy_version 52792 (0.0011) +[2023-10-09 06:20:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 108756992. Throughput: 0: 1702.2, 1: 1738.2. Samples: 27199138. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) +[2023-10-09 06:20:01,053][59242] Avg episode reward: [(0, '32.260'), (1, '30.810')] +[2023-10-09 06:20:02,239][60144] Updated weights for policy 1, policy_version 53412 (0.0008) +[2023-10-09 06:20:02,628][60144] Updated weights for policy 1, policy_version 53422 (0.0007) +[2023-10-09 06:20:02,993][60144] Updated weights for policy 1, policy_version 53432 (0.0008) +[2023-10-09 06:20:03,211][60143] Updated weights for policy 0, policy_version 52802 (0.0008) +[2023-10-09 06:20:03,575][60143] Updated weights for policy 0, policy_version 52812 (0.0008) +[2023-10-09 06:20:03,949][60143] Updated weights for policy 0, policy_version 52822 (0.0007) +[2023-10-09 06:20:04,325][60143] Updated weights for policy 0, policy_version 52832 (0.0007) +[2023-10-09 06:20:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 108822528. Throughput: 0: 1719.9, 1: 1718.3. Samples: 27209172. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) +[2023-10-09 06:20:06,053][59242] Avg episode reward: [(0, '29.980'), (1, '31.110')] +[2023-10-09 06:20:06,998][60144] Updated weights for policy 1, policy_version 53442 (0.0008) +[2023-10-09 06:20:07,369][60144] Updated weights for policy 1, policy_version 53452 (0.0007) +[2023-10-09 06:20:07,735][60144] Updated weights for policy 1, policy_version 53462 (0.0008) +[2023-10-09 06:20:08,104][60144] Updated weights for policy 1, policy_version 53472 (0.0009) +[2023-10-09 06:20:08,325][60143] Updated weights for policy 0, policy_version 52842 (0.0008) +[2023-10-09 06:20:08,693][60143] Updated weights for policy 0, policy_version 52852 (0.0007) +[2023-10-09 06:20:09,058][60143] Updated weights for policy 0, policy_version 52862 (0.0008) +[2023-10-09 06:20:11,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 108888064. Throughput: 0: 1697.4, 1: 1724.4. Samples: 27229634. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) +[2023-10-09 06:20:11,053][59242] Avg episode reward: [(0, '30.140'), (1, '30.090')] +[2023-10-09 06:20:11,972][60144] Updated weights for policy 1, policy_version 53482 (0.0008) +[2023-10-09 06:20:12,347][60144] Updated weights for policy 1, policy_version 53492 (0.0007) +[2023-10-09 06:20:12,714][60144] Updated weights for policy 1, policy_version 53502 (0.0007) +[2023-10-09 06:20:13,177][60143] Updated weights for policy 0, policy_version 52872 (0.0010) +[2023-10-09 06:20:13,550][60143] Updated weights for policy 0, policy_version 52882 (0.0008) +[2023-10-09 06:20:13,909][60143] Updated weights for policy 0, policy_version 52892 (0.0009) +[2023-10-09 06:20:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 108953600. Throughput: 0: 1711.9, 1: 1746.8. Samples: 27250694. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) +[2023-10-09 06:20:16,053][59242] Avg episode reward: [(0, '29.760'), (1, '30.220')] +[2023-10-09 06:20:16,711][60144] Updated weights for policy 1, policy_version 53512 (0.0007) +[2023-10-09 06:20:17,078][60144] Updated weights for policy 1, policy_version 53522 (0.0007) +[2023-10-09 06:20:17,452][60144] Updated weights for policy 1, policy_version 53532 (0.0007) +[2023-10-09 06:20:17,949][60143] Updated weights for policy 0, policy_version 52902 (0.0010) +[2023-10-09 06:20:18,313][60143] Updated weights for policy 0, policy_version 52912 (0.0009) +[2023-10-09 06:20:18,674][60143] Updated weights for policy 0, policy_version 52922 (0.0010) +[2023-10-09 06:20:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 109019136. Throughput: 0: 1701.2, 1: 1717.3. Samples: 27260512. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) +[2023-10-09 06:20:21,053][59242] Avg episode reward: [(0, '28.160'), (1, '30.170')] +[2023-10-09 06:20:21,422][60144] Updated weights for policy 1, policy_version 53542 (0.0009) +[2023-10-09 06:20:21,793][60144] Updated weights for policy 1, policy_version 53552 (0.0008) +[2023-10-09 06:20:22,164][60144] Updated weights for policy 1, policy_version 53562 (0.0010) +[2023-10-09 06:20:22,611][60143] Updated weights for policy 0, policy_version 52932 (0.0008) +[2023-10-09 06:20:22,980][60143] Updated weights for policy 0, policy_version 52942 (0.0009) +[2023-10-09 06:20:23,360][60143] Updated weights for policy 0, policy_version 52952 (0.0008) +[2023-10-09 06:20:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 109084672. Throughput: 0: 1697.0, 1: 1733.4. Samples: 27281088. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) +[2023-10-09 06:20:26,053][59242] Avg episode reward: [(0, '28.290'), (1, '29.750')] +[2023-10-09 06:20:26,191][60144] Updated weights for policy 1, policy_version 53572 (0.0010) +[2023-10-09 06:20:26,554][60144] Updated weights for policy 1, policy_version 53582 (0.0008) +[2023-10-09 06:20:26,927][60144] Updated weights for policy 1, policy_version 53592 (0.0008) +[2023-10-09 06:20:27,574][60143] Updated weights for policy 0, policy_version 52962 (0.0007) +[2023-10-09 06:20:27,946][60143] Updated weights for policy 0, policy_version 52972 (0.0009) +[2023-10-09 06:20:28,320][60143] Updated weights for policy 0, policy_version 52982 (0.0010) +[2023-10-09 06:20:28,692][60143] Updated weights for policy 0, policy_version 52992 (0.0009) +[2023-10-09 06:20:30,888][60144] Updated weights for policy 1, policy_version 53602 (0.0008) +[2023-10-09 06:20:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 109150208. Throughput: 0: 1718.1, 1: 1737.9. Samples: 27302352. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) +[2023-10-09 06:20:31,053][59242] Avg episode reward: [(0, '27.670'), (1, '29.840')] +[2023-10-09 06:20:31,258][60144] Updated weights for policy 1, policy_version 53612 (0.0009) +[2023-10-09 06:20:31,633][60144] Updated weights for policy 1, policy_version 53622 (0.0010) +[2023-10-09 06:20:32,004][60144] Updated weights for policy 1, policy_version 53632 (0.0009) +[2023-10-09 06:20:32,500][60143] Updated weights for policy 0, policy_version 53002 (0.0007) +[2023-10-09 06:20:32,874][60143] Updated weights for policy 0, policy_version 53012 (0.0007) +[2023-10-09 06:20:33,241][60143] Updated weights for policy 0, policy_version 53022 (0.0010) +[2023-10-09 06:20:35,817][60144] Updated weights for policy 1, policy_version 53642 (0.0009) +[2023-10-09 06:20:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 109215744. Throughput: 0: 1686.7, 1: 1719.8. Samples: 27311702. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) +[2023-10-09 06:20:36,053][59242] Avg episode reward: [(0, '27.260'), (1, '29.690')] +[2023-10-09 06:20:36,183][60144] Updated weights for policy 1, policy_version 53652 (0.0009) +[2023-10-09 06:20:36,559][60144] Updated weights for policy 1, policy_version 53662 (0.0007) +[2023-10-09 06:20:37,182][60143] Updated weights for policy 0, policy_version 53032 (0.0008) +[2023-10-09 06:20:37,548][60143] Updated weights for policy 0, policy_version 53042 (0.0008) +[2023-10-09 06:20:37,920][60143] Updated weights for policy 0, policy_version 53052 (0.0010) +[2023-10-09 06:20:40,622][60144] Updated weights for policy 1, policy_version 53672 (0.0011) +[2023-10-09 06:20:40,988][60144] Updated weights for policy 1, policy_version 53682 (0.0008) +[2023-10-09 06:20:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.1, 300 sec: 13551.5). Total num frames: 109281280. Throughput: 0: 1711.9, 1: 1726.0. Samples: 27332880. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) +[2023-10-09 06:20:41,053][59242] Avg episode reward: [(0, '28.460'), (1, '31.370')] +[2023-10-09 06:20:41,352][60144] Updated weights for policy 1, policy_version 53692 (0.0008) +[2023-10-09 06:20:41,906][60143] Updated weights for policy 0, policy_version 53062 (0.0011) +[2023-10-09 06:20:42,279][60143] Updated weights for policy 0, policy_version 53072 (0.0009) +[2023-10-09 06:20:42,642][60143] Updated weights for policy 0, policy_version 53082 (0.0009) +[2023-10-09 06:20:45,374][60144] Updated weights for policy 1, policy_version 53702 (0.0008) +[2023-10-09 06:20:45,737][60144] Updated weights for policy 1, policy_version 53712 (0.0007) +[2023-10-09 06:20:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 109346816. Throughput: 0: 1713.8, 1: 1719.1. Samples: 27353616. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) +[2023-10-09 06:20:46,053][59242] Avg episode reward: [(0, '28.880'), (1, '31.360')] +[2023-10-09 06:20:46,103][60144] Updated weights for policy 1, policy_version 53722 (0.0008) +[2023-10-09 06:20:46,664][60143] Updated weights for policy 0, policy_version 53092 (0.0010) +[2023-10-09 06:20:47,040][60143] Updated weights for policy 0, policy_version 53102 (0.0010) +[2023-10-09 06:20:47,416][60143] Updated weights for policy 0, policy_version 53112 (0.0010) +[2023-10-09 06:20:49,990][60144] Updated weights for policy 1, policy_version 53732 (0.0009) +[2023-10-09 06:20:50,393][60144] Updated weights for policy 1, policy_version 53742 (0.0010) +[2023-10-09 06:20:50,766][60144] Updated weights for policy 1, policy_version 53752 (0.0009) +[2023-10-09 06:20:51,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 109445120. Throughput: 0: 1694.6, 1: 1735.7. Samples: 27363538. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) +[2023-10-09 06:20:51,053][59242] Avg episode reward: [(0, '29.180'), (1, '31.750')] +[2023-10-09 06:20:51,486][60143] Updated weights for policy 0, policy_version 53122 (0.0009) +[2023-10-09 06:20:51,862][60143] Updated weights for policy 0, policy_version 53132 (0.0010) +[2023-10-09 06:20:52,237][60143] Updated weights for policy 0, policy_version 53142 (0.0008) +[2023-10-09 06:20:52,603][60143] Updated weights for policy 0, policy_version 53152 (0.0007) +[2023-10-09 06:20:54,570][60144] Updated weights for policy 1, policy_version 53762 (0.0007) +[2023-10-09 06:20:54,935][60144] Updated weights for policy 1, policy_version 53772 (0.0007) +[2023-10-09 06:20:55,314][60144] Updated weights for policy 1, policy_version 53782 (0.0007) +[2023-10-09 06:20:55,676][60144] Updated weights for policy 1, policy_version 53792 (0.0009) +[2023-10-09 06:20:56,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 109510656. Throughput: 0: 1713.6, 1: 1730.7. Samples: 27384628. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) +[2023-10-09 06:20:56,053][59242] Avg episode reward: [(0, '28.410'), (1, '30.750')] +[2023-10-09 06:20:56,442][60143] Updated weights for policy 0, policy_version 53162 (0.0010) +[2023-10-09 06:20:56,813][60143] Updated weights for policy 0, policy_version 53172 (0.0007) +[2023-10-09 06:20:57,184][60143] Updated weights for policy 0, policy_version 53182 (0.0008) +[2023-10-09 06:20:59,708][60144] Updated weights for policy 1, policy_version 53802 (0.0007) +[2023-10-09 06:21:00,077][60144] Updated weights for policy 1, policy_version 53812 (0.0007) +[2023-10-09 06:21:00,436][60144] Updated weights for policy 1, policy_version 53822 (0.0008) +[2023-10-09 06:21:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 109576192. Throughput: 0: 1721.1, 1: 1705.9. Samples: 27404908. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) +[2023-10-09 06:21:01,053][59242] Avg episode reward: [(0, '27.170'), (1, '32.270')] +[2023-10-09 06:21:01,206][60143] Updated weights for policy 0, policy_version 53192 (0.0009) +[2023-10-09 06:21:01,576][60143] Updated weights for policy 0, policy_version 53202 (0.0010) +[2023-10-09 06:21:01,945][60143] Updated weights for policy 0, policy_version 53212 (0.0010) +[2023-10-09 06:21:04,336][60144] Updated weights for policy 1, policy_version 53832 (0.0009) +[2023-10-09 06:21:04,709][60144] Updated weights for policy 1, policy_version 53842 (0.0009) +[2023-10-09 06:21:05,069][60144] Updated weights for policy 1, policy_version 53852 (0.0010) +[2023-10-09 06:21:05,997][60143] Updated weights for policy 0, policy_version 53222 (0.0008) +[2023-10-09 06:21:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 109641728. Throughput: 0: 1709.6, 1: 1736.0. Samples: 27415566. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:21:06,053][59242] Avg episode reward: [(0, '27.530'), (1, '31.250')] +[2023-10-09 06:21:06,361][60143] Updated weights for policy 0, policy_version 53232 (0.0009) +[2023-10-09 06:21:06,736][60143] Updated weights for policy 0, policy_version 53242 (0.0007) +[2023-10-09 06:21:09,008][60144] Updated weights for policy 1, policy_version 53862 (0.0010) +[2023-10-09 06:21:09,379][60144] Updated weights for policy 1, policy_version 53872 (0.0010) +[2023-10-09 06:21:09,739][60144] Updated weights for policy 1, policy_version 53882 (0.0008) +[2023-10-09 06:21:10,698][60143] Updated weights for policy 0, policy_version 53252 (0.0008) +[2023-10-09 06:21:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 109707264. Throughput: 0: 1720.6, 1: 1721.9. Samples: 27436000. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:21:11,053][59242] Avg episode reward: [(0, '27.270'), (1, '32.690')] +[2023-10-09 06:21:11,074][60143] Updated weights for policy 0, policy_version 53262 (0.0007) +[2023-10-09 06:21:11,447][60143] Updated weights for policy 0, policy_version 53272 (0.0008) +[2023-10-09 06:21:13,705][60144] Updated weights for policy 1, policy_version 53892 (0.0008) +[2023-10-09 06:21:14,066][60144] Updated weights for policy 1, policy_version 53902 (0.0010) +[2023-10-09 06:21:14,430][60144] Updated weights for policy 1, policy_version 53912 (0.0009) +[2023-10-09 06:21:15,150][60143] Updated weights for policy 0, policy_version 53282 (0.0008) +[2023-10-09 06:21:15,519][60143] Updated weights for policy 0, policy_version 53292 (0.0010) +[2023-10-09 06:21:15,897][60143] Updated weights for policy 0, policy_version 53302 (0.0011) +[2023-10-09 06:21:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 109772800. Throughput: 0: 1714.7, 1: 1711.7. Samples: 27456538. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:21:16,053][59242] Avg episode reward: [(0, '27.230'), (1, '31.200')] +[2023-10-09 06:21:16,263][60143] Updated weights for policy 0, policy_version 53312 (0.0009) +[2023-10-09 06:21:18,268][60144] Updated weights for policy 1, policy_version 53922 (0.0010) +[2023-10-09 06:21:18,642][60144] Updated weights for policy 1, policy_version 53932 (0.0010) +[2023-10-09 06:21:19,017][60144] Updated weights for policy 1, policy_version 53942 (0.0008) +[2023-10-09 06:21:19,380][60144] Updated weights for policy 1, policy_version 53952 (0.0008) +[2023-10-09 06:21:20,202][60143] Updated weights for policy 0, policy_version 53322 (0.0009) +[2023-10-09 06:21:20,574][60143] Updated weights for policy 0, policy_version 53332 (0.0010) +[2023-10-09 06:21:20,940][60143] Updated weights for policy 0, policy_version 53342 (0.0009) +[2023-10-09 06:21:21,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 109871104. Throughput: 0: 1723.9, 1: 1735.6. Samples: 27467376. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:21:21,052][59242] Avg episode reward: [(0, '27.030'), (1, '30.620')] +[2023-10-09 06:21:23,380][60144] Updated weights for policy 1, policy_version 53962 (0.0008) +[2023-10-09 06:21:23,749][60144] Updated weights for policy 1, policy_version 53972 (0.0008) +[2023-10-09 06:21:24,124][60144] Updated weights for policy 1, policy_version 53982 (0.0009) +[2023-10-09 06:21:25,034][60143] Updated weights for policy 0, policy_version 53352 (0.0008) +[2023-10-09 06:21:25,407][60143] Updated weights for policy 0, policy_version 53362 (0.0010) +[2023-10-09 06:21:25,776][60143] Updated weights for policy 0, policy_version 53372 (0.0008) +[2023-10-09 06:21:26,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 109936640. Throughput: 0: 1730.2, 1: 1719.0. Samples: 27488094. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:21:26,053][59242] Avg episode reward: [(0, '27.480'), (1, '30.510')] +[2023-10-09 06:21:27,929][60144] Updated weights for policy 1, policy_version 53992 (0.0007) +[2023-10-09 06:21:28,304][60144] Updated weights for policy 1, policy_version 54002 (0.0007) +[2023-10-09 06:21:28,658][60144] Updated weights for policy 1, policy_version 54012 (0.0007) +[2023-10-09 06:21:29,713][60143] Updated weights for policy 0, policy_version 53382 (0.0008) +[2023-10-09 06:21:30,098][60143] Updated weights for policy 0, policy_version 53392 (0.0007) +[2023-10-09 06:21:30,460][60143] Updated weights for policy 0, policy_version 53402 (0.0008) +[2023-10-09 06:21:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 110002176. Throughput: 0: 1713.3, 1: 1730.6. Samples: 27508592. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:21:31,053][59242] Avg episode reward: [(0, '26.740'), (1, '31.280')] +[2023-10-09 06:21:31,066][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000054016_55312384.pth... +[2023-10-09 06:21:31,066][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000053408_54689792.pth... +[2023-10-09 06:21:31,102][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000051808_53051392.pth +[2023-10-09 06:21:31,106][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000052416_53673984.pth +[2023-10-09 06:21:32,604][60144] Updated weights for policy 1, policy_version 54022 (0.0007) +[2023-10-09 06:21:32,973][60144] Updated weights for policy 1, policy_version 54032 (0.0008) +[2023-10-09 06:21:33,352][60144] Updated weights for policy 1, policy_version 54042 (0.0009) +[2023-10-09 06:21:34,353][60143] Updated weights for policy 0, policy_version 53412 (0.0007) +[2023-10-09 06:21:34,724][60143] Updated weights for policy 0, policy_version 53422 (0.0007) +[2023-10-09 06:21:35,089][60143] Updated weights for policy 0, policy_version 53432 (0.0007) +[2023-10-09 06:21:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 110067712. Throughput: 0: 1739.1, 1: 1717.9. Samples: 27519104. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:21:36,053][59242] Avg episode reward: [(0, '28.920'), (1, '31.540')] +[2023-10-09 06:21:37,314][60144] Updated weights for policy 1, policy_version 54052 (0.0008) +[2023-10-09 06:21:37,688][60144] Updated weights for policy 1, policy_version 54062 (0.0008) +[2023-10-09 06:21:38,044][60144] Updated weights for policy 1, policy_version 54072 (0.0008) +[2023-10-09 06:21:39,089][60143] Updated weights for policy 0, policy_version 53442 (0.0009) +[2023-10-09 06:21:39,456][60143] Updated weights for policy 0, policy_version 53452 (0.0008) +[2023-10-09 06:21:39,818][60143] Updated weights for policy 0, policy_version 53462 (0.0008) +[2023-10-09 06:21:40,184][60143] Updated weights for policy 0, policy_version 53472 (0.0009) +[2023-10-09 06:21:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 110133248. Throughput: 0: 1728.5, 1: 1719.3. Samples: 27539780. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 06:21:41,053][59242] Avg episode reward: [(0, '29.030'), (1, '32.910')] +[2023-10-09 06:21:41,877][60144] Updated weights for policy 1, policy_version 54082 (0.0008) +[2023-10-09 06:21:42,282][60144] Updated weights for policy 1, policy_version 54092 (0.0007) +[2023-10-09 06:21:42,644][60144] Updated weights for policy 1, policy_version 54102 (0.0007) +[2023-10-09 06:21:43,012][60144] Updated weights for policy 1, policy_version 54112 (0.0007) +[2023-10-09 06:21:44,272][60143] Updated weights for policy 0, policy_version 53482 (0.0009) +[2023-10-09 06:21:44,644][60143] Updated weights for policy 0, policy_version 53492 (0.0008) +[2023-10-09 06:21:45,011][60143] Updated weights for policy 0, policy_version 53502 (0.0008) +[2023-10-09 06:21:46,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 110198784. Throughput: 0: 1703.7, 1: 1747.5. Samples: 27560212. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 06:21:46,052][59242] Avg episode reward: [(0, '29.940'), (1, '32.590')] +[2023-10-09 06:21:46,847][60144] Updated weights for policy 1, policy_version 54122 (0.0009) +[2023-10-09 06:21:47,220][60144] Updated weights for policy 1, policy_version 54132 (0.0008) +[2023-10-09 06:21:47,586][60144] Updated weights for policy 1, policy_version 54142 (0.0009) +[2023-10-09 06:21:49,083][60143] Updated weights for policy 0, policy_version 53512 (0.0007) +[2023-10-09 06:21:49,463][60143] Updated weights for policy 0, policy_version 53522 (0.0011) +[2023-10-09 06:21:49,840][60143] Updated weights for policy 0, policy_version 53532 (0.0011) +[2023-10-09 06:21:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 110264320. Throughput: 0: 1732.8, 1: 1713.6. Samples: 27570652. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 06:21:51,052][59242] Avg episode reward: [(0, '29.770'), (1, '32.500')] +[2023-10-09 06:21:51,755][60144] Updated weights for policy 1, policy_version 54152 (0.0007) +[2023-10-09 06:21:52,117][60144] Updated weights for policy 1, policy_version 54162 (0.0009) +[2023-10-09 06:21:52,477][60144] Updated weights for policy 1, policy_version 54172 (0.0007) +[2023-10-09 06:21:53,537][60143] Updated weights for policy 0, policy_version 53542 (0.0008) +[2023-10-09 06:21:53,906][60143] Updated weights for policy 0, policy_version 53552 (0.0008) +[2023-10-09 06:21:54,276][60143] Updated weights for policy 0, policy_version 53562 (0.0011) +[2023-10-09 06:21:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 110329856. Throughput: 0: 1706.4, 1: 1734.6. Samples: 27590846. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 06:21:56,053][59242] Avg episode reward: [(0, '30.530'), (1, '32.830')] +[2023-10-09 06:21:56,373][60144] Updated weights for policy 1, policy_version 54182 (0.0008) +[2023-10-09 06:21:56,746][60144] Updated weights for policy 1, policy_version 54192 (0.0007) +[2023-10-09 06:21:57,118][60144] Updated weights for policy 1, policy_version 54202 (0.0007) +[2023-10-09 06:21:58,172][60143] Updated weights for policy 0, policy_version 53572 (0.0011) +[2023-10-09 06:21:58,529][60143] Updated weights for policy 0, policy_version 53582 (0.0010) +[2023-10-09 06:21:58,904][60143] Updated weights for policy 0, policy_version 53592 (0.0010) +[2023-10-09 06:22:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 110395392. Throughput: 0: 1709.9, 1: 1747.9. Samples: 27612136. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 06:22:01,052][59242] Avg episode reward: [(0, '30.160'), (1, '32.680')] +[2023-10-09 06:22:01,069][60144] Updated weights for policy 1, policy_version 54212 (0.0007) +[2023-10-09 06:22:01,433][60144] Updated weights for policy 1, policy_version 54222 (0.0008) +[2023-10-09 06:22:01,797][60144] Updated weights for policy 1, policy_version 54232 (0.0008) +[2023-10-09 06:22:02,848][60143] Updated weights for policy 0, policy_version 53602 (0.0011) +[2023-10-09 06:22:03,211][60143] Updated weights for policy 0, policy_version 53612 (0.0011) +[2023-10-09 06:22:03,588][60143] Updated weights for policy 0, policy_version 53622 (0.0011) +[2023-10-09 06:22:03,952][60143] Updated weights for policy 0, policy_version 53632 (0.0008) +[2023-10-09 06:22:05,776][60144] Updated weights for policy 1, policy_version 54242 (0.0007) +[2023-10-09 06:22:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 110460928. Throughput: 0: 1716.5, 1: 1723.9. Samples: 27622194. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 06:22:06,053][59242] Avg episode reward: [(0, '30.750'), (1, '33.640')] +[2023-10-09 06:22:06,155][60144] Updated weights for policy 1, policy_version 54252 (0.0008) +[2023-10-09 06:22:06,519][60144] Updated weights for policy 1, policy_version 54262 (0.0008) +[2023-10-09 06:22:06,887][60144] Updated weights for policy 1, policy_version 54272 (0.0010) +[2023-10-09 06:22:07,937][60143] Updated weights for policy 0, policy_version 53642 (0.0008) +[2023-10-09 06:22:08,300][60143] Updated weights for policy 0, policy_version 53652 (0.0010) +[2023-10-09 06:22:08,669][60143] Updated weights for policy 0, policy_version 53662 (0.0008) +[2023-10-09 06:22:10,778][60144] Updated weights for policy 1, policy_version 54282 (0.0008) +[2023-10-09 06:22:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 110526464. Throughput: 0: 1699.2, 1: 1736.4. Samples: 27642696. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 06:22:11,053][59242] Avg episode reward: [(0, '30.570'), (1, '31.840')] +[2023-10-09 06:22:11,155][60144] Updated weights for policy 1, policy_version 54292 (0.0010) +[2023-10-09 06:22:11,523][60144] Updated weights for policy 1, policy_version 54302 (0.0010) +[2023-10-09 06:22:12,675][60143] Updated weights for policy 0, policy_version 53672 (0.0010) +[2023-10-09 06:22:13,044][60143] Updated weights for policy 0, policy_version 53682 (0.0008) +[2023-10-09 06:22:13,407][60143] Updated weights for policy 0, policy_version 53692 (0.0007) +[2023-10-09 06:22:15,480][60144] Updated weights for policy 1, policy_version 54312 (0.0008) +[2023-10-09 06:22:15,846][60144] Updated weights for policy 1, policy_version 54322 (0.0007) +[2023-10-09 06:22:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 110592000. Throughput: 0: 1713.8, 1: 1724.3. Samples: 27663306. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 06:22:16,053][59242] Avg episode reward: [(0, '31.480'), (1, '31.230')] +[2023-10-09 06:22:16,209][60144] Updated weights for policy 1, policy_version 54332 (0.0007) +[2023-10-09 06:22:17,547][60143] Updated weights for policy 0, policy_version 53702 (0.0008) +[2023-10-09 06:22:17,940][60143] Updated weights for policy 0, policy_version 53712 (0.0009) +[2023-10-09 06:22:18,309][60143] Updated weights for policy 0, policy_version 53722 (0.0008) +[2023-10-09 06:22:20,187][60144] Updated weights for policy 1, policy_version 54342 (0.0007) +[2023-10-09 06:22:20,547][60144] Updated weights for policy 1, policy_version 54352 (0.0008) +[2023-10-09 06:22:20,920][60144] Updated weights for policy 1, policy_version 54362 (0.0008) +[2023-10-09 06:22:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 110657536. Throughput: 0: 1688.0, 1: 1736.2. Samples: 27673190. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:22:21,053][59242] Avg episode reward: [(0, '31.290'), (1, '30.750')] +[2023-10-09 06:22:22,224][60143] Updated weights for policy 0, policy_version 53732 (0.0009) +[2023-10-09 06:22:22,598][60143] Updated weights for policy 0, policy_version 53742 (0.0009) +[2023-10-09 06:22:22,956][60143] Updated weights for policy 0, policy_version 53752 (0.0007) +[2023-10-09 06:22:24,639][60144] Updated weights for policy 1, policy_version 54372 (0.0008) +[2023-10-09 06:22:25,004][60144] Updated weights for policy 1, policy_version 54382 (0.0008) +[2023-10-09 06:22:25,377][60144] Updated weights for policy 1, policy_version 54392 (0.0009) +[2023-10-09 06:22:26,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 110755840. Throughput: 0: 1699.7, 1: 1736.4. Samples: 27694402. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:22:26,053][59242] Avg episode reward: [(0, '32.210'), (1, '31.870')] +[2023-10-09 06:22:26,895][60143] Updated weights for policy 0, policy_version 53762 (0.0009) +[2023-10-09 06:22:27,258][60143] Updated weights for policy 0, policy_version 53772 (0.0010) +[2023-10-09 06:22:27,637][60143] Updated weights for policy 0, policy_version 53782 (0.0009) +[2023-10-09 06:22:27,999][60143] Updated weights for policy 0, policy_version 53792 (0.0008) +[2023-10-09 06:22:29,457][60144] Updated weights for policy 1, policy_version 54402 (0.0010) +[2023-10-09 06:22:29,851][60144] Updated weights for policy 1, policy_version 54412 (0.0008) +[2023-10-09 06:22:30,228][60144] Updated weights for policy 1, policy_version 54422 (0.0007) +[2023-10-09 06:22:30,600][60144] Updated weights for policy 1, policy_version 54432 (0.0007) +[2023-10-09 06:22:31,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 110821376. Throughput: 0: 1721.8, 1: 1706.3. Samples: 27714476. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:22:31,052][59242] Avg episode reward: [(0, '32.340'), (1, '32.720')] +[2023-10-09 06:22:31,979][60143] Updated weights for policy 0, policy_version 53802 (0.0008) +[2023-10-09 06:22:32,343][60143] Updated weights for policy 0, policy_version 53812 (0.0007) +[2023-10-09 06:22:32,725][60143] Updated weights for policy 0, policy_version 53822 (0.0010) +[2023-10-09 06:22:34,487][60144] Updated weights for policy 1, policy_version 54442 (0.0011) +[2023-10-09 06:22:34,858][60144] Updated weights for policy 1, policy_version 54452 (0.0007) +[2023-10-09 06:22:35,223][60144] Updated weights for policy 1, policy_version 54462 (0.0007) +[2023-10-09 06:22:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 110886912. Throughput: 0: 1692.8, 1: 1731.6. Samples: 27724750. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:22:36,053][59242] Avg episode reward: [(0, '31.150'), (1, '30.850')] +[2023-10-09 06:22:36,768][60143] Updated weights for policy 0, policy_version 53832 (0.0008) +[2023-10-09 06:22:37,141][60143] Updated weights for policy 0, policy_version 53842 (0.0009) +[2023-10-09 06:22:37,519][60143] Updated weights for policy 0, policy_version 53852 (0.0009) +[2023-10-09 06:22:39,151][60144] Updated weights for policy 1, policy_version 54472 (0.0010) +[2023-10-09 06:22:39,515][60144] Updated weights for policy 1, policy_version 54482 (0.0009) +[2023-10-09 06:22:39,889][60144] Updated weights for policy 1, policy_version 54492 (0.0008) +[2023-10-09 06:22:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 110952448. Throughput: 0: 1721.6, 1: 1712.8. Samples: 27745394. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:22:41,052][59242] Avg episode reward: [(0, '30.230'), (1, '32.110')] +[2023-10-09 06:22:41,429][60143] Updated weights for policy 0, policy_version 53862 (0.0009) +[2023-10-09 06:22:41,797][60143] Updated weights for policy 0, policy_version 53872 (0.0010) +[2023-10-09 06:22:42,168][60143] Updated weights for policy 0, policy_version 53882 (0.0008) +[2023-10-09 06:22:43,891][60144] Updated weights for policy 1, policy_version 54502 (0.0008) +[2023-10-09 06:22:44,266][60144] Updated weights for policy 1, policy_version 54512 (0.0007) +[2023-10-09 06:22:44,625][60144] Updated weights for policy 1, policy_version 54522 (0.0009) +[2023-10-09 06:22:46,042][60143] Updated weights for policy 0, policy_version 53892 (0.0009) +[2023-10-09 06:22:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 111017984. Throughput: 0: 1724.2, 1: 1699.7. Samples: 27766212. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:22:46,052][59242] Avg episode reward: [(0, '29.830'), (1, '34.220')] +[2023-10-09 06:22:46,418][60143] Updated weights for policy 0, policy_version 53902 (0.0008) +[2023-10-09 06:22:46,786][60143] Updated weights for policy 0, policy_version 53912 (0.0007) +[2023-10-09 06:22:48,738][60144] Updated weights for policy 1, policy_version 54532 (0.0008) +[2023-10-09 06:22:49,098][60144] Updated weights for policy 1, policy_version 54542 (0.0010) +[2023-10-09 06:22:49,467][60144] Updated weights for policy 1, policy_version 54552 (0.0009) +[2023-10-09 06:22:50,729][60143] Updated weights for policy 0, policy_version 53922 (0.0009) +[2023-10-09 06:22:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 111083520. Throughput: 0: 1706.0, 1: 1727.5. Samples: 27776704. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:22:51,053][59242] Avg episode reward: [(0, '29.930'), (1, '33.580')] +[2023-10-09 06:22:51,114][60143] Updated weights for policy 0, policy_version 53932 (0.0008) +[2023-10-09 06:22:51,483][60143] Updated weights for policy 0, policy_version 53942 (0.0008) +[2023-10-09 06:22:51,861][60143] Updated weights for policy 0, policy_version 53952 (0.0009) +[2023-10-09 06:22:53,396][60144] Updated weights for policy 1, policy_version 54562 (0.0007) +[2023-10-09 06:22:53,764][60144] Updated weights for policy 1, policy_version 54572 (0.0008) +[2023-10-09 06:22:54,129][60144] Updated weights for policy 1, policy_version 54582 (0.0009) +[2023-10-09 06:22:54,489][60144] Updated weights for policy 1, policy_version 54592 (0.0010) +[2023-10-09 06:22:55,767][60143] Updated weights for policy 0, policy_version 53962 (0.0008) +[2023-10-09 06:22:56,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 111149056. Throughput: 0: 1717.5, 1: 1703.7. Samples: 27796650. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:22:56,053][59242] Avg episode reward: [(0, '29.420'), (1, '34.820')] +[2023-10-09 06:22:56,141][60143] Updated weights for policy 0, policy_version 53972 (0.0007) +[2023-10-09 06:22:56,524][60143] Updated weights for policy 0, policy_version 53982 (0.0007) +[2023-10-09 06:22:58,404][60144] Updated weights for policy 1, policy_version 54602 (0.0007) +[2023-10-09 06:22:58,777][60144] Updated weights for policy 1, policy_version 54612 (0.0010) +[2023-10-09 06:22:59,146][60144] Updated weights for policy 1, policy_version 54622 (0.0008) +[2023-10-09 06:23:00,502][60143] Updated weights for policy 0, policy_version 53992 (0.0008) +[2023-10-09 06:23:00,875][60143] Updated weights for policy 0, policy_version 54002 (0.0008) +[2023-10-09 06:23:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 111214592. Throughput: 0: 1717.5, 1: 1715.0. Samples: 27817768. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 06:23:01,053][59242] Avg episode reward: [(0, '29.080'), (1, '33.370')] +[2023-10-09 06:23:01,238][60143] Updated weights for policy 0, policy_version 54012 (0.0008) +[2023-10-09 06:23:03,057][60144] Updated weights for policy 1, policy_version 54632 (0.0009) +[2023-10-09 06:23:03,421][60144] Updated weights for policy 1, policy_version 54642 (0.0008) +[2023-10-09 06:23:03,793][60144] Updated weights for policy 1, policy_version 54652 (0.0008) +[2023-10-09 06:23:05,482][60143] Updated weights for policy 0, policy_version 54022 (0.0008) +[2023-10-09 06:23:05,862][60143] Updated weights for policy 0, policy_version 54032 (0.0009) +[2023-10-09 06:23:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 111280128. Throughput: 0: 1722.0, 1: 1712.6. Samples: 27827748. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 06:23:06,053][59242] Avg episode reward: [(0, '29.970'), (1, '33.020')] +[2023-10-09 06:23:06,228][60143] Updated weights for policy 0, policy_version 54042 (0.0007) +[2023-10-09 06:23:07,735][60144] Updated weights for policy 1, policy_version 54662 (0.0010) +[2023-10-09 06:23:08,096][60144] Updated weights for policy 1, policy_version 54672 (0.0008) +[2023-10-09 06:23:08,458][60144] Updated weights for policy 1, policy_version 54682 (0.0011) +[2023-10-09 06:23:10,064][60143] Updated weights for policy 0, policy_version 54052 (0.0009) +[2023-10-09 06:23:10,435][60143] Updated weights for policy 0, policy_version 54062 (0.0009) +[2023-10-09 06:23:10,794][60143] Updated weights for policy 0, policy_version 54072 (0.0011) +[2023-10-09 06:23:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 111345664. Throughput: 0: 1723.9, 1: 1701.3. Samples: 27848534. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 06:23:11,053][59242] Avg episode reward: [(0, '29.740'), (1, '32.620')] +[2023-10-09 06:23:12,272][60144] Updated weights for policy 1, policy_version 54692 (0.0009) +[2023-10-09 06:23:12,641][60144] Updated weights for policy 1, policy_version 54702 (0.0008) +[2023-10-09 06:23:13,010][60144] Updated weights for policy 1, policy_version 54712 (0.0008) +[2023-10-09 06:23:14,765][60143] Updated weights for policy 0, policy_version 54082 (0.0011) +[2023-10-09 06:23:15,136][60143] Updated weights for policy 0, policy_version 54092 (0.0009) +[2023-10-09 06:23:15,495][60143] Updated weights for policy 0, policy_version 54102 (0.0009) +[2023-10-09 06:23:15,867][60143] Updated weights for policy 0, policy_version 54112 (0.0009) +[2023-10-09 06:23:16,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 111443968. Throughput: 0: 1703.2, 1: 1732.4. Samples: 27869080. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 06:23:16,053][59242] Avg episode reward: [(0, '30.090'), (1, '32.490')] +[2023-10-09 06:23:17,148][60144] Updated weights for policy 1, policy_version 54722 (0.0009) +[2023-10-09 06:23:17,555][60144] Updated weights for policy 1, policy_version 54732 (0.0008) +[2023-10-09 06:23:17,922][60144] Updated weights for policy 1, policy_version 54742 (0.0008) +[2023-10-09 06:23:18,288][60144] Updated weights for policy 1, policy_version 54752 (0.0009) +[2023-10-09 06:23:19,788][60143] Updated weights for policy 0, policy_version 54122 (0.0007) +[2023-10-09 06:23:20,165][60143] Updated weights for policy 0, policy_version 54132 (0.0009) +[2023-10-09 06:23:20,533][60143] Updated weights for policy 0, policy_version 54142 (0.0009) +[2023-10-09 06:23:21,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 111509504. Throughput: 0: 1722.7, 1: 1707.5. Samples: 27879106. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 06:23:21,053][59242] Avg episode reward: [(0, '30.610'), (1, '32.080')] +[2023-10-09 06:23:21,874][60144] Updated weights for policy 1, policy_version 54762 (0.0009) +[2023-10-09 06:23:22,248][60144] Updated weights for policy 1, policy_version 54772 (0.0008) +[2023-10-09 06:23:22,611][60144] Updated weights for policy 1, policy_version 54782 (0.0007) +[2023-10-09 06:23:24,636][60143] Updated weights for policy 0, policy_version 54152 (0.0008) +[2023-10-09 06:23:25,005][60143] Updated weights for policy 0, policy_version 54162 (0.0010) +[2023-10-09 06:23:25,372][60143] Updated weights for policy 0, policy_version 54172 (0.0009) +[2023-10-09 06:23:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 111575040. Throughput: 0: 1717.2, 1: 1733.3. Samples: 27900666. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 06:23:26,053][59242] Avg episode reward: [(0, '29.730'), (1, '34.200')] +[2023-10-09 06:23:26,425][60144] Updated weights for policy 1, policy_version 54792 (0.0009) +[2023-10-09 06:23:26,797][60144] Updated weights for policy 1, policy_version 54802 (0.0007) +[2023-10-09 06:23:27,153][60144] Updated weights for policy 1, policy_version 54812 (0.0008) +[2023-10-09 06:23:29,359][60143] Updated weights for policy 0, policy_version 54182 (0.0008) +[2023-10-09 06:23:29,719][60143] Updated weights for policy 0, policy_version 54192 (0.0007) +[2023-10-09 06:23:30,085][60143] Updated weights for policy 0, policy_version 54202 (0.0011) +[2023-10-09 06:23:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 111640576. Throughput: 0: 1688.8, 1: 1744.1. Samples: 27920696. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 06:23:31,053][59242] Avg episode reward: [(0, '29.030'), (1, '33.510')] +[2023-10-09 06:23:31,061][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000054208_55508992.pth... +[2023-10-09 06:23:31,094][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000052608_53870592.pth +[2023-10-09 06:23:31,204][60144] Updated weights for policy 1, policy_version 54822 (0.0010) +[2023-10-09 06:23:31,558][60144] Updated weights for policy 1, policy_version 54832 (0.0007) +[2023-10-09 06:23:31,921][60144] Updated weights for policy 1, policy_version 54842 (0.0009) +[2023-10-09 06:23:32,142][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000054848_56164352.pth... +[2023-10-09 06:23:32,172][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000053216_54493184.pth +[2023-10-09 06:23:34,191][60143] Updated weights for policy 0, policy_version 54212 (0.0008) +[2023-10-09 06:23:34,556][60143] Updated weights for policy 0, policy_version 54222 (0.0007) +[2023-10-09 06:23:34,934][60143] Updated weights for policy 0, policy_version 54232 (0.0008) +[2023-10-09 06:23:35,829][60144] Updated weights for policy 1, policy_version 54852 (0.0007) +[2023-10-09 06:23:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 111706112. Throughput: 0: 1719.1, 1: 1716.5. Samples: 27931302. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-09 06:23:36,052][59242] Avg episode reward: [(0, '27.960'), (1, '33.950')] +[2023-10-09 06:23:36,193][60144] Updated weights for policy 1, policy_version 54862 (0.0007) +[2023-10-09 06:23:36,564][60144] Updated weights for policy 1, policy_version 54872 (0.0007) +[2023-10-09 06:23:39,111][60143] Updated weights for policy 0, policy_version 54242 (0.0008) +[2023-10-09 06:23:39,479][60143] Updated weights for policy 0, policy_version 54252 (0.0011) +[2023-10-09 06:23:39,844][60143] Updated weights for policy 0, policy_version 54262 (0.0010) +[2023-10-09 06:23:40,210][60143] Updated weights for policy 0, policy_version 54272 (0.0010) +[2023-10-09 06:23:40,409][60144] Updated weights for policy 1, policy_version 54882 (0.0010) +[2023-10-09 06:23:40,773][60144] Updated weights for policy 1, policy_version 54892 (0.0009) +[2023-10-09 06:23:41,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.2, 300 sec: 13773.7). Total num frames: 111771648. Throughput: 0: 1704.2, 1: 1750.3. Samples: 27952104. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-09 06:23:41,053][59242] Avg episode reward: [(0, '28.920'), (1, '35.040')] +[2023-10-09 06:23:41,147][60144] Updated weights for policy 1, policy_version 54902 (0.0009) +[2023-10-09 06:23:41,514][60144] Updated weights for policy 1, policy_version 54912 (0.0008) +[2023-10-09 06:23:44,030][60143] Updated weights for policy 0, policy_version 54282 (0.0008) +[2023-10-09 06:23:44,393][60143] Updated weights for policy 0, policy_version 54292 (0.0008) +[2023-10-09 06:23:44,752][60143] Updated weights for policy 0, policy_version 54302 (0.0007) +[2023-10-09 06:23:45,705][60144] Updated weights for policy 1, policy_version 54922 (0.0011) +[2023-10-09 06:23:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 111837184. Throughput: 0: 1691.9, 1: 1741.0. Samples: 27972248. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-09 06:23:46,052][59242] Avg episode reward: [(0, '27.850'), (1, '34.020')] +[2023-10-09 06:23:46,059][60144] Updated weights for policy 1, policy_version 54932 (0.0010) +[2023-10-09 06:23:46,423][60144] Updated weights for policy 1, policy_version 54942 (0.0007) +[2023-10-09 06:23:48,607][60143] Updated weights for policy 0, policy_version 54312 (0.0009) +[2023-10-09 06:23:48,974][60143] Updated weights for policy 0, policy_version 54322 (0.0009) +[2023-10-09 06:23:49,337][60143] Updated weights for policy 0, policy_version 54332 (0.0010) +[2023-10-09 06:23:50,365][60144] Updated weights for policy 1, policy_version 54952 (0.0008) +[2023-10-09 06:23:50,735][60144] Updated weights for policy 1, policy_version 54962 (0.0010) +[2023-10-09 06:23:51,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 111902720. Throughput: 0: 1714.3, 1: 1734.6. Samples: 27982948. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-09 06:23:51,053][59242] Avg episode reward: [(0, '27.300'), (1, '33.610')] +[2023-10-09 06:23:51,112][60144] Updated weights for policy 1, policy_version 54972 (0.0010) +[2023-10-09 06:23:53,286][60143] Updated weights for policy 0, policy_version 54342 (0.0009) +[2023-10-09 06:23:53,673][60143] Updated weights for policy 0, policy_version 54352 (0.0007) +[2023-10-09 06:23:54,043][60143] Updated weights for policy 0, policy_version 54362 (0.0009) +[2023-10-09 06:23:54,946][60144] Updated weights for policy 1, policy_version 54982 (0.0008) +[2023-10-09 06:23:55,322][60144] Updated weights for policy 1, policy_version 54992 (0.0009) +[2023-10-09 06:23:55,680][60144] Updated weights for policy 1, policy_version 55002 (0.0009) +[2023-10-09 06:23:56,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 112001024. Throughput: 0: 1684.8, 1: 1752.5. Samples: 28003214. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-09 06:23:56,052][59242] Avg episode reward: [(0, '27.060'), (1, '32.970')] +[2023-10-09 06:23:58,051][60143] Updated weights for policy 0, policy_version 54372 (0.0008) +[2023-10-09 06:23:58,427][60143] Updated weights for policy 0, policy_version 54382 (0.0008) +[2023-10-09 06:23:58,788][60143] Updated weights for policy 0, policy_version 54392 (0.0008) +[2023-10-09 06:23:59,454][60144] Updated weights for policy 1, policy_version 55012 (0.0009) +[2023-10-09 06:23:59,818][60144] Updated weights for policy 1, policy_version 55022 (0.0010) +[2023-10-09 06:24:00,186][60144] Updated weights for policy 1, policy_version 55032 (0.0009) +[2023-10-09 06:24:01,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 112066560. Throughput: 0: 1705.9, 1: 1722.2. Samples: 28023342. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-09 06:24:01,053][59242] Avg episode reward: [(0, '28.870'), (1, '32.800')] +[2023-10-09 06:24:02,895][60143] Updated weights for policy 0, policy_version 54402 (0.0009) +[2023-10-09 06:24:03,253][60143] Updated weights for policy 0, policy_version 54412 (0.0008) +[2023-10-09 06:24:03,622][60143] Updated weights for policy 0, policy_version 54422 (0.0009) +[2023-10-09 06:24:03,977][60143] Updated weights for policy 0, policy_version 54432 (0.0008) +[2023-10-09 06:24:04,239][60144] Updated weights for policy 1, policy_version 55042 (0.0010) +[2023-10-09 06:24:04,615][60144] Updated weights for policy 1, policy_version 55052 (0.0010) +[2023-10-09 06:24:04,989][60144] Updated weights for policy 1, policy_version 55062 (0.0009) +[2023-10-09 06:24:05,348][60144] Updated weights for policy 1, policy_version 55072 (0.0009) +[2023-10-09 06:24:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 112132096. Throughput: 0: 1698.5, 1: 1752.8. Samples: 28034414. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-09 06:24:06,053][59242] Avg episode reward: [(0, '29.580'), (1, '32.830')] +[2023-10-09 06:24:07,964][60143] Updated weights for policy 0, policy_version 54442 (0.0008) +[2023-10-09 06:24:08,342][60143] Updated weights for policy 0, policy_version 54452 (0.0007) +[2023-10-09 06:24:08,707][60143] Updated weights for policy 0, policy_version 54462 (0.0007) +[2023-10-09 06:24:09,259][60144] Updated weights for policy 1, policy_version 55082 (0.0009) +[2023-10-09 06:24:09,626][60144] Updated weights for policy 1, policy_version 55092 (0.0008) +[2023-10-09 06:24:09,991][60144] Updated weights for policy 1, policy_version 55102 (0.0007) +[2023-10-09 06:24:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 112197632. Throughput: 0: 1689.3, 1: 1726.5. Samples: 28054378. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-09 06:24:11,053][59242] Avg episode reward: [(0, '30.280'), (1, '31.630')] +[2023-10-09 06:24:12,783][60143] Updated weights for policy 0, policy_version 54472 (0.0009) +[2023-10-09 06:24:13,149][60143] Updated weights for policy 0, policy_version 54482 (0.0008) +[2023-10-09 06:24:13,514][60143] Updated weights for policy 0, policy_version 54492 (0.0009) +[2023-10-09 06:24:13,800][60144] Updated weights for policy 1, policy_version 55112 (0.0007) +[2023-10-09 06:24:14,174][60144] Updated weights for policy 1, policy_version 55122 (0.0009) +[2023-10-09 06:24:14,540][60144] Updated weights for policy 1, policy_version 55132 (0.0009) +[2023-10-09 06:24:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 112263168. Throughput: 0: 1721.2, 1: 1714.0. Samples: 28075282. Policy #0 lag: (min: 30.0, avg: 35.6, max: 62.0) +[2023-10-09 06:24:16,053][59242] Avg episode reward: [(0, '30.310'), (1, '31.250')] +[2023-10-09 06:24:17,534][60143] Updated weights for policy 0, policy_version 54502 (0.0007) +[2023-10-09 06:24:17,903][60143] Updated weights for policy 0, policy_version 54512 (0.0007) +[2023-10-09 06:24:18,275][60143] Updated weights for policy 0, policy_version 54522 (0.0011) +[2023-10-09 06:24:18,610][60144] Updated weights for policy 1, policy_version 55142 (0.0008) +[2023-10-09 06:24:18,984][60144] Updated weights for policy 1, policy_version 55152 (0.0008) +[2023-10-09 06:24:19,347][60144] Updated weights for policy 1, policy_version 55162 (0.0008) +[2023-10-09 06:24:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 112328704. Throughput: 0: 1695.3, 1: 1737.8. Samples: 28085790. Policy #0 lag: (min: 30.0, avg: 35.6, max: 62.0) +[2023-10-09 06:24:21,053][59242] Avg episode reward: [(0, '29.680'), (1, '33.170')] +[2023-10-09 06:24:22,198][60143] Updated weights for policy 0, policy_version 54532 (0.0008) +[2023-10-09 06:24:22,557][60143] Updated weights for policy 0, policy_version 54542 (0.0008) +[2023-10-09 06:24:22,936][60143] Updated weights for policy 0, policy_version 54552 (0.0009) +[2023-10-09 06:24:23,331][60144] Updated weights for policy 1, policy_version 55172 (0.0008) +[2023-10-09 06:24:23,691][60144] Updated weights for policy 1, policy_version 55182 (0.0008) +[2023-10-09 06:24:24,056][60144] Updated weights for policy 1, policy_version 55192 (0.0010) +[2023-10-09 06:24:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 112394240. Throughput: 0: 1714.8, 1: 1708.0. Samples: 28106130. Policy #0 lag: (min: 30.0, avg: 35.6, max: 62.0) +[2023-10-09 06:24:26,053][59242] Avg episode reward: [(0, '29.620'), (1, '32.740')] +[2023-10-09 06:24:26,691][60143] Updated weights for policy 0, policy_version 54562 (0.0008) +[2023-10-09 06:24:27,064][60143] Updated weights for policy 0, policy_version 54572 (0.0010) +[2023-10-09 06:24:27,443][60143] Updated weights for policy 0, policy_version 54582 (0.0008) +[2023-10-09 06:24:27,821][60143] Updated weights for policy 0, policy_version 54592 (0.0009) +[2023-10-09 06:24:28,031][60144] Updated weights for policy 1, policy_version 55202 (0.0010) +[2023-10-09 06:24:28,393][60144] Updated weights for policy 1, policy_version 55212 (0.0009) +[2023-10-09 06:24:28,754][60144] Updated weights for policy 1, policy_version 55222 (0.0009) +[2023-10-09 06:24:29,121][60144] Updated weights for policy 1, policy_version 55232 (0.0010) +[2023-10-09 06:24:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 112459776. Throughput: 0: 1736.6, 1: 1716.2. Samples: 28127626. Policy #0 lag: (min: 30.0, avg: 35.6, max: 62.0) +[2023-10-09 06:24:31,052][59242] Avg episode reward: [(0, '28.390'), (1, '32.830')] +[2023-10-09 06:24:31,683][60143] Updated weights for policy 0, policy_version 54602 (0.0008) +[2023-10-09 06:24:32,045][60143] Updated weights for policy 0, policy_version 54612 (0.0008) +[2023-10-09 06:24:32,415][60143] Updated weights for policy 0, policy_version 54622 (0.0007) +[2023-10-09 06:24:33,225][60144] Updated weights for policy 1, policy_version 55242 (0.0008) +[2023-10-09 06:24:33,593][60144] Updated weights for policy 1, policy_version 55252 (0.0008) +[2023-10-09 06:24:33,967][60144] Updated weights for policy 1, policy_version 55262 (0.0008) +[2023-10-09 06:24:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 112525312. Throughput: 0: 1709.9, 1: 1723.8. Samples: 28137466. Policy #0 lag: (min: 30.0, avg: 35.6, max: 62.0) +[2023-10-09 06:24:36,052][59242] Avg episode reward: [(0, '28.050'), (1, '32.640')] +[2023-10-09 06:24:36,162][60143] Updated weights for policy 0, policy_version 54632 (0.0009) +[2023-10-09 06:24:36,533][60143] Updated weights for policy 0, policy_version 54642 (0.0009) +[2023-10-09 06:24:36,898][60143] Updated weights for policy 0, policy_version 54652 (0.0009) +[2023-10-09 06:24:37,997][60144] Updated weights for policy 1, policy_version 55272 (0.0008) +[2023-10-09 06:24:38,370][60144] Updated weights for policy 1, policy_version 55282 (0.0008) +[2023-10-09 06:24:38,731][60144] Updated weights for policy 1, policy_version 55292 (0.0007) +[2023-10-09 06:24:40,921][60143] Updated weights for policy 0, policy_version 54662 (0.0009) +[2023-10-09 06:24:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 112590848. Throughput: 0: 1743.8, 1: 1703.9. Samples: 28158362. Policy #0 lag: (min: 30.0, avg: 35.6, max: 62.0) +[2023-10-09 06:24:41,053][59242] Avg episode reward: [(0, '27.830'), (1, '33.670')] +[2023-10-09 06:24:41,302][60143] Updated weights for policy 0, policy_version 54672 (0.0007) +[2023-10-09 06:24:41,668][60143] Updated weights for policy 0, policy_version 54682 (0.0009) +[2023-10-09 06:24:42,650][60144] Updated weights for policy 1, policy_version 55302 (0.0008) +[2023-10-09 06:24:43,020][60144] Updated weights for policy 1, policy_version 55312 (0.0007) +[2023-10-09 06:24:43,384][60144] Updated weights for policy 1, policy_version 55322 (0.0007) +[2023-10-09 06:24:45,699][60143] Updated weights for policy 0, policy_version 54692 (0.0010) +[2023-10-09 06:24:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 112656384. Throughput: 0: 1738.9, 1: 1733.7. Samples: 28179610. Policy #0 lag: (min: 30.0, avg: 35.6, max: 62.0) +[2023-10-09 06:24:46,053][59242] Avg episode reward: [(0, '29.070'), (1, '33.490')] +[2023-10-09 06:24:46,074][60143] Updated weights for policy 0, policy_version 54702 (0.0011) +[2023-10-09 06:24:46,444][60143] Updated weights for policy 0, policy_version 54712 (0.0011) +[2023-10-09 06:24:47,147][60144] Updated weights for policy 1, policy_version 55332 (0.0008) +[2023-10-09 06:24:47,520][60144] Updated weights for policy 1, policy_version 55342 (0.0007) +[2023-10-09 06:24:47,879][60144] Updated weights for policy 1, policy_version 55352 (0.0011) +[2023-10-09 06:24:50,554][60143] Updated weights for policy 0, policy_version 54722 (0.0010) +[2023-10-09 06:24:50,918][60143] Updated weights for policy 0, policy_version 54732 (0.0009) +[2023-10-09 06:24:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 112721920. Throughput: 0: 1722.3, 1: 1707.9. Samples: 28188772. Policy #0 lag: (min: 30.0, avg: 35.6, max: 62.0) +[2023-10-09 06:24:51,052][59242] Avg episode reward: [(0, '28.790'), (1, '33.620')] +[2023-10-09 06:24:51,293][60143] Updated weights for policy 0, policy_version 54742 (0.0008) +[2023-10-09 06:24:51,659][60143] Updated weights for policy 0, policy_version 54752 (0.0008) +[2023-10-09 06:24:51,920][60144] Updated weights for policy 1, policy_version 55362 (0.0011) +[2023-10-09 06:24:52,327][60144] Updated weights for policy 1, policy_version 55372 (0.0009) +[2023-10-09 06:24:52,697][60144] Updated weights for policy 1, policy_version 55382 (0.0008) +[2023-10-09 06:24:53,073][60144] Updated weights for policy 1, policy_version 55392 (0.0009) +[2023-10-09 06:24:55,689][60143] Updated weights for policy 0, policy_version 54762 (0.0007) +[2023-10-09 06:24:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 112787456. Throughput: 0: 1730.5, 1: 1721.9. Samples: 28209736. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 06:24:56,052][59242] Avg episode reward: [(0, '29.130'), (1, '31.840')] +[2023-10-09 06:24:56,062][60143] Updated weights for policy 0, policy_version 54772 (0.0010) +[2023-10-09 06:24:56,441][60143] Updated weights for policy 0, policy_version 54782 (0.0009) +[2023-10-09 06:24:56,795][60144] Updated weights for policy 1, policy_version 55402 (0.0011) +[2023-10-09 06:24:57,165][60144] Updated weights for policy 1, policy_version 55412 (0.0008) +[2023-10-09 06:24:57,524][60144] Updated weights for policy 1, policy_version 55422 (0.0008) +[2023-10-09 06:25:00,499][60143] Updated weights for policy 0, policy_version 54792 (0.0009) +[2023-10-09 06:25:00,874][60143] Updated weights for policy 0, policy_version 54802 (0.0008) +[2023-10-09 06:25:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 112852992. Throughput: 0: 1718.2, 1: 1734.7. Samples: 28230660. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 06:25:01,053][59242] Avg episode reward: [(0, '29.180'), (1, '31.810')] +[2023-10-09 06:25:01,243][60143] Updated weights for policy 0, policy_version 54812 (0.0009) +[2023-10-09 06:25:01,541][60144] Updated weights for policy 1, policy_version 55432 (0.0007) +[2023-10-09 06:25:01,904][60144] Updated weights for policy 1, policy_version 55442 (0.0008) +[2023-10-09 06:25:02,277][60144] Updated weights for policy 1, policy_version 55452 (0.0009) +[2023-10-09 06:25:05,032][60143] Updated weights for policy 0, policy_version 54822 (0.0007) +[2023-10-09 06:25:05,400][60143] Updated weights for policy 0, policy_version 54832 (0.0008) +[2023-10-09 06:25:05,774][60143] Updated weights for policy 0, policy_version 54842 (0.0008) +[2023-10-09 06:25:06,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 112951296. Throughput: 0: 1722.5, 1: 1708.7. Samples: 28240196. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 06:25:06,052][59242] Avg episode reward: [(0, '28.470'), (1, '30.560')] +[2023-10-09 06:25:06,273][60144] Updated weights for policy 1, policy_version 55462 (0.0010) +[2023-10-09 06:25:06,650][60144] Updated weights for policy 1, policy_version 55472 (0.0008) +[2023-10-09 06:25:07,016][60144] Updated weights for policy 1, policy_version 55482 (0.0008) +[2023-10-09 06:25:09,628][60143] Updated weights for policy 0, policy_version 54852 (0.0009) +[2023-10-09 06:25:09,997][60143] Updated weights for policy 0, policy_version 54862 (0.0011) +[2023-10-09 06:25:10,371][60143] Updated weights for policy 0, policy_version 54872 (0.0008) +[2023-10-09 06:25:10,975][60144] Updated weights for policy 1, policy_version 55492 (0.0009) +[2023-10-09 06:25:11,052][59242] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 113016832. Throughput: 0: 1720.3, 1: 1732.5. Samples: 28261506. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 06:25:11,053][59242] Avg episode reward: [(0, '28.560'), (1, '31.770')] +[2023-10-09 06:25:11,340][60144] Updated weights for policy 1, policy_version 55502 (0.0008) +[2023-10-09 06:25:11,709][60144] Updated weights for policy 1, policy_version 55512 (0.0010) +[2023-10-09 06:25:14,351][60143] Updated weights for policy 0, policy_version 54882 (0.0010) +[2023-10-09 06:25:14,721][60143] Updated weights for policy 0, policy_version 54892 (0.0008) +[2023-10-09 06:25:15,083][60143] Updated weights for policy 0, policy_version 54902 (0.0009) +[2023-10-09 06:25:15,453][60143] Updated weights for policy 0, policy_version 54912 (0.0010) +[2023-10-09 06:25:15,698][60144] Updated weights for policy 1, policy_version 55522 (0.0010) +[2023-10-09 06:25:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 113082368. Throughput: 0: 1690.9, 1: 1736.5. Samples: 28281858. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 06:25:16,053][59242] Avg episode reward: [(0, '30.180'), (1, '30.550')] +[2023-10-09 06:25:16,070][60144] Updated weights for policy 1, policy_version 55532 (0.0007) +[2023-10-09 06:25:16,442][60144] Updated weights for policy 1, policy_version 55542 (0.0008) +[2023-10-09 06:25:16,799][60144] Updated weights for policy 1, policy_version 55552 (0.0007) +[2023-10-09 06:25:19,584][60143] Updated weights for policy 0, policy_version 54922 (0.0008) +[2023-10-09 06:25:19,949][60143] Updated weights for policy 0, policy_version 54932 (0.0007) +[2023-10-09 06:25:20,316][60143] Updated weights for policy 0, policy_version 54942 (0.0009) +[2023-10-09 06:25:20,648][60144] Updated weights for policy 1, policy_version 55562 (0.0007) +[2023-10-09 06:25:21,006][60144] Updated weights for policy 1, policy_version 55572 (0.0007) +[2023-10-09 06:25:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 113147904. Throughput: 0: 1718.9, 1: 1722.4. Samples: 28292328. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 06:25:21,053][59242] Avg episode reward: [(0, '30.510'), (1, '30.540')] +[2023-10-09 06:25:21,377][60144] Updated weights for policy 1, policy_version 55582 (0.0010) +[2023-10-09 06:25:24,392][60143] Updated weights for policy 0, policy_version 54952 (0.0009) +[2023-10-09 06:25:24,766][60143] Updated weights for policy 0, policy_version 54962 (0.0010) +[2023-10-09 06:25:25,140][60143] Updated weights for policy 0, policy_version 54972 (0.0009) +[2023-10-09 06:25:25,292][60144] Updated weights for policy 1, policy_version 55592 (0.0009) +[2023-10-09 06:25:25,656][60144] Updated weights for policy 1, policy_version 55602 (0.0009) +[2023-10-09 06:25:26,016][60144] Updated weights for policy 1, policy_version 55612 (0.0009) +[2023-10-09 06:25:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 113213440. Throughput: 0: 1704.6, 1: 1741.9. Samples: 28313452. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-09 06:25:26,053][59242] Avg episode reward: [(0, '29.620'), (1, '31.010')] +[2023-10-09 06:25:28,938][60143] Updated weights for policy 0, policy_version 54982 (0.0008) +[2023-10-09 06:25:29,316][60143] Updated weights for policy 0, policy_version 54992 (0.0008) +[2023-10-09 06:25:29,689][60143] Updated weights for policy 0, policy_version 55002 (0.0007) +[2023-10-09 06:25:29,823][60144] Updated weights for policy 1, policy_version 55622 (0.0009) +[2023-10-09 06:25:30,181][60144] Updated weights for policy 1, policy_version 55632 (0.0007) +[2023-10-09 06:25:30,547][60144] Updated weights for policy 1, policy_version 55642 (0.0007) +[2023-10-09 06:25:31,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 113311744. Throughput: 0: 1685.6, 1: 1723.2. Samples: 28333006. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 06:25:31,053][59242] Avg episode reward: [(0, '28.770'), (1, '30.250')] +[2023-10-09 06:25:31,061][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000055648_56983552.pth... +[2023-10-09 06:25:31,061][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000055008_56328192.pth... +[2023-10-09 06:25:31,092][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000054016_55312384.pth +[2023-10-09 06:25:31,094][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000053408_54689792.pth +[2023-10-09 06:25:33,737][60143] Updated weights for policy 0, policy_version 55012 (0.0008) +[2023-10-09 06:25:34,098][60143] Updated weights for policy 0, policy_version 55022 (0.0007) +[2023-10-09 06:25:34,328][60144] Updated weights for policy 1, policy_version 55652 (0.0008) +[2023-10-09 06:25:34,469][60143] Updated weights for policy 0, policy_version 55032 (0.0007) +[2023-10-09 06:25:34,706][60144] Updated weights for policy 1, policy_version 55662 (0.0008) +[2023-10-09 06:25:35,071][60144] Updated weights for policy 1, policy_version 55672 (0.0010) +[2023-10-09 06:25:36,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 113377280. Throughput: 0: 1716.3, 1: 1751.5. Samples: 28344824. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 06:25:36,053][59242] Avg episode reward: [(0, '29.660'), (1, '31.140')] +[2023-10-09 06:25:38,413][60143] Updated weights for policy 0, policy_version 55042 (0.0007) +[2023-10-09 06:25:38,783][60143] Updated weights for policy 0, policy_version 55052 (0.0007) +[2023-10-09 06:25:38,889][60144] Updated weights for policy 1, policy_version 55682 (0.0008) +[2023-10-09 06:25:39,139][60143] Updated weights for policy 0, policy_version 55062 (0.0010) +[2023-10-09 06:25:39,307][60144] Updated weights for policy 1, policy_version 55692 (0.0008) +[2023-10-09 06:25:39,511][60143] Updated weights for policy 0, policy_version 55072 (0.0007) +[2023-10-09 06:25:39,680][60144] Updated weights for policy 1, policy_version 55702 (0.0008) +[2023-10-09 06:25:40,039][60144] Updated weights for policy 1, policy_version 55712 (0.0008) +[2023-10-09 06:25:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 113442816. Throughput: 0: 1695.2, 1: 1735.5. Samples: 28364116. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 06:25:41,052][59242] Avg episode reward: [(0, '28.740'), (1, '30.810')] +[2023-10-09 06:25:43,532][60143] Updated weights for policy 0, policy_version 55082 (0.0007) +[2023-10-09 06:25:43,890][60143] Updated weights for policy 0, policy_version 55092 (0.0008) +[2023-10-09 06:25:43,982][60144] Updated weights for policy 1, policy_version 55722 (0.0008) +[2023-10-09 06:25:44,267][60143] Updated weights for policy 0, policy_version 55102 (0.0010) +[2023-10-09 06:25:44,342][60144] Updated weights for policy 1, policy_version 55732 (0.0007) +[2023-10-09 06:25:44,702][60144] Updated weights for policy 1, policy_version 55742 (0.0009) +[2023-10-09 06:25:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 113508352. Throughput: 0: 1699.9, 1: 1724.0. Samples: 28384732. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 06:25:46,053][59242] Avg episode reward: [(0, '29.620'), (1, '29.960')] +[2023-10-09 06:25:48,259][60143] Updated weights for policy 0, policy_version 55112 (0.0008) +[2023-10-09 06:25:48,605][60144] Updated weights for policy 1, policy_version 55752 (0.0008) +[2023-10-09 06:25:48,639][60143] Updated weights for policy 0, policy_version 55122 (0.0008) +[2023-10-09 06:25:48,981][60144] Updated weights for policy 1, policy_version 55762 (0.0007) +[2023-10-09 06:25:49,007][60143] Updated weights for policy 0, policy_version 55132 (0.0008) +[2023-10-09 06:25:49,348][60144] Updated weights for policy 1, policy_version 55772 (0.0009) +[2023-10-09 06:25:51,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 113573888. Throughput: 0: 1707.9, 1: 1751.1. Samples: 28395854. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 06:25:51,053][59242] Avg episode reward: [(0, '29.530'), (1, '30.620')] +[2023-10-09 06:25:53,170][60143] Updated weights for policy 0, policy_version 55142 (0.0008) +[2023-10-09 06:25:53,340][60144] Updated weights for policy 1, policy_version 55782 (0.0009) +[2023-10-09 06:25:53,544][60143] Updated weights for policy 0, policy_version 55152 (0.0008) +[2023-10-09 06:25:53,711][60144] Updated weights for policy 1, policy_version 55792 (0.0009) +[2023-10-09 06:25:53,908][60143] Updated weights for policy 0, policy_version 55162 (0.0009) +[2023-10-09 06:25:54,082][60144] Updated weights for policy 1, policy_version 55802 (0.0008) +[2023-10-09 06:25:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 113639424. Throughput: 0: 1685.1, 1: 1732.4. Samples: 28415294. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 06:25:56,053][59242] Avg episode reward: [(0, '30.710'), (1, '30.770')] +[2023-10-09 06:25:57,795][60143] Updated weights for policy 0, policy_version 55172 (0.0008) +[2023-10-09 06:25:58,026][60144] Updated weights for policy 1, policy_version 55812 (0.0008) +[2023-10-09 06:25:58,167][60143] Updated weights for policy 0, policy_version 55182 (0.0008) +[2023-10-09 06:25:58,388][60144] Updated weights for policy 1, policy_version 55822 (0.0008) +[2023-10-09 06:25:58,539][60143] Updated weights for policy 0, policy_version 55192 (0.0008) +[2023-10-09 06:25:58,759][60144] Updated weights for policy 1, policy_version 55832 (0.0007) +[2023-10-09 06:26:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 113704960. Throughput: 0: 1715.6, 1: 1731.3. Samples: 28436970. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 06:26:01,053][59242] Avg episode reward: [(0, '30.710'), (1, '29.930')] +[2023-10-09 06:26:02,574][60143] Updated weights for policy 0, policy_version 55202 (0.0008) +[2023-10-09 06:26:02,681][60144] Updated weights for policy 1, policy_version 55842 (0.0008) +[2023-10-09 06:26:02,938][60143] Updated weights for policy 0, policy_version 55212 (0.0008) +[2023-10-09 06:26:03,046][60144] Updated weights for policy 1, policy_version 55852 (0.0009) +[2023-10-09 06:26:03,302][60143] Updated weights for policy 0, policy_version 55222 (0.0009) +[2023-10-09 06:26:03,410][60144] Updated weights for policy 1, policy_version 55862 (0.0008) +[2023-10-09 06:26:03,669][60143] Updated weights for policy 0, policy_version 55232 (0.0008) +[2023-10-09 06:26:03,780][60144] Updated weights for policy 1, policy_version 55872 (0.0009) +[2023-10-09 06:26:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 113770496. Throughput: 0: 1690.6, 1: 1742.0. Samples: 28446796. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 06:26:06,053][59242] Avg episode reward: [(0, '29.960'), (1, '30.050')] +[2023-10-09 06:26:07,645][60143] Updated weights for policy 0, policy_version 55242 (0.0009) +[2023-10-09 06:26:07,678][60144] Updated weights for policy 1, policy_version 55882 (0.0008) +[2023-10-09 06:26:08,018][60143] Updated weights for policy 0, policy_version 55252 (0.0010) +[2023-10-09 06:26:08,053][60144] Updated weights for policy 1, policy_version 55892 (0.0007) +[2023-10-09 06:26:08,383][60143] Updated weights for policy 0, policy_version 55262 (0.0008) +[2023-10-09 06:26:08,413][60144] Updated weights for policy 1, policy_version 55902 (0.0009) +[2023-10-09 06:26:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 113836032. Throughput: 0: 1694.3, 1: 1730.8. Samples: 28467582. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:26:11,053][59242] Avg episode reward: [(0, '32.060'), (1, '30.180')] +[2023-10-09 06:26:12,265][60144] Updated weights for policy 1, policy_version 55912 (0.0009) +[2023-10-09 06:26:12,511][60143] Updated weights for policy 0, policy_version 55272 (0.0008) +[2023-10-09 06:26:12,642][60144] Updated weights for policy 1, policy_version 55922 (0.0007) +[2023-10-09 06:26:12,874][60143] Updated weights for policy 0, policy_version 55282 (0.0007) +[2023-10-09 06:26:13,003][60144] Updated weights for policy 1, policy_version 55932 (0.0008) +[2023-10-09 06:26:13,245][60143] Updated weights for policy 0, policy_version 55292 (0.0008) +[2023-10-09 06:26:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 113901568. Throughput: 0: 1714.8, 1: 1748.5. Samples: 28488852. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:26:16,052][59242] Avg episode reward: [(0, '31.000'), (1, '28.310')] +[2023-10-09 06:26:16,958][60144] Updated weights for policy 1, policy_version 55942 (0.0010) +[2023-10-09 06:26:17,121][60143] Updated weights for policy 0, policy_version 55302 (0.0009) +[2023-10-09 06:26:17,329][60144] Updated weights for policy 1, policy_version 55952 (0.0007) +[2023-10-09 06:26:17,515][60143] Updated weights for policy 0, policy_version 55312 (0.0008) +[2023-10-09 06:26:17,695][60144] Updated weights for policy 1, policy_version 55962 (0.0011) +[2023-10-09 06:26:17,880][60143] Updated weights for policy 0, policy_version 55322 (0.0010) +[2023-10-09 06:26:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 113967104. Throughput: 0: 1685.8, 1: 1717.0. Samples: 28497950. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:26:21,053][59242] Avg episode reward: [(0, '29.100'), (1, '29.940')] +[2023-10-09 06:26:21,691][60144] Updated weights for policy 1, policy_version 55972 (0.0008) +[2023-10-09 06:26:21,926][60143] Updated weights for policy 0, policy_version 55332 (0.0009) +[2023-10-09 06:26:22,061][60144] Updated weights for policy 1, policy_version 55982 (0.0007) +[2023-10-09 06:26:22,289][60143] Updated weights for policy 0, policy_version 55342 (0.0009) +[2023-10-09 06:26:22,430][60144] Updated weights for policy 1, policy_version 55992 (0.0009) +[2023-10-09 06:26:22,663][60143] Updated weights for policy 0, policy_version 55352 (0.0008) +[2023-10-09 06:26:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 114032640. Throughput: 0: 1705.8, 1: 1740.3. Samples: 28519192. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:26:26,052][59242] Avg episode reward: [(0, '30.800'), (1, '31.300')] +[2023-10-09 06:26:26,239][60144] Updated weights for policy 1, policy_version 56002 (0.0007) +[2023-10-09 06:26:26,667][60144] Updated weights for policy 1, policy_version 56012 (0.0010) +[2023-10-09 06:26:26,788][60143] Updated weights for policy 0, policy_version 55362 (0.0008) +[2023-10-09 06:26:27,032][60144] Updated weights for policy 1, policy_version 56022 (0.0007) +[2023-10-09 06:26:27,157][60143] Updated weights for policy 0, policy_version 55372 (0.0010) +[2023-10-09 06:26:27,397][60144] Updated weights for policy 1, policy_version 56032 (0.0008) +[2023-10-09 06:26:27,540][60143] Updated weights for policy 0, policy_version 55382 (0.0011) +[2023-10-09 06:26:27,914][60143] Updated weights for policy 0, policy_version 55392 (0.0009) +[2023-10-09 06:26:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 114098176. Throughput: 0: 1707.6, 1: 1754.8. Samples: 28540536. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:26:31,053][59242] Avg episode reward: [(0, '31.170'), (1, '29.890')] +[2023-10-09 06:26:31,270][60144] Updated weights for policy 1, policy_version 56042 (0.0007) +[2023-10-09 06:26:31,637][60144] Updated weights for policy 1, policy_version 56052 (0.0007) +[2023-10-09 06:26:31,936][60143] Updated weights for policy 0, policy_version 55402 (0.0008) +[2023-10-09 06:26:32,003][60144] Updated weights for policy 1, policy_version 56062 (0.0009) +[2023-10-09 06:26:32,301][60143] Updated weights for policy 0, policy_version 55412 (0.0008) +[2023-10-09 06:26:32,669][60143] Updated weights for policy 0, policy_version 55422 (0.0011) +[2023-10-09 06:26:35,949][60144] Updated weights for policy 1, policy_version 56072 (0.0010) +[2023-10-09 06:26:36,052][59242] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 114163712. Throughput: 0: 1691.4, 1: 1728.0. Samples: 28549728. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:26:36,054][59242] Avg episode reward: [(0, '31.650'), (1, '29.540')] +[2023-10-09 06:26:36,318][60144] Updated weights for policy 1, policy_version 56082 (0.0009) +[2023-10-09 06:26:36,679][60144] Updated weights for policy 1, policy_version 56092 (0.0007) +[2023-10-09 06:26:36,702][60143] Updated weights for policy 0, policy_version 55432 (0.0008) +[2023-10-09 06:26:37,081][60143] Updated weights for policy 0, policy_version 55442 (0.0008) +[2023-10-09 06:26:37,440][60143] Updated weights for policy 0, policy_version 55452 (0.0010) +[2023-10-09 06:26:40,750][60144] Updated weights for policy 1, policy_version 56102 (0.0007) +[2023-10-09 06:26:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 114229248. Throughput: 0: 1709.8, 1: 1749.7. Samples: 28570970. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:26:41,053][59242] Avg episode reward: [(0, '30.300'), (1, '30.980')] +[2023-10-09 06:26:41,118][60144] Updated weights for policy 1, policy_version 56112 (0.0009) +[2023-10-09 06:26:41,259][60143] Updated weights for policy 0, policy_version 55462 (0.0009) +[2023-10-09 06:26:41,485][60144] Updated weights for policy 1, policy_version 56122 (0.0009) +[2023-10-09 06:26:41,623][60143] Updated weights for policy 0, policy_version 55472 (0.0009) +[2023-10-09 06:26:41,995][60143] Updated weights for policy 0, policy_version 55482 (0.0009) +[2023-10-09 06:26:45,185][60144] Updated weights for policy 1, policy_version 56132 (0.0010) +[2023-10-09 06:26:45,552][60144] Updated weights for policy 1, policy_version 56142 (0.0010) +[2023-10-09 06:26:45,922][60144] Updated weights for policy 1, policy_version 56152 (0.0008) +[2023-10-09 06:26:46,013][60143] Updated weights for policy 0, policy_version 55492 (0.0009) +[2023-10-09 06:26:46,052][59242] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 114294784. Throughput: 0: 1705.6, 1: 1734.9. Samples: 28591794. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:26:46,052][59242] Avg episode reward: [(0, '29.730'), (1, '31.410')] +[2023-10-09 06:26:46,373][60143] Updated weights for policy 0, policy_version 55502 (0.0007) +[2023-10-09 06:26:46,742][60143] Updated weights for policy 0, policy_version 55512 (0.0007) +[2023-10-09 06:26:49,772][60144] Updated weights for policy 1, policy_version 56162 (0.0007) +[2023-10-09 06:26:50,139][60144] Updated weights for policy 1, policy_version 56172 (0.0007) +[2023-10-09 06:26:50,505][60144] Updated weights for policy 1, policy_version 56182 (0.0008) +[2023-10-09 06:26:50,857][60143] Updated weights for policy 0, policy_version 55522 (0.0008) +[2023-10-09 06:26:50,872][60144] Updated weights for policy 1, policy_version 56192 (0.0009) +[2023-10-09 06:26:51,052][59242] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 114393088. Throughput: 0: 1701.7, 1: 1742.0. Samples: 28601766. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 06:26:51,053][59242] Avg episode reward: [(0, '29.550'), (1, '31.900')] +[2023-10-09 06:26:51,221][60143] Updated weights for policy 0, policy_version 55532 (0.0007) +[2023-10-09 06:26:51,577][60143] Updated weights for policy 0, policy_version 55542 (0.0010) +[2023-10-09 06:26:51,938][60143] Updated weights for policy 0, policy_version 55552 (0.0007) +[2023-10-09 06:26:54,907][60144] Updated weights for policy 1, policy_version 56202 (0.0010) +[2023-10-09 06:26:55,270][60144] Updated weights for policy 1, policy_version 56212 (0.0010) +[2023-10-09 06:26:55,643][60144] Updated weights for policy 1, policy_version 56222 (0.0010) +[2023-10-09 06:26:55,971][60143] Updated weights for policy 0, policy_version 55562 (0.0007) +[2023-10-09 06:26:56,052][59242] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 114458624. Throughput: 0: 1706.9, 1: 1739.0. Samples: 28622648. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 06:26:56,053][59242] Avg episode reward: [(0, '29.910'), (1, '31.910')] +[2023-10-09 06:26:56,348][60143] Updated weights for policy 0, policy_version 55572 (0.0007) +[2023-10-09 06:26:56,716][60143] Updated weights for policy 0, policy_version 55582 (0.0007) +[2023-10-09 06:26:59,653][60144] Updated weights for policy 1, policy_version 56232 (0.0007) +[2023-10-09 06:27:00,025][60144] Updated weights for policy 1, policy_version 56242 (0.0007) +[2023-10-09 06:27:00,389][60144] Updated weights for policy 1, policy_version 56252 (0.0008) +[2023-10-09 06:27:00,749][60143] Updated weights for policy 0, policy_version 55592 (0.0010) +[2023-10-09 06:27:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 114524160. Throughput: 0: 1706.1, 1: 1710.5. Samples: 28642598. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 06:27:01,053][59242] Avg episode reward: [(0, '30.870'), (1, '31.100')] +[2023-10-09 06:27:01,113][60143] Updated weights for policy 0, policy_version 55602 (0.0008) +[2023-10-09 06:27:01,483][60143] Updated weights for policy 0, policy_version 55612 (0.0009) +[2023-10-09 06:27:04,332][60144] Updated weights for policy 1, policy_version 56262 (0.0008) +[2023-10-09 06:27:04,703][60144] Updated weights for policy 1, policy_version 56272 (0.0011) +[2023-10-09 06:27:05,085][60144] Updated weights for policy 1, policy_version 56282 (0.0010) +[2023-10-09 06:27:05,386][60143] Updated weights for policy 0, policy_version 55622 (0.0009) +[2023-10-09 06:27:05,761][60143] Updated weights for policy 0, policy_version 55632 (0.0007) +[2023-10-09 06:27:06,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 114589696. Throughput: 0: 1710.7, 1: 1744.4. Samples: 28653430. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 06:27:06,052][59242] Avg episode reward: [(0, '31.920'), (1, '32.090')] +[2023-10-09 06:27:06,130][60143] Updated weights for policy 0, policy_version 55642 (0.0007) +[2023-10-09 06:27:09,034][60144] Updated weights for policy 1, policy_version 56292 (0.0009) +[2023-10-09 06:27:09,406][60144] Updated weights for policy 1, policy_version 56302 (0.0011) +[2023-10-09 06:27:09,769][60144] Updated weights for policy 1, policy_version 56312 (0.0011) +[2023-10-09 06:27:10,165][60143] Updated weights for policy 0, policy_version 55652 (0.0008) +[2023-10-09 06:27:10,536][60143] Updated weights for policy 0, policy_version 55662 (0.0011) +[2023-10-09 06:27:10,901][60143] Updated weights for policy 0, policy_version 55672 (0.0012) +[2023-10-09 06:27:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 114655232. Throughput: 0: 1710.1, 1: 1726.2. Samples: 28673824. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 06:27:11,053][59242] Avg episode reward: [(0, '31.310'), (1, '32.540')] +[2023-10-09 06:27:13,798][60144] Updated weights for policy 1, policy_version 56322 (0.0009) +[2023-10-09 06:27:14,222][60144] Updated weights for policy 1, policy_version 56332 (0.0009) +[2023-10-09 06:27:14,597][60144] Updated weights for policy 1, policy_version 56342 (0.0008) +[2023-10-09 06:27:14,724][60143] Updated weights for policy 0, policy_version 55682 (0.0009) +[2023-10-09 06:27:14,960][60144] Updated weights for policy 1, policy_version 56352 (0.0008) +[2023-10-09 06:27:15,099][60143] Updated weights for policy 0, policy_version 55692 (0.0007) +[2023-10-09 06:27:15,463][60143] Updated weights for policy 0, policy_version 55702 (0.0010) +[2023-10-09 06:27:15,834][60143] Updated weights for policy 0, policy_version 55712 (0.0011) +[2023-10-09 06:27:16,052][59242] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 114753536. Throughput: 0: 1697.9, 1: 1704.3. Samples: 28693634. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 06:27:16,053][59242] Avg episode reward: [(0, '30.080'), (1, '32.270')] +[2023-10-09 06:27:18,917][60144] Updated weights for policy 1, policy_version 56362 (0.0009) +[2023-10-09 06:27:19,277][60144] Updated weights for policy 1, policy_version 56372 (0.0010) +[2023-10-09 06:27:19,644][60144] Updated weights for policy 1, policy_version 56382 (0.0008) +[2023-10-09 06:27:19,846][60143] Updated weights for policy 0, policy_version 55722 (0.0010) +[2023-10-09 06:27:20,212][60143] Updated weights for policy 0, policy_version 55732 (0.0008) +[2023-10-09 06:27:20,577][60143] Updated weights for policy 0, policy_version 55742 (0.0010) +[2023-10-09 06:27:21,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 114819072. Throughput: 0: 1720.4, 1: 1734.7. Samples: 28705210. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 06:27:21,053][59242] Avg episode reward: [(0, '28.960'), (1, '30.120')] +[2023-10-09 06:27:23,510][60144] Updated weights for policy 1, policy_version 56392 (0.0007) +[2023-10-09 06:27:23,870][60144] Updated weights for policy 1, policy_version 56402 (0.0009) +[2023-10-09 06:27:24,237][60144] Updated weights for policy 1, policy_version 56412 (0.0010) +[2023-10-09 06:27:24,580][60143] Updated weights for policy 0, policy_version 55752 (0.0007) +[2023-10-09 06:27:24,955][60143] Updated weights for policy 0, policy_version 55762 (0.0007) +[2023-10-09 06:27:25,317][60143] Updated weights for policy 0, policy_version 55772 (0.0008) +[2023-10-09 06:27:26,052][59242] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 114884608. Throughput: 0: 1716.8, 1: 1706.9. Samples: 28725036. Policy #0 lag: (min: 23.0, avg: 25.2, max: 52.0) +[2023-10-09 06:27:26,052][59242] Avg episode reward: [(0, '28.820'), (1, '32.170')] +[2023-10-09 06:27:28,011][60144] Updated weights for policy 1, policy_version 56422 (0.0008) +[2023-10-09 06:27:28,376][60144] Updated weights for policy 1, policy_version 56432 (0.0008) +[2023-10-09 06:27:28,735][60144] Updated weights for policy 1, policy_version 56442 (0.0008) +[2023-10-09 06:27:29,202][60143] Updated weights for policy 0, policy_version 55782 (0.0008) +[2023-10-09 06:27:29,571][60143] Updated weights for policy 0, policy_version 55792 (0.0008) +[2023-10-09 06:27:29,945][60143] Updated weights for policy 0, policy_version 55802 (0.0009) +[2023-10-09 06:27:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.6). Total num frames: 114950144. Throughput: 0: 1688.3, 1: 1724.1. Samples: 28745352. Policy #0 lag: (min: 23.0, avg: 25.2, max: 52.0) +[2023-10-09 06:27:31,053][59242] Avg episode reward: [(0, '27.910'), (1, '31.940')] +[2023-10-09 06:27:31,065][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000055808_57147392.pth... +[2023-10-09 06:27:31,066][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000056448_57802752.pth... +[2023-10-09 06:27:31,096][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000054208_55508992.pth +[2023-10-09 06:27:31,100][59934] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p0/milestones/checkpoint_000055808_57147392.pth +[2023-10-09 06:27:31,101][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000054848_56164352.pth +[2023-10-09 06:27:31,105][60003] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p1/milestones/checkpoint_000056448_57802752.pth +[2023-10-09 06:27:32,654][60144] Updated weights for policy 1, policy_version 56452 (0.0008) +[2023-10-09 06:27:33,028][60144] Updated weights for policy 1, policy_version 56462 (0.0008) +[2023-10-09 06:27:33,400][60144] Updated weights for policy 1, policy_version 56472 (0.0009) +[2023-10-09 06:27:33,872][60143] Updated weights for policy 0, policy_version 55812 (0.0008) +[2023-10-09 06:27:34,233][60143] Updated weights for policy 0, policy_version 55822 (0.0008) +[2023-10-09 06:27:34,606][60143] Updated weights for policy 0, policy_version 55832 (0.0007) +[2023-10-09 06:27:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 115015680. Throughput: 0: 1718.8, 1: 1713.4. Samples: 28756216. Policy #0 lag: (min: 23.0, avg: 25.2, max: 52.0) +[2023-10-09 06:27:36,053][59242] Avg episode reward: [(0, '27.910'), (1, '32.200')] +[2023-10-09 06:27:37,281][60144] Updated weights for policy 1, policy_version 56482 (0.0009) +[2023-10-09 06:27:37,650][60144] Updated weights for policy 1, policy_version 56492 (0.0007) +[2023-10-09 06:27:38,015][60144] Updated weights for policy 1, policy_version 56502 (0.0007) +[2023-10-09 06:27:38,379][60144] Updated weights for policy 1, policy_version 56512 (0.0008) +[2023-10-09 06:27:38,667][60143] Updated weights for policy 0, policy_version 55842 (0.0007) +[2023-10-09 06:27:39,036][60143] Updated weights for policy 0, policy_version 55852 (0.0009) +[2023-10-09 06:27:39,401][60143] Updated weights for policy 0, policy_version 55862 (0.0009) +[2023-10-09 06:27:39,765][60143] Updated weights for policy 0, policy_version 55872 (0.0009) +[2023-10-09 06:27:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 115081216. Throughput: 0: 1697.6, 1: 1721.4. Samples: 28776502. Policy #0 lag: (min: 23.0, avg: 25.2, max: 52.0) +[2023-10-09 06:27:41,053][59242] Avg episode reward: [(0, '29.830'), (1, '31.440')] +[2023-10-09 06:27:42,344][60144] Updated weights for policy 1, policy_version 56522 (0.0010) +[2023-10-09 06:27:42,710][60144] Updated weights for policy 1, policy_version 56532 (0.0011) +[2023-10-09 06:27:43,085][60144] Updated weights for policy 1, policy_version 56542 (0.0011) +[2023-10-09 06:27:43,638][60143] Updated weights for policy 0, policy_version 55882 (0.0011) +[2023-10-09 06:27:43,997][60143] Updated weights for policy 0, policy_version 55892 (0.0010) +[2023-10-09 06:27:44,372][60143] Updated weights for policy 0, policy_version 55902 (0.0010) +[2023-10-09 06:27:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 115146752. Throughput: 0: 1690.4, 1: 1752.7. Samples: 28797538. Policy #0 lag: (min: 23.0, avg: 25.2, max: 52.0) +[2023-10-09 06:27:46,053][59242] Avg episode reward: [(0, '29.370'), (1, '31.370')] +[2023-10-09 06:27:46,778][60144] Updated weights for policy 1, policy_version 56552 (0.0008) +[2023-10-09 06:27:47,140][60144] Updated weights for policy 1, policy_version 56562 (0.0009) +[2023-10-09 06:27:47,521][60144] Updated weights for policy 1, policy_version 56572 (0.0009) +[2023-10-09 06:27:48,521][60143] Updated weights for policy 0, policy_version 55912 (0.0009) +[2023-10-09 06:27:48,897][60143] Updated weights for policy 0, policy_version 55922 (0.0007) +[2023-10-09 06:27:49,257][60143] Updated weights for policy 0, policy_version 55932 (0.0010) +[2023-10-09 06:27:51,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 115212288. Throughput: 0: 1709.9, 1: 1719.9. Samples: 28807768. Policy #0 lag: (min: 23.0, avg: 25.2, max: 52.0) +[2023-10-09 06:27:51,052][59242] Avg episode reward: [(0, '29.050'), (1, '32.380')] +[2023-10-09 06:27:51,550][60144] Updated weights for policy 1, policy_version 56582 (0.0008) +[2023-10-09 06:27:51,921][60144] Updated weights for policy 1, policy_version 56592 (0.0007) +[2023-10-09 06:27:52,284][60144] Updated weights for policy 1, policy_version 56602 (0.0007) +[2023-10-09 06:27:53,345][60143] Updated weights for policy 0, policy_version 55942 (0.0008) +[2023-10-09 06:27:53,734][60143] Updated weights for policy 0, policy_version 55952 (0.0009) +[2023-10-09 06:27:54,104][60143] Updated weights for policy 0, policy_version 55962 (0.0011) +[2023-10-09 06:27:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 115277824. Throughput: 0: 1688.2, 1: 1734.5. Samples: 28827848. Policy #0 lag: (min: 23.0, avg: 25.2, max: 52.0) +[2023-10-09 06:27:56,053][59242] Avg episode reward: [(0, '30.100'), (1, '32.570')] +[2023-10-09 06:27:56,225][60144] Updated weights for policy 1, policy_version 56612 (0.0008) +[2023-10-09 06:27:56,596][60144] Updated weights for policy 1, policy_version 56622 (0.0011) +[2023-10-09 06:27:56,956][60144] Updated weights for policy 1, policy_version 56632 (0.0008) +[2023-10-09 06:27:57,901][60143] Updated weights for policy 0, policy_version 55972 (0.0010) +[2023-10-09 06:27:58,275][60143] Updated weights for policy 0, policy_version 55982 (0.0010) +[2023-10-09 06:27:58,642][60143] Updated weights for policy 0, policy_version 55992 (0.0008) +[2023-10-09 06:28:00,914][60144] Updated weights for policy 1, policy_version 56642 (0.0010) +[2023-10-09 06:28:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 115343360. Throughput: 0: 1704.9, 1: 1755.0. Samples: 28849330. Policy #0 lag: (min: 23.0, avg: 25.2, max: 52.0) +[2023-10-09 06:28:01,053][59242] Avg episode reward: [(0, '31.210'), (1, '31.180')] +[2023-10-09 06:28:01,329][60144] Updated weights for policy 1, policy_version 56652 (0.0007) +[2023-10-09 06:28:01,702][60144] Updated weights for policy 1, policy_version 56662 (0.0009) +[2023-10-09 06:28:02,071][60144] Updated weights for policy 1, policy_version 56672 (0.0008) +[2023-10-09 06:28:02,803][60143] Updated weights for policy 0, policy_version 56002 (0.0008) +[2023-10-09 06:28:03,167][60143] Updated weights for policy 0, policy_version 56012 (0.0009) +[2023-10-09 06:28:03,538][60143] Updated weights for policy 0, policy_version 56022 (0.0010) +[2023-10-09 06:28:03,916][60143] Updated weights for policy 0, policy_version 56032 (0.0009) +[2023-10-09 06:28:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 115408896. Throughput: 0: 1693.1, 1: 1720.2. Samples: 28858810. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-09 06:28:06,052][59242] Avg episode reward: [(0, '31.560'), (1, '32.140')] +[2023-10-09 06:28:06,188][60144] Updated weights for policy 1, policy_version 56682 (0.0009) +[2023-10-09 06:28:06,547][60144] Updated weights for policy 1, policy_version 56692 (0.0007) +[2023-10-09 06:28:06,923][60144] Updated weights for policy 1, policy_version 56702 (0.0007) +[2023-10-09 06:28:07,935][60143] Updated weights for policy 0, policy_version 56042 (0.0007) +[2023-10-09 06:28:08,301][60143] Updated weights for policy 0, policy_version 56052 (0.0007) +[2023-10-09 06:28:08,669][60143] Updated weights for policy 0, policy_version 56062 (0.0009) +[2023-10-09 06:28:10,835][60144] Updated weights for policy 1, policy_version 56712 (0.0008) +[2023-10-09 06:28:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 115474432. Throughput: 0: 1684.6, 1: 1745.1. Samples: 28879372. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-09 06:28:11,052][59242] Avg episode reward: [(0, '30.290'), (1, '32.160')] +[2023-10-09 06:28:11,208][60144] Updated weights for policy 1, policy_version 56722 (0.0007) +[2023-10-09 06:28:11,586][60144] Updated weights for policy 1, policy_version 56732 (0.0009) +[2023-10-09 06:28:12,708][60143] Updated weights for policy 0, policy_version 56072 (0.0008) +[2023-10-09 06:28:13,072][60143] Updated weights for policy 0, policy_version 56082 (0.0008) +[2023-10-09 06:28:13,448][60143] Updated weights for policy 0, policy_version 56092 (0.0007) +[2023-10-09 06:28:15,693][60144] Updated weights for policy 1, policy_version 56742 (0.0008) +[2023-10-09 06:28:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 115539968. Throughput: 0: 1712.7, 1: 1735.5. Samples: 28900522. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-09 06:28:16,052][59242] Avg episode reward: [(0, '29.730'), (1, '32.110')] +[2023-10-09 06:28:16,063][60144] Updated weights for policy 1, policy_version 56752 (0.0009) +[2023-10-09 06:28:16,432][60144] Updated weights for policy 1, policy_version 56762 (0.0009) +[2023-10-09 06:28:17,458][60143] Updated weights for policy 0, policy_version 56102 (0.0009) +[2023-10-09 06:28:17,828][60143] Updated weights for policy 0, policy_version 56112 (0.0009) +[2023-10-09 06:28:18,192][60143] Updated weights for policy 0, policy_version 56122 (0.0007) +[2023-10-09 06:28:20,349][60144] Updated weights for policy 1, policy_version 56772 (0.0009) +[2023-10-09 06:28:20,722][60144] Updated weights for policy 1, policy_version 56782 (0.0008) +[2023-10-09 06:28:21,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 115605504. Throughput: 0: 1680.7, 1: 1737.9. Samples: 28910052. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-09 06:28:21,053][59242] Avg episode reward: [(0, '29.820'), (1, '32.210')] +[2023-10-09 06:28:21,080][60144] Updated weights for policy 1, policy_version 56792 (0.0007) +[2023-10-09 06:28:22,251][60143] Updated weights for policy 0, policy_version 56132 (0.0008) +[2023-10-09 06:28:22,618][60143] Updated weights for policy 0, policy_version 56142 (0.0008) +[2023-10-09 06:28:22,984][60143] Updated weights for policy 0, policy_version 56152 (0.0007) +[2023-10-09 06:28:24,976][60144] Updated weights for policy 1, policy_version 56802 (0.0009) +[2023-10-09 06:28:25,342][60144] Updated weights for policy 1, policy_version 56812 (0.0007) +[2023-10-09 06:28:25,715][60144] Updated weights for policy 1, policy_version 56822 (0.0007) +[2023-10-09 06:28:26,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 115671040. Throughput: 0: 1702.1, 1: 1739.3. Samples: 28931368. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-09 06:28:26,053][59242] Avg episode reward: [(0, '28.760'), (1, '32.990')] +[2023-10-09 06:28:26,077][60144] Updated weights for policy 1, policy_version 56832 (0.0009) +[2023-10-09 06:28:26,886][60143] Updated weights for policy 0, policy_version 56162 (0.0007) +[2023-10-09 06:28:27,263][60143] Updated weights for policy 0, policy_version 56172 (0.0007) +[2023-10-09 06:28:27,636][60143] Updated weights for policy 0, policy_version 56182 (0.0007) +[2023-10-09 06:28:28,002][60143] Updated weights for policy 0, policy_version 56192 (0.0007) +[2023-10-09 06:28:30,111][60144] Updated weights for policy 1, policy_version 56842 (0.0007) +[2023-10-09 06:28:30,489][60144] Updated weights for policy 1, policy_version 56852 (0.0007) +[2023-10-09 06:28:30,864][60144] Updated weights for policy 1, policy_version 56862 (0.0008) +[2023-10-09 06:28:31,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 115769344. Throughput: 0: 1710.0, 1: 1713.8. Samples: 28951610. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-09 06:28:31,053][59242] Avg episode reward: [(0, '28.730'), (1, '32.050')] +[2023-10-09 06:28:31,915][60143] Updated weights for policy 0, policy_version 56202 (0.0010) +[2023-10-09 06:28:32,277][60143] Updated weights for policy 0, policy_version 56212 (0.0009) +[2023-10-09 06:28:32,648][60143] Updated weights for policy 0, policy_version 56222 (0.0008) +[2023-10-09 06:28:34,874][60144] Updated weights for policy 1, policy_version 56872 (0.0007) +[2023-10-09 06:28:35,241][60144] Updated weights for policy 1, policy_version 56882 (0.0009) +[2023-10-09 06:28:35,608][60144] Updated weights for policy 1, policy_version 56892 (0.0010) +[2023-10-09 06:28:36,052][59242] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 115834880. Throughput: 0: 1687.4, 1: 1733.7. Samples: 28961718. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-09 06:28:36,052][59242] Avg episode reward: [(0, '28.840'), (1, '31.050')] +[2023-10-09 06:28:36,661][60143] Updated weights for policy 0, policy_version 56232 (0.0010) +[2023-10-09 06:28:37,031][60143] Updated weights for policy 0, policy_version 56242 (0.0011) +[2023-10-09 06:28:37,397][60143] Updated weights for policy 0, policy_version 56252 (0.0011) +[2023-10-09 06:28:39,382][60144] Updated weights for policy 1, policy_version 56902 (0.0009) +[2023-10-09 06:28:39,746][60144] Updated weights for policy 1, policy_version 56912 (0.0009) +[2023-10-09 06:28:40,108][60144] Updated weights for policy 1, policy_version 56922 (0.0008) +[2023-10-09 06:28:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 115900416. Throughput: 0: 1712.0, 1: 1722.7. Samples: 28982408. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-09 06:28:41,052][59242] Avg episode reward: [(0, '29.300'), (1, '30.730')] +[2023-10-09 06:28:41,423][60143] Updated weights for policy 0, policy_version 56262 (0.0009) +[2023-10-09 06:28:41,801][60143] Updated weights for policy 0, policy_version 56272 (0.0009) +[2023-10-09 06:28:42,175][60143] Updated weights for policy 0, policy_version 56282 (0.0008) +[2023-10-09 06:28:44,144][60144] Updated weights for policy 1, policy_version 56932 (0.0009) +[2023-10-09 06:28:44,501][60144] Updated weights for policy 1, policy_version 56942 (0.0007) +[2023-10-09 06:28:44,856][60144] Updated weights for policy 1, policy_version 56952 (0.0007) +[2023-10-09 06:28:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 115965952. Throughput: 0: 1709.8, 1: 1700.6. Samples: 29002798. Policy #0 lag: (min: 11.0, avg: 15.5, max: 43.0) +[2023-10-09 06:28:46,053][59242] Avg episode reward: [(0, '30.520'), (1, '32.300')] +[2023-10-09 06:28:46,066][60143] Updated weights for policy 0, policy_version 56292 (0.0007) +[2023-10-09 06:28:46,438][60143] Updated weights for policy 0, policy_version 56302 (0.0009) +[2023-10-09 06:28:46,811][60143] Updated weights for policy 0, policy_version 56312 (0.0009) +[2023-10-09 06:28:48,777][60144] Updated weights for policy 1, policy_version 56962 (0.0008) +[2023-10-09 06:28:49,163][60144] Updated weights for policy 1, policy_version 56972 (0.0008) +[2023-10-09 06:28:49,540][60144] Updated weights for policy 1, policy_version 56982 (0.0008) +[2023-10-09 06:28:49,909][60144] Updated weights for policy 1, policy_version 56992 (0.0009) +[2023-10-09 06:28:50,750][60143] Updated weights for policy 0, policy_version 56322 (0.0008) +[2023-10-09 06:28:51,052][59242] Fps is (10 sec: 13106.7, 60 sec: 13653.2, 300 sec: 13662.6). Total num frames: 116031488. Throughput: 0: 1703.3, 1: 1735.7. Samples: 29013568. Policy #0 lag: (min: 11.0, avg: 15.5, max: 43.0) +[2023-10-09 06:28:51,054][59242] Avg episode reward: [(0, '30.450'), (1, '32.830')] +[2023-10-09 06:28:51,125][60143] Updated weights for policy 0, policy_version 56332 (0.0009) +[2023-10-09 06:28:51,494][60143] Updated weights for policy 0, policy_version 56342 (0.0008) +[2023-10-09 06:28:51,866][60143] Updated weights for policy 0, policy_version 56352 (0.0009) +[2023-10-09 06:28:53,771][60144] Updated weights for policy 1, policy_version 57002 (0.0009) +[2023-10-09 06:28:54,133][60144] Updated weights for policy 1, policy_version 57012 (0.0010) +[2023-10-09 06:28:54,504][60144] Updated weights for policy 1, policy_version 57022 (0.0011) +[2023-10-09 06:28:55,804][60143] Updated weights for policy 0, policy_version 56362 (0.0008) +[2023-10-09 06:28:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 116097024. Throughput: 0: 1721.8, 1: 1706.5. Samples: 29033648. Policy #0 lag: (min: 11.0, avg: 15.5, max: 43.0) +[2023-10-09 06:28:56,052][59242] Avg episode reward: [(0, '30.140'), (1, '32.500')] +[2023-10-09 06:28:56,175][60143] Updated weights for policy 0, policy_version 56372 (0.0008) +[2023-10-09 06:28:56,543][60143] Updated weights for policy 0, policy_version 56382 (0.0009) +[2023-10-09 06:28:58,405][60144] Updated weights for policy 1, policy_version 57032 (0.0009) +[2023-10-09 06:28:58,770][60144] Updated weights for policy 1, policy_version 57042 (0.0009) +[2023-10-09 06:28:59,139][60144] Updated weights for policy 1, policy_version 57052 (0.0008) +[2023-10-09 06:29:00,623][60143] Updated weights for policy 0, policy_version 56392 (0.0009) +[2023-10-09 06:29:00,989][60143] Updated weights for policy 0, policy_version 56402 (0.0010) +[2023-10-09 06:29:01,052][59242] Fps is (10 sec: 13107.7, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 116162560. Throughput: 0: 1719.0, 1: 1707.7. Samples: 29054724. Policy #0 lag: (min: 11.0, avg: 15.5, max: 43.0) +[2023-10-09 06:29:01,052][59242] Avg episode reward: [(0, '29.040'), (1, '32.320')] +[2023-10-09 06:29:01,364][60143] Updated weights for policy 0, policy_version 56412 (0.0008) +[2023-10-09 06:29:03,161][60144] Updated weights for policy 1, policy_version 57062 (0.0009) +[2023-10-09 06:29:03,537][60144] Updated weights for policy 1, policy_version 57072 (0.0008) +[2023-10-09 06:29:03,914][60144] Updated weights for policy 1, policy_version 57082 (0.0009) +[2023-10-09 06:29:05,506][60143] Updated weights for policy 0, policy_version 56422 (0.0010) +[2023-10-09 06:29:05,868][60143] Updated weights for policy 0, policy_version 56432 (0.0010) +[2023-10-09 06:29:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 116228096. Throughput: 0: 1723.9, 1: 1711.8. Samples: 29064656. Policy #0 lag: (min: 11.0, avg: 15.5, max: 43.0) +[2023-10-09 06:29:06,053][59242] Avg episode reward: [(0, '28.810'), (1, '32.370')] +[2023-10-09 06:29:06,238][60143] Updated weights for policy 0, policy_version 56442 (0.0012) +[2023-10-09 06:29:08,001][60144] Updated weights for policy 1, policy_version 57092 (0.0009) +[2023-10-09 06:29:08,366][60144] Updated weights for policy 1, policy_version 57102 (0.0009) +[2023-10-09 06:29:08,735][60144] Updated weights for policy 1, policy_version 57112 (0.0007) +[2023-10-09 06:29:10,234][60143] Updated weights for policy 0, policy_version 56452 (0.0009) +[2023-10-09 06:29:10,605][60143] Updated weights for policy 0, policy_version 56462 (0.0007) +[2023-10-09 06:29:10,977][60143] Updated weights for policy 0, policy_version 56472 (0.0007) +[2023-10-09 06:29:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 116293632. Throughput: 0: 1717.9, 1: 1694.1. Samples: 29084906. Policy #0 lag: (min: 11.0, avg: 15.5, max: 43.0) +[2023-10-09 06:29:11,052][59242] Avg episode reward: [(0, '29.760'), (1, '33.070')] +[2023-10-09 06:29:12,629][60144] Updated weights for policy 1, policy_version 57122 (0.0012) +[2023-10-09 06:29:13,003][60144] Updated weights for policy 1, policy_version 57132 (0.0012) +[2023-10-09 06:29:13,369][60144] Updated weights for policy 1, policy_version 57142 (0.0007) +[2023-10-09 06:29:13,727][60144] Updated weights for policy 1, policy_version 57152 (0.0008) +[2023-10-09 06:29:14,881][60143] Updated weights for policy 0, policy_version 56482 (0.0007) +[2023-10-09 06:29:15,249][60143] Updated weights for policy 0, policy_version 56492 (0.0008) +[2023-10-09 06:29:15,623][60143] Updated weights for policy 0, policy_version 56502 (0.0010) +[2023-10-09 06:29:15,989][60143] Updated weights for policy 0, policy_version 56512 (0.0009) +[2023-10-09 06:29:16,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 116391936. Throughput: 0: 1705.1, 1: 1716.3. Samples: 29105574. Policy #0 lag: (min: 11.0, avg: 15.5, max: 43.0) +[2023-10-09 06:29:16,053][59242] Avg episode reward: [(0, '30.350'), (1, '30.630')] +[2023-10-09 06:29:17,559][60144] Updated weights for policy 1, policy_version 57162 (0.0009) +[2023-10-09 06:29:17,924][60144] Updated weights for policy 1, policy_version 57172 (0.0010) +[2023-10-09 06:29:18,298][60144] Updated weights for policy 1, policy_version 57182 (0.0010) +[2023-10-09 06:29:20,003][60143] Updated weights for policy 0, policy_version 56522 (0.0007) +[2023-10-09 06:29:20,375][60143] Updated weights for policy 0, policy_version 56532 (0.0007) +[2023-10-09 06:29:20,738][60143] Updated weights for policy 0, policy_version 56542 (0.0007) +[2023-10-09 06:29:21,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 116457472. Throughput: 0: 1722.7, 1: 1702.8. Samples: 29115866. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-09 06:29:21,053][59242] Avg episode reward: [(0, '30.970'), (1, '30.850')] +[2023-10-09 06:29:22,240][60144] Updated weights for policy 1, policy_version 57192 (0.0008) +[2023-10-09 06:29:22,607][60144] Updated weights for policy 1, policy_version 57202 (0.0008) +[2023-10-09 06:29:22,974][60144] Updated weights for policy 1, policy_version 57212 (0.0009) +[2023-10-09 06:29:24,529][60143] Updated weights for policy 0, policy_version 56552 (0.0009) +[2023-10-09 06:29:24,904][60143] Updated weights for policy 0, policy_version 56562 (0.0007) +[2023-10-09 06:29:25,271][60143] Updated weights for policy 0, policy_version 56572 (0.0008) +[2023-10-09 06:29:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 116523008. Throughput: 0: 1723.7, 1: 1713.6. Samples: 29137090. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-09 06:29:26,053][59242] Avg episode reward: [(0, '31.220'), (1, '31.140')] +[2023-10-09 06:29:26,918][60144] Updated weights for policy 1, policy_version 57222 (0.0009) +[2023-10-09 06:29:27,286][60144] Updated weights for policy 1, policy_version 57232 (0.0010) +[2023-10-09 06:29:27,657][60144] Updated weights for policy 1, policy_version 57242 (0.0008) +[2023-10-09 06:29:29,201][60143] Updated weights for policy 0, policy_version 56582 (0.0010) +[2023-10-09 06:29:29,585][60143] Updated weights for policy 0, policy_version 56592 (0.0008) +[2023-10-09 06:29:29,957][60143] Updated weights for policy 0, policy_version 56602 (0.0011) +[2023-10-09 06:29:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 116588544. Throughput: 0: 1700.4, 1: 1733.4. Samples: 29157318. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-09 06:29:31,053][59242] Avg episode reward: [(0, '30.400'), (1, '32.620')] +[2023-10-09 06:29:31,060][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000057248_58621952.pth... +[2023-10-09 06:29:31,060][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000056608_57966592.pth... +[2023-10-09 06:29:31,097][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000055008_56328192.pth +[2023-10-09 06:29:31,100][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000055648_56983552.pth +[2023-10-09 06:29:31,657][60144] Updated weights for policy 1, policy_version 57252 (0.0008) +[2023-10-09 06:29:32,021][60144] Updated weights for policy 1, policy_version 57262 (0.0009) +[2023-10-09 06:29:32,394][60144] Updated weights for policy 1, policy_version 57272 (0.0009) +[2023-10-09 06:29:33,889][60143] Updated weights for policy 0, policy_version 56612 (0.0010) +[2023-10-09 06:29:34,261][60143] Updated weights for policy 0, policy_version 56622 (0.0009) +[2023-10-09 06:29:34,633][60143] Updated weights for policy 0, policy_version 56632 (0.0008) +[2023-10-09 06:29:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 116654080. Throughput: 0: 1726.9, 1: 1699.9. Samples: 29167772. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-09 06:29:36,053][59242] Avg episode reward: [(0, '30.290'), (1, '32.330')] +[2023-10-09 06:29:36,541][60144] Updated weights for policy 1, policy_version 57282 (0.0008) +[2023-10-09 06:29:36,930][60144] Updated weights for policy 1, policy_version 57292 (0.0010) +[2023-10-09 06:29:37,285][60144] Updated weights for policy 1, policy_version 57302 (0.0011) +[2023-10-09 06:29:37,648][60144] Updated weights for policy 1, policy_version 57312 (0.0010) +[2023-10-09 06:29:38,737][60143] Updated weights for policy 0, policy_version 56642 (0.0007) +[2023-10-09 06:29:39,110][60143] Updated weights for policy 0, policy_version 56652 (0.0008) +[2023-10-09 06:29:39,474][60143] Updated weights for policy 0, policy_version 56662 (0.0008) +[2023-10-09 06:29:39,847][60143] Updated weights for policy 0, policy_version 56672 (0.0007) +[2023-10-09 06:29:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 116719616. Throughput: 0: 1700.8, 1: 1728.0. Samples: 29187944. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-09 06:29:41,053][59242] Avg episode reward: [(0, '28.140'), (1, '32.640')] +[2023-10-09 06:29:41,393][60144] Updated weights for policy 1, policy_version 57322 (0.0009) +[2023-10-09 06:29:41,764][60144] Updated weights for policy 1, policy_version 57332 (0.0008) +[2023-10-09 06:29:42,132][60144] Updated weights for policy 1, policy_version 57342 (0.0007) +[2023-10-09 06:29:43,656][60143] Updated weights for policy 0, policy_version 56682 (0.0008) +[2023-10-09 06:29:44,029][60143] Updated weights for policy 0, policy_version 56692 (0.0008) +[2023-10-09 06:29:44,413][60143] Updated weights for policy 0, policy_version 56702 (0.0007) +[2023-10-09 06:29:46,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 116785152. Throughput: 0: 1698.0, 1: 1732.3. Samples: 29209086. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-09 06:29:46,052][59242] Avg episode reward: [(0, '28.910'), (1, '31.880')] +[2023-10-09 06:29:46,147][60144] Updated weights for policy 1, policy_version 57352 (0.0008) +[2023-10-09 06:29:46,505][60144] Updated weights for policy 1, policy_version 57362 (0.0008) +[2023-10-09 06:29:46,880][60144] Updated weights for policy 1, policy_version 57372 (0.0009) +[2023-10-09 06:29:48,332][60143] Updated weights for policy 0, policy_version 56712 (0.0009) +[2023-10-09 06:29:48,705][60143] Updated weights for policy 0, policy_version 56722 (0.0007) +[2023-10-09 06:29:49,075][60143] Updated weights for policy 0, policy_version 56732 (0.0010) +[2023-10-09 06:29:50,794][60144] Updated weights for policy 1, policy_version 57382 (0.0009) +[2023-10-09 06:29:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 116850688. Throughput: 0: 1715.7, 1: 1721.2. Samples: 29219314. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-09 06:29:51,053][59242] Avg episode reward: [(0, '28.900'), (1, '31.360')] +[2023-10-09 06:29:51,159][60144] Updated weights for policy 1, policy_version 57392 (0.0011) +[2023-10-09 06:29:51,526][60144] Updated weights for policy 1, policy_version 57402 (0.0010) +[2023-10-09 06:29:53,089][60143] Updated weights for policy 0, policy_version 56742 (0.0009) +[2023-10-09 06:29:53,465][60143] Updated weights for policy 0, policy_version 56752 (0.0009) +[2023-10-09 06:29:53,823][60143] Updated weights for policy 0, policy_version 56762 (0.0008) +[2023-10-09 06:29:55,513][60144] Updated weights for policy 1, policy_version 57412 (0.0011) +[2023-10-09 06:29:55,886][60144] Updated weights for policy 1, policy_version 57422 (0.0010) +[2023-10-09 06:29:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 116916224. Throughput: 0: 1705.8, 1: 1742.3. Samples: 29240070. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-09 06:29:56,053][59242] Avg episode reward: [(0, '28.890'), (1, '31.420')] +[2023-10-09 06:29:56,245][60144] Updated weights for policy 1, policy_version 57432 (0.0011) +[2023-10-09 06:29:57,802][60143] Updated weights for policy 0, policy_version 56772 (0.0008) +[2023-10-09 06:29:58,182][60143] Updated weights for policy 0, policy_version 56782 (0.0008) +[2023-10-09 06:29:58,546][60143] Updated weights for policy 0, policy_version 56792 (0.0008) +[2023-10-09 06:30:00,204][60144] Updated weights for policy 1, policy_version 57442 (0.0008) +[2023-10-09 06:30:00,572][60144] Updated weights for policy 1, policy_version 57452 (0.0008) +[2023-10-09 06:30:00,943][60144] Updated weights for policy 1, policy_version 57462 (0.0008) +[2023-10-09 06:30:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 116981760. Throughput: 0: 1720.4, 1: 1728.9. Samples: 29260792. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:30:01,053][59242] Avg episode reward: [(0, '28.540'), (1, '32.800')] +[2023-10-09 06:30:01,314][60144] Updated weights for policy 1, policy_version 57472 (0.0007) +[2023-10-09 06:30:02,525][60143] Updated weights for policy 0, policy_version 56802 (0.0008) +[2023-10-09 06:30:02,884][60143] Updated weights for policy 0, policy_version 56812 (0.0009) +[2023-10-09 06:30:03,260][60143] Updated weights for policy 0, policy_version 56822 (0.0007) +[2023-10-09 06:30:03,633][60143] Updated weights for policy 0, policy_version 56832 (0.0009) +[2023-10-09 06:30:05,144][60144] Updated weights for policy 1, policy_version 57482 (0.0009) +[2023-10-09 06:30:05,515][60144] Updated weights for policy 1, policy_version 57492 (0.0009) +[2023-10-09 06:30:05,884][60144] Updated weights for policy 1, policy_version 57502 (0.0008) +[2023-10-09 06:30:06,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 117080064. Throughput: 0: 1709.1, 1: 1734.7. Samples: 29270836. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:30:06,053][59242] Avg episode reward: [(0, '30.180'), (1, '32.700')] +[2023-10-09 06:30:07,722][60143] Updated weights for policy 0, policy_version 56842 (0.0007) +[2023-10-09 06:30:08,092][60143] Updated weights for policy 0, policy_version 56852 (0.0008) +[2023-10-09 06:30:08,460][60143] Updated weights for policy 0, policy_version 56862 (0.0009) +[2023-10-09 06:30:09,848][60144] Updated weights for policy 1, policy_version 57512 (0.0010) +[2023-10-09 06:30:10,221][60144] Updated weights for policy 1, policy_version 57522 (0.0007) +[2023-10-09 06:30:10,575][60144] Updated weights for policy 1, policy_version 57532 (0.0009) +[2023-10-09 06:30:11,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 117145600. Throughput: 0: 1699.3, 1: 1732.7. Samples: 29291532. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:30:11,053][59242] Avg episode reward: [(0, '29.850'), (1, '33.510')] +[2023-10-09 06:30:12,340][60143] Updated weights for policy 0, policy_version 56872 (0.0008) +[2023-10-09 06:30:12,705][60143] Updated weights for policy 0, policy_version 56882 (0.0009) +[2023-10-09 06:30:13,074][60143] Updated weights for policy 0, policy_version 56892 (0.0007) +[2023-10-09 06:30:14,698][60144] Updated weights for policy 1, policy_version 57542 (0.0007) +[2023-10-09 06:30:15,073][60144] Updated weights for policy 1, policy_version 57552 (0.0008) +[2023-10-09 06:30:15,438][60144] Updated weights for policy 1, policy_version 57562 (0.0009) +[2023-10-09 06:30:16,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 117211136. Throughput: 0: 1726.2, 1: 1702.9. Samples: 29311626. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:30:16,053][59242] Avg episode reward: [(0, '29.720'), (1, '34.120')] +[2023-10-09 06:30:17,212][60143] Updated weights for policy 0, policy_version 56902 (0.0009) +[2023-10-09 06:30:17,600][60143] Updated weights for policy 0, policy_version 56912 (0.0010) +[2023-10-09 06:30:17,970][60143] Updated weights for policy 0, policy_version 56922 (0.0009) +[2023-10-09 06:30:19,339][60144] Updated weights for policy 1, policy_version 57572 (0.0009) +[2023-10-09 06:30:19,700][60144] Updated weights for policy 1, policy_version 57582 (0.0008) +[2023-10-09 06:30:20,073][60144] Updated weights for policy 1, policy_version 57592 (0.0007) +[2023-10-09 06:30:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 117276672. Throughput: 0: 1694.1, 1: 1734.6. Samples: 29322060. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:30:21,053][59242] Avg episode reward: [(0, '27.820'), (1, '34.270')] +[2023-10-09 06:30:21,991][60143] Updated weights for policy 0, policy_version 56932 (0.0007) +[2023-10-09 06:30:22,364][60143] Updated weights for policy 0, policy_version 56942 (0.0008) +[2023-10-09 06:30:22,727][60143] Updated weights for policy 0, policy_version 56952 (0.0009) +[2023-10-09 06:30:24,122][60144] Updated weights for policy 1, policy_version 57602 (0.0007) +[2023-10-09 06:30:24,559][60144] Updated weights for policy 1, policy_version 57612 (0.0009) +[2023-10-09 06:30:24,930][60144] Updated weights for policy 1, policy_version 57622 (0.0008) +[2023-10-09 06:30:25,297][60144] Updated weights for policy 1, policy_version 57632 (0.0007) +[2023-10-09 06:30:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 117342208. Throughput: 0: 1714.3, 1: 1721.4. Samples: 29342552. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:30:26,053][59242] Avg episode reward: [(0, '27.130'), (1, '33.590')] +[2023-10-09 06:30:26,661][60143] Updated weights for policy 0, policy_version 56962 (0.0009) +[2023-10-09 06:30:27,036][60143] Updated weights for policy 0, policy_version 56972 (0.0010) +[2023-10-09 06:30:27,399][60143] Updated weights for policy 0, policy_version 56982 (0.0009) +[2023-10-09 06:30:27,769][60143] Updated weights for policy 0, policy_version 56992 (0.0008) +[2023-10-09 06:30:29,121][60144] Updated weights for policy 1, policy_version 57642 (0.0009) +[2023-10-09 06:30:29,490][60144] Updated weights for policy 1, policy_version 57652 (0.0008) +[2023-10-09 06:30:29,855][60144] Updated weights for policy 1, policy_version 57662 (0.0008) +[2023-10-09 06:30:31,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 117407744. Throughput: 0: 1719.5, 1: 1701.5. Samples: 29363032. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:30:31,053][59242] Avg episode reward: [(0, '27.040'), (1, '33.870')] +[2023-10-09 06:30:31,818][60143] Updated weights for policy 0, policy_version 57002 (0.0009) +[2023-10-09 06:30:32,190][60143] Updated weights for policy 0, policy_version 57012 (0.0008) +[2023-10-09 06:30:32,553][60143] Updated weights for policy 0, policy_version 57022 (0.0007) +[2023-10-09 06:30:33,782][60144] Updated weights for policy 1, policy_version 57672 (0.0010) +[2023-10-09 06:30:34,144][60144] Updated weights for policy 1, policy_version 57682 (0.0010) +[2023-10-09 06:30:34,519][60144] Updated weights for policy 1, policy_version 57692 (0.0009) +[2023-10-09 06:30:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 117473280. Throughput: 0: 1694.3, 1: 1729.6. Samples: 29373388. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:30:36,053][59242] Avg episode reward: [(0, '26.620'), (1, '33.000')] +[2023-10-09 06:30:36,576][60143] Updated weights for policy 0, policy_version 57032 (0.0009) +[2023-10-09 06:30:36,935][60143] Updated weights for policy 0, policy_version 57042 (0.0010) +[2023-10-09 06:30:37,309][60143] Updated weights for policy 0, policy_version 57052 (0.0009) +[2023-10-09 06:30:38,355][60144] Updated weights for policy 1, policy_version 57702 (0.0008) +[2023-10-09 06:30:38,728][60144] Updated weights for policy 1, policy_version 57712 (0.0007) +[2023-10-09 06:30:39,100][60144] Updated weights for policy 1, policy_version 57722 (0.0009) +[2023-10-09 06:30:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 117538816. Throughput: 0: 1713.1, 1: 1699.6. Samples: 29393642. Policy #0 lag: (min: 38.0, avg: 47.7, max: 48.0) +[2023-10-09 06:30:41,052][59242] Avg episode reward: [(0, '24.790'), (1, '34.770')] +[2023-10-09 06:30:41,180][60143] Updated weights for policy 0, policy_version 57062 (0.0009) +[2023-10-09 06:30:41,559][60143] Updated weights for policy 0, policy_version 57072 (0.0010) +[2023-10-09 06:30:41,927][60143] Updated weights for policy 0, policy_version 57082 (0.0009) +[2023-10-09 06:30:42,985][60144] Updated weights for policy 1, policy_version 57732 (0.0007) +[2023-10-09 06:30:43,359][60144] Updated weights for policy 1, policy_version 57742 (0.0009) +[2023-10-09 06:30:43,722][60144] Updated weights for policy 1, policy_version 57752 (0.0009) +[2023-10-09 06:30:45,994][60143] Updated weights for policy 0, policy_version 57092 (0.0009) +[2023-10-09 06:30:46,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 117604352. Throughput: 0: 1710.5, 1: 1712.3. Samples: 29414816. Policy #0 lag: (min: 38.0, avg: 47.7, max: 48.0) +[2023-10-09 06:30:46,052][59242] Avg episode reward: [(0, '26.390'), (1, '34.700')] +[2023-10-09 06:30:46,376][60143] Updated weights for policy 0, policy_version 57102 (0.0008) +[2023-10-09 06:30:46,746][60143] Updated weights for policy 0, policy_version 57112 (0.0009) +[2023-10-09 06:30:47,639][60144] Updated weights for policy 1, policy_version 57762 (0.0009) +[2023-10-09 06:30:48,002][60144] Updated weights for policy 1, policy_version 57772 (0.0008) +[2023-10-09 06:30:48,375][60144] Updated weights for policy 1, policy_version 57782 (0.0007) +[2023-10-09 06:30:48,737][60144] Updated weights for policy 1, policy_version 57792 (0.0007) +[2023-10-09 06:30:50,670][60143] Updated weights for policy 0, policy_version 57122 (0.0009) +[2023-10-09 06:30:51,047][60143] Updated weights for policy 0, policy_version 57132 (0.0009) +[2023-10-09 06:30:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 117669888. Throughput: 0: 1704.4, 1: 1705.6. Samples: 29424286. Policy #0 lag: (min: 38.0, avg: 47.7, max: 48.0) +[2023-10-09 06:30:51,052][59242] Avg episode reward: [(0, '27.540'), (1, '34.590')] +[2023-10-09 06:30:51,432][60143] Updated weights for policy 0, policy_version 57142 (0.0009) +[2023-10-09 06:30:51,801][60143] Updated weights for policy 0, policy_version 57152 (0.0009) +[2023-10-09 06:30:52,806][60144] Updated weights for policy 1, policy_version 57802 (0.0009) +[2023-10-09 06:30:53,168][60144] Updated weights for policy 1, policy_version 57812 (0.0008) +[2023-10-09 06:30:53,535][60144] Updated weights for policy 1, policy_version 57822 (0.0007) +[2023-10-09 06:30:55,621][60143] Updated weights for policy 0, policy_version 57162 (0.0008) +[2023-10-09 06:30:55,986][60143] Updated weights for policy 0, policy_version 57172 (0.0008) +[2023-10-09 06:30:56,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 117735424. Throughput: 0: 1718.5, 1: 1700.6. Samples: 29445392. Policy #0 lag: (min: 38.0, avg: 47.7, max: 48.0) +[2023-10-09 06:30:56,053][59242] Avg episode reward: [(0, '29.080'), (1, '35.190')] +[2023-10-09 06:30:56,360][60143] Updated weights for policy 0, policy_version 57182 (0.0008) +[2023-10-09 06:30:57,418][60144] Updated weights for policy 1, policy_version 57832 (0.0007) +[2023-10-09 06:30:57,786][60144] Updated weights for policy 1, policy_version 57842 (0.0007) +[2023-10-09 06:30:58,148][60144] Updated weights for policy 1, policy_version 57852 (0.0009) +[2023-10-09 06:31:00,403][60143] Updated weights for policy 0, policy_version 57192 (0.0008) +[2023-10-09 06:31:00,768][60143] Updated weights for policy 0, policy_version 57202 (0.0007) +[2023-10-09 06:31:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 117800960. Throughput: 0: 1703.4, 1: 1735.1. Samples: 29466358. Policy #0 lag: (min: 38.0, avg: 47.7, max: 48.0) +[2023-10-09 06:31:01,052][59242] Avg episode reward: [(0, '28.800'), (1, '35.340')] +[2023-10-09 06:31:01,126][60143] Updated weights for policy 0, policy_version 57212 (0.0010) +[2023-10-09 06:31:01,927][60144] Updated weights for policy 1, policy_version 57862 (0.0011) +[2023-10-09 06:31:02,300][60144] Updated weights for policy 1, policy_version 57872 (0.0009) +[2023-10-09 06:31:02,676][60144] Updated weights for policy 1, policy_version 57882 (0.0009) +[2023-10-09 06:31:05,307][60143] Updated weights for policy 0, policy_version 57222 (0.0012) +[2023-10-09 06:31:05,684][60143] Updated weights for policy 0, policy_version 57232 (0.0010) +[2023-10-09 06:31:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 117866496. Throughput: 0: 1717.9, 1: 1706.3. Samples: 29476150. Policy #0 lag: (min: 38.0, avg: 47.7, max: 48.0) +[2023-10-09 06:31:06,053][59242] Avg episode reward: [(0, '30.510'), (1, '34.160')] +[2023-10-09 06:31:06,059][60143] Updated weights for policy 0, policy_version 57242 (0.0011) +[2023-10-09 06:31:06,648][60144] Updated weights for policy 1, policy_version 57892 (0.0009) +[2023-10-09 06:31:07,017][60144] Updated weights for policy 1, policy_version 57902 (0.0007) +[2023-10-09 06:31:07,388][60144] Updated weights for policy 1, policy_version 57912 (0.0008) +[2023-10-09 06:31:09,887][60143] Updated weights for policy 0, policy_version 57252 (0.0009) +[2023-10-09 06:31:10,266][60143] Updated weights for policy 0, policy_version 57262 (0.0009) +[2023-10-09 06:31:10,639][60143] Updated weights for policy 0, policy_version 57272 (0.0010) +[2023-10-09 06:31:11,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 117964800. Throughput: 0: 1717.7, 1: 1722.0. Samples: 29497340. Policy #0 lag: (min: 38.0, avg: 47.7, max: 48.0) +[2023-10-09 06:31:11,053][59242] Avg episode reward: [(0, '31.210'), (1, '32.940')] +[2023-10-09 06:31:11,428][60144] Updated weights for policy 1, policy_version 57922 (0.0008) +[2023-10-09 06:31:11,847][60144] Updated weights for policy 1, policy_version 57932 (0.0009) +[2023-10-09 06:31:12,225][60144] Updated weights for policy 1, policy_version 57942 (0.0010) +[2023-10-09 06:31:12,582][60144] Updated weights for policy 1, policy_version 57952 (0.0010) +[2023-10-09 06:31:14,625][60143] Updated weights for policy 0, policy_version 57282 (0.0009) +[2023-10-09 06:31:14,997][60143] Updated weights for policy 0, policy_version 57292 (0.0007) +[2023-10-09 06:31:15,362][60143] Updated weights for policy 0, policy_version 57302 (0.0009) +[2023-10-09 06:31:15,738][60143] Updated weights for policy 0, policy_version 57312 (0.0008) +[2023-10-09 06:31:16,052][59242] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 118030336. Throughput: 0: 1695.5, 1: 1734.5. Samples: 29517378. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:31:16,053][59242] Avg episode reward: [(0, '31.180'), (1, '32.590')] +[2023-10-09 06:31:16,495][60144] Updated weights for policy 1, policy_version 57962 (0.0007) +[2023-10-09 06:31:16,870][60144] Updated weights for policy 1, policy_version 57972 (0.0007) +[2023-10-09 06:31:17,235][60144] Updated weights for policy 1, policy_version 57982 (0.0007) +[2023-10-09 06:31:19,695][60143] Updated weights for policy 0, policy_version 57322 (0.0007) +[2023-10-09 06:31:20,070][60143] Updated weights for policy 0, policy_version 57332 (0.0007) +[2023-10-09 06:31:20,438][60143] Updated weights for policy 0, policy_version 57342 (0.0008) +[2023-10-09 06:31:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 118095872. Throughput: 0: 1721.8, 1: 1708.0. Samples: 29527728. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:31:21,052][59242] Avg episode reward: [(0, '30.430'), (1, '33.510')] +[2023-10-09 06:31:21,074][60144] Updated weights for policy 1, policy_version 57992 (0.0007) +[2023-10-09 06:31:21,440][60144] Updated weights for policy 1, policy_version 58002 (0.0008) +[2023-10-09 06:31:21,804][60144] Updated weights for policy 1, policy_version 58012 (0.0008) +[2023-10-09 06:31:24,307][60143] Updated weights for policy 0, policy_version 57352 (0.0010) +[2023-10-09 06:31:24,688][60143] Updated weights for policy 0, policy_version 57362 (0.0010) +[2023-10-09 06:31:25,055][60143] Updated weights for policy 0, policy_version 57372 (0.0011) +[2023-10-09 06:31:25,831][60144] Updated weights for policy 1, policy_version 58022 (0.0008) +[2023-10-09 06:31:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 118161408. Throughput: 0: 1707.8, 1: 1738.1. Samples: 29548708. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:31:26,052][59242] Avg episode reward: [(0, '29.400'), (1, '33.500')] +[2023-10-09 06:31:26,201][60144] Updated weights for policy 1, policy_version 58032 (0.0007) +[2023-10-09 06:31:26,563][60144] Updated weights for policy 1, policy_version 58042 (0.0010) +[2023-10-09 06:31:29,120][60143] Updated weights for policy 0, policy_version 57382 (0.0010) +[2023-10-09 06:31:29,500][60143] Updated weights for policy 0, policy_version 57392 (0.0007) +[2023-10-09 06:31:29,865][60143] Updated weights for policy 0, policy_version 57402 (0.0007) +[2023-10-09 06:31:30,621][60144] Updated weights for policy 1, policy_version 58052 (0.0010) +[2023-10-09 06:31:30,979][60144] Updated weights for policy 1, policy_version 58062 (0.0010) +[2023-10-09 06:31:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 118226944. Throughput: 0: 1692.8, 1: 1730.0. Samples: 29568842. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:31:31,053][59242] Avg episode reward: [(0, '29.700'), (1, '32.680')] +[2023-10-09 06:31:31,061][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000057408_58785792.pth... +[2023-10-09 06:31:31,091][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000055808_57147392.pth +[2023-10-09 06:31:31,353][60144] Updated weights for policy 1, policy_version 58072 (0.0010) +[2023-10-09 06:31:31,634][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000058080_59473920.pth... +[2023-10-09 06:31:31,681][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000056448_57802752.pth +[2023-10-09 06:31:33,720][60143] Updated weights for policy 0, policy_version 57412 (0.0007) +[2023-10-09 06:31:34,096][60143] Updated weights for policy 0, policy_version 57422 (0.0008) +[2023-10-09 06:31:34,457][60143] Updated weights for policy 0, policy_version 57432 (0.0008) +[2023-10-09 06:31:35,289][60144] Updated weights for policy 1, policy_version 58082 (0.0010) +[2023-10-09 06:31:35,657][60144] Updated weights for policy 1, policy_version 58092 (0.0008) +[2023-10-09 06:31:36,023][60144] Updated weights for policy 1, policy_version 58102 (0.0011) +[2023-10-09 06:31:36,052][59242] Fps is (10 sec: 13106.6, 60 sec: 13653.3, 300 sec: 13773.6). Total num frames: 118292480. Throughput: 0: 1723.6, 1: 1726.4. Samples: 29579536. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:31:36,053][59242] Avg episode reward: [(0, '28.930'), (1, '32.500')] +[2023-10-09 06:31:36,395][60144] Updated weights for policy 1, policy_version 58112 (0.0010) +[2023-10-09 06:31:38,428][60143] Updated weights for policy 0, policy_version 57442 (0.0008) +[2023-10-09 06:31:38,800][60143] Updated weights for policy 0, policy_version 57452 (0.0008) +[2023-10-09 06:31:39,172][60143] Updated weights for policy 0, policy_version 57462 (0.0010) +[2023-10-09 06:31:39,544][60143] Updated weights for policy 0, policy_version 57472 (0.0008) +[2023-10-09 06:31:40,315][60144] Updated weights for policy 1, policy_version 58122 (0.0010) +[2023-10-09 06:31:40,678][60144] Updated weights for policy 1, policy_version 58132 (0.0010) +[2023-10-09 06:31:41,041][60144] Updated weights for policy 1, policy_version 58142 (0.0010) +[2023-10-09 06:31:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 118358016. Throughput: 0: 1690.5, 1: 1740.4. Samples: 29599784. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:31:41,053][59242] Avg episode reward: [(0, '29.700'), (1, '32.770')] +[2023-10-09 06:31:43,658][60143] Updated weights for policy 0, policy_version 57482 (0.0010) +[2023-10-09 06:31:44,030][60143] Updated weights for policy 0, policy_version 57492 (0.0010) +[2023-10-09 06:31:44,405][60143] Updated weights for policy 0, policy_version 57502 (0.0011) +[2023-10-09 06:31:44,844][60144] Updated weights for policy 1, policy_version 58152 (0.0010) +[2023-10-09 06:31:45,213][60144] Updated weights for policy 1, policy_version 58162 (0.0008) +[2023-10-09 06:31:45,582][60144] Updated weights for policy 1, policy_version 58172 (0.0008) +[2023-10-09 06:31:46,052][59242] Fps is (10 sec: 16384.4, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 118456320. Throughput: 0: 1700.6, 1: 1712.0. Samples: 29619924. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:31:46,053][59242] Avg episode reward: [(0, '30.380'), (1, '31.950')] +[2023-10-09 06:31:48,278][60143] Updated weights for policy 0, policy_version 57512 (0.0010) +[2023-10-09 06:31:48,654][60143] Updated weights for policy 0, policy_version 57522 (0.0010) +[2023-10-09 06:31:49,031][60143] Updated weights for policy 0, policy_version 57532 (0.0008) +[2023-10-09 06:31:49,508][60144] Updated weights for policy 1, policy_version 58182 (0.0007) +[2023-10-09 06:31:49,867][60144] Updated weights for policy 1, policy_version 58192 (0.0008) +[2023-10-09 06:31:50,239][60144] Updated weights for policy 1, policy_version 58202 (0.0008) +[2023-10-09 06:31:51,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 118521856. Throughput: 0: 1709.7, 1: 1738.9. Samples: 29631340. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:31:51,053][59242] Avg episode reward: [(0, '31.300'), (1, '32.280')] +[2023-10-09 06:31:53,011][60143] Updated weights for policy 0, policy_version 57542 (0.0009) +[2023-10-09 06:31:53,396][60143] Updated weights for policy 0, policy_version 57552 (0.0007) +[2023-10-09 06:31:53,765][60143] Updated weights for policy 0, policy_version 57562 (0.0009) +[2023-10-09 06:31:54,288][60144] Updated weights for policy 1, policy_version 58212 (0.0009) +[2023-10-09 06:31:54,663][60144] Updated weights for policy 1, policy_version 58222 (0.0009) +[2023-10-09 06:31:55,029][60144] Updated weights for policy 1, policy_version 58232 (0.0007) +[2023-10-09 06:31:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 118587392. Throughput: 0: 1693.4, 1: 1729.2. Samples: 29651358. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:31:56,053][59242] Avg episode reward: [(0, '32.430'), (1, '32.000')] +[2023-10-09 06:31:57,665][60143] Updated weights for policy 0, policy_version 57572 (0.0008) +[2023-10-09 06:31:58,032][60143] Updated weights for policy 0, policy_version 57582 (0.0008) +[2023-10-09 06:31:58,411][60143] Updated weights for policy 0, policy_version 57592 (0.0008) +[2023-10-09 06:31:58,852][60144] Updated weights for policy 1, policy_version 58242 (0.0008) +[2023-10-09 06:31:59,256][60144] Updated weights for policy 1, policy_version 58252 (0.0008) +[2023-10-09 06:31:59,620][60144] Updated weights for policy 1, policy_version 58262 (0.0007) +[2023-10-09 06:31:59,987][60144] Updated weights for policy 1, policy_version 58272 (0.0007) +[2023-10-09 06:32:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 118652928. Throughput: 0: 1717.7, 1: 1714.6. Samples: 29671832. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:32:01,053][59242] Avg episode reward: [(0, '33.980'), (1, '33.310')] +[2023-10-09 06:32:02,345][60143] Updated weights for policy 0, policy_version 57602 (0.0009) +[2023-10-09 06:32:02,703][60143] Updated weights for policy 0, policy_version 57612 (0.0009) +[2023-10-09 06:32:03,078][60143] Updated weights for policy 0, policy_version 57622 (0.0008) +[2023-10-09 06:32:03,448][60143] Updated weights for policy 0, policy_version 57632 (0.0008) +[2023-10-09 06:32:03,857][60144] Updated weights for policy 1, policy_version 58282 (0.0008) +[2023-10-09 06:32:04,228][60144] Updated weights for policy 1, policy_version 58292 (0.0009) +[2023-10-09 06:32:04,604][60144] Updated weights for policy 1, policy_version 58302 (0.0007) +[2023-10-09 06:32:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 118718464. Throughput: 0: 1693.0, 1: 1743.1. Samples: 29682354. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:32:06,053][59242] Avg episode reward: [(0, '33.140'), (1, '32.580')] +[2023-10-09 06:32:07,335][60143] Updated weights for policy 0, policy_version 57642 (0.0007) +[2023-10-09 06:32:07,710][60143] Updated weights for policy 0, policy_version 57652 (0.0007) +[2023-10-09 06:32:08,086][60143] Updated weights for policy 0, policy_version 57662 (0.0008) +[2023-10-09 06:32:08,532][60144] Updated weights for policy 1, policy_version 58312 (0.0011) +[2023-10-09 06:32:08,892][60144] Updated weights for policy 1, policy_version 58322 (0.0011) +[2023-10-09 06:32:09,255][60144] Updated weights for policy 1, policy_version 58332 (0.0009) +[2023-10-09 06:32:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 118784000. Throughput: 0: 1709.7, 1: 1708.2. Samples: 29702514. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:32:11,052][59242] Avg episode reward: [(0, '30.590'), (1, '34.460')] +[2023-10-09 06:32:11,956][60143] Updated weights for policy 0, policy_version 57672 (0.0007) +[2023-10-09 06:32:12,318][60143] Updated weights for policy 0, policy_version 57682 (0.0008) +[2023-10-09 06:32:12,691][60143] Updated weights for policy 0, policy_version 57692 (0.0008) +[2023-10-09 06:32:13,201][60144] Updated weights for policy 1, policy_version 58342 (0.0008) +[2023-10-09 06:32:13,574][60144] Updated weights for policy 1, policy_version 58352 (0.0008) +[2023-10-09 06:32:13,936][60144] Updated weights for policy 1, policy_version 58362 (0.0008) +[2023-10-09 06:32:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 118849536. Throughput: 0: 1723.8, 1: 1723.4. Samples: 29723966. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:32:16,053][59242] Avg episode reward: [(0, '30.000'), (1, '33.230')] +[2023-10-09 06:32:16,844][60143] Updated weights for policy 0, policy_version 57702 (0.0010) +[2023-10-09 06:32:17,215][60143] Updated weights for policy 0, policy_version 57712 (0.0011) +[2023-10-09 06:32:17,595][60143] Updated weights for policy 0, policy_version 57722 (0.0008) +[2023-10-09 06:32:17,797][60144] Updated weights for policy 1, policy_version 58372 (0.0008) +[2023-10-09 06:32:18,151][60144] Updated weights for policy 1, policy_version 58382 (0.0007) +[2023-10-09 06:32:18,517][60144] Updated weights for policy 1, policy_version 58392 (0.0009) +[2023-10-09 06:32:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 118915072. Throughput: 0: 1692.5, 1: 1732.9. Samples: 29733674. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:32:21,053][59242] Avg episode reward: [(0, '30.140'), (1, '32.370')] +[2023-10-09 06:32:21,780][60143] Updated weights for policy 0, policy_version 57732 (0.0009) +[2023-10-09 06:32:22,153][60143] Updated weights for policy 0, policy_version 57742 (0.0009) +[2023-10-09 06:32:22,497][60144] Updated weights for policy 1, policy_version 58402 (0.0008) +[2023-10-09 06:32:22,524][60143] Updated weights for policy 0, policy_version 57752 (0.0007) +[2023-10-09 06:32:22,865][60144] Updated weights for policy 1, policy_version 58412 (0.0008) +[2023-10-09 06:32:23,235][60144] Updated weights for policy 1, policy_version 58422 (0.0009) +[2023-10-09 06:32:23,595][60144] Updated weights for policy 1, policy_version 58432 (0.0008) +[2023-10-09 06:32:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 118980608. Throughput: 0: 1723.7, 1: 1720.4. Samples: 29754770. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:32:26,053][59242] Avg episode reward: [(0, '29.690'), (1, '32.680')] +[2023-10-09 06:32:26,365][60143] Updated weights for policy 0, policy_version 57762 (0.0007) +[2023-10-09 06:32:26,743][60143] Updated weights for policy 0, policy_version 57772 (0.0007) +[2023-10-09 06:32:27,113][60143] Updated weights for policy 0, policy_version 57782 (0.0008) +[2023-10-09 06:32:27,484][60143] Updated weights for policy 0, policy_version 57792 (0.0008) +[2023-10-09 06:32:27,568][60144] Updated weights for policy 1, policy_version 58442 (0.0007) +[2023-10-09 06:32:27,936][60144] Updated weights for policy 1, policy_version 58452 (0.0010) +[2023-10-09 06:32:28,309][60144] Updated weights for policy 1, policy_version 58462 (0.0008) +[2023-10-09 06:32:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 119046144. Throughput: 0: 1730.6, 1: 1747.4. Samples: 29776434. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:32:31,053][59242] Avg episode reward: [(0, '28.390'), (1, '32.260')] +[2023-10-09 06:32:31,435][60143] Updated weights for policy 0, policy_version 57802 (0.0007) +[2023-10-09 06:32:31,804][60143] Updated weights for policy 0, policy_version 57812 (0.0011) +[2023-10-09 06:32:32,165][60143] Updated weights for policy 0, policy_version 57822 (0.0010) +[2023-10-09 06:32:32,369][60144] Updated weights for policy 1, policy_version 58472 (0.0008) +[2023-10-09 06:32:32,734][60144] Updated weights for policy 1, policy_version 58482 (0.0011) +[2023-10-09 06:32:33,097][60144] Updated weights for policy 1, policy_version 58492 (0.0011) +[2023-10-09 06:32:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 119111680. Throughput: 0: 1711.9, 1: 1717.5. Samples: 29785662. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 06:32:36,052][59242] Avg episode reward: [(0, '29.000'), (1, '31.640')] +[2023-10-09 06:32:36,295][60143] Updated weights for policy 0, policy_version 57832 (0.0007) +[2023-10-09 06:32:36,674][60143] Updated weights for policy 0, policy_version 57842 (0.0007) +[2023-10-09 06:32:37,043][60143] Updated weights for policy 0, policy_version 57852 (0.0008) +[2023-10-09 06:32:37,058][60144] Updated weights for policy 1, policy_version 58502 (0.0009) +[2023-10-09 06:32:37,432][60144] Updated weights for policy 1, policy_version 58512 (0.0008) +[2023-10-09 06:32:37,792][60144] Updated weights for policy 1, policy_version 58522 (0.0007) +[2023-10-09 06:32:41,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 119177216. Throughput: 0: 1721.9, 1: 1727.6. Samples: 29806588. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 06:32:41,053][59242] Avg episode reward: [(0, '28.980'), (1, '33.000')] +[2023-10-09 06:32:41,106][60143] Updated weights for policy 0, policy_version 57862 (0.0008) +[2023-10-09 06:32:41,486][60143] Updated weights for policy 0, policy_version 57872 (0.0010) +[2023-10-09 06:32:41,593][60144] Updated weights for policy 1, policy_version 58532 (0.0008) +[2023-10-09 06:32:41,855][60143] Updated weights for policy 0, policy_version 57882 (0.0007) +[2023-10-09 06:32:41,961][60144] Updated weights for policy 1, policy_version 58542 (0.0010) +[2023-10-09 06:32:42,334][60144] Updated weights for policy 1, policy_version 58552 (0.0010) +[2023-10-09 06:32:45,769][60143] Updated weights for policy 0, policy_version 57892 (0.0007) +[2023-10-09 06:32:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 119242752. Throughput: 0: 1718.6, 1: 1748.9. Samples: 29827870. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 06:32:46,052][59242] Avg episode reward: [(0, '30.200'), (1, '33.250')] +[2023-10-09 06:32:46,140][60143] Updated weights for policy 0, policy_version 57902 (0.0010) +[2023-10-09 06:32:46,397][60144] Updated weights for policy 1, policy_version 58562 (0.0007) +[2023-10-09 06:32:46,510][60143] Updated weights for policy 0, policy_version 57912 (0.0009) +[2023-10-09 06:32:46,767][60144] Updated weights for policy 1, policy_version 58572 (0.0008) +[2023-10-09 06:32:47,134][60144] Updated weights for policy 1, policy_version 58582 (0.0007) +[2023-10-09 06:32:47,502][60144] Updated weights for policy 1, policy_version 58592 (0.0008) +[2023-10-09 06:32:50,280][60143] Updated weights for policy 0, policy_version 57922 (0.0008) +[2023-10-09 06:32:50,645][60143] Updated weights for policy 0, policy_version 57932 (0.0009) +[2023-10-09 06:32:51,016][60143] Updated weights for policy 0, policy_version 57942 (0.0008) +[2023-10-09 06:32:51,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 119308288. Throughput: 0: 1724.1, 1: 1720.6. Samples: 29837368. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 06:32:51,053][59242] Avg episode reward: [(0, '31.150'), (1, '33.210')] +[2023-10-09 06:32:51,360][60144] Updated weights for policy 1, policy_version 58602 (0.0008) +[2023-10-09 06:32:51,386][60143] Updated weights for policy 0, policy_version 57952 (0.0008) +[2023-10-09 06:32:51,724][60144] Updated weights for policy 1, policy_version 58612 (0.0008) +[2023-10-09 06:32:52,090][60144] Updated weights for policy 1, policy_version 58622 (0.0010) +[2023-10-09 06:32:55,405][60143] Updated weights for policy 0, policy_version 57962 (0.0009) +[2023-10-09 06:32:55,772][60143] Updated weights for policy 0, policy_version 57972 (0.0007) +[2023-10-09 06:32:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 119373824. Throughput: 0: 1718.6, 1: 1747.2. Samples: 29858474. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 06:32:56,052][59242] Avg episode reward: [(0, '31.560'), (1, '34.120')] +[2023-10-09 06:32:56,072][60144] Updated weights for policy 1, policy_version 58632 (0.0008) +[2023-10-09 06:32:56,144][60143] Updated weights for policy 0, policy_version 57982 (0.0007) +[2023-10-09 06:32:56,437][60144] Updated weights for policy 1, policy_version 58642 (0.0010) +[2023-10-09 06:32:56,798][60144] Updated weights for policy 1, policy_version 58652 (0.0010) +[2023-10-09 06:33:00,020][60143] Updated weights for policy 0, policy_version 57992 (0.0009) +[2023-10-09 06:33:00,394][60143] Updated weights for policy 0, policy_version 58002 (0.0007) +[2023-10-09 06:33:00,729][60144] Updated weights for policy 1, policy_version 58662 (0.0008) +[2023-10-09 06:33:00,752][60143] Updated weights for policy 0, policy_version 58012 (0.0008) +[2023-10-09 06:33:01,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 119472128. Throughput: 0: 1707.5, 1: 1733.5. Samples: 29878810. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 06:33:01,053][59242] Avg episode reward: [(0, '31.420'), (1, '33.440')] +[2023-10-09 06:33:01,092][60144] Updated weights for policy 1, policy_version 58672 (0.0009) +[2023-10-09 06:33:01,452][60144] Updated weights for policy 1, policy_version 58682 (0.0008) +[2023-10-09 06:33:04,733][60143] Updated weights for policy 0, policy_version 58022 (0.0008) +[2023-10-09 06:33:05,098][60143] Updated weights for policy 0, policy_version 58032 (0.0010) +[2023-10-09 06:33:05,277][60144] Updated weights for policy 1, policy_version 58692 (0.0008) +[2023-10-09 06:33:05,473][60143] Updated weights for policy 0, policy_version 58042 (0.0010) +[2023-10-09 06:33:05,642][60144] Updated weights for policy 1, policy_version 58702 (0.0009) +[2023-10-09 06:33:06,007][60144] Updated weights for policy 1, policy_version 58712 (0.0009) +[2023-10-09 06:33:06,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 119537664. Throughput: 0: 1727.9, 1: 1726.3. Samples: 29889112. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 06:33:06,052][59242] Avg episode reward: [(0, '30.520'), (1, '33.260')] +[2023-10-09 06:33:09,540][60143] Updated weights for policy 0, policy_version 58052 (0.0009) +[2023-10-09 06:33:09,899][60143] Updated weights for policy 0, policy_version 58062 (0.0009) +[2023-10-09 06:33:10,115][60144] Updated weights for policy 1, policy_version 58722 (0.0010) +[2023-10-09 06:33:10,275][60143] Updated weights for policy 0, policy_version 58072 (0.0008) +[2023-10-09 06:33:10,482][60144] Updated weights for policy 1, policy_version 58732 (0.0008) +[2023-10-09 06:33:10,850][60144] Updated weights for policy 1, policy_version 58742 (0.0007) +[2023-10-09 06:33:11,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 119603200. Throughput: 0: 1721.9, 1: 1731.7. Samples: 29910182. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:33:11,053][59242] Avg episode reward: [(0, '29.450'), (1, '31.760')] +[2023-10-09 06:33:11,228][60144] Updated weights for policy 1, policy_version 58752 (0.0008) +[2023-10-09 06:33:14,268][60143] Updated weights for policy 0, policy_version 58082 (0.0008) +[2023-10-09 06:33:14,645][60143] Updated weights for policy 0, policy_version 58092 (0.0009) +[2023-10-09 06:33:15,023][60144] Updated weights for policy 1, policy_version 58762 (0.0008) +[2023-10-09 06:33:15,031][60143] Updated weights for policy 0, policy_version 58102 (0.0007) +[2023-10-09 06:33:15,391][60144] Updated weights for policy 1, policy_version 58772 (0.0008) +[2023-10-09 06:33:15,397][60143] Updated weights for policy 0, policy_version 58112 (0.0008) +[2023-10-09 06:33:15,752][60144] Updated weights for policy 1, policy_version 58782 (0.0007) +[2023-10-09 06:33:16,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 119701504. Throughput: 0: 1688.8, 1: 1715.5. Samples: 29929626. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:33:16,053][59242] Avg episode reward: [(0, '29.220'), (1, '30.410')] +[2023-10-09 06:33:19,434][60143] Updated weights for policy 0, policy_version 58122 (0.0007) +[2023-10-09 06:33:19,588][60144] Updated weights for policy 1, policy_version 58792 (0.0008) +[2023-10-09 06:33:19,804][60143] Updated weights for policy 0, policy_version 58132 (0.0007) +[2023-10-09 06:33:19,955][60144] Updated weights for policy 1, policy_version 58802 (0.0007) +[2023-10-09 06:33:20,175][60143] Updated weights for policy 0, policy_version 58142 (0.0009) +[2023-10-09 06:33:20,326][60144] Updated weights for policy 1, policy_version 58812 (0.0007) +[2023-10-09 06:33:21,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 119767040. Throughput: 0: 1713.2, 1: 1743.7. Samples: 29941226. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:33:21,053][59242] Avg episode reward: [(0, '28.710'), (1, '29.820')] +[2023-10-09 06:33:24,096][60143] Updated weights for policy 0, policy_version 58152 (0.0009) +[2023-10-09 06:33:24,160][60144] Updated weights for policy 1, policy_version 58822 (0.0008) +[2023-10-09 06:33:24,467][60143] Updated weights for policy 0, policy_version 58162 (0.0008) +[2023-10-09 06:33:24,521][60144] Updated weights for policy 1, policy_version 58832 (0.0007) +[2023-10-09 06:33:24,832][60143] Updated weights for policy 0, policy_version 58172 (0.0007) +[2023-10-09 06:33:24,892][60144] Updated weights for policy 1, policy_version 58842 (0.0007) +[2023-10-09 06:33:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 119832576. Throughput: 0: 1705.1, 1: 1732.5. Samples: 29961280. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:33:26,053][59242] Avg episode reward: [(0, '30.280'), (1, '29.460')] +[2023-10-09 06:33:28,852][60144] Updated weights for policy 1, policy_version 58852 (0.0008) +[2023-10-09 06:33:28,977][60143] Updated weights for policy 0, policy_version 58182 (0.0009) +[2023-10-09 06:33:29,219][60144] Updated weights for policy 1, policy_version 58862 (0.0009) +[2023-10-09 06:33:29,336][60143] Updated weights for policy 0, policy_version 58192 (0.0009) +[2023-10-09 06:33:29,591][60144] Updated weights for policy 1, policy_version 58872 (0.0007) +[2023-10-09 06:33:29,710][60143] Updated weights for policy 0, policy_version 58202 (0.0009) +[2023-10-09 06:33:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 119898112. Throughput: 0: 1689.7, 1: 1719.8. Samples: 29981298. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:33:31,053][59242] Avg episode reward: [(0, '29.740'), (1, '27.870')] +[2023-10-09 06:33:31,060][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000058880_60293120.pth... +[2023-10-09 06:33:31,060][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000058208_59604992.pth... +[2023-10-09 06:33:31,099][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000057248_58621952.pth +[2023-10-09 06:33:31,102][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000056608_57966592.pth +[2023-10-09 06:33:33,680][60143] Updated weights for policy 0, policy_version 58212 (0.0009) +[2023-10-09 06:33:33,717][60144] Updated weights for policy 1, policy_version 58882 (0.0009) +[2023-10-09 06:33:34,052][60143] Updated weights for policy 0, policy_version 58222 (0.0008) +[2023-10-09 06:33:34,124][60144] Updated weights for policy 1, policy_version 58892 (0.0009) +[2023-10-09 06:33:34,433][60143] Updated weights for policy 0, policy_version 58232 (0.0008) +[2023-10-09 06:33:34,498][60144] Updated weights for policy 1, policy_version 58902 (0.0008) +[2023-10-09 06:33:34,857][60144] Updated weights for policy 1, policy_version 58912 (0.0008) +[2023-10-09 06:33:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 119963648. Throughput: 0: 1714.3, 1: 1746.4. Samples: 29993096. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:33:36,052][59242] Avg episode reward: [(0, '29.900'), (1, '29.100')] +[2023-10-09 06:33:38,413][60143] Updated weights for policy 0, policy_version 58242 (0.0008) +[2023-10-09 06:33:38,727][60144] Updated weights for policy 1, policy_version 58922 (0.0008) +[2023-10-09 06:33:38,781][60143] Updated weights for policy 0, policy_version 58252 (0.0008) +[2023-10-09 06:33:39,097][60144] Updated weights for policy 1, policy_version 58932 (0.0007) +[2023-10-09 06:33:39,156][60143] Updated weights for policy 0, policy_version 58262 (0.0009) +[2023-10-09 06:33:39,458][60144] Updated weights for policy 1, policy_version 58942 (0.0007) +[2023-10-09 06:33:39,523][60143] Updated weights for policy 0, policy_version 58272 (0.0009) +[2023-10-09 06:33:41,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 120029184. Throughput: 0: 1685.5, 1: 1722.2. Samples: 30011820. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:33:41,053][59242] Avg episode reward: [(0, '30.450'), (1, '29.690')] +[2023-10-09 06:33:43,321][60144] Updated weights for policy 1, policy_version 58952 (0.0007) +[2023-10-09 06:33:43,601][60143] Updated weights for policy 0, policy_version 58282 (0.0009) +[2023-10-09 06:33:43,695][60144] Updated weights for policy 1, policy_version 58962 (0.0007) +[2023-10-09 06:33:43,951][60143] Updated weights for policy 0, policy_version 58292 (0.0008) +[2023-10-09 06:33:44,058][60144] Updated weights for policy 1, policy_version 58972 (0.0008) +[2023-10-09 06:33:44,314][60143] Updated weights for policy 0, policy_version 58302 (0.0009) +[2023-10-09 06:33:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 120094720. Throughput: 0: 1692.8, 1: 1736.1. Samples: 30033112. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:33:46,053][59242] Avg episode reward: [(0, '31.750'), (1, '29.650')] +[2023-10-09 06:33:47,787][60144] Updated weights for policy 1, policy_version 58982 (0.0009) +[2023-10-09 06:33:48,146][60144] Updated weights for policy 1, policy_version 58992 (0.0008) +[2023-10-09 06:33:48,395][60143] Updated weights for policy 0, policy_version 58312 (0.0010) +[2023-10-09 06:33:48,513][60144] Updated weights for policy 1, policy_version 59002 (0.0007) +[2023-10-09 06:33:48,764][60143] Updated weights for policy 0, policy_version 58322 (0.0009) +[2023-10-09 06:33:49,133][60143] Updated weights for policy 0, policy_version 58332 (0.0009) +[2023-10-09 06:33:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 120160256. Throughput: 0: 1693.6, 1: 1739.0. Samples: 30043580. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 06:33:51,053][59242] Avg episode reward: [(0, '33.430'), (1, '29.070')] +[2023-10-09 06:33:52,382][60144] Updated weights for policy 1, policy_version 59012 (0.0008) +[2023-10-09 06:33:52,741][60144] Updated weights for policy 1, policy_version 59022 (0.0007) +[2023-10-09 06:33:53,028][60143] Updated weights for policy 0, policy_version 58342 (0.0008) +[2023-10-09 06:33:53,103][60144] Updated weights for policy 1, policy_version 59032 (0.0008) +[2023-10-09 06:33:53,395][60143] Updated weights for policy 0, policy_version 58352 (0.0008) +[2023-10-09 06:33:53,766][60143] Updated weights for policy 0, policy_version 58362 (0.0010) +[2023-10-09 06:33:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 120225792. Throughput: 0: 1675.4, 1: 1733.8. Samples: 30063594. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 06:33:56,053][59242] Avg episode reward: [(0, '32.490'), (1, '30.180')] +[2023-10-09 06:33:57,040][60144] Updated weights for policy 1, policy_version 59042 (0.0008) +[2023-10-09 06:33:57,410][60144] Updated weights for policy 1, policy_version 59052 (0.0009) +[2023-10-09 06:33:57,715][60143] Updated weights for policy 0, policy_version 58372 (0.0008) +[2023-10-09 06:33:57,768][60144] Updated weights for policy 1, policy_version 59062 (0.0008) +[2023-10-09 06:33:58,086][60143] Updated weights for policy 0, policy_version 58382 (0.0008) +[2023-10-09 06:33:58,133][60144] Updated weights for policy 1, policy_version 59072 (0.0007) +[2023-10-09 06:33:58,458][60143] Updated weights for policy 0, policy_version 58392 (0.0010) +[2023-10-09 06:34:01,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 120291328. Throughput: 0: 1704.9, 1: 1746.0. Samples: 30084916. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 06:34:01,052][59242] Avg episode reward: [(0, '32.590'), (1, '30.720')] +[2023-10-09 06:34:02,068][60144] Updated weights for policy 1, policy_version 59082 (0.0007) +[2023-10-09 06:34:02,337][60143] Updated weights for policy 0, policy_version 58402 (0.0010) +[2023-10-09 06:34:02,435][60144] Updated weights for policy 1, policy_version 59092 (0.0008) +[2023-10-09 06:34:02,708][60143] Updated weights for policy 0, policy_version 58412 (0.0007) +[2023-10-09 06:34:02,797][60144] Updated weights for policy 1, policy_version 59102 (0.0009) +[2023-10-09 06:34:03,077][60143] Updated weights for policy 0, policy_version 58422 (0.0007) +[2023-10-09 06:34:03,447][60143] Updated weights for policy 0, policy_version 58432 (0.0008) +[2023-10-09 06:34:06,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.2, 300 sec: 13773.6). Total num frames: 120356864. Throughput: 0: 1678.8, 1: 1722.3. Samples: 30094276. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 06:34:06,054][59242] Avg episode reward: [(0, '33.630'), (1, '29.620')] +[2023-10-09 06:34:06,812][60144] Updated weights for policy 1, policy_version 59112 (0.0008) +[2023-10-09 06:34:07,177][60144] Updated weights for policy 1, policy_version 59122 (0.0009) +[2023-10-09 06:34:07,435][60143] Updated weights for policy 0, policy_version 58442 (0.0007) +[2023-10-09 06:34:07,541][60144] Updated weights for policy 1, policy_version 59132 (0.0008) +[2023-10-09 06:34:07,809][60143] Updated weights for policy 0, policy_version 58452 (0.0010) +[2023-10-09 06:34:08,186][60143] Updated weights for policy 0, policy_version 58462 (0.0010) +[2023-10-09 06:34:11,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 120422400. Throughput: 0: 1692.0, 1: 1728.5. Samples: 30115204. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 06:34:11,053][59242] Avg episode reward: [(0, '34.150'), (1, '31.080')] +[2023-10-09 06:34:11,574][60144] Updated weights for policy 1, policy_version 59142 (0.0007) +[2023-10-09 06:34:11,934][60144] Updated weights for policy 1, policy_version 59152 (0.0008) +[2023-10-09 06:34:12,302][60144] Updated weights for policy 1, policy_version 59162 (0.0008) +[2023-10-09 06:34:12,306][60143] Updated weights for policy 0, policy_version 58472 (0.0009) +[2023-10-09 06:34:12,680][60143] Updated weights for policy 0, policy_version 58482 (0.0008) +[2023-10-09 06:34:13,040][60143] Updated weights for policy 0, policy_version 58492 (0.0007) +[2023-10-09 06:34:16,052][59242] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 120487936. Throughput: 0: 1706.7, 1: 1734.7. Samples: 30136158. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 06:34:16,052][59242] Avg episode reward: [(0, '31.170'), (1, '29.930')] +[2023-10-09 06:34:16,268][60144] Updated weights for policy 1, policy_version 59172 (0.0008) +[2023-10-09 06:34:16,634][60144] Updated weights for policy 1, policy_version 59182 (0.0007) +[2023-10-09 06:34:17,003][60144] Updated weights for policy 1, policy_version 59192 (0.0009) +[2023-10-09 06:34:17,141][60143] Updated weights for policy 0, policy_version 58502 (0.0008) +[2023-10-09 06:34:17,521][60143] Updated weights for policy 0, policy_version 58512 (0.0007) +[2023-10-09 06:34:17,893][60143] Updated weights for policy 0, policy_version 58522 (0.0010) +[2023-10-09 06:34:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 120553472. Throughput: 0: 1676.4, 1: 1707.7. Samples: 30145382. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 06:34:21,053][59242] Avg episode reward: [(0, '30.700'), (1, '30.460')] +[2023-10-09 06:34:21,089][60144] Updated weights for policy 1, policy_version 59202 (0.0008) +[2023-10-09 06:34:21,506][60144] Updated weights for policy 1, policy_version 59212 (0.0007) +[2023-10-09 06:34:21,871][60144] Updated weights for policy 1, policy_version 59222 (0.0008) +[2023-10-09 06:34:21,976][60143] Updated weights for policy 0, policy_version 58532 (0.0009) +[2023-10-09 06:34:22,244][60144] Updated weights for policy 1, policy_version 59232 (0.0009) +[2023-10-09 06:34:22,338][60143] Updated weights for policy 0, policy_version 58542 (0.0007) +[2023-10-09 06:34:22,717][60143] Updated weights for policy 0, policy_version 58552 (0.0008) +[2023-10-09 06:34:26,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 120619008. Throughput: 0: 1702.8, 1: 1732.5. Samples: 30166412. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-09 06:34:26,053][59242] Avg episode reward: [(0, '30.630'), (1, '30.440')] +[2023-10-09 06:34:26,259][60144] Updated weights for policy 1, policy_version 59242 (0.0009) +[2023-10-09 06:34:26,627][60144] Updated weights for policy 1, policy_version 59252 (0.0007) +[2023-10-09 06:34:26,734][60143] Updated weights for policy 0, policy_version 58562 (0.0009) +[2023-10-09 06:34:26,994][60144] Updated weights for policy 1, policy_version 59262 (0.0007) +[2023-10-09 06:34:27,111][60143] Updated weights for policy 0, policy_version 58572 (0.0008) +[2023-10-09 06:34:27,477][60143] Updated weights for policy 0, policy_version 58582 (0.0009) +[2023-10-09 06:34:27,857][60143] Updated weights for policy 0, policy_version 58592 (0.0009) +[2023-10-09 06:34:31,033][60144] Updated weights for policy 1, policy_version 59272 (0.0009) +[2023-10-09 06:34:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 120684544. Throughput: 0: 1708.5, 1: 1727.3. Samples: 30187722. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:34:31,053][59242] Avg episode reward: [(0, '31.530'), (1, '32.520')] +[2023-10-09 06:34:31,401][60144] Updated weights for policy 1, policy_version 59282 (0.0009) +[2023-10-09 06:34:31,764][60144] Updated weights for policy 1, policy_version 59292 (0.0009) +[2023-10-09 06:34:31,805][60143] Updated weights for policy 0, policy_version 58602 (0.0009) +[2023-10-09 06:34:32,167][60143] Updated weights for policy 0, policy_version 58612 (0.0007) +[2023-10-09 06:34:32,532][60143] Updated weights for policy 0, policy_version 58622 (0.0008) +[2023-10-09 06:34:35,612][60144] Updated weights for policy 1, policy_version 59302 (0.0009) +[2023-10-09 06:34:35,977][60144] Updated weights for policy 1, policy_version 59312 (0.0008) +[2023-10-09 06:34:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 120750080. Throughput: 0: 1694.4, 1: 1720.2. Samples: 30197236. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:34:36,053][59242] Avg episode reward: [(0, '31.740'), (1, '31.650')] +[2023-10-09 06:34:36,339][60144] Updated weights for policy 1, policy_version 59322 (0.0008) +[2023-10-09 06:34:36,375][60143] Updated weights for policy 0, policy_version 58632 (0.0009) +[2023-10-09 06:34:36,754][60143] Updated weights for policy 0, policy_version 58642 (0.0008) +[2023-10-09 06:34:37,125][60143] Updated weights for policy 0, policy_version 58652 (0.0010) +[2023-10-09 06:34:40,409][60144] Updated weights for policy 1, policy_version 59332 (0.0007) +[2023-10-09 06:34:40,777][60144] Updated weights for policy 1, policy_version 59342 (0.0008) +[2023-10-09 06:34:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 120815616. Throughput: 0: 1717.6, 1: 1724.0. Samples: 30218468. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:34:41,053][59242] Avg episode reward: [(0, '32.950'), (1, '31.530')] +[2023-10-09 06:34:41,133][60144] Updated weights for policy 1, policy_version 59352 (0.0009) +[2023-10-09 06:34:41,272][60143] Updated weights for policy 0, policy_version 58662 (0.0008) +[2023-10-09 06:34:41,645][60143] Updated weights for policy 0, policy_version 58672 (0.0009) +[2023-10-09 06:34:42,011][60143] Updated weights for policy 0, policy_version 58682 (0.0009) +[2023-10-09 06:34:44,984][60144] Updated weights for policy 1, policy_version 59362 (0.0007) +[2023-10-09 06:34:45,350][60144] Updated weights for policy 1, policy_version 59372 (0.0007) +[2023-10-09 06:34:45,718][60144] Updated weights for policy 1, policy_version 59382 (0.0007) +[2023-10-09 06:34:45,939][60143] Updated weights for policy 0, policy_version 58692 (0.0007) +[2023-10-09 06:34:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 120881152. Throughput: 0: 1713.5, 1: 1711.1. Samples: 30239024. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:34:46,053][59242] Avg episode reward: [(0, '33.760'), (1, '31.040')] +[2023-10-09 06:34:46,077][60144] Updated weights for policy 1, policy_version 59392 (0.0008) +[2023-10-09 06:34:46,314][60143] Updated weights for policy 0, policy_version 58702 (0.0010) +[2023-10-09 06:34:46,690][60143] Updated weights for policy 0, policy_version 58712 (0.0008) +[2023-10-09 06:34:50,004][60144] Updated weights for policy 1, policy_version 59402 (0.0010) +[2023-10-09 06:34:50,369][60144] Updated weights for policy 1, policy_version 59412 (0.0007) +[2023-10-09 06:34:50,675][60143] Updated weights for policy 0, policy_version 58722 (0.0009) +[2023-10-09 06:34:50,740][60144] Updated weights for policy 1, policy_version 59422 (0.0007) +[2023-10-09 06:34:51,050][60143] Updated weights for policy 0, policy_version 58732 (0.0008) +[2023-10-09 06:34:51,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 120979456. Throughput: 0: 1711.1, 1: 1729.5. Samples: 30249100. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:34:51,053][59242] Avg episode reward: [(0, '33.140'), (1, '31.070')] +[2023-10-09 06:34:51,423][60143] Updated weights for policy 0, policy_version 58742 (0.0009) +[2023-10-09 06:34:51,787][60143] Updated weights for policy 0, policy_version 58752 (0.0008) +[2023-10-09 06:34:54,462][60144] Updated weights for policy 1, policy_version 59432 (0.0008) +[2023-10-09 06:34:54,830][60144] Updated weights for policy 1, policy_version 59442 (0.0008) +[2023-10-09 06:34:55,202][60144] Updated weights for policy 1, policy_version 59452 (0.0008) +[2023-10-09 06:34:55,501][60143] Updated weights for policy 0, policy_version 58762 (0.0009) +[2023-10-09 06:34:55,870][60143] Updated weights for policy 0, policy_version 58772 (0.0010) +[2023-10-09 06:34:56,052][59242] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 121044992. Throughput: 0: 1718.1, 1: 1726.1. Samples: 30270192. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:34:56,052][59242] Avg episode reward: [(0, '32.960'), (1, '30.970')] +[2023-10-09 06:34:56,225][60143] Updated weights for policy 0, policy_version 58782 (0.0008) +[2023-10-09 06:34:59,066][60144] Updated weights for policy 1, policy_version 59462 (0.0008) +[2023-10-09 06:34:59,428][60144] Updated weights for policy 1, policy_version 59472 (0.0008) +[2023-10-09 06:34:59,792][60144] Updated weights for policy 1, policy_version 59482 (0.0007) +[2023-10-09 06:35:00,124][60143] Updated weights for policy 0, policy_version 58792 (0.0009) +[2023-10-09 06:35:00,490][60143] Updated weights for policy 0, policy_version 58802 (0.0008) +[2023-10-09 06:35:00,863][60143] Updated weights for policy 0, policy_version 58812 (0.0008) +[2023-10-09 06:35:01,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 121143296. Throughput: 0: 1707.3, 1: 1711.3. Samples: 30289996. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:35:01,053][59242] Avg episode reward: [(0, '32.190'), (1, '31.330')] +[2023-10-09 06:35:03,864][60144] Updated weights for policy 1, policy_version 59492 (0.0007) +[2023-10-09 06:35:04,238][60144] Updated weights for policy 1, policy_version 59502 (0.0009) +[2023-10-09 06:35:04,600][60144] Updated weights for policy 1, policy_version 59512 (0.0007) +[2023-10-09 06:35:05,005][60143] Updated weights for policy 0, policy_version 58822 (0.0007) +[2023-10-09 06:35:05,389][60143] Updated weights for policy 0, policy_version 58832 (0.0010) +[2023-10-09 06:35:05,751][60143] Updated weights for policy 0, policy_version 58842 (0.0010) +[2023-10-09 06:35:06,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 121208832. Throughput: 0: 1722.2, 1: 1739.3. Samples: 30301150. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-09 06:35:06,053][59242] Avg episode reward: [(0, '30.150'), (1, '32.030')] +[2023-10-09 06:35:08,675][60144] Updated weights for policy 1, policy_version 59522 (0.0008) +[2023-10-09 06:35:09,080][60144] Updated weights for policy 1, policy_version 59532 (0.0009) +[2023-10-09 06:35:09,455][60144] Updated weights for policy 1, policy_version 59542 (0.0008) +[2023-10-09 06:35:09,804][60143] Updated weights for policy 0, policy_version 58852 (0.0008) +[2023-10-09 06:35:09,820][60144] Updated weights for policy 1, policy_version 59552 (0.0008) +[2023-10-09 06:35:10,173][60143] Updated weights for policy 0, policy_version 58862 (0.0008) +[2023-10-09 06:35:10,542][60143] Updated weights for policy 0, policy_version 58872 (0.0008) +[2023-10-09 06:35:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 121274368. Throughput: 0: 1722.8, 1: 1713.3. Samples: 30321036. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-09 06:35:11,053][59242] Avg episode reward: [(0, '29.130'), (1, '32.920')] +[2023-10-09 06:35:13,765][60144] Updated weights for policy 1, policy_version 59562 (0.0007) +[2023-10-09 06:35:14,128][60144] Updated weights for policy 1, policy_version 59572 (0.0011) +[2023-10-09 06:35:14,499][60144] Updated weights for policy 1, policy_version 59582 (0.0010) +[2023-10-09 06:35:14,606][60143] Updated weights for policy 0, policy_version 58882 (0.0008) +[2023-10-09 06:35:14,986][60143] Updated weights for policy 0, policy_version 58892 (0.0009) +[2023-10-09 06:35:15,364][60143] Updated weights for policy 0, policy_version 58902 (0.0008) +[2023-10-09 06:35:15,735][60143] Updated weights for policy 0, policy_version 58912 (0.0009) +[2023-10-09 06:35:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 121339904. Throughput: 0: 1702.3, 1: 1699.1. Samples: 30340784. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-09 06:35:16,053][59242] Avg episode reward: [(0, '28.090'), (1, '31.630')] +[2023-10-09 06:35:18,479][60144] Updated weights for policy 1, policy_version 59592 (0.0010) +[2023-10-09 06:35:18,847][60144] Updated weights for policy 1, policy_version 59602 (0.0010) +[2023-10-09 06:35:19,218][60144] Updated weights for policy 1, policy_version 59612 (0.0007) +[2023-10-09 06:35:19,792][60143] Updated weights for policy 0, policy_version 58922 (0.0007) +[2023-10-09 06:35:20,163][60143] Updated weights for policy 0, policy_version 58932 (0.0007) +[2023-10-09 06:35:20,530][60143] Updated weights for policy 0, policy_version 58942 (0.0010) +[2023-10-09 06:35:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 121405440. Throughput: 0: 1716.1, 1: 1723.2. Samples: 30352004. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-09 06:35:21,053][59242] Avg episode reward: [(0, '28.840'), (1, '32.510')] +[2023-10-09 06:35:23,056][60144] Updated weights for policy 1, policy_version 59622 (0.0008) +[2023-10-09 06:35:23,424][60144] Updated weights for policy 1, policy_version 59632 (0.0009) +[2023-10-09 06:35:23,791][60144] Updated weights for policy 1, policy_version 59642 (0.0009) +[2023-10-09 06:35:24,605][60143] Updated weights for policy 0, policy_version 58952 (0.0007) +[2023-10-09 06:35:24,970][60143] Updated weights for policy 0, policy_version 58962 (0.0010) +[2023-10-09 06:35:25,336][60143] Updated weights for policy 0, policy_version 58972 (0.0009) +[2023-10-09 06:35:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 121470976. Throughput: 0: 1708.4, 1: 1707.2. Samples: 30372170. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-09 06:35:26,053][59242] Avg episode reward: [(0, '30.710'), (1, '31.380')] +[2023-10-09 06:35:27,766][60144] Updated weights for policy 1, policy_version 59652 (0.0008) +[2023-10-09 06:35:28,131][60144] Updated weights for policy 1, policy_version 59662 (0.0009) +[2023-10-09 06:35:28,498][60144] Updated weights for policy 1, policy_version 59672 (0.0010) +[2023-10-09 06:35:29,212][60143] Updated weights for policy 0, policy_version 58982 (0.0009) +[2023-10-09 06:35:29,581][60143] Updated weights for policy 0, policy_version 58992 (0.0008) +[2023-10-09 06:35:29,954][60143] Updated weights for policy 0, policy_version 59002 (0.0009) +[2023-10-09 06:35:31,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 121536512. Throughput: 0: 1688.3, 1: 1724.8. Samples: 30392612. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-09 06:35:31,053][59242] Avg episode reward: [(0, '30.300'), (1, '30.730')] +[2023-10-09 06:35:31,067][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000059680_61112320.pth... +[2023-10-09 06:35:31,067][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000059008_60424192.pth... +[2023-10-09 06:35:31,099][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000058080_59473920.pth +[2023-10-09 06:35:31,102][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000057408_58785792.pth +[2023-10-09 06:35:32,400][60144] Updated weights for policy 1, policy_version 59682 (0.0009) +[2023-10-09 06:35:32,773][60144] Updated weights for policy 1, policy_version 59692 (0.0009) +[2023-10-09 06:35:33,139][60144] Updated weights for policy 1, policy_version 59702 (0.0008) +[2023-10-09 06:35:33,505][60144] Updated weights for policy 1, policy_version 59712 (0.0009) +[2023-10-09 06:35:33,820][60143] Updated weights for policy 0, policy_version 59012 (0.0010) +[2023-10-09 06:35:34,180][60143] Updated weights for policy 0, policy_version 59022 (0.0008) +[2023-10-09 06:35:34,556][60143] Updated weights for policy 0, policy_version 59032 (0.0008) +[2023-10-09 06:35:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 121602048. Throughput: 0: 1718.9, 1: 1703.4. Samples: 30403104. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-09 06:35:36,052][59242] Avg episode reward: [(0, '29.910'), (1, '29.340')] +[2023-10-09 06:35:37,506][60144] Updated weights for policy 1, policy_version 59722 (0.0008) +[2023-10-09 06:35:37,876][60144] Updated weights for policy 1, policy_version 59732 (0.0008) +[2023-10-09 06:35:38,241][60144] Updated weights for policy 1, policy_version 59742 (0.0007) +[2023-10-09 06:35:38,529][60143] Updated weights for policy 0, policy_version 59042 (0.0010) +[2023-10-09 06:35:38,901][60143] Updated weights for policy 0, policy_version 59052 (0.0008) +[2023-10-09 06:35:39,272][60143] Updated weights for policy 0, policy_version 59062 (0.0007) +[2023-10-09 06:35:39,641][60143] Updated weights for policy 0, policy_version 59072 (0.0007) +[2023-10-09 06:35:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 121667584. Throughput: 0: 1687.0, 1: 1710.9. Samples: 30423098. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-09 06:35:41,053][59242] Avg episode reward: [(0, '31.690'), (1, '29.160')] +[2023-10-09 06:35:42,121][60144] Updated weights for policy 1, policy_version 59752 (0.0010) +[2023-10-09 06:35:42,492][60144] Updated weights for policy 1, policy_version 59762 (0.0007) +[2023-10-09 06:35:42,850][60144] Updated weights for policy 1, policy_version 59772 (0.0008) +[2023-10-09 06:35:43,502][60143] Updated weights for policy 0, policy_version 59082 (0.0010) +[2023-10-09 06:35:43,873][60143] Updated weights for policy 0, policy_version 59092 (0.0009) +[2023-10-09 06:35:44,242][60143] Updated weights for policy 0, policy_version 59102 (0.0008) +[2023-10-09 06:35:46,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 121733120. Throughput: 0: 1695.8, 1: 1732.0. Samples: 30444246. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 06:35:46,053][59242] Avg episode reward: [(0, '31.090'), (1, '28.640')] +[2023-10-09 06:35:46,854][60144] Updated weights for policy 1, policy_version 59782 (0.0010) +[2023-10-09 06:35:47,217][60144] Updated weights for policy 1, policy_version 59792 (0.0008) +[2023-10-09 06:35:47,589][60144] Updated weights for policy 1, policy_version 59802 (0.0009) +[2023-10-09 06:35:48,225][60143] Updated weights for policy 0, policy_version 59112 (0.0008) +[2023-10-09 06:35:48,593][60143] Updated weights for policy 0, policy_version 59122 (0.0009) +[2023-10-09 06:35:48,962][60143] Updated weights for policy 0, policy_version 59132 (0.0009) +[2023-10-09 06:35:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 121798656. Throughput: 0: 1699.4, 1: 1706.0. Samples: 30454396. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 06:35:51,053][59242] Avg episode reward: [(0, '31.700'), (1, '29.070')] +[2023-10-09 06:35:51,546][60144] Updated weights for policy 1, policy_version 59812 (0.0010) +[2023-10-09 06:35:51,905][60144] Updated weights for policy 1, policy_version 59822 (0.0008) +[2023-10-09 06:35:52,279][60144] Updated weights for policy 1, policy_version 59832 (0.0007) +[2023-10-09 06:35:53,187][60143] Updated weights for policy 0, policy_version 59142 (0.0008) +[2023-10-09 06:35:53,554][60143] Updated weights for policy 0, policy_version 59152 (0.0009) +[2023-10-09 06:35:53,917][60143] Updated weights for policy 0, policy_version 59162 (0.0008) +[2023-10-09 06:35:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 121864192. Throughput: 0: 1681.5, 1: 1734.8. Samples: 30474768. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 06:35:56,053][59242] Avg episode reward: [(0, '30.820'), (1, '28.040')] +[2023-10-09 06:35:56,365][60144] Updated weights for policy 1, policy_version 59842 (0.0007) +[2023-10-09 06:35:56,783][60144] Updated weights for policy 1, policy_version 59852 (0.0007) +[2023-10-09 06:35:57,145][60144] Updated weights for policy 1, policy_version 59862 (0.0007) +[2023-10-09 06:35:57,519][60144] Updated weights for policy 1, policy_version 59872 (0.0008) +[2023-10-09 06:35:57,798][60143] Updated weights for policy 0, policy_version 59172 (0.0009) +[2023-10-09 06:35:58,168][60143] Updated weights for policy 0, policy_version 59182 (0.0008) +[2023-10-09 06:35:58,537][60143] Updated weights for policy 0, policy_version 59192 (0.0007) +[2023-10-09 06:36:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13773.7). Total num frames: 121929728. Throughput: 0: 1702.8, 1: 1742.1. Samples: 30495804. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 06:36:01,053][59242] Avg episode reward: [(0, '29.910'), (1, '29.800')] +[2023-10-09 06:36:01,374][60144] Updated weights for policy 1, policy_version 59882 (0.0009) +[2023-10-09 06:36:01,745][60144] Updated weights for policy 1, policy_version 59892 (0.0009) +[2023-10-09 06:36:02,106][60144] Updated weights for policy 1, policy_version 59902 (0.0008) +[2023-10-09 06:36:02,659][60143] Updated weights for policy 0, policy_version 59202 (0.0008) +[2023-10-09 06:36:03,025][60143] Updated weights for policy 0, policy_version 59212 (0.0011) +[2023-10-09 06:36:03,401][60143] Updated weights for policy 0, policy_version 59222 (0.0010) +[2023-10-09 06:36:03,770][60143] Updated weights for policy 0, policy_version 59232 (0.0009) +[2023-10-09 06:36:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 121995264. Throughput: 0: 1689.7, 1: 1718.8. Samples: 30505384. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 06:36:06,052][59242] Avg episode reward: [(0, '29.550'), (1, '30.090')] +[2023-10-09 06:36:06,128][60144] Updated weights for policy 1, policy_version 59912 (0.0011) +[2023-10-09 06:36:06,502][60144] Updated weights for policy 1, policy_version 59922 (0.0010) +[2023-10-09 06:36:06,864][60144] Updated weights for policy 1, policy_version 59932 (0.0010) +[2023-10-09 06:36:07,766][60143] Updated weights for policy 0, policy_version 59242 (0.0010) +[2023-10-09 06:36:08,144][60143] Updated weights for policy 0, policy_version 59252 (0.0008) +[2023-10-09 06:36:08,507][60143] Updated weights for policy 0, policy_version 59262 (0.0007) +[2023-10-09 06:36:10,796][60144] Updated weights for policy 1, policy_version 59942 (0.0010) +[2023-10-09 06:36:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 122060800. Throughput: 0: 1686.5, 1: 1732.9. Samples: 30526046. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 06:36:11,053][59242] Avg episode reward: [(0, '29.310'), (1, '30.200')] +[2023-10-09 06:36:11,166][60144] Updated weights for policy 1, policy_version 59952 (0.0009) +[2023-10-09 06:36:11,526][60144] Updated weights for policy 1, policy_version 59962 (0.0010) +[2023-10-09 06:36:12,397][60143] Updated weights for policy 0, policy_version 59272 (0.0010) +[2023-10-09 06:36:12,770][60143] Updated weights for policy 0, policy_version 59282 (0.0008) +[2023-10-09 06:36:13,146][60143] Updated weights for policy 0, policy_version 59292 (0.0008) +[2023-10-09 06:36:15,379][60144] Updated weights for policy 1, policy_version 59972 (0.0009) +[2023-10-09 06:36:15,745][60144] Updated weights for policy 1, policy_version 59982 (0.0008) +[2023-10-09 06:36:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 122126336. Throughput: 0: 1709.5, 1: 1719.5. Samples: 30546916. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 06:36:16,053][59242] Avg episode reward: [(0, '29.510'), (1, '32.140')] +[2023-10-09 06:36:16,109][60144] Updated weights for policy 1, policy_version 59992 (0.0007) +[2023-10-09 06:36:17,116][60143] Updated weights for policy 0, policy_version 59302 (0.0008) +[2023-10-09 06:36:17,478][60143] Updated weights for policy 0, policy_version 59312 (0.0007) +[2023-10-09 06:36:17,853][60143] Updated weights for policy 0, policy_version 59322 (0.0008) +[2023-10-09 06:36:20,100][60144] Updated weights for policy 1, policy_version 60002 (0.0009) +[2023-10-09 06:36:20,468][60144] Updated weights for policy 1, policy_version 60012 (0.0008) +[2023-10-09 06:36:20,826][60144] Updated weights for policy 1, policy_version 60022 (0.0010) +[2023-10-09 06:36:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 122191872. Throughput: 0: 1682.0, 1: 1727.8. Samples: 30556544. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 06:36:21,052][59242] Avg episode reward: [(0, '29.030'), (1, '31.100')] +[2023-10-09 06:36:21,195][60144] Updated weights for policy 1, policy_version 60032 (0.0009) +[2023-10-09 06:36:21,993][60143] Updated weights for policy 0, policy_version 59332 (0.0009) +[2023-10-09 06:36:22,373][60143] Updated weights for policy 0, policy_version 59342 (0.0009) +[2023-10-09 06:36:22,744][60143] Updated weights for policy 0, policy_version 59352 (0.0011) +[2023-10-09 06:36:25,142][60144] Updated weights for policy 1, policy_version 60042 (0.0011) +[2023-10-09 06:36:25,512][60144] Updated weights for policy 1, policy_version 60052 (0.0010) +[2023-10-09 06:36:25,886][60144] Updated weights for policy 1, policy_version 60062 (0.0008) +[2023-10-09 06:36:26,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 122290176. Throughput: 0: 1708.3, 1: 1728.9. Samples: 30577770. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 06:36:26,053][59242] Avg episode reward: [(0, '29.880'), (1, '32.260')] +[2023-10-09 06:36:26,657][60143] Updated weights for policy 0, policy_version 59362 (0.0008) +[2023-10-09 06:36:27,034][60143] Updated weights for policy 0, policy_version 59372 (0.0008) +[2023-10-09 06:36:27,396][60143] Updated weights for policy 0, policy_version 59382 (0.0009) +[2023-10-09 06:36:27,761][60143] Updated weights for policy 0, policy_version 59392 (0.0010) +[2023-10-09 06:36:29,818][60144] Updated weights for policy 1, policy_version 60072 (0.0009) +[2023-10-09 06:36:30,185][60144] Updated weights for policy 1, policy_version 60082 (0.0008) +[2023-10-09 06:36:30,560][60144] Updated weights for policy 1, policy_version 60092 (0.0007) +[2023-10-09 06:36:31,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 122355712. Throughput: 0: 1714.9, 1: 1701.7. Samples: 30597992. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 06:36:31,053][59242] Avg episode reward: [(0, '30.500'), (1, '31.390')] +[2023-10-09 06:36:32,070][60143] Updated weights for policy 0, policy_version 59402 (0.0009) +[2023-10-09 06:36:32,439][60143] Updated weights for policy 0, policy_version 59412 (0.0009) +[2023-10-09 06:36:32,807][60143] Updated weights for policy 0, policy_version 59422 (0.0009) +[2023-10-09 06:36:34,425][60144] Updated weights for policy 1, policy_version 60102 (0.0008) +[2023-10-09 06:36:34,790][60144] Updated weights for policy 1, policy_version 60112 (0.0009) +[2023-10-09 06:36:35,156][60144] Updated weights for policy 1, policy_version 60122 (0.0009) +[2023-10-09 06:36:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 122421248. Throughput: 0: 1694.4, 1: 1729.9. Samples: 30608488. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 06:36:36,053][59242] Avg episode reward: [(0, '30.810'), (1, '31.030')] +[2023-10-09 06:36:36,695][60143] Updated weights for policy 0, policy_version 59432 (0.0009) +[2023-10-09 06:36:37,068][60143] Updated weights for policy 0, policy_version 59442 (0.0008) +[2023-10-09 06:36:37,438][60143] Updated weights for policy 0, policy_version 59452 (0.0007) +[2023-10-09 06:36:39,234][60144] Updated weights for policy 1, policy_version 60132 (0.0010) +[2023-10-09 06:36:39,606][60144] Updated weights for policy 1, policy_version 60142 (0.0008) +[2023-10-09 06:36:39,971][60144] Updated weights for policy 1, policy_version 60152 (0.0009) +[2023-10-09 06:36:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 122486784. Throughput: 0: 1716.6, 1: 1717.8. Samples: 30629314. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 06:36:41,052][59242] Avg episode reward: [(0, '31.700'), (1, '31.240')] +[2023-10-09 06:36:41,506][60143] Updated weights for policy 0, policy_version 59462 (0.0007) +[2023-10-09 06:36:41,889][60143] Updated weights for policy 0, policy_version 59472 (0.0009) +[2023-10-09 06:36:42,263][60143] Updated weights for policy 0, policy_version 59482 (0.0007) +[2023-10-09 06:36:43,858][60144] Updated weights for policy 1, policy_version 60162 (0.0008) +[2023-10-09 06:36:44,277][60144] Updated weights for policy 1, policy_version 60172 (0.0009) +[2023-10-09 06:36:44,652][60144] Updated weights for policy 1, policy_version 60182 (0.0009) +[2023-10-09 06:36:45,012][60144] Updated weights for policy 1, policy_version 60192 (0.0008) +[2023-10-09 06:36:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 122552320. Throughput: 0: 1712.4, 1: 1706.8. Samples: 30649664. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 06:36:46,053][59242] Avg episode reward: [(0, '31.880'), (1, '31.930')] +[2023-10-09 06:36:46,234][60143] Updated weights for policy 0, policy_version 59492 (0.0008) +[2023-10-09 06:36:46,604][60143] Updated weights for policy 0, policy_version 59502 (0.0007) +[2023-10-09 06:36:46,984][60143] Updated weights for policy 0, policy_version 59512 (0.0010) +[2023-10-09 06:36:48,896][60144] Updated weights for policy 1, policy_version 60202 (0.0009) +[2023-10-09 06:36:49,259][60144] Updated weights for policy 1, policy_version 60212 (0.0009) +[2023-10-09 06:36:49,633][60144] Updated weights for policy 1, policy_version 60222 (0.0011) +[2023-10-09 06:36:50,880][60143] Updated weights for policy 0, policy_version 59522 (0.0008) +[2023-10-09 06:36:51,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 122617856. Throughput: 0: 1706.1, 1: 1736.5. Samples: 30660302. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 06:36:51,053][59242] Avg episode reward: [(0, '30.590'), (1, '31.320')] +[2023-10-09 06:36:51,258][60143] Updated weights for policy 0, policy_version 59532 (0.0010) +[2023-10-09 06:36:51,628][60143] Updated weights for policy 0, policy_version 59542 (0.0007) +[2023-10-09 06:36:52,002][60143] Updated weights for policy 0, policy_version 59552 (0.0007) +[2023-10-09 06:36:53,647][60144] Updated weights for policy 1, policy_version 60232 (0.0010) +[2023-10-09 06:36:54,013][60144] Updated weights for policy 1, policy_version 60242 (0.0009) +[2023-10-09 06:36:54,390][60144] Updated weights for policy 1, policy_version 60252 (0.0008) +[2023-10-09 06:36:55,837][60143] Updated weights for policy 0, policy_version 59562 (0.0009) +[2023-10-09 06:36:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 122683392. Throughput: 0: 1718.0, 1: 1711.9. Samples: 30680390. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 06:36:56,053][59242] Avg episode reward: [(0, '30.660'), (1, '31.760')] +[2023-10-09 06:36:56,207][60143] Updated weights for policy 0, policy_version 59572 (0.0008) +[2023-10-09 06:36:56,576][60143] Updated weights for policy 0, policy_version 59582 (0.0007) +[2023-10-09 06:36:58,191][60144] Updated weights for policy 1, policy_version 60262 (0.0008) +[2023-10-09 06:36:58,546][60144] Updated weights for policy 1, policy_version 60272 (0.0007) +[2023-10-09 06:36:58,910][60144] Updated weights for policy 1, policy_version 60282 (0.0009) +[2023-10-09 06:37:00,512][60143] Updated weights for policy 0, policy_version 59592 (0.0008) +[2023-10-09 06:37:00,883][60143] Updated weights for policy 0, policy_version 59602 (0.0009) +[2023-10-09 06:37:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 122748928. Throughput: 0: 1711.5, 1: 1726.0. Samples: 30701604. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-09 06:37:01,053][59242] Avg episode reward: [(0, '30.730'), (1, '31.320')] +[2023-10-09 06:37:01,258][60143] Updated weights for policy 0, policy_version 59612 (0.0007) +[2023-10-09 06:37:02,746][60144] Updated weights for policy 1, policy_version 60292 (0.0007) +[2023-10-09 06:37:03,100][60144] Updated weights for policy 1, policy_version 60302 (0.0009) +[2023-10-09 06:37:03,465][60144] Updated weights for policy 1, policy_version 60312 (0.0009) +[2023-10-09 06:37:05,287][60143] Updated weights for policy 0, policy_version 59622 (0.0007) +[2023-10-09 06:37:05,662][60143] Updated weights for policy 0, policy_version 59632 (0.0009) +[2023-10-09 06:37:06,034][60143] Updated weights for policy 0, policy_version 59642 (0.0007) +[2023-10-09 06:37:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 122814464. Throughput: 0: 1717.8, 1: 1725.3. Samples: 30711486. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:37:06,053][59242] Avg episode reward: [(0, '30.220'), (1, '32.340')] +[2023-10-09 06:37:07,397][60144] Updated weights for policy 1, policy_version 60322 (0.0009) +[2023-10-09 06:37:07,765][60144] Updated weights for policy 1, policy_version 60332 (0.0007) +[2023-10-09 06:37:08,133][60144] Updated weights for policy 1, policy_version 60342 (0.0007) +[2023-10-09 06:37:08,499][60144] Updated weights for policy 1, policy_version 60352 (0.0010) +[2023-10-09 06:37:10,055][60143] Updated weights for policy 0, policy_version 59652 (0.0009) +[2023-10-09 06:37:10,426][60143] Updated weights for policy 0, policy_version 59662 (0.0010) +[2023-10-09 06:37:10,797][60143] Updated weights for policy 0, policy_version 59672 (0.0010) +[2023-10-09 06:37:11,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 122880000. Throughput: 0: 1718.5, 1: 1723.3. Samples: 30732650. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:37:11,052][59242] Avg episode reward: [(0, '31.170'), (1, '33.500')] +[2023-10-09 06:37:12,361][60144] Updated weights for policy 1, policy_version 60362 (0.0009) +[2023-10-09 06:37:12,736][60144] Updated weights for policy 1, policy_version 60372 (0.0009) +[2023-10-09 06:37:13,102][60144] Updated weights for policy 1, policy_version 60382 (0.0007) +[2023-10-09 06:37:14,741][60143] Updated weights for policy 0, policy_version 59682 (0.0010) +[2023-10-09 06:37:15,117][60143] Updated weights for policy 0, policy_version 59692 (0.0007) +[2023-10-09 06:37:15,490][60143] Updated weights for policy 0, policy_version 59702 (0.0007) +[2023-10-09 06:37:15,858][60143] Updated weights for policy 0, policy_version 59712 (0.0010) +[2023-10-09 06:37:16,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 122978304. Throughput: 0: 1699.7, 1: 1746.0. Samples: 30753052. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:37:16,053][59242] Avg episode reward: [(0, '31.940'), (1, '34.630')] +[2023-10-09 06:37:17,213][60144] Updated weights for policy 1, policy_version 60392 (0.0008) +[2023-10-09 06:37:17,579][60144] Updated weights for policy 1, policy_version 60402 (0.0008) +[2023-10-09 06:37:17,949][60144] Updated weights for policy 1, policy_version 60412 (0.0009) +[2023-10-09 06:37:19,624][60143] Updated weights for policy 0, policy_version 59722 (0.0010) +[2023-10-09 06:37:19,992][60143] Updated weights for policy 0, policy_version 59732 (0.0008) +[2023-10-09 06:37:20,361][60143] Updated weights for policy 0, policy_version 59742 (0.0007) +[2023-10-09 06:37:21,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 123043840. Throughput: 0: 1729.6, 1: 1714.8. Samples: 30763486. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:37:21,052][59242] Avg episode reward: [(0, '31.830'), (1, '34.690')] +[2023-10-09 06:37:21,975][60144] Updated weights for policy 1, policy_version 60422 (0.0008) +[2023-10-09 06:37:22,348][60144] Updated weights for policy 1, policy_version 60432 (0.0010) +[2023-10-09 06:37:22,715][60144] Updated weights for policy 1, policy_version 60442 (0.0009) +[2023-10-09 06:37:24,305][60143] Updated weights for policy 0, policy_version 59752 (0.0009) +[2023-10-09 06:37:24,679][60143] Updated weights for policy 0, policy_version 59762 (0.0008) +[2023-10-09 06:37:25,052][60143] Updated weights for policy 0, policy_version 59772 (0.0007) +[2023-10-09 06:37:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 123109376. Throughput: 0: 1717.0, 1: 1730.2. Samples: 30784436. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:37:26,053][59242] Avg episode reward: [(0, '33.170'), (1, '35.510')] +[2023-10-09 06:37:26,536][60144] Updated weights for policy 1, policy_version 60452 (0.0007) +[2023-10-09 06:37:26,903][60144] Updated weights for policy 1, policy_version 60462 (0.0007) +[2023-10-09 06:37:27,262][60144] Updated weights for policy 1, policy_version 60472 (0.0009) +[2023-10-09 06:37:29,044][60143] Updated weights for policy 0, policy_version 59782 (0.0008) +[2023-10-09 06:37:29,423][60143] Updated weights for policy 0, policy_version 59792 (0.0007) +[2023-10-09 06:37:29,790][60143] Updated weights for policy 0, policy_version 59802 (0.0007) +[2023-10-09 06:37:31,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 123174912. Throughput: 0: 1703.3, 1: 1747.9. Samples: 30804970. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:37:31,053][59242] Avg episode reward: [(0, '31.720'), (1, '34.920')] +[2023-10-09 06:37:31,064][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000059808_61243392.pth... +[2023-10-09 06:37:31,064][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000060480_61931520.pth... +[2023-10-09 06:37:31,095][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000058208_59604992.pth +[2023-10-09 06:37:31,100][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000058880_60293120.pth +[2023-10-09 06:37:31,320][60144] Updated weights for policy 1, policy_version 60482 (0.0009) +[2023-10-09 06:37:31,719][60144] Updated weights for policy 1, policy_version 60492 (0.0008) +[2023-10-09 06:37:32,087][60144] Updated weights for policy 1, policy_version 60502 (0.0007) +[2023-10-09 06:37:32,461][60144] Updated weights for policy 1, policy_version 60512 (0.0009) +[2023-10-09 06:37:33,758][60143] Updated weights for policy 0, policy_version 59812 (0.0008) +[2023-10-09 06:37:34,128][60143] Updated weights for policy 0, policy_version 59822 (0.0010) +[2023-10-09 06:37:34,504][60143] Updated weights for policy 0, policy_version 59832 (0.0008) +[2023-10-09 06:37:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 123240448. Throughput: 0: 1732.7, 1: 1713.9. Samples: 30815398. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:37:36,052][59242] Avg episode reward: [(0, '30.670'), (1, '34.130')] +[2023-10-09 06:37:36,214][60144] Updated weights for policy 1, policy_version 60522 (0.0009) +[2023-10-09 06:37:36,588][60144] Updated weights for policy 1, policy_version 60532 (0.0009) +[2023-10-09 06:37:36,953][60144] Updated weights for policy 1, policy_version 60542 (0.0008) +[2023-10-09 06:37:38,630][60143] Updated weights for policy 0, policy_version 59842 (0.0007) +[2023-10-09 06:37:39,005][60143] Updated weights for policy 0, policy_version 59852 (0.0009) +[2023-10-09 06:37:39,375][60143] Updated weights for policy 0, policy_version 59862 (0.0009) +[2023-10-09 06:37:39,738][60143] Updated weights for policy 0, policy_version 59872 (0.0010) +[2023-10-09 06:37:40,942][60144] Updated weights for policy 1, policy_version 60552 (0.0007) +[2023-10-09 06:37:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 123305984. Throughput: 0: 1701.2, 1: 1747.6. Samples: 30835584. Policy #0 lag: (min: 31.0, avg: 32.4, max: 58.0) +[2023-10-09 06:37:41,053][59242] Avg episode reward: [(0, '31.640'), (1, '34.480')] +[2023-10-09 06:37:41,320][60144] Updated weights for policy 1, policy_version 60562 (0.0007) +[2023-10-09 06:37:41,687][60144] Updated weights for policy 1, policy_version 60572 (0.0008) +[2023-10-09 06:37:43,779][60143] Updated weights for policy 0, policy_version 59882 (0.0008) +[2023-10-09 06:37:44,152][60143] Updated weights for policy 0, policy_version 59892 (0.0008) +[2023-10-09 06:37:44,523][60143] Updated weights for policy 0, policy_version 59902 (0.0007) +[2023-10-09 06:37:45,550][60144] Updated weights for policy 1, policy_version 60582 (0.0009) +[2023-10-09 06:37:45,926][60144] Updated weights for policy 1, policy_version 60592 (0.0009) +[2023-10-09 06:37:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 123371520. Throughput: 0: 1694.5, 1: 1738.1. Samples: 30856066. Policy #0 lag: (min: 31.0, avg: 32.4, max: 58.0) +[2023-10-09 06:37:46,053][59242] Avg episode reward: [(0, '32.600'), (1, '33.730')] +[2023-10-09 06:37:46,289][60144] Updated weights for policy 1, policy_version 60602 (0.0009) +[2023-10-09 06:37:48,627][60143] Updated weights for policy 0, policy_version 59912 (0.0008) +[2023-10-09 06:37:48,994][60143] Updated weights for policy 0, policy_version 59922 (0.0009) +[2023-10-09 06:37:49,364][60143] Updated weights for policy 0, policy_version 59932 (0.0007) +[2023-10-09 06:37:50,116][60144] Updated weights for policy 1, policy_version 60612 (0.0008) +[2023-10-09 06:37:50,485][60144] Updated weights for policy 1, policy_version 60622 (0.0009) +[2023-10-09 06:37:50,849][60144] Updated weights for policy 1, policy_version 60632 (0.0008) +[2023-10-09 06:37:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 123437056. Throughput: 0: 1710.1, 1: 1740.6. Samples: 30866770. Policy #0 lag: (min: 31.0, avg: 32.4, max: 58.0) +[2023-10-09 06:37:51,052][59242] Avg episode reward: [(0, '32.520'), (1, '32.260')] +[2023-10-09 06:37:53,295][60143] Updated weights for policy 0, policy_version 59942 (0.0009) +[2023-10-09 06:37:53,667][60143] Updated weights for policy 0, policy_version 59952 (0.0008) +[2023-10-09 06:37:54,028][60143] Updated weights for policy 0, policy_version 59962 (0.0010) +[2023-10-09 06:37:54,788][60144] Updated weights for policy 1, policy_version 60642 (0.0009) +[2023-10-09 06:37:55,150][60144] Updated weights for policy 1, policy_version 60652 (0.0009) +[2023-10-09 06:37:55,505][60144] Updated weights for policy 1, policy_version 60662 (0.0009) +[2023-10-09 06:37:55,880][60144] Updated weights for policy 1, policy_version 60672 (0.0008) +[2023-10-09 06:37:56,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 123535360. Throughput: 0: 1689.2, 1: 1744.1. Samples: 30887148. Policy #0 lag: (min: 31.0, avg: 32.4, max: 58.0) +[2023-10-09 06:37:56,053][59242] Avg episode reward: [(0, '31.500'), (1, '31.380')] +[2023-10-09 06:37:58,072][60143] Updated weights for policy 0, policy_version 59972 (0.0011) +[2023-10-09 06:37:58,447][60143] Updated weights for policy 0, policy_version 59982 (0.0011) +[2023-10-09 06:37:58,828][60143] Updated weights for policy 0, policy_version 59992 (0.0010) +[2023-10-09 06:37:59,856][60144] Updated weights for policy 1, policy_version 60682 (0.0007) +[2023-10-09 06:38:00,234][60144] Updated weights for policy 1, policy_version 60692 (0.0007) +[2023-10-09 06:38:00,605][60144] Updated weights for policy 1, policy_version 60702 (0.0009) +[2023-10-09 06:38:01,052][59242] Fps is (10 sec: 16383.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 123600896. Throughput: 0: 1706.7, 1: 1721.0. Samples: 30907300. Policy #0 lag: (min: 31.0, avg: 32.4, max: 58.0) +[2023-10-09 06:38:01,053][59242] Avg episode reward: [(0, '32.140'), (1, '31.560')] +[2023-10-09 06:38:02,677][60143] Updated weights for policy 0, policy_version 60002 (0.0007) +[2023-10-09 06:38:03,048][60143] Updated weights for policy 0, policy_version 60012 (0.0008) +[2023-10-09 06:38:03,408][60143] Updated weights for policy 0, policy_version 60022 (0.0008) +[2023-10-09 06:38:03,779][60143] Updated weights for policy 0, policy_version 60032 (0.0008) +[2023-10-09 06:38:04,254][60144] Updated weights for policy 1, policy_version 60712 (0.0008) +[2023-10-09 06:38:04,617][60144] Updated weights for policy 1, policy_version 60722 (0.0010) +[2023-10-09 06:38:04,980][60144] Updated weights for policy 1, policy_version 60732 (0.0010) +[2023-10-09 06:38:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 123666432. Throughput: 0: 1691.8, 1: 1749.7. Samples: 30918354. Policy #0 lag: (min: 31.0, avg: 32.4, max: 58.0) +[2023-10-09 06:38:06,053][59242] Avg episode reward: [(0, '32.380'), (1, '33.250')] +[2023-10-09 06:38:07,653][60143] Updated weights for policy 0, policy_version 60042 (0.0008) +[2023-10-09 06:38:08,028][60143] Updated weights for policy 0, policy_version 60052 (0.0009) +[2023-10-09 06:38:08,396][60143] Updated weights for policy 0, policy_version 60062 (0.0008) +[2023-10-09 06:38:09,149][60144] Updated weights for policy 1, policy_version 60742 (0.0008) +[2023-10-09 06:38:09,509][60144] Updated weights for policy 1, policy_version 60752 (0.0007) +[2023-10-09 06:38:09,875][60144] Updated weights for policy 1, policy_version 60762 (0.0007) +[2023-10-09 06:38:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 123731968. Throughput: 0: 1693.9, 1: 1727.4. Samples: 30938396. Policy #0 lag: (min: 31.0, avg: 32.4, max: 58.0) +[2023-10-09 06:38:11,053][59242] Avg episode reward: [(0, '30.990'), (1, '32.130')] +[2023-10-09 06:38:12,614][60143] Updated weights for policy 0, policy_version 60072 (0.0009) +[2023-10-09 06:38:12,977][60143] Updated weights for policy 0, policy_version 60082 (0.0007) +[2023-10-09 06:38:13,341][60143] Updated weights for policy 0, policy_version 60092 (0.0009) +[2023-10-09 06:38:13,848][60144] Updated weights for policy 1, policy_version 60772 (0.0008) +[2023-10-09 06:38:14,232][60144] Updated weights for policy 1, policy_version 60782 (0.0007) +[2023-10-09 06:38:14,601][60144] Updated weights for policy 1, policy_version 60792 (0.0009) +[2023-10-09 06:38:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 123797504. Throughput: 0: 1708.2, 1: 1707.1. Samples: 30958656. Policy #0 lag: (min: 31.0, avg: 32.4, max: 58.0) +[2023-10-09 06:38:16,053][59242] Avg episode reward: [(0, '32.590'), (1, '31.420')] +[2023-10-09 06:38:17,345][60143] Updated weights for policy 0, policy_version 60102 (0.0009) +[2023-10-09 06:38:17,734][60143] Updated weights for policy 0, policy_version 60112 (0.0008) +[2023-10-09 06:38:18,106][60143] Updated weights for policy 0, policy_version 60122 (0.0009) +[2023-10-09 06:38:18,630][60144] Updated weights for policy 1, policy_version 60802 (0.0009) +[2023-10-09 06:38:19,045][60144] Updated weights for policy 1, policy_version 60812 (0.0009) +[2023-10-09 06:38:19,416][60144] Updated weights for policy 1, policy_version 60822 (0.0007) +[2023-10-09 06:38:19,782][60144] Updated weights for policy 1, policy_version 60832 (0.0007) +[2023-10-09 06:38:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 123863040. Throughput: 0: 1677.2, 1: 1741.4. Samples: 30969238. Policy #0 lag: (min: 23.0, avg: 23.3, max: 35.0) +[2023-10-09 06:38:21,053][59242] Avg episode reward: [(0, '32.720'), (1, '31.520')] +[2023-10-09 06:38:22,086][60143] Updated weights for policy 0, policy_version 60132 (0.0007) +[2023-10-09 06:38:22,450][60143] Updated weights for policy 0, policy_version 60142 (0.0007) +[2023-10-09 06:38:22,817][60143] Updated weights for policy 0, policy_version 60152 (0.0008) +[2023-10-09 06:38:23,543][60144] Updated weights for policy 1, policy_version 60842 (0.0009) +[2023-10-09 06:38:23,899][60144] Updated weights for policy 1, policy_version 60852 (0.0008) +[2023-10-09 06:38:24,261][60144] Updated weights for policy 1, policy_version 60862 (0.0007) +[2023-10-09 06:38:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 123928576. Throughput: 0: 1705.7, 1: 1707.9. Samples: 30989198. Policy #0 lag: (min: 23.0, avg: 23.3, max: 35.0) +[2023-10-09 06:38:26,053][59242] Avg episode reward: [(0, '32.430'), (1, '32.750')] +[2023-10-09 06:38:26,867][60143] Updated weights for policy 0, policy_version 60162 (0.0009) +[2023-10-09 06:38:27,233][60143] Updated weights for policy 0, policy_version 60172 (0.0009) +[2023-10-09 06:38:27,597][60143] Updated weights for policy 0, policy_version 60182 (0.0007) +[2023-10-09 06:38:27,955][60143] Updated weights for policy 0, policy_version 60192 (0.0007) +[2023-10-09 06:38:28,284][60144] Updated weights for policy 1, policy_version 60872 (0.0007) +[2023-10-09 06:38:28,652][60144] Updated weights for policy 1, policy_version 60882 (0.0009) +[2023-10-09 06:38:29,011][60144] Updated weights for policy 1, policy_version 60892 (0.0010) +[2023-10-09 06:38:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 123994112. Throughput: 0: 1716.2, 1: 1714.7. Samples: 31010460. Policy #0 lag: (min: 23.0, avg: 23.3, max: 35.0) +[2023-10-09 06:38:31,053][59242] Avg episode reward: [(0, '32.130'), (1, '31.330')] +[2023-10-09 06:38:32,008][60143] Updated weights for policy 0, policy_version 60202 (0.0010) +[2023-10-09 06:38:32,381][60143] Updated weights for policy 0, policy_version 60212 (0.0008) +[2023-10-09 06:38:32,744][60143] Updated weights for policy 0, policy_version 60222 (0.0007) +[2023-10-09 06:38:33,061][60144] Updated weights for policy 1, policy_version 60902 (0.0008) +[2023-10-09 06:38:33,427][60144] Updated weights for policy 1, policy_version 60912 (0.0008) +[2023-10-09 06:38:33,786][60144] Updated weights for policy 1, policy_version 60922 (0.0009) +[2023-10-09 06:38:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 124059648. Throughput: 0: 1692.9, 1: 1721.4. Samples: 31020412. Policy #0 lag: (min: 23.0, avg: 23.3, max: 35.0) +[2023-10-09 06:38:36,053][59242] Avg episode reward: [(0, '32.320'), (1, '29.620')] +[2023-10-09 06:38:36,688][60143] Updated weights for policy 0, policy_version 60232 (0.0008) +[2023-10-09 06:38:37,061][60143] Updated weights for policy 0, policy_version 60242 (0.0007) +[2023-10-09 06:38:37,426][60143] Updated weights for policy 0, policy_version 60252 (0.0011) +[2023-10-09 06:38:37,782][60144] Updated weights for policy 1, policy_version 60932 (0.0008) +[2023-10-09 06:38:38,149][60144] Updated weights for policy 1, policy_version 60942 (0.0007) +[2023-10-09 06:38:38,523][60144] Updated weights for policy 1, policy_version 60952 (0.0007) +[2023-10-09 06:38:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 124125184. Throughput: 0: 1709.8, 1: 1708.5. Samples: 31040972. Policy #0 lag: (min: 23.0, avg: 23.3, max: 35.0) +[2023-10-09 06:38:41,052][59242] Avg episode reward: [(0, '32.630'), (1, '29.860')] +[2023-10-09 06:38:41,716][60143] Updated weights for policy 0, policy_version 60262 (0.0009) +[2023-10-09 06:38:42,079][60143] Updated weights for policy 0, policy_version 60272 (0.0008) +[2023-10-09 06:38:42,454][60143] Updated weights for policy 0, policy_version 60282 (0.0007) +[2023-10-09 06:38:42,600][60144] Updated weights for policy 1, policy_version 60962 (0.0007) +[2023-10-09 06:38:42,963][60144] Updated weights for policy 1, policy_version 60972 (0.0007) +[2023-10-09 06:38:43,318][60144] Updated weights for policy 1, policy_version 60982 (0.0008) +[2023-10-09 06:38:43,692][60144] Updated weights for policy 1, policy_version 60992 (0.0008) +[2023-10-09 06:38:46,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 124190720. Throughput: 0: 1703.2, 1: 1734.2. Samples: 31061982. Policy #0 lag: (min: 23.0, avg: 23.3, max: 35.0) +[2023-10-09 06:38:46,052][59242] Avg episode reward: [(0, '31.630'), (1, '29.210')] +[2023-10-09 06:38:46,334][60143] Updated weights for policy 0, policy_version 60292 (0.0009) +[2023-10-09 06:38:46,708][60143] Updated weights for policy 0, policy_version 60302 (0.0010) +[2023-10-09 06:38:47,084][60143] Updated weights for policy 0, policy_version 60312 (0.0008) +[2023-10-09 06:38:47,565][60144] Updated weights for policy 1, policy_version 61002 (0.0010) +[2023-10-09 06:38:47,934][60144] Updated weights for policy 1, policy_version 61012 (0.0008) +[2023-10-09 06:38:48,301][60144] Updated weights for policy 1, policy_version 61022 (0.0009) +[2023-10-09 06:38:51,048][60143] Updated weights for policy 0, policy_version 60322 (0.0010) +[2023-10-09 06:38:51,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 124256256. Throughput: 0: 1692.6, 1: 1707.0. Samples: 31071336. Policy #0 lag: (min: 23.0, avg: 23.3, max: 35.0) +[2023-10-09 06:38:51,053][59242] Avg episode reward: [(0, '32.070'), (1, '29.460')] +[2023-10-09 06:38:51,420][60143] Updated weights for policy 0, policy_version 60332 (0.0009) +[2023-10-09 06:38:51,797][60143] Updated weights for policy 0, policy_version 60342 (0.0007) +[2023-10-09 06:38:52,161][60143] Updated weights for policy 0, policy_version 60352 (0.0008) +[2023-10-09 06:38:52,201][60144] Updated weights for policy 1, policy_version 61032 (0.0009) +[2023-10-09 06:38:52,558][60144] Updated weights for policy 1, policy_version 61042 (0.0010) +[2023-10-09 06:38:52,928][60144] Updated weights for policy 1, policy_version 61052 (0.0007) +[2023-10-09 06:38:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 124321792. Throughput: 0: 1695.4, 1: 1724.8. Samples: 31092304. Policy #0 lag: (min: 23.0, avg: 23.3, max: 35.0) +[2023-10-09 06:38:56,052][59242] Avg episode reward: [(0, '30.320'), (1, '29.130')] +[2023-10-09 06:38:56,101][60143] Updated weights for policy 0, policy_version 60362 (0.0011) +[2023-10-09 06:38:56,477][60143] Updated weights for policy 0, policy_version 60372 (0.0010) +[2023-10-09 06:38:56,790][60144] Updated weights for policy 1, policy_version 61062 (0.0008) +[2023-10-09 06:38:56,854][60143] Updated weights for policy 0, policy_version 60382 (0.0009) +[2023-10-09 06:38:57,161][60144] Updated weights for policy 1, policy_version 61072 (0.0010) +[2023-10-09 06:38:57,528][60144] Updated weights for policy 1, policy_version 61082 (0.0007) +[2023-10-09 06:39:00,724][60143] Updated weights for policy 0, policy_version 60392 (0.0008) +[2023-10-09 06:39:01,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 124387328. Throughput: 0: 1701.1, 1: 1744.0. Samples: 31113686. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) +[2023-10-09 06:39:01,053][59242] Avg episode reward: [(0, '31.930'), (1, '29.700')] +[2023-10-09 06:39:01,096][60143] Updated weights for policy 0, policy_version 60402 (0.0010) +[2023-10-09 06:39:01,467][60143] Updated weights for policy 0, policy_version 60412 (0.0009) +[2023-10-09 06:39:01,485][60144] Updated weights for policy 1, policy_version 61092 (0.0007) +[2023-10-09 06:39:01,864][60144] Updated weights for policy 1, policy_version 61102 (0.0009) +[2023-10-09 06:39:02,223][60144] Updated weights for policy 1, policy_version 61112 (0.0008) +[2023-10-09 06:39:05,389][60143] Updated weights for policy 0, policy_version 60422 (0.0010) +[2023-10-09 06:39:05,755][60143] Updated weights for policy 0, policy_version 60432 (0.0009) +[2023-10-09 06:39:06,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 124452864. Throughput: 0: 1709.6, 1: 1711.7. Samples: 31123198. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) +[2023-10-09 06:39:06,053][59242] Avg episode reward: [(0, '31.650'), (1, '29.130')] +[2023-10-09 06:39:06,135][60143] Updated weights for policy 0, policy_version 60442 (0.0009) +[2023-10-09 06:39:06,289][60144] Updated weights for policy 1, policy_version 61122 (0.0007) +[2023-10-09 06:39:06,681][60144] Updated weights for policy 1, policy_version 61132 (0.0007) +[2023-10-09 06:39:07,052][60144] Updated weights for policy 1, policy_version 61142 (0.0011) +[2023-10-09 06:39:07,416][60144] Updated weights for policy 1, policy_version 61152 (0.0008) +[2023-10-09 06:39:10,239][60143] Updated weights for policy 0, policy_version 60452 (0.0009) +[2023-10-09 06:39:10,616][60143] Updated weights for policy 0, policy_version 60462 (0.0009) +[2023-10-09 06:39:10,977][60143] Updated weights for policy 0, policy_version 60472 (0.0009) +[2023-10-09 06:39:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 124518400. Throughput: 0: 1715.5, 1: 1736.6. Samples: 31144542. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) +[2023-10-09 06:39:11,052][59242] Avg episode reward: [(0, '30.300'), (1, '30.460')] +[2023-10-09 06:39:11,453][60144] Updated weights for policy 1, policy_version 61162 (0.0009) +[2023-10-09 06:39:11,818][60144] Updated weights for policy 1, policy_version 61172 (0.0009) +[2023-10-09 06:39:12,185][60144] Updated weights for policy 1, policy_version 61182 (0.0009) +[2023-10-09 06:39:14,809][60143] Updated weights for policy 0, policy_version 60482 (0.0008) +[2023-10-09 06:39:15,181][60143] Updated weights for policy 0, policy_version 60492 (0.0008) +[2023-10-09 06:39:15,537][60143] Updated weights for policy 0, policy_version 60502 (0.0009) +[2023-10-09 06:39:15,905][60143] Updated weights for policy 0, policy_version 60512 (0.0007) +[2023-10-09 06:39:15,917][60144] Updated weights for policy 1, policy_version 61192 (0.0009) +[2023-10-09 06:39:16,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 124616704. Throughput: 0: 1700.8, 1: 1737.8. Samples: 31165198. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) +[2023-10-09 06:39:16,053][59242] Avg episode reward: [(0, '29.730'), (1, '30.960')] +[2023-10-09 06:39:16,282][60144] Updated weights for policy 1, policy_version 61202 (0.0009) +[2023-10-09 06:39:16,652][60144] Updated weights for policy 1, policy_version 61212 (0.0007) +[2023-10-09 06:39:19,976][60143] Updated weights for policy 0, policy_version 60522 (0.0008) +[2023-10-09 06:39:20,355][60143] Updated weights for policy 0, policy_version 60532 (0.0008) +[2023-10-09 06:39:20,385][60144] Updated weights for policy 1, policy_version 61222 (0.0008) +[2023-10-09 06:39:20,726][60143] Updated weights for policy 0, policy_version 60542 (0.0008) +[2023-10-09 06:39:20,748][60144] Updated weights for policy 1, policy_version 61232 (0.0009) +[2023-10-09 06:39:21,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 124682240. Throughput: 0: 1719.3, 1: 1723.3. Samples: 31175330. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) +[2023-10-09 06:39:21,053][59242] Avg episode reward: [(0, '30.190'), (1, '29.470')] +[2023-10-09 06:39:21,127][60144] Updated weights for policy 1, policy_version 61242 (0.0008) +[2023-10-09 06:39:24,761][60143] Updated weights for policy 0, policy_version 60552 (0.0007) +[2023-10-09 06:39:25,013][60144] Updated weights for policy 1, policy_version 61252 (0.0009) +[2023-10-09 06:39:25,133][60143] Updated weights for policy 0, policy_version 60562 (0.0007) +[2023-10-09 06:39:25,372][60144] Updated weights for policy 1, policy_version 61262 (0.0007) +[2023-10-09 06:39:25,498][60143] Updated weights for policy 0, policy_version 60572 (0.0009) +[2023-10-09 06:39:25,746][60144] Updated weights for policy 1, policy_version 61272 (0.0009) +[2023-10-09 06:39:26,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 124780544. Throughput: 0: 1725.3, 1: 1736.4. Samples: 31196748. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) +[2023-10-09 06:39:26,052][59242] Avg episode reward: [(0, '30.180'), (1, '28.150')] +[2023-10-09 06:39:29,530][60143] Updated weights for policy 0, policy_version 60582 (0.0008) +[2023-10-09 06:39:29,591][60144] Updated weights for policy 1, policy_version 61282 (0.0007) +[2023-10-09 06:39:29,908][60143] Updated weights for policy 0, policy_version 60592 (0.0008) +[2023-10-09 06:39:29,959][60144] Updated weights for policy 1, policy_version 61292 (0.0009) +[2023-10-09 06:39:30,277][60143] Updated weights for policy 0, policy_version 60602 (0.0008) +[2023-10-09 06:39:30,331][60144] Updated weights for policy 1, policy_version 61302 (0.0009) +[2023-10-09 06:39:30,694][60144] Updated weights for policy 1, policy_version 61312 (0.0008) +[2023-10-09 06:39:31,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 124846080. Throughput: 0: 1701.7, 1: 1717.3. Samples: 31215840. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) +[2023-10-09 06:39:31,053][59242] Avg episode reward: [(0, '30.570'), (1, '29.320')] +[2023-10-09 06:39:31,064][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000060608_62062592.pth... +[2023-10-09 06:39:31,064][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000061312_62783488.pth... +[2023-10-09 06:39:31,098][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000059680_61112320.pth +[2023-10-09 06:39:31,101][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000059008_60424192.pth +[2023-10-09 06:39:34,208][60143] Updated weights for policy 0, policy_version 60612 (0.0009) +[2023-10-09 06:39:34,581][60143] Updated weights for policy 0, policy_version 60622 (0.0009) +[2023-10-09 06:39:34,837][60144] Updated weights for policy 1, policy_version 61322 (0.0008) +[2023-10-09 06:39:34,939][60143] Updated weights for policy 0, policy_version 60632 (0.0007) +[2023-10-09 06:39:35,212][60144] Updated weights for policy 1, policy_version 61332 (0.0007) +[2023-10-09 06:39:35,576][60144] Updated weights for policy 1, policy_version 61342 (0.0008) +[2023-10-09 06:39:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 124911616. Throughput: 0: 1727.5, 1: 1738.6. Samples: 31227310. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) +[2023-10-09 06:39:36,053][59242] Avg episode reward: [(0, '30.610'), (1, '29.490')] +[2023-10-09 06:39:38,880][60143] Updated weights for policy 0, policy_version 60642 (0.0008) +[2023-10-09 06:39:39,249][60143] Updated weights for policy 0, policy_version 60652 (0.0007) +[2023-10-09 06:39:39,392][60144] Updated weights for policy 1, policy_version 61352 (0.0008) +[2023-10-09 06:39:39,617][60143] Updated weights for policy 0, policy_version 60662 (0.0008) +[2023-10-09 06:39:39,755][60144] Updated weights for policy 1, policy_version 61362 (0.0008) +[2023-10-09 06:39:39,987][60143] Updated weights for policy 0, policy_version 60672 (0.0008) +[2023-10-09 06:39:40,127][60144] Updated weights for policy 1, policy_version 61372 (0.0007) +[2023-10-09 06:39:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.4, 300 sec: 13884.8). Total num frames: 124977152. Throughput: 0: 1715.4, 1: 1729.6. Samples: 31247326. Policy #0 lag: (min: 29.0, avg: 29.9, max: 49.0) +[2023-10-09 06:39:41,053][59242] Avg episode reward: [(0, '29.810'), (1, '29.590')] +[2023-10-09 06:39:43,918][60143] Updated weights for policy 0, policy_version 60682 (0.0011) +[2023-10-09 06:39:44,085][60144] Updated weights for policy 1, policy_version 61382 (0.0007) +[2023-10-09 06:39:44,288][60143] Updated weights for policy 0, policy_version 60692 (0.0010) +[2023-10-09 06:39:44,453][60144] Updated weights for policy 1, policy_version 61392 (0.0007) +[2023-10-09 06:39:44,650][60143] Updated weights for policy 0, policy_version 60702 (0.0008) +[2023-10-09 06:39:44,816][60144] Updated weights for policy 1, policy_version 61402 (0.0009) +[2023-10-09 06:39:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 125042688. Throughput: 0: 1699.2, 1: 1712.4. Samples: 31267210. Policy #0 lag: (min: 29.0, avg: 29.9, max: 49.0) +[2023-10-09 06:39:46,053][59242] Avg episode reward: [(0, '29.510'), (1, '28.730')] +[2023-10-09 06:39:48,706][60143] Updated weights for policy 0, policy_version 60712 (0.0010) +[2023-10-09 06:39:48,874][60144] Updated weights for policy 1, policy_version 61412 (0.0008) +[2023-10-09 06:39:49,078][60143] Updated weights for policy 0, policy_version 60722 (0.0009) +[2023-10-09 06:39:49,237][60144] Updated weights for policy 1, policy_version 61422 (0.0008) +[2023-10-09 06:39:49,442][60143] Updated weights for policy 0, policy_version 60732 (0.0009) +[2023-10-09 06:39:49,611][60144] Updated weights for policy 1, policy_version 61432 (0.0007) +[2023-10-09 06:39:51,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 125108224. Throughput: 0: 1718.9, 1: 1745.3. Samples: 31279090. Policy #0 lag: (min: 29.0, avg: 29.9, max: 49.0) +[2023-10-09 06:39:51,052][59242] Avg episode reward: [(0, '30.290'), (1, '29.040')] +[2023-10-09 06:39:53,409][60143] Updated weights for policy 0, policy_version 60742 (0.0009) +[2023-10-09 06:39:53,605][60144] Updated weights for policy 1, policy_version 61442 (0.0007) +[2023-10-09 06:39:53,786][60143] Updated weights for policy 0, policy_version 60752 (0.0009) +[2023-10-09 06:39:54,013][60144] Updated weights for policy 1, policy_version 61452 (0.0008) +[2023-10-09 06:39:54,158][60143] Updated weights for policy 0, policy_version 60762 (0.0007) +[2023-10-09 06:39:54,380][60144] Updated weights for policy 1, policy_version 61462 (0.0009) +[2023-10-09 06:39:54,734][60144] Updated weights for policy 1, policy_version 61472 (0.0008) +[2023-10-09 06:39:56,052][59242] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 125173760. Throughput: 0: 1686.0, 1: 1727.3. Samples: 31298142. Policy #0 lag: (min: 29.0, avg: 29.9, max: 49.0) +[2023-10-09 06:39:56,052][59242] Avg episode reward: [(0, '31.370'), (1, '28.240')] +[2023-10-09 06:39:58,069][60143] Updated weights for policy 0, policy_version 60772 (0.0009) +[2023-10-09 06:39:58,431][60143] Updated weights for policy 0, policy_version 60782 (0.0008) +[2023-10-09 06:39:58,677][60144] Updated weights for policy 1, policy_version 61482 (0.0007) +[2023-10-09 06:39:58,801][60143] Updated weights for policy 0, policy_version 60792 (0.0007) +[2023-10-09 06:39:59,035][60144] Updated weights for policy 1, policy_version 61492 (0.0007) +[2023-10-09 06:39:59,411][60144] Updated weights for policy 1, policy_version 61502 (0.0008) +[2023-10-09 06:40:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 125239296. Throughput: 0: 1701.8, 1: 1714.0. Samples: 31318910. Policy #0 lag: (min: 29.0, avg: 29.9, max: 49.0) +[2023-10-09 06:40:01,053][59242] Avg episode reward: [(0, '31.850'), (1, '28.050')] +[2023-10-09 06:40:02,872][60143] Updated weights for policy 0, policy_version 60802 (0.0008) +[2023-10-09 06:40:03,244][60143] Updated weights for policy 0, policy_version 60812 (0.0008) +[2023-10-09 06:40:03,309][60144] Updated weights for policy 1, policy_version 61512 (0.0008) +[2023-10-09 06:40:03,619][60143] Updated weights for policy 0, policy_version 60822 (0.0008) +[2023-10-09 06:40:03,669][60144] Updated weights for policy 1, policy_version 61522 (0.0007) +[2023-10-09 06:40:03,980][60143] Updated weights for policy 0, policy_version 60832 (0.0009) +[2023-10-09 06:40:04,022][60144] Updated weights for policy 1, policy_version 61532 (0.0007) +[2023-10-09 06:40:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 125304832. Throughput: 0: 1699.4, 1: 1728.4. Samples: 31329580. Policy #0 lag: (min: 29.0, avg: 29.9, max: 49.0) +[2023-10-09 06:40:06,052][59242] Avg episode reward: [(0, '30.840'), (1, '28.320')] +[2023-10-09 06:40:07,737][60144] Updated weights for policy 1, policy_version 61542 (0.0007) +[2023-10-09 06:40:07,952][60143] Updated weights for policy 0, policy_version 60842 (0.0009) +[2023-10-09 06:40:08,108][60144] Updated weights for policy 1, policy_version 61552 (0.0008) +[2023-10-09 06:40:08,329][60143] Updated weights for policy 0, policy_version 60852 (0.0009) +[2023-10-09 06:40:08,464][60144] Updated weights for policy 1, policy_version 61562 (0.0009) +[2023-10-09 06:40:08,700][60143] Updated weights for policy 0, policy_version 60862 (0.0009) +[2023-10-09 06:40:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 125370368. Throughput: 0: 1685.6, 1: 1713.4. Samples: 31349704. Policy #0 lag: (min: 29.0, avg: 29.9, max: 49.0) +[2023-10-09 06:40:11,052][59242] Avg episode reward: [(0, '31.210'), (1, '28.340')] +[2023-10-09 06:40:12,601][60144] Updated weights for policy 1, policy_version 61572 (0.0008) +[2023-10-09 06:40:12,633][60143] Updated weights for policy 0, policy_version 60872 (0.0008) +[2023-10-09 06:40:12,971][60144] Updated weights for policy 1, policy_version 61582 (0.0008) +[2023-10-09 06:40:13,013][60143] Updated weights for policy 0, policy_version 60882 (0.0009) +[2023-10-09 06:40:13,339][60144] Updated weights for policy 1, policy_version 61592 (0.0008) +[2023-10-09 06:40:13,391][60143] Updated weights for policy 0, policy_version 60892 (0.0007) +[2023-10-09 06:40:16,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 125435904. Throughput: 0: 1713.4, 1: 1725.8. Samples: 31370606. Policy #0 lag: (min: 29.0, avg: 29.9, max: 49.0) +[2023-10-09 06:40:16,053][59242] Avg episode reward: [(0, '30.760'), (1, '27.910')] +[2023-10-09 06:40:17,354][60144] Updated weights for policy 1, policy_version 61602 (0.0008) +[2023-10-09 06:40:17,406][60143] Updated weights for policy 0, policy_version 60902 (0.0007) +[2023-10-09 06:40:17,720][60144] Updated weights for policy 1, policy_version 61612 (0.0009) +[2023-10-09 06:40:17,770][60143] Updated weights for policy 0, policy_version 60912 (0.0007) +[2023-10-09 06:40:18,072][60144] Updated weights for policy 1, policy_version 61622 (0.0008) +[2023-10-09 06:40:18,149][60143] Updated weights for policy 0, policy_version 60922 (0.0007) +[2023-10-09 06:40:18,443][60144] Updated weights for policy 1, policy_version 61632 (0.0009) +[2023-10-09 06:40:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 125501440. Throughput: 0: 1686.4, 1: 1703.2. Samples: 31379844. Policy #0 lag: (min: 29.0, avg: 29.9, max: 49.0) +[2023-10-09 06:40:21,053][59242] Avg episode reward: [(0, '32.410'), (1, '29.340')] +[2023-10-09 06:40:21,995][60143] Updated weights for policy 0, policy_version 60932 (0.0007) +[2023-10-09 06:40:22,361][60143] Updated weights for policy 0, policy_version 60942 (0.0009) +[2023-10-09 06:40:22,516][60144] Updated weights for policy 1, policy_version 61642 (0.0007) +[2023-10-09 06:40:22,734][60143] Updated weights for policy 0, policy_version 60952 (0.0009) +[2023-10-09 06:40:22,883][60144] Updated weights for policy 1, policy_version 61652 (0.0008) +[2023-10-09 06:40:23,260][60144] Updated weights for policy 1, policy_version 61662 (0.0007) +[2023-10-09 06:40:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 125566976. Throughput: 0: 1704.4, 1: 1706.0. Samples: 31400796. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:40:26,053][59242] Avg episode reward: [(0, '32.550'), (1, '29.070')] +[2023-10-09 06:40:26,834][60143] Updated weights for policy 0, policy_version 60962 (0.0008) +[2023-10-09 06:40:27,196][60143] Updated weights for policy 0, policy_version 60972 (0.0007) +[2023-10-09 06:40:27,305][60144] Updated weights for policy 1, policy_version 61672 (0.0008) +[2023-10-09 06:40:27,567][60143] Updated weights for policy 0, policy_version 60982 (0.0009) +[2023-10-09 06:40:27,666][60144] Updated weights for policy 1, policy_version 61682 (0.0007) +[2023-10-09 06:40:27,931][60143] Updated weights for policy 0, policy_version 60992 (0.0008) +[2023-10-09 06:40:28,031][60144] Updated weights for policy 1, policy_version 61692 (0.0007) +[2023-10-09 06:40:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 125632512. Throughput: 0: 1713.5, 1: 1729.7. Samples: 31422154. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:40:31,053][59242] Avg episode reward: [(0, '32.390'), (1, '29.460')] +[2023-10-09 06:40:31,889][60144] Updated weights for policy 1, policy_version 61702 (0.0008) +[2023-10-09 06:40:32,151][60143] Updated weights for policy 0, policy_version 61002 (0.0007) +[2023-10-09 06:40:32,251][60144] Updated weights for policy 1, policy_version 61712 (0.0009) +[2023-10-09 06:40:32,514][60143] Updated weights for policy 0, policy_version 61012 (0.0007) +[2023-10-09 06:40:32,629][60144] Updated weights for policy 1, policy_version 61722 (0.0008) +[2023-10-09 06:40:32,880][60143] Updated weights for policy 0, policy_version 61022 (0.0009) +[2023-10-09 06:40:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 125698048. Throughput: 0: 1685.2, 1: 1697.3. Samples: 31431304. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:40:36,053][59242] Avg episode reward: [(0, '32.470'), (1, '30.100')] +[2023-10-09 06:40:36,558][60144] Updated weights for policy 1, policy_version 61732 (0.0009) +[2023-10-09 06:40:36,930][60144] Updated weights for policy 1, policy_version 61742 (0.0007) +[2023-10-09 06:40:36,936][60143] Updated weights for policy 0, policy_version 61032 (0.0008) +[2023-10-09 06:40:37,287][60144] Updated weights for policy 1, policy_version 61752 (0.0007) +[2023-10-09 06:40:37,303][60143] Updated weights for policy 0, policy_version 61042 (0.0007) +[2023-10-09 06:40:37,672][60143] Updated weights for policy 0, policy_version 61052 (0.0007) +[2023-10-09 06:40:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 125763584. Throughput: 0: 1714.2, 1: 1722.2. Samples: 31452780. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:40:41,053][59242] Avg episode reward: [(0, '32.920'), (1, '29.670')] +[2023-10-09 06:40:41,276][60144] Updated weights for policy 1, policy_version 61762 (0.0009) +[2023-10-09 06:40:41,691][60144] Updated weights for policy 1, policy_version 61772 (0.0009) +[2023-10-09 06:40:41,772][60143] Updated weights for policy 0, policy_version 61062 (0.0009) +[2023-10-09 06:40:42,049][60144] Updated weights for policy 1, policy_version 61782 (0.0009) +[2023-10-09 06:40:42,148][60143] Updated weights for policy 0, policy_version 61072 (0.0009) +[2023-10-09 06:40:42,411][60144] Updated weights for policy 1, policy_version 61792 (0.0008) +[2023-10-09 06:40:42,520][60143] Updated weights for policy 0, policy_version 61082 (0.0008) +[2023-10-09 06:40:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 125829120. Throughput: 0: 1704.8, 1: 1726.6. Samples: 31473326. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:40:46,053][59242] Avg episode reward: [(0, '32.720'), (1, '30.720')] +[2023-10-09 06:40:46,263][60144] Updated weights for policy 1, policy_version 61802 (0.0010) +[2023-10-09 06:40:46,557][60143] Updated weights for policy 0, policy_version 61092 (0.0008) +[2023-10-09 06:40:46,636][60144] Updated weights for policy 1, policy_version 61812 (0.0007) +[2023-10-09 06:40:46,926][60143] Updated weights for policy 0, policy_version 61102 (0.0008) +[2023-10-09 06:40:46,997][60144] Updated weights for policy 1, policy_version 61822 (0.0008) +[2023-10-09 06:40:47,293][60143] Updated weights for policy 0, policy_version 61112 (0.0007) +[2023-10-09 06:40:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 125894656. Throughput: 0: 1688.7, 1: 1710.9. Samples: 31482562. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:40:51,053][59242] Avg episode reward: [(0, '33.910'), (1, '30.150')] +[2023-10-09 06:40:51,168][60144] Updated weights for policy 1, policy_version 61832 (0.0010) +[2023-10-09 06:40:51,337][60143] Updated weights for policy 0, policy_version 61122 (0.0010) +[2023-10-09 06:40:51,534][60144] Updated weights for policy 1, policy_version 61842 (0.0007) +[2023-10-09 06:40:51,701][60143] Updated weights for policy 0, policy_version 61132 (0.0007) +[2023-10-09 06:40:51,897][60144] Updated weights for policy 1, policy_version 61852 (0.0010) +[2023-10-09 06:40:52,075][60143] Updated weights for policy 0, policy_version 61142 (0.0009) +[2023-10-09 06:40:52,443][60143] Updated weights for policy 0, policy_version 61152 (0.0010) +[2023-10-09 06:40:55,753][60144] Updated weights for policy 1, policy_version 61862 (0.0008) +[2023-10-09 06:40:56,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 125960192. Throughput: 0: 1699.3, 1: 1720.5. Samples: 31503594. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:40:56,052][59242] Avg episode reward: [(0, '34.540'), (1, '30.700')] +[2023-10-09 06:40:56,126][60144] Updated weights for policy 1, policy_version 61872 (0.0010) +[2023-10-09 06:40:56,408][60143] Updated weights for policy 0, policy_version 61162 (0.0007) +[2023-10-09 06:40:56,498][60144] Updated weights for policy 1, policy_version 61882 (0.0008) +[2023-10-09 06:40:56,777][60143] Updated weights for policy 0, policy_version 61172 (0.0007) +[2023-10-09 06:40:57,154][60143] Updated weights for policy 0, policy_version 61182 (0.0010) +[2023-10-09 06:41:00,558][60144] Updated weights for policy 1, policy_version 61892 (0.0007) +[2023-10-09 06:41:00,918][60144] Updated weights for policy 1, policy_version 61902 (0.0008) +[2023-10-09 06:41:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 126025728. Throughput: 0: 1701.0, 1: 1722.4. Samples: 31524656. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:41:01,053][59242] Avg episode reward: [(0, '34.930'), (1, '31.170')] +[2023-10-09 06:41:01,118][60143] Updated weights for policy 0, policy_version 61192 (0.0009) +[2023-10-09 06:41:01,291][60144] Updated weights for policy 1, policy_version 61912 (0.0008) +[2023-10-09 06:41:01,484][60143] Updated weights for policy 0, policy_version 61202 (0.0009) +[2023-10-09 06:41:01,857][60143] Updated weights for policy 0, policy_version 61212 (0.0007) +[2023-10-09 06:41:05,078][60144] Updated weights for policy 1, policy_version 61922 (0.0007) +[2023-10-09 06:41:05,438][60144] Updated weights for policy 1, policy_version 61932 (0.0009) +[2023-10-09 06:41:05,808][60144] Updated weights for policy 1, policy_version 61942 (0.0010) +[2023-10-09 06:41:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 126091264. Throughput: 0: 1702.1, 1: 1728.1. Samples: 31534204. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 06:41:06,053][59242] Avg episode reward: [(0, '34.880'), (1, '31.280')] +[2023-10-09 06:41:06,063][60143] Updated weights for policy 0, policy_version 61222 (0.0007) +[2023-10-09 06:41:06,180][60144] Updated weights for policy 1, policy_version 61952 (0.0007) +[2023-10-09 06:41:06,430][60143] Updated weights for policy 0, policy_version 61232 (0.0007) +[2023-10-09 06:41:06,799][60143] Updated weights for policy 0, policy_version 61242 (0.0007) +[2023-10-09 06:41:10,162][60144] Updated weights for policy 1, policy_version 61962 (0.0007) +[2023-10-09 06:41:10,526][60144] Updated weights for policy 1, policy_version 61972 (0.0007) +[2023-10-09 06:41:10,708][60143] Updated weights for policy 0, policy_version 61252 (0.0008) +[2023-10-09 06:41:10,895][60144] Updated weights for policy 1, policy_version 61982 (0.0007) +[2023-10-09 06:41:11,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 126189568. Throughput: 0: 1695.4, 1: 1739.6. Samples: 31555370. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 06:41:11,052][59242] Avg episode reward: [(0, '33.070'), (1, '31.620')] +[2023-10-09 06:41:11,065][60143] Updated weights for policy 0, policy_version 61262 (0.0009) +[2023-10-09 06:41:11,445][60143] Updated weights for policy 0, policy_version 61272 (0.0011) +[2023-10-09 06:41:14,739][60144] Updated weights for policy 1, policy_version 61992 (0.0008) +[2023-10-09 06:41:15,104][60144] Updated weights for policy 1, policy_version 62002 (0.0010) +[2023-10-09 06:41:15,432][60143] Updated weights for policy 0, policy_version 61282 (0.0011) +[2023-10-09 06:41:15,467][60144] Updated weights for policy 1, policy_version 62012 (0.0008) +[2023-10-09 06:41:15,802][60143] Updated weights for policy 0, policy_version 61292 (0.0008) +[2023-10-09 06:41:16,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 126255104. Throughput: 0: 1697.7, 1: 1704.0. Samples: 31575234. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 06:41:16,052][59242] Avg episode reward: [(0, '33.960'), (1, '32.150')] +[2023-10-09 06:41:16,165][60143] Updated weights for policy 0, policy_version 61302 (0.0010) +[2023-10-09 06:41:16,546][60143] Updated weights for policy 0, policy_version 61312 (0.0009) +[2023-10-09 06:41:19,398][60144] Updated weights for policy 1, policy_version 62022 (0.0009) +[2023-10-09 06:41:19,765][60144] Updated weights for policy 1, policy_version 62032 (0.0010) +[2023-10-09 06:41:20,132][60144] Updated weights for policy 1, policy_version 62042 (0.0009) +[2023-10-09 06:41:20,442][60143] Updated weights for policy 0, policy_version 61322 (0.0009) +[2023-10-09 06:41:20,818][60143] Updated weights for policy 0, policy_version 61332 (0.0007) +[2023-10-09 06:41:21,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 126320640. Throughput: 0: 1705.5, 1: 1733.5. Samples: 31586056. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 06:41:21,053][59242] Avg episode reward: [(0, '33.260'), (1, '31.890')] +[2023-10-09 06:41:21,190][60143] Updated weights for policy 0, policy_version 61342 (0.0007) +[2023-10-09 06:41:24,126][60144] Updated weights for policy 1, policy_version 62052 (0.0007) +[2023-10-09 06:41:24,502][60144] Updated weights for policy 1, policy_version 62062 (0.0007) +[2023-10-09 06:41:24,871][60144] Updated weights for policy 1, policy_version 62072 (0.0007) +[2023-10-09 06:41:25,217][60143] Updated weights for policy 0, policy_version 61352 (0.0009) +[2023-10-09 06:41:25,578][60143] Updated weights for policy 0, policy_version 61362 (0.0007) +[2023-10-09 06:41:25,948][60143] Updated weights for policy 0, policy_version 61372 (0.0007) +[2023-10-09 06:41:26,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 126386176. Throughput: 0: 1704.9, 1: 1717.2. Samples: 31606776. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 06:41:26,053][59242] Avg episode reward: [(0, '31.640'), (1, '32.330')] +[2023-10-09 06:41:28,933][60144] Updated weights for policy 1, policy_version 62082 (0.0007) +[2023-10-09 06:41:29,354][60144] Updated weights for policy 1, policy_version 62092 (0.0007) +[2023-10-09 06:41:29,713][60144] Updated weights for policy 1, policy_version 62102 (0.0007) +[2023-10-09 06:41:29,802][60143] Updated weights for policy 0, policy_version 61382 (0.0009) +[2023-10-09 06:41:30,083][60144] Updated weights for policy 1, policy_version 62112 (0.0007) +[2023-10-09 06:41:30,183][60143] Updated weights for policy 0, policy_version 61392 (0.0008) +[2023-10-09 06:41:30,559][60143] Updated weights for policy 0, policy_version 61402 (0.0008) +[2023-10-09 06:41:31,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 126484480. Throughput: 0: 1693.9, 1: 1701.4. Samples: 31626112. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 06:41:31,053][59242] Avg episode reward: [(0, '30.610'), (1, '32.530')] +[2023-10-09 06:41:31,059][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000062112_63602688.pth... +[2023-10-09 06:41:31,059][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000061408_62881792.pth... +[2023-10-09 06:41:31,090][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000060480_61931520.pth +[2023-10-09 06:41:31,090][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000059808_61243392.pth +[2023-10-09 06:41:33,949][60144] Updated weights for policy 1, policy_version 62122 (0.0008) +[2023-10-09 06:41:34,316][60144] Updated weights for policy 1, policy_version 62132 (0.0008) +[2023-10-09 06:41:34,348][60143] Updated weights for policy 0, policy_version 61412 (0.0010) +[2023-10-09 06:41:34,682][60144] Updated weights for policy 1, policy_version 62142 (0.0009) +[2023-10-09 06:41:34,725][60143] Updated weights for policy 0, policy_version 61422 (0.0008) +[2023-10-09 06:41:35,096][60143] Updated weights for policy 0, policy_version 61432 (0.0009) +[2023-10-09 06:41:36,052][59242] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 126550016. Throughput: 0: 1719.5, 1: 1729.1. Samples: 31637748. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 06:41:36,053][59242] Avg episode reward: [(0, '31.410'), (1, '31.500')] +[2023-10-09 06:41:38,803][60144] Updated weights for policy 1, policy_version 62152 (0.0007) +[2023-10-09 06:41:39,113][60143] Updated weights for policy 0, policy_version 61442 (0.0008) +[2023-10-09 06:41:39,181][60144] Updated weights for policy 1, policy_version 62162 (0.0009) +[2023-10-09 06:41:39,488][60143] Updated weights for policy 0, policy_version 61452 (0.0009) +[2023-10-09 06:41:39,549][60144] Updated weights for policy 1, policy_version 62172 (0.0010) +[2023-10-09 06:41:39,861][60143] Updated weights for policy 0, policy_version 61462 (0.0009) +[2023-10-09 06:41:40,229][60143] Updated weights for policy 0, policy_version 61472 (0.0009) +[2023-10-09 06:41:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 126615552. Throughput: 0: 1710.6, 1: 1704.4. Samples: 31657272. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 06:41:41,052][59242] Avg episode reward: [(0, '30.460'), (1, '32.450')] +[2023-10-09 06:41:43,666][60144] Updated weights for policy 1, policy_version 62182 (0.0009) +[2023-10-09 06:41:44,025][60144] Updated weights for policy 1, policy_version 62192 (0.0007) +[2023-10-09 06:41:44,242][60143] Updated weights for policy 0, policy_version 61482 (0.0010) +[2023-10-09 06:41:44,399][60144] Updated weights for policy 1, policy_version 62202 (0.0008) +[2023-10-09 06:41:44,613][60143] Updated weights for policy 0, policy_version 61492 (0.0007) +[2023-10-09 06:41:44,973][60143] Updated weights for policy 0, policy_version 61502 (0.0007) +[2023-10-09 06:41:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 126681088. Throughput: 0: 1692.0, 1: 1698.9. Samples: 31677246. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-09 06:41:46,053][59242] Avg episode reward: [(0, '29.210'), (1, '32.060')] +[2023-10-09 06:41:48,301][60144] Updated weights for policy 1, policy_version 62212 (0.0009) +[2023-10-09 06:41:48,673][60144] Updated weights for policy 1, policy_version 62222 (0.0011) +[2023-10-09 06:41:48,996][60143] Updated weights for policy 0, policy_version 61512 (0.0009) +[2023-10-09 06:41:49,037][60144] Updated weights for policy 1, policy_version 62232 (0.0009) +[2023-10-09 06:41:49,367][60143] Updated weights for policy 0, policy_version 61522 (0.0009) +[2023-10-09 06:41:49,740][60143] Updated weights for policy 0, policy_version 61532 (0.0008) +[2023-10-09 06:41:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 126746624. Throughput: 0: 1719.9, 1: 1715.6. Samples: 31688800. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-09 06:41:51,052][59242] Avg episode reward: [(0, '29.790'), (1, '30.830')] +[2023-10-09 06:41:52,817][60144] Updated weights for policy 1, policy_version 62242 (0.0008) +[2023-10-09 06:41:53,193][60144] Updated weights for policy 1, policy_version 62252 (0.0008) +[2023-10-09 06:41:53,566][60144] Updated weights for policy 1, policy_version 62262 (0.0008) +[2023-10-09 06:41:53,602][60143] Updated weights for policy 0, policy_version 61542 (0.0008) +[2023-10-09 06:41:53,929][60144] Updated weights for policy 1, policy_version 62272 (0.0009) +[2023-10-09 06:41:53,962][60143] Updated weights for policy 0, policy_version 61552 (0.0007) +[2023-10-09 06:41:54,329][60143] Updated weights for policy 0, policy_version 61562 (0.0007) +[2023-10-09 06:41:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 126812160. Throughput: 0: 1700.7, 1: 1694.6. Samples: 31708158. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-09 06:41:56,053][59242] Avg episode reward: [(0, '31.110'), (1, '31.130')] +[2023-10-09 06:41:58,030][60144] Updated weights for policy 1, policy_version 62282 (0.0008) +[2023-10-09 06:41:58,318][60143] Updated weights for policy 0, policy_version 61572 (0.0008) +[2023-10-09 06:41:58,390][60144] Updated weights for policy 1, policy_version 62292 (0.0007) +[2023-10-09 06:41:58,686][60143] Updated weights for policy 0, policy_version 61582 (0.0009) +[2023-10-09 06:41:58,753][60144] Updated weights for policy 1, policy_version 62302 (0.0007) +[2023-10-09 06:41:59,055][60143] Updated weights for policy 0, policy_version 61592 (0.0009) +[2023-10-09 06:42:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 126877696. Throughput: 0: 1700.1, 1: 1725.0. Samples: 31729362. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-09 06:42:01,053][59242] Avg episode reward: [(0, '30.670'), (1, '30.390')] +[2023-10-09 06:42:02,847][60144] Updated weights for policy 1, policy_version 62312 (0.0008) +[2023-10-09 06:42:03,112][60143] Updated weights for policy 0, policy_version 61602 (0.0008) +[2023-10-09 06:42:03,205][60144] Updated weights for policy 1, policy_version 62322 (0.0008) +[2023-10-09 06:42:03,472][60143] Updated weights for policy 0, policy_version 61612 (0.0008) +[2023-10-09 06:42:03,578][60144] Updated weights for policy 1, policy_version 62332 (0.0008) +[2023-10-09 06:42:03,840][60143] Updated weights for policy 0, policy_version 61622 (0.0007) +[2023-10-09 06:42:04,208][60143] Updated weights for policy 0, policy_version 61632 (0.0007) +[2023-10-09 06:42:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 126943232. Throughput: 0: 1711.6, 1: 1698.3. Samples: 31739500. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-09 06:42:06,052][59242] Avg episode reward: [(0, '31.210'), (1, '29.980')] +[2023-10-09 06:42:07,494][60144] Updated weights for policy 1, policy_version 62342 (0.0007) +[2023-10-09 06:42:07,859][60144] Updated weights for policy 1, policy_version 62352 (0.0007) +[2023-10-09 06:42:08,220][60144] Updated weights for policy 1, policy_version 62362 (0.0007) +[2023-10-09 06:42:08,231][60143] Updated weights for policy 0, policy_version 61642 (0.0008) +[2023-10-09 06:42:08,600][60143] Updated weights for policy 0, policy_version 61652 (0.0009) +[2023-10-09 06:42:08,977][60143] Updated weights for policy 0, policy_version 61662 (0.0008) +[2023-10-09 06:42:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 127008768. Throughput: 0: 1693.3, 1: 1705.5. Samples: 31759718. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-09 06:42:11,052][59242] Avg episode reward: [(0, '30.950'), (1, '30.050')] +[2023-10-09 06:42:12,131][60144] Updated weights for policy 1, policy_version 62372 (0.0008) +[2023-10-09 06:42:12,505][60144] Updated weights for policy 1, policy_version 62382 (0.0009) +[2023-10-09 06:42:12,868][60144] Updated weights for policy 1, policy_version 62392 (0.0007) +[2023-10-09 06:42:13,159][60143] Updated weights for policy 0, policy_version 61672 (0.0010) +[2023-10-09 06:42:13,530][60143] Updated weights for policy 0, policy_version 61682 (0.0008) +[2023-10-09 06:42:13,898][60143] Updated weights for policy 0, policy_version 61692 (0.0011) +[2023-10-09 06:42:16,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 127074304. Throughput: 0: 1713.6, 1: 1728.6. Samples: 31781012. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-09 06:42:16,053][59242] Avg episode reward: [(0, '30.120'), (1, '30.400')] +[2023-10-09 06:42:16,856][60144] Updated weights for policy 1, policy_version 62402 (0.0008) +[2023-10-09 06:42:17,263][60144] Updated weights for policy 1, policy_version 62412 (0.0009) +[2023-10-09 06:42:17,639][60144] Updated weights for policy 1, policy_version 62422 (0.0009) +[2023-10-09 06:42:17,828][60143] Updated weights for policy 0, policy_version 61702 (0.0011) +[2023-10-09 06:42:18,000][60144] Updated weights for policy 1, policy_version 62432 (0.0007) +[2023-10-09 06:42:18,206][60143] Updated weights for policy 0, policy_version 61712 (0.0009) +[2023-10-09 06:42:18,579][60143] Updated weights for policy 0, policy_version 61722 (0.0009) +[2023-10-09 06:42:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 127139840. Throughput: 0: 1697.2, 1: 1698.6. Samples: 31790560. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-09 06:42:21,053][59242] Avg episode reward: [(0, '31.000'), (1, '31.330')] +[2023-10-09 06:42:21,894][60144] Updated weights for policy 1, policy_version 62442 (0.0007) +[2023-10-09 06:42:22,262][60144] Updated weights for policy 1, policy_version 62452 (0.0009) +[2023-10-09 06:42:22,574][60143] Updated weights for policy 0, policy_version 61732 (0.0008) +[2023-10-09 06:42:22,632][60144] Updated weights for policy 1, policy_version 62462 (0.0007) +[2023-10-09 06:42:22,946][60143] Updated weights for policy 0, policy_version 61742 (0.0009) +[2023-10-09 06:42:23,322][60143] Updated weights for policy 0, policy_version 61752 (0.0011) +[2023-10-09 06:42:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 127205376. Throughput: 0: 1693.0, 1: 1727.7. Samples: 31811204. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-09 06:42:26,053][59242] Avg episode reward: [(0, '30.650'), (1, '30.140')] +[2023-10-09 06:42:26,484][60144] Updated weights for policy 1, policy_version 62472 (0.0008) +[2023-10-09 06:42:26,856][60144] Updated weights for policy 1, policy_version 62482 (0.0007) +[2023-10-09 06:42:27,226][60144] Updated weights for policy 1, policy_version 62492 (0.0007) +[2023-10-09 06:42:27,472][60143] Updated weights for policy 0, policy_version 61762 (0.0010) +[2023-10-09 06:42:27,843][60143] Updated weights for policy 0, policy_version 61772 (0.0008) +[2023-10-09 06:42:28,230][60143] Updated weights for policy 0, policy_version 61782 (0.0010) +[2023-10-09 06:42:28,602][60143] Updated weights for policy 0, policy_version 61792 (0.0009) +[2023-10-09 06:42:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 127270912. Throughput: 0: 1707.4, 1: 1742.6. Samples: 31832498. Policy #0 lag: (min: 26.0, avg: 46.4, max: 48.0) +[2023-10-09 06:42:31,052][59242] Avg episode reward: [(0, '30.210'), (1, '31.710')] +[2023-10-09 06:42:31,101][60144] Updated weights for policy 1, policy_version 62502 (0.0007) +[2023-10-09 06:42:31,467][60144] Updated weights for policy 1, policy_version 62512 (0.0008) +[2023-10-09 06:42:31,832][60144] Updated weights for policy 1, policy_version 62522 (0.0007) +[2023-10-09 06:42:32,591][60143] Updated weights for policy 0, policy_version 61802 (0.0008) +[2023-10-09 06:42:32,965][60143] Updated weights for policy 0, policy_version 61812 (0.0009) +[2023-10-09 06:42:33,334][60143] Updated weights for policy 0, policy_version 61822 (0.0009) +[2023-10-09 06:42:35,801][60144] Updated weights for policy 1, policy_version 62532 (0.0007) +[2023-10-09 06:42:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 127336448. Throughput: 0: 1680.1, 1: 1721.2. Samples: 31841860. Policy #0 lag: (min: 26.0, avg: 46.4, max: 48.0) +[2023-10-09 06:42:36,053][59242] Avg episode reward: [(0, '29.220'), (1, '31.570')] +[2023-10-09 06:42:36,164][60144] Updated weights for policy 1, policy_version 62542 (0.0007) +[2023-10-09 06:42:36,529][60144] Updated weights for policy 1, policy_version 62552 (0.0008) +[2023-10-09 06:42:37,180][60143] Updated weights for policy 0, policy_version 61832 (0.0011) +[2023-10-09 06:42:37,557][60143] Updated weights for policy 0, policy_version 61842 (0.0007) +[2023-10-09 06:42:37,933][60143] Updated weights for policy 0, policy_version 61852 (0.0009) +[2023-10-09 06:42:40,356][60144] Updated weights for policy 1, policy_version 62562 (0.0008) +[2023-10-09 06:42:40,732][60144] Updated weights for policy 1, policy_version 62572 (0.0009) +[2023-10-09 06:42:41,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 127401984. Throughput: 0: 1701.9, 1: 1739.2. Samples: 31863008. Policy #0 lag: (min: 26.0, avg: 46.4, max: 48.0) +[2023-10-09 06:42:41,053][59242] Avg episode reward: [(0, '29.180'), (1, '30.550')] +[2023-10-09 06:42:41,100][60144] Updated weights for policy 1, policy_version 62582 (0.0009) +[2023-10-09 06:42:41,463][60144] Updated weights for policy 1, policy_version 62592 (0.0011) +[2023-10-09 06:42:41,881][60143] Updated weights for policy 0, policy_version 61862 (0.0009) +[2023-10-09 06:42:42,257][60143] Updated weights for policy 0, policy_version 61872 (0.0008) +[2023-10-09 06:42:42,628][60143] Updated weights for policy 0, policy_version 61882 (0.0008) +[2023-10-09 06:42:45,416][60144] Updated weights for policy 1, policy_version 62602 (0.0010) +[2023-10-09 06:42:45,788][60144] Updated weights for policy 1, policy_version 62612 (0.0010) +[2023-10-09 06:42:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 127467520. Throughput: 0: 1708.4, 1: 1726.6. Samples: 31883938. Policy #0 lag: (min: 26.0, avg: 46.4, max: 48.0) +[2023-10-09 06:42:46,053][59242] Avg episode reward: [(0, '29.980'), (1, '30.620')] +[2023-10-09 06:42:46,152][60144] Updated weights for policy 1, policy_version 62622 (0.0008) +[2023-10-09 06:42:46,583][60143] Updated weights for policy 0, policy_version 61892 (0.0010) +[2023-10-09 06:42:46,955][60143] Updated weights for policy 0, policy_version 61902 (0.0010) +[2023-10-09 06:42:47,333][60143] Updated weights for policy 0, policy_version 61912 (0.0009) +[2023-10-09 06:42:50,246][60144] Updated weights for policy 1, policy_version 62632 (0.0010) +[2023-10-09 06:42:50,607][60144] Updated weights for policy 1, policy_version 62642 (0.0008) +[2023-10-09 06:42:50,975][60144] Updated weights for policy 1, policy_version 62652 (0.0009) +[2023-10-09 06:42:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 127533056. Throughput: 0: 1687.6, 1: 1738.9. Samples: 31893692. Policy #0 lag: (min: 26.0, avg: 46.4, max: 48.0) +[2023-10-09 06:42:51,052][59242] Avg episode reward: [(0, '30.310'), (1, '30.310')] +[2023-10-09 06:42:51,315][60143] Updated weights for policy 0, policy_version 61922 (0.0009) +[2023-10-09 06:42:51,686][60143] Updated weights for policy 0, policy_version 61932 (0.0008) +[2023-10-09 06:42:52,056][60143] Updated weights for policy 0, policy_version 61942 (0.0007) +[2023-10-09 06:42:52,422][60143] Updated weights for policy 0, policy_version 61952 (0.0007) +[2023-10-09 06:42:54,889][60144] Updated weights for policy 1, policy_version 62662 (0.0008) +[2023-10-09 06:42:55,251][60144] Updated weights for policy 1, policy_version 62672 (0.0008) +[2023-10-09 06:42:55,609][60144] Updated weights for policy 1, policy_version 62682 (0.0009) +[2023-10-09 06:42:56,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 127631360. Throughput: 0: 1705.5, 1: 1738.7. Samples: 31914706. Policy #0 lag: (min: 26.0, avg: 46.4, max: 48.0) +[2023-10-09 06:42:56,053][59242] Avg episode reward: [(0, '30.990'), (1, '31.130')] +[2023-10-09 06:42:56,385][60143] Updated weights for policy 0, policy_version 61962 (0.0010) +[2023-10-09 06:42:56,760][60143] Updated weights for policy 0, policy_version 61972 (0.0009) +[2023-10-09 06:42:57,122][60143] Updated weights for policy 0, policy_version 61982 (0.0010) +[2023-10-09 06:42:59,670][60144] Updated weights for policy 1, policy_version 62692 (0.0010) +[2023-10-09 06:43:00,035][60144] Updated weights for policy 1, policy_version 62702 (0.0010) +[2023-10-09 06:43:00,406][60144] Updated weights for policy 1, policy_version 62712 (0.0010) +[2023-10-09 06:43:01,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 127696896. Throughput: 0: 1707.5, 1: 1711.2. Samples: 31934854. Policy #0 lag: (min: 26.0, avg: 46.4, max: 48.0) +[2023-10-09 06:43:01,053][59242] Avg episode reward: [(0, '31.130'), (1, '31.650')] +[2023-10-09 06:43:01,277][60143] Updated weights for policy 0, policy_version 61992 (0.0010) +[2023-10-09 06:43:01,655][60143] Updated weights for policy 0, policy_version 62002 (0.0009) +[2023-10-09 06:43:02,038][60143] Updated weights for policy 0, policy_version 62012 (0.0011) +[2023-10-09 06:43:04,512][60144] Updated weights for policy 1, policy_version 62722 (0.0010) +[2023-10-09 06:43:04,941][60144] Updated weights for policy 1, policy_version 62732 (0.0009) +[2023-10-09 06:43:05,298][60144] Updated weights for policy 1, policy_version 62742 (0.0008) +[2023-10-09 06:43:05,666][60144] Updated weights for policy 1, policy_version 62752 (0.0007) +[2023-10-09 06:43:06,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 127762432. Throughput: 0: 1694.0, 1: 1738.7. Samples: 31945034. Policy #0 lag: (min: 26.0, avg: 46.4, max: 48.0) +[2023-10-09 06:43:06,053][59242] Avg episode reward: [(0, '32.210'), (1, '30.970')] +[2023-10-09 06:43:06,175][60143] Updated weights for policy 0, policy_version 62022 (0.0011) +[2023-10-09 06:43:06,548][60143] Updated weights for policy 0, policy_version 62032 (0.0010) +[2023-10-09 06:43:06,926][60143] Updated weights for policy 0, policy_version 62042 (0.0008) +[2023-10-09 06:43:09,463][60144] Updated weights for policy 1, policy_version 62762 (0.0007) +[2023-10-09 06:43:09,832][60144] Updated weights for policy 1, policy_version 62772 (0.0007) +[2023-10-09 06:43:10,199][60144] Updated weights for policy 1, policy_version 62782 (0.0009) +[2023-10-09 06:43:10,784][60143] Updated weights for policy 0, policy_version 62052 (0.0008) +[2023-10-09 06:43:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 127827968. Throughput: 0: 1709.0, 1: 1723.9. Samples: 31965684. Policy #0 lag: (min: 26.0, avg: 46.4, max: 48.0) +[2023-10-09 06:43:11,053][59242] Avg episode reward: [(0, '32.030'), (1, '29.520')] +[2023-10-09 06:43:11,146][60143] Updated weights for policy 0, policy_version 62062 (0.0010) +[2023-10-09 06:43:11,524][60143] Updated weights for policy 0, policy_version 62072 (0.0008) +[2023-10-09 06:43:13,948][60144] Updated weights for policy 1, policy_version 62792 (0.0008) +[2023-10-09 06:43:14,310][60144] Updated weights for policy 1, policy_version 62802 (0.0007) +[2023-10-09 06:43:14,675][60144] Updated weights for policy 1, policy_version 62812 (0.0008) +[2023-10-09 06:43:15,413][60143] Updated weights for policy 0, policy_version 62082 (0.0007) +[2023-10-09 06:43:15,788][60143] Updated weights for policy 0, policy_version 62092 (0.0007) +[2023-10-09 06:43:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 127893504. Throughput: 0: 1711.1, 1: 1703.9. Samples: 31986170. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) +[2023-10-09 06:43:16,053][59242] Avg episode reward: [(0, '32.970'), (1, '30.570')] +[2023-10-09 06:43:16,147][60143] Updated weights for policy 0, policy_version 62102 (0.0008) +[2023-10-09 06:43:16,517][60143] Updated weights for policy 0, policy_version 62112 (0.0010) +[2023-10-09 06:43:18,564][60144] Updated weights for policy 1, policy_version 62822 (0.0008) +[2023-10-09 06:43:18,937][60144] Updated weights for policy 1, policy_version 62832 (0.0007) +[2023-10-09 06:43:19,310][60144] Updated weights for policy 1, policy_version 62842 (0.0009) +[2023-10-09 06:43:20,564][60143] Updated weights for policy 0, policy_version 62122 (0.0007) +[2023-10-09 06:43:20,929][60143] Updated weights for policy 0, policy_version 62132 (0.0009) +[2023-10-09 06:43:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 127959040. Throughput: 0: 1712.8, 1: 1729.5. Samples: 31996764. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) +[2023-10-09 06:43:21,053][59242] Avg episode reward: [(0, '31.300'), (1, '31.650')] +[2023-10-09 06:43:21,298][60143] Updated weights for policy 0, policy_version 62142 (0.0010) +[2023-10-09 06:43:23,311][60144] Updated weights for policy 1, policy_version 62852 (0.0009) +[2023-10-09 06:43:23,679][60144] Updated weights for policy 1, policy_version 62862 (0.0008) +[2023-10-09 06:43:24,047][60144] Updated weights for policy 1, policy_version 62872 (0.0008) +[2023-10-09 06:43:25,253][60143] Updated weights for policy 0, policy_version 62152 (0.0009) +[2023-10-09 06:43:25,617][60143] Updated weights for policy 0, policy_version 62162 (0.0011) +[2023-10-09 06:43:25,985][60143] Updated weights for policy 0, policy_version 62172 (0.0009) +[2023-10-09 06:43:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 128024576. Throughput: 0: 1717.0, 1: 1704.1. Samples: 32016958. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) +[2023-10-09 06:43:26,053][59242] Avg episode reward: [(0, '30.900'), (1, '33.720')] +[2023-10-09 06:43:27,927][60144] Updated weights for policy 1, policy_version 62882 (0.0008) +[2023-10-09 06:43:28,302][60144] Updated weights for policy 1, policy_version 62892 (0.0009) +[2023-10-09 06:43:28,667][60144] Updated weights for policy 1, policy_version 62902 (0.0009) +[2023-10-09 06:43:29,031][60144] Updated weights for policy 1, policy_version 62912 (0.0010) +[2023-10-09 06:43:30,020][60143] Updated weights for policy 0, policy_version 62182 (0.0010) +[2023-10-09 06:43:30,383][60143] Updated weights for policy 0, policy_version 62192 (0.0011) +[2023-10-09 06:43:30,762][60143] Updated weights for policy 0, policy_version 62202 (0.0010) +[2023-10-09 06:43:31,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 128122880. Throughput: 0: 1696.1, 1: 1717.1. Samples: 32037534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) +[2023-10-09 06:43:31,053][59242] Avg episode reward: [(0, '30.550'), (1, '33.710')] +[2023-10-09 06:43:31,066][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000062912_64421888.pth... +[2023-10-09 06:43:31,066][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000062208_63700992.pth... +[2023-10-09 06:43:31,102][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000061312_62783488.pth +[2023-10-09 06:43:31,102][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000060608_62062592.pth +[2023-10-09 06:43:32,977][60144] Updated weights for policy 1, policy_version 62922 (0.0008) +[2023-10-09 06:43:33,347][60144] Updated weights for policy 1, policy_version 62932 (0.0008) +[2023-10-09 06:43:33,726][60144] Updated weights for policy 1, policy_version 62942 (0.0008) +[2023-10-09 06:43:34,792][60143] Updated weights for policy 0, policy_version 62212 (0.0007) +[2023-10-09 06:43:35,167][60143] Updated weights for policy 0, policy_version 62222 (0.0009) +[2023-10-09 06:43:35,537][60143] Updated weights for policy 0, policy_version 62232 (0.0009) +[2023-10-09 06:43:36,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 128188416. Throughput: 0: 1717.5, 1: 1708.8. Samples: 32047876. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) +[2023-10-09 06:43:36,053][59242] Avg episode reward: [(0, '30.280'), (1, '33.510')] +[2023-10-09 06:43:37,708][60144] Updated weights for policy 1, policy_version 62952 (0.0008) +[2023-10-09 06:43:38,072][60144] Updated weights for policy 1, policy_version 62962 (0.0008) +[2023-10-09 06:43:38,438][60144] Updated weights for policy 1, policy_version 62972 (0.0009) +[2023-10-09 06:43:39,630][60143] Updated weights for policy 0, policy_version 62242 (0.0008) +[2023-10-09 06:43:40,003][60143] Updated weights for policy 0, policy_version 62252 (0.0009) +[2023-10-09 06:43:40,378][60143] Updated weights for policy 0, policy_version 62262 (0.0009) +[2023-10-09 06:43:40,743][60143] Updated weights for policy 0, policy_version 62272 (0.0010) +[2023-10-09 06:43:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 128253952. Throughput: 0: 1715.1, 1: 1707.3. Samples: 32068718. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) +[2023-10-09 06:43:41,053][59242] Avg episode reward: [(0, '30.840'), (1, '33.090')] +[2023-10-09 06:43:42,556][60144] Updated weights for policy 1, policy_version 62982 (0.0009) +[2023-10-09 06:43:42,922][60144] Updated weights for policy 1, policy_version 62992 (0.0007) +[2023-10-09 06:43:43,292][60144] Updated weights for policy 1, policy_version 63002 (0.0008) +[2023-10-09 06:43:44,808][60143] Updated weights for policy 0, policy_version 62282 (0.0009) +[2023-10-09 06:43:45,175][60143] Updated weights for policy 0, policy_version 62292 (0.0009) +[2023-10-09 06:43:45,545][60143] Updated weights for policy 0, policy_version 62302 (0.0009) +[2023-10-09 06:43:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 128319488. Throughput: 0: 1682.8, 1: 1735.1. Samples: 32088658. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) +[2023-10-09 06:43:46,053][59242] Avg episode reward: [(0, '30.580'), (1, '32.740')] +[2023-10-09 06:43:47,022][60144] Updated weights for policy 1, policy_version 63012 (0.0009) +[2023-10-09 06:43:47,389][60144] Updated weights for policy 1, policy_version 63022 (0.0007) +[2023-10-09 06:43:47,746][60144] Updated weights for policy 1, policy_version 63032 (0.0008) +[2023-10-09 06:43:49,450][60143] Updated weights for policy 0, policy_version 62312 (0.0008) +[2023-10-09 06:43:49,817][60143] Updated weights for policy 0, policy_version 62322 (0.0007) +[2023-10-09 06:43:50,181][60143] Updated weights for policy 0, policy_version 62332 (0.0009) +[2023-10-09 06:43:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 128385024. Throughput: 0: 1717.0, 1: 1708.8. Samples: 32099192. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) +[2023-10-09 06:43:51,053][59242] Avg episode reward: [(0, '30.210'), (1, '31.320')] +[2023-10-09 06:43:51,666][60144] Updated weights for policy 1, policy_version 63042 (0.0011) +[2023-10-09 06:43:52,056][60144] Updated weights for policy 1, policy_version 63052 (0.0008) +[2023-10-09 06:43:52,425][60144] Updated weights for policy 1, policy_version 63062 (0.0009) +[2023-10-09 06:43:52,791][60144] Updated weights for policy 1, policy_version 63072 (0.0008) +[2023-10-09 06:43:54,312][60143] Updated weights for policy 0, policy_version 62342 (0.0007) +[2023-10-09 06:43:54,703][60143] Updated weights for policy 0, policy_version 62352 (0.0008) +[2023-10-09 06:43:55,077][60143] Updated weights for policy 0, policy_version 62362 (0.0007) +[2023-10-09 06:43:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 128450560. Throughput: 0: 1704.1, 1: 1723.4. Samples: 32119922. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 06:43:56,053][59242] Avg episode reward: [(0, '30.610'), (1, '31.790')] +[2023-10-09 06:43:56,575][60144] Updated weights for policy 1, policy_version 63082 (0.0008) +[2023-10-09 06:43:56,939][60144] Updated weights for policy 1, policy_version 63092 (0.0007) +[2023-10-09 06:43:57,301][60144] Updated weights for policy 1, policy_version 63102 (0.0007) +[2023-10-09 06:43:58,962][60143] Updated weights for policy 0, policy_version 62372 (0.0007) +[2023-10-09 06:43:59,330][60143] Updated weights for policy 0, policy_version 62382 (0.0007) +[2023-10-09 06:43:59,705][60143] Updated weights for policy 0, policy_version 62392 (0.0008) +[2023-10-09 06:44:01,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 128516096. Throughput: 0: 1688.7, 1: 1739.7. Samples: 32140446. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 06:44:01,052][59242] Avg episode reward: [(0, '31.800'), (1, '30.270')] +[2023-10-09 06:44:01,383][60144] Updated weights for policy 1, policy_version 63112 (0.0008) +[2023-10-09 06:44:01,748][60144] Updated weights for policy 1, policy_version 63122 (0.0009) +[2023-10-09 06:44:02,105][60144] Updated weights for policy 1, policy_version 63132 (0.0008) +[2023-10-09 06:44:03,460][60143] Updated weights for policy 0, policy_version 62402 (0.0008) +[2023-10-09 06:44:03,835][60143] Updated weights for policy 0, policy_version 62412 (0.0009) +[2023-10-09 06:44:04,204][60143] Updated weights for policy 0, policy_version 62422 (0.0008) +[2023-10-09 06:44:04,577][60143] Updated weights for policy 0, policy_version 62432 (0.0007) +[2023-10-09 06:44:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 128581632. Throughput: 0: 1714.0, 1: 1715.1. Samples: 32151070. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 06:44:06,053][59242] Avg episode reward: [(0, '32.990'), (1, '30.440')] +[2023-10-09 06:44:06,082][60144] Updated weights for policy 1, policy_version 63142 (0.0009) +[2023-10-09 06:44:06,454][60144] Updated weights for policy 1, policy_version 63152 (0.0009) +[2023-10-09 06:44:06,828][60144] Updated weights for policy 1, policy_version 63162 (0.0009) +[2023-10-09 06:44:08,473][60143] Updated weights for policy 0, policy_version 62442 (0.0008) +[2023-10-09 06:44:08,835][60143] Updated weights for policy 0, policy_version 62452 (0.0010) +[2023-10-09 06:44:09,212][60143] Updated weights for policy 0, policy_version 62462 (0.0007) +[2023-10-09 06:44:10,889][60144] Updated weights for policy 1, policy_version 63172 (0.0009) +[2023-10-09 06:44:11,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 128647168. Throughput: 0: 1688.4, 1: 1743.2. Samples: 32171382. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 06:44:11,053][59242] Avg episode reward: [(0, '33.910'), (1, '32.440')] +[2023-10-09 06:44:11,252][60144] Updated weights for policy 1, policy_version 63182 (0.0008) +[2023-10-09 06:44:11,626][60144] Updated weights for policy 1, policy_version 63192 (0.0008) +[2023-10-09 06:44:13,086][60143] Updated weights for policy 0, policy_version 62472 (0.0007) +[2023-10-09 06:44:13,460][60143] Updated weights for policy 0, policy_version 62482 (0.0009) +[2023-10-09 06:44:13,832][60143] Updated weights for policy 0, policy_version 62492 (0.0009) +[2023-10-09 06:44:15,417][60144] Updated weights for policy 1, policy_version 63202 (0.0011) +[2023-10-09 06:44:15,785][60144] Updated weights for policy 1, policy_version 63212 (0.0007) +[2023-10-09 06:44:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 128712704. Throughput: 0: 1705.2, 1: 1740.8. Samples: 32192604. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 06:44:16,053][59242] Avg episode reward: [(0, '32.580'), (1, '32.340')] +[2023-10-09 06:44:16,143][60144] Updated weights for policy 1, policy_version 63222 (0.0007) +[2023-10-09 06:44:16,512][60144] Updated weights for policy 1, policy_version 63232 (0.0007) +[2023-10-09 06:44:17,854][60143] Updated weights for policy 0, policy_version 62502 (0.0011) +[2023-10-09 06:44:18,223][60143] Updated weights for policy 0, policy_version 62512 (0.0011) +[2023-10-09 06:44:18,601][60143] Updated weights for policy 0, policy_version 62522 (0.0011) +[2023-10-09 06:44:20,494][60144] Updated weights for policy 1, policy_version 63242 (0.0010) +[2023-10-09 06:44:20,865][60144] Updated weights for policy 1, policy_version 63252 (0.0008) +[2023-10-09 06:44:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 128778240. Throughput: 0: 1699.0, 1: 1739.3. Samples: 32202602. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 06:44:21,053][59242] Avg episode reward: [(0, '33.280'), (1, '30.780')] +[2023-10-09 06:44:21,232][60144] Updated weights for policy 1, policy_version 63262 (0.0009) +[2023-10-09 06:44:22,637][60143] Updated weights for policy 0, policy_version 62532 (0.0009) +[2023-10-09 06:44:23,011][60143] Updated weights for policy 0, policy_version 62542 (0.0009) +[2023-10-09 06:44:23,398][60143] Updated weights for policy 0, policy_version 62552 (0.0009) +[2023-10-09 06:44:25,162][60144] Updated weights for policy 1, policy_version 63272 (0.0009) +[2023-10-09 06:44:25,532][60144] Updated weights for policy 1, policy_version 63282 (0.0009) +[2023-10-09 06:44:25,895][60144] Updated weights for policy 1, policy_version 63292 (0.0007) +[2023-10-09 06:44:26,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 128876544. Throughput: 0: 1687.2, 1: 1747.2. Samples: 32223262. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 06:44:26,053][59242] Avg episode reward: [(0, '33.090'), (1, '30.470')] +[2023-10-09 06:44:27,509][60143] Updated weights for policy 0, policy_version 62562 (0.0010) +[2023-10-09 06:44:27,886][60143] Updated weights for policy 0, policy_version 62572 (0.0009) +[2023-10-09 06:44:28,258][60143] Updated weights for policy 0, policy_version 62582 (0.0008) +[2023-10-09 06:44:28,629][60143] Updated weights for policy 0, policy_version 62592 (0.0008) +[2023-10-09 06:44:29,828][60144] Updated weights for policy 1, policy_version 63302 (0.0009) +[2023-10-09 06:44:30,201][60144] Updated weights for policy 1, policy_version 63312 (0.0010) +[2023-10-09 06:44:30,555][60144] Updated weights for policy 1, policy_version 63322 (0.0008) +[2023-10-09 06:44:31,052][59242] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 128942080. Throughput: 0: 1719.0, 1: 1724.7. Samples: 32243622. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 06:44:31,053][59242] Avg episode reward: [(0, '32.420'), (1, '30.490')] +[2023-10-09 06:44:32,500][60143] Updated weights for policy 0, policy_version 62602 (0.0007) +[2023-10-09 06:44:32,866][60143] Updated weights for policy 0, policy_version 62612 (0.0008) +[2023-10-09 06:44:33,234][60143] Updated weights for policy 0, policy_version 62622 (0.0009) +[2023-10-09 06:44:34,384][60144] Updated weights for policy 1, policy_version 63332 (0.0009) +[2023-10-09 06:44:34,755][60144] Updated weights for policy 1, policy_version 63342 (0.0009) +[2023-10-09 06:44:35,118][60144] Updated weights for policy 1, policy_version 63352 (0.0008) +[2023-10-09 06:44:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 129007616. Throughput: 0: 1690.1, 1: 1753.4. Samples: 32254150. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-09 06:44:36,053][59242] Avg episode reward: [(0, '33.500'), (1, '30.010')] +[2023-10-09 06:44:37,202][60143] Updated weights for policy 0, policy_version 62632 (0.0007) +[2023-10-09 06:44:37,585][60143] Updated weights for policy 0, policy_version 62642 (0.0008) +[2023-10-09 06:44:37,963][60143] Updated weights for policy 0, policy_version 62652 (0.0008) +[2023-10-09 06:44:39,044][60144] Updated weights for policy 1, policy_version 63362 (0.0008) +[2023-10-09 06:44:39,445][60144] Updated weights for policy 1, policy_version 63372 (0.0007) +[2023-10-09 06:44:39,814][60144] Updated weights for policy 1, policy_version 63382 (0.0007) +[2023-10-09 06:44:40,190][60144] Updated weights for policy 1, policy_version 63392 (0.0009) +[2023-10-09 06:44:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 129073152. Throughput: 0: 1707.9, 1: 1736.7. Samples: 32274930. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 06:44:41,053][59242] Avg episode reward: [(0, '32.590'), (1, '28.360')] +[2023-10-09 06:44:41,936][60143] Updated weights for policy 0, policy_version 62662 (0.0009) +[2023-10-09 06:44:42,311][60143] Updated weights for policy 0, policy_version 62672 (0.0010) +[2023-10-09 06:44:42,675][60143] Updated weights for policy 0, policy_version 62682 (0.0011) +[2023-10-09 06:44:44,125][60144] Updated weights for policy 1, policy_version 63402 (0.0010) +[2023-10-09 06:44:44,487][60144] Updated weights for policy 1, policy_version 63412 (0.0010) +[2023-10-09 06:44:44,841][60144] Updated weights for policy 1, policy_version 63422 (0.0010) +[2023-10-09 06:44:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 129138688. Throughput: 0: 1726.7, 1: 1720.4. Samples: 32295562. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 06:44:46,053][59242] Avg episode reward: [(0, '31.060'), (1, '28.900')] +[2023-10-09 06:44:46,581][60143] Updated weights for policy 0, policy_version 62692 (0.0010) +[2023-10-09 06:44:46,956][60143] Updated weights for policy 0, policy_version 62702 (0.0011) +[2023-10-09 06:44:47,318][60143] Updated weights for policy 0, policy_version 62712 (0.0008) +[2023-10-09 06:44:48,854][60144] Updated weights for policy 1, policy_version 63432 (0.0007) +[2023-10-09 06:44:49,210][60144] Updated weights for policy 1, policy_version 63442 (0.0007) +[2023-10-09 06:44:49,582][60144] Updated weights for policy 1, policy_version 63452 (0.0008) +[2023-10-09 06:44:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 129204224. Throughput: 0: 1699.1, 1: 1745.3. Samples: 32306066. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 06:44:51,053][59242] Avg episode reward: [(0, '31.920'), (1, '29.380')] +[2023-10-09 06:44:51,377][60143] Updated weights for policy 0, policy_version 62722 (0.0010) +[2023-10-09 06:44:51,740][60143] Updated weights for policy 0, policy_version 62732 (0.0009) +[2023-10-09 06:44:52,110][60143] Updated weights for policy 0, policy_version 62742 (0.0011) +[2023-10-09 06:44:52,486][60143] Updated weights for policy 0, policy_version 62752 (0.0011) +[2023-10-09 06:44:53,438][60144] Updated weights for policy 1, policy_version 63462 (0.0007) +[2023-10-09 06:44:53,793][60144] Updated weights for policy 1, policy_version 63472 (0.0008) +[2023-10-09 06:44:54,158][60144] Updated weights for policy 1, policy_version 63482 (0.0009) +[2023-10-09 06:44:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 129269760. Throughput: 0: 1729.6, 1: 1713.3. Samples: 32326310. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 06:44:56,052][59242] Avg episode reward: [(0, '32.140'), (1, '29.440')] +[2023-10-09 06:44:56,214][60143] Updated weights for policy 0, policy_version 62762 (0.0008) +[2023-10-09 06:44:56,575][60143] Updated weights for policy 0, policy_version 62772 (0.0010) +[2023-10-09 06:44:56,940][60143] Updated weights for policy 0, policy_version 62782 (0.0009) +[2023-10-09 06:44:58,080][60144] Updated weights for policy 1, policy_version 63492 (0.0011) +[2023-10-09 06:44:58,463][60144] Updated weights for policy 1, policy_version 63502 (0.0009) +[2023-10-09 06:44:58,827][60144] Updated weights for policy 1, policy_version 63512 (0.0008) +[2023-10-09 06:45:00,914][60143] Updated weights for policy 0, policy_version 62792 (0.0009) +[2023-10-09 06:45:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 129335296. Throughput: 0: 1734.8, 1: 1717.8. Samples: 32347972. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 06:45:01,053][59242] Avg episode reward: [(0, '33.560'), (1, '29.180')] +[2023-10-09 06:45:01,277][60143] Updated weights for policy 0, policy_version 62802 (0.0007) +[2023-10-09 06:45:01,649][60143] Updated weights for policy 0, policy_version 62812 (0.0007) +[2023-10-09 06:45:02,749][60144] Updated weights for policy 1, policy_version 63522 (0.0007) +[2023-10-09 06:45:03,109][60144] Updated weights for policy 1, policy_version 63532 (0.0008) +[2023-10-09 06:45:03,468][60144] Updated weights for policy 1, policy_version 63542 (0.0008) +[2023-10-09 06:45:03,835][60144] Updated weights for policy 1, policy_version 63552 (0.0008) +[2023-10-09 06:45:05,533][60143] Updated weights for policy 0, policy_version 62822 (0.0009) +[2023-10-09 06:45:05,915][60143] Updated weights for policy 0, policy_version 62832 (0.0011) +[2023-10-09 06:45:06,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 129400832. Throughput: 0: 1724.7, 1: 1719.6. Samples: 32357598. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 06:45:06,053][59242] Avg episode reward: [(0, '34.880'), (1, '28.410')] +[2023-10-09 06:45:06,284][60143] Updated weights for policy 0, policy_version 62842 (0.0007) +[2023-10-09 06:45:07,904][60144] Updated weights for policy 1, policy_version 63562 (0.0008) +[2023-10-09 06:45:08,275][60144] Updated weights for policy 1, policy_version 63572 (0.0008) +[2023-10-09 06:45:08,634][60144] Updated weights for policy 1, policy_version 63582 (0.0009) +[2023-10-09 06:45:10,229][60143] Updated weights for policy 0, policy_version 62852 (0.0008) +[2023-10-09 06:45:10,609][60143] Updated weights for policy 0, policy_version 62862 (0.0009) +[2023-10-09 06:45:10,981][60143] Updated weights for policy 0, policy_version 62872 (0.0010) +[2023-10-09 06:45:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 129466368. Throughput: 0: 1742.5, 1: 1709.6. Samples: 32378606. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 06:45:11,053][59242] Avg episode reward: [(0, '34.370'), (1, '29.350')] +[2023-10-09 06:45:12,636][60144] Updated weights for policy 1, policy_version 63592 (0.0007) +[2023-10-09 06:45:13,000][60144] Updated weights for policy 1, policy_version 63602 (0.0007) +[2023-10-09 06:45:13,368][60144] Updated weights for policy 1, policy_version 63612 (0.0007) +[2023-10-09 06:45:14,883][60143] Updated weights for policy 0, policy_version 62882 (0.0008) +[2023-10-09 06:45:15,252][60143] Updated weights for policy 0, policy_version 62892 (0.0010) +[2023-10-09 06:45:15,614][60143] Updated weights for policy 0, policy_version 62902 (0.0010) +[2023-10-09 06:45:15,986][60143] Updated weights for policy 0, policy_version 62912 (0.0009) +[2023-10-09 06:45:16,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 129564672. Throughput: 0: 1727.7, 1: 1735.9. Samples: 32399482. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 06:45:16,053][59242] Avg episode reward: [(0, '34.770'), (1, '28.990')] +[2023-10-09 06:45:17,137][60144] Updated weights for policy 1, policy_version 63622 (0.0007) +[2023-10-09 06:45:17,511][60144] Updated weights for policy 1, policy_version 63632 (0.0007) +[2023-10-09 06:45:17,868][60144] Updated weights for policy 1, policy_version 63642 (0.0009) +[2023-10-09 06:45:19,997][60143] Updated weights for policy 0, policy_version 62922 (0.0009) +[2023-10-09 06:45:20,366][60143] Updated weights for policy 0, policy_version 62932 (0.0008) +[2023-10-09 06:45:20,741][60143] Updated weights for policy 0, policy_version 62942 (0.0009) +[2023-10-09 06:45:21,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 129630208. Throughput: 0: 1743.8, 1: 1707.6. Samples: 32409466. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 06:45:21,053][59242] Avg episode reward: [(0, '34.990'), (1, '29.990')] +[2023-10-09 06:45:21,825][60144] Updated weights for policy 1, policy_version 63652 (0.0009) +[2023-10-09 06:45:22,190][60144] Updated weights for policy 1, policy_version 63662 (0.0009) +[2023-10-09 06:45:22,561][60144] Updated weights for policy 1, policy_version 63672 (0.0007) +[2023-10-09 06:45:24,511][60143] Updated weights for policy 0, policy_version 62952 (0.0007) +[2023-10-09 06:45:24,875][60143] Updated weights for policy 0, policy_version 62962 (0.0009) +[2023-10-09 06:45:25,239][60143] Updated weights for policy 0, policy_version 62972 (0.0011) +[2023-10-09 06:45:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 129695744. Throughput: 0: 1734.4, 1: 1725.7. Samples: 32430638. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 06:45:26,053][59242] Avg episode reward: [(0, '35.190'), (1, '31.470')] +[2023-10-09 06:45:26,628][60144] Updated weights for policy 1, policy_version 63682 (0.0007) +[2023-10-09 06:45:27,035][60144] Updated weights for policy 1, policy_version 63692 (0.0007) +[2023-10-09 06:45:27,409][60144] Updated weights for policy 1, policy_version 63702 (0.0008) +[2023-10-09 06:45:27,776][60144] Updated weights for policy 1, policy_version 63712 (0.0008) +[2023-10-09 06:45:29,398][60143] Updated weights for policy 0, policy_version 62982 (0.0010) +[2023-10-09 06:45:29,785][60143] Updated weights for policy 0, policy_version 62992 (0.0008) +[2023-10-09 06:45:30,156][60143] Updated weights for policy 0, policy_version 63002 (0.0008) +[2023-10-09 06:45:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 129761280. Throughput: 0: 1705.9, 1: 1740.7. Samples: 32450656. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 06:45:31,053][59242] Avg episode reward: [(0, '34.010'), (1, '28.960')] +[2023-10-09 06:45:31,062][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000063712_65241088.pth... +[2023-10-09 06:45:31,062][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000063008_64520192.pth... +[2023-10-09 06:45:31,091][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000062112_63602688.pth +[2023-10-09 06:45:31,099][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000061408_62881792.pth +[2023-10-09 06:45:31,642][60144] Updated weights for policy 1, policy_version 63722 (0.0009) +[2023-10-09 06:45:32,004][60144] Updated weights for policy 1, policy_version 63732 (0.0009) +[2023-10-09 06:45:32,378][60144] Updated weights for policy 1, policy_version 63742 (0.0008) +[2023-10-09 06:45:34,270][60143] Updated weights for policy 0, policy_version 63012 (0.0008) +[2023-10-09 06:45:34,644][60143] Updated weights for policy 0, policy_version 63022 (0.0008) +[2023-10-09 06:45:35,016][60143] Updated weights for policy 0, policy_version 63032 (0.0009) +[2023-10-09 06:45:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 129826816. Throughput: 0: 1735.9, 1: 1710.4. Samples: 32461150. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 06:45:36,053][59242] Avg episode reward: [(0, '33.240'), (1, '30.070')] +[2023-10-09 06:45:36,333][60144] Updated weights for policy 1, policy_version 63752 (0.0010) +[2023-10-09 06:45:36,694][60144] Updated weights for policy 1, policy_version 63762 (0.0009) +[2023-10-09 06:45:37,067][60144] Updated weights for policy 1, policy_version 63772 (0.0009) +[2023-10-09 06:45:38,774][60143] Updated weights for policy 0, policy_version 63042 (0.0008) +[2023-10-09 06:45:39,138][60143] Updated weights for policy 0, policy_version 63052 (0.0009) +[2023-10-09 06:45:39,499][60143] Updated weights for policy 0, policy_version 63062 (0.0007) +[2023-10-09 06:45:39,872][60143] Updated weights for policy 0, policy_version 63072 (0.0007) +[2023-10-09 06:45:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 129892352. Throughput: 0: 1715.7, 1: 1738.0. Samples: 32481726. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 06:45:41,053][59242] Avg episode reward: [(0, '33.090'), (1, '29.400')] +[2023-10-09 06:45:41,151][60144] Updated weights for policy 1, policy_version 63782 (0.0008) +[2023-10-09 06:45:41,517][60144] Updated weights for policy 1, policy_version 63792 (0.0007) +[2023-10-09 06:45:41,880][60144] Updated weights for policy 1, policy_version 63802 (0.0008) +[2023-10-09 06:45:43,693][60143] Updated weights for policy 0, policy_version 63082 (0.0008) +[2023-10-09 06:45:44,061][60143] Updated weights for policy 0, policy_version 63092 (0.0008) +[2023-10-09 06:45:44,433][60143] Updated weights for policy 0, policy_version 63102 (0.0008) +[2023-10-09 06:45:45,668][60144] Updated weights for policy 1, policy_version 63812 (0.0007) +[2023-10-09 06:45:46,034][60144] Updated weights for policy 1, policy_version 63822 (0.0007) +[2023-10-09 06:45:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 129957888. Throughput: 0: 1705.3, 1: 1731.2. Samples: 32502610. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 06:45:46,053][59242] Avg episode reward: [(0, '33.540'), (1, '29.860')] +[2023-10-09 06:45:46,397][60144] Updated weights for policy 1, policy_version 63832 (0.0007) +[2023-10-09 06:45:48,286][60143] Updated weights for policy 0, policy_version 63112 (0.0008) +[2023-10-09 06:45:48,662][60143] Updated weights for policy 0, policy_version 63122 (0.0009) +[2023-10-09 06:45:49,028][60143] Updated weights for policy 0, policy_version 63132 (0.0010) +[2023-10-09 06:45:50,302][60144] Updated weights for policy 1, policy_version 63842 (0.0009) +[2023-10-09 06:45:50,678][60144] Updated weights for policy 1, policy_version 63852 (0.0007) +[2023-10-09 06:45:51,039][60144] Updated weights for policy 1, policy_version 63862 (0.0007) +[2023-10-09 06:45:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 130023424. Throughput: 0: 1723.1, 1: 1730.9. Samples: 32513028. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 06:45:51,053][59242] Avg episode reward: [(0, '32.610'), (1, '30.270')] +[2023-10-09 06:45:51,407][60144] Updated weights for policy 1, policy_version 63872 (0.0007) +[2023-10-09 06:45:53,063][60143] Updated weights for policy 0, policy_version 63142 (0.0007) +[2023-10-09 06:45:53,431][60143] Updated weights for policy 0, policy_version 63152 (0.0009) +[2023-10-09 06:45:53,797][60143] Updated weights for policy 0, policy_version 63162 (0.0009) +[2023-10-09 06:45:55,262][60144] Updated weights for policy 1, policy_version 63882 (0.0009) +[2023-10-09 06:45:55,635][60144] Updated weights for policy 1, policy_version 63892 (0.0010) +[2023-10-09 06:45:55,996][60144] Updated weights for policy 1, policy_version 63902 (0.0008) +[2023-10-09 06:45:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 130088960. Throughput: 0: 1701.5, 1: 1744.8. Samples: 32533686. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 06:45:56,052][59242] Avg episode reward: [(0, '33.030'), (1, '30.860')] +[2023-10-09 06:45:57,872][60143] Updated weights for policy 0, policy_version 63172 (0.0009) +[2023-10-09 06:45:58,238][60143] Updated weights for policy 0, policy_version 63182 (0.0009) +[2023-10-09 06:45:58,615][60143] Updated weights for policy 0, policy_version 63192 (0.0008) +[2023-10-09 06:45:59,900][60144] Updated weights for policy 1, policy_version 63912 (0.0009) +[2023-10-09 06:46:00,271][60144] Updated weights for policy 1, policy_version 63922 (0.0009) +[2023-10-09 06:46:00,640][60144] Updated weights for policy 1, policy_version 63932 (0.0007) +[2023-10-09 06:46:01,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 130187264. Throughput: 0: 1718.1, 1: 1715.0. Samples: 32553970. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-09 06:46:01,053][59242] Avg episode reward: [(0, '34.180'), (1, '31.820')] +[2023-10-09 06:46:02,580][60143] Updated weights for policy 0, policy_version 63202 (0.0008) +[2023-10-09 06:46:02,952][60143] Updated weights for policy 0, policy_version 63212 (0.0010) +[2023-10-09 06:46:03,325][60143] Updated weights for policy 0, policy_version 63222 (0.0008) +[2023-10-09 06:46:03,701][60143] Updated weights for policy 0, policy_version 63232 (0.0007) +[2023-10-09 06:46:04,568][60144] Updated weights for policy 1, policy_version 63942 (0.0010) +[2023-10-09 06:46:04,925][60144] Updated weights for policy 1, policy_version 63952 (0.0008) +[2023-10-09 06:46:05,294][60144] Updated weights for policy 1, policy_version 63962 (0.0007) +[2023-10-09 06:46:06,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 130252800. Throughput: 0: 1705.4, 1: 1742.3. Samples: 32564612. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-09 06:46:06,052][59242] Avg episode reward: [(0, '34.260'), (1, '31.600')] +[2023-10-09 06:46:07,708][60143] Updated weights for policy 0, policy_version 63242 (0.0011) +[2023-10-09 06:46:08,075][60143] Updated weights for policy 0, policy_version 63252 (0.0008) +[2023-10-09 06:46:08,448][60143] Updated weights for policy 0, policy_version 63262 (0.0008) +[2023-10-09 06:46:08,997][60144] Updated weights for policy 1, policy_version 63972 (0.0007) +[2023-10-09 06:46:09,351][60144] Updated weights for policy 1, policy_version 63982 (0.0007) +[2023-10-09 06:46:09,718][60144] Updated weights for policy 1, policy_version 63992 (0.0009) +[2023-10-09 06:46:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 130318336. Throughput: 0: 1704.4, 1: 1729.6. Samples: 32585172. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-09 06:46:11,053][59242] Avg episode reward: [(0, '35.090'), (1, '30.740')] +[2023-10-09 06:46:12,415][60143] Updated weights for policy 0, policy_version 63272 (0.0008) +[2023-10-09 06:46:12,786][60143] Updated weights for policy 0, policy_version 63282 (0.0007) +[2023-10-09 06:46:13,158][60143] Updated weights for policy 0, policy_version 63292 (0.0007) +[2023-10-09 06:46:13,793][60144] Updated weights for policy 1, policy_version 64002 (0.0008) +[2023-10-09 06:46:14,197][60144] Updated weights for policy 1, policy_version 64012 (0.0009) +[2023-10-09 06:46:14,578][60144] Updated weights for policy 1, policy_version 64022 (0.0008) +[2023-10-09 06:46:14,944][60144] Updated weights for policy 1, policy_version 64032 (0.0007) +[2023-10-09 06:46:16,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 130383872. Throughput: 0: 1731.8, 1: 1715.7. Samples: 32605792. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-09 06:46:16,053][59242] Avg episode reward: [(0, '36.230'), (1, '30.690')] +[2023-10-09 06:46:16,064][59934] Saving new best policy, reward=36.230! +[2023-10-09 06:46:17,129][60143] Updated weights for policy 0, policy_version 63302 (0.0007) +[2023-10-09 06:46:17,510][60143] Updated weights for policy 0, policy_version 63312 (0.0009) +[2023-10-09 06:46:17,888][60143] Updated weights for policy 0, policy_version 63322 (0.0007) +[2023-10-09 06:46:18,786][60144] Updated weights for policy 1, policy_version 64042 (0.0009) +[2023-10-09 06:46:19,157][60144] Updated weights for policy 1, policy_version 64052 (0.0008) +[2023-10-09 06:46:19,523][60144] Updated weights for policy 1, policy_version 64062 (0.0008) +[2023-10-09 06:46:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 130449408. Throughput: 0: 1699.1, 1: 1745.3. Samples: 32616144. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-09 06:46:21,052][59242] Avg episode reward: [(0, '36.850'), (1, '29.870')] +[2023-10-09 06:46:21,053][59934] Saving new best policy, reward=36.850! +[2023-10-09 06:46:21,851][60143] Updated weights for policy 0, policy_version 63332 (0.0007) +[2023-10-09 06:46:22,216][60143] Updated weights for policy 0, policy_version 63342 (0.0009) +[2023-10-09 06:46:22,580][60143] Updated weights for policy 0, policy_version 63352 (0.0010) +[2023-10-09 06:46:23,324][60144] Updated weights for policy 1, policy_version 64072 (0.0009) +[2023-10-09 06:46:23,692][60144] Updated weights for policy 1, policy_version 64082 (0.0010) +[2023-10-09 06:46:24,059][60144] Updated weights for policy 1, policy_version 64092 (0.0007) +[2023-10-09 06:46:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 130514944. Throughput: 0: 1715.8, 1: 1718.8. Samples: 32636282. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-09 06:46:26,053][59242] Avg episode reward: [(0, '36.710'), (1, '31.340')] +[2023-10-09 06:46:26,484][60143] Updated weights for policy 0, policy_version 63362 (0.0010) +[2023-10-09 06:46:26,847][60143] Updated weights for policy 0, policy_version 63372 (0.0007) +[2023-10-09 06:46:27,220][60143] Updated weights for policy 0, policy_version 63382 (0.0008) +[2023-10-09 06:46:27,592][60143] Updated weights for policy 0, policy_version 63392 (0.0007) +[2023-10-09 06:46:27,944][60144] Updated weights for policy 1, policy_version 64102 (0.0010) +[2023-10-09 06:46:28,319][60144] Updated weights for policy 1, policy_version 64112 (0.0009) +[2023-10-09 06:46:28,686][60144] Updated weights for policy 1, policy_version 64122 (0.0010) +[2023-10-09 06:46:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 130580480. Throughput: 0: 1723.5, 1: 1723.0. Samples: 32657700. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-09 06:46:31,053][59242] Avg episode reward: [(0, '33.670'), (1, '31.620')] +[2023-10-09 06:46:31,525][60143] Updated weights for policy 0, policy_version 63402 (0.0007) +[2023-10-09 06:46:31,895][60143] Updated weights for policy 0, policy_version 63412 (0.0009) +[2023-10-09 06:46:32,268][60143] Updated weights for policy 0, policy_version 63422 (0.0007) +[2023-10-09 06:46:32,753][60144] Updated weights for policy 1, policy_version 64132 (0.0008) +[2023-10-09 06:46:33,122][60144] Updated weights for policy 1, policy_version 64142 (0.0009) +[2023-10-09 06:46:33,483][60144] Updated weights for policy 1, policy_version 64152 (0.0007) +[2023-10-09 06:46:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 130646016. Throughput: 0: 1704.1, 1: 1724.6. Samples: 32667320. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-09 06:46:36,052][59242] Avg episode reward: [(0, '34.530'), (1, '30.760')] +[2023-10-09 06:46:36,180][60143] Updated weights for policy 0, policy_version 63432 (0.0007) +[2023-10-09 06:46:36,549][60143] Updated weights for policy 0, policy_version 63442 (0.0008) +[2023-10-09 06:46:36,922][60143] Updated weights for policy 0, policy_version 63452 (0.0008) +[2023-10-09 06:46:37,412][60144] Updated weights for policy 1, policy_version 64162 (0.0008) +[2023-10-09 06:46:37,778][60144] Updated weights for policy 1, policy_version 64172 (0.0007) +[2023-10-09 06:46:38,143][60144] Updated weights for policy 1, policy_version 64182 (0.0008) +[2023-10-09 06:46:38,509][60144] Updated weights for policy 1, policy_version 64192 (0.0011) +[2023-10-09 06:46:40,871][60143] Updated weights for policy 0, policy_version 63462 (0.0008) +[2023-10-09 06:46:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 130711552. Throughput: 0: 1720.1, 1: 1711.5. Samples: 32688108. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-09 06:46:41,053][59242] Avg episode reward: [(0, '34.100'), (1, '30.340')] +[2023-10-09 06:46:41,243][60143] Updated weights for policy 0, policy_version 63472 (0.0007) +[2023-10-09 06:46:41,612][60143] Updated weights for policy 0, policy_version 63482 (0.0008) +[2023-10-09 06:46:42,573][60144] Updated weights for policy 1, policy_version 64202 (0.0009) +[2023-10-09 06:46:42,939][60144] Updated weights for policy 1, policy_version 64212 (0.0008) +[2023-10-09 06:46:43,310][60144] Updated weights for policy 1, policy_version 64222 (0.0008) +[2023-10-09 06:46:45,685][60143] Updated weights for policy 0, policy_version 63492 (0.0009) +[2023-10-09 06:46:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 130777088. Throughput: 0: 1720.3, 1: 1732.2. Samples: 32709330. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-09 06:46:46,053][59242] Avg episode reward: [(0, '33.920'), (1, '29.300')] +[2023-10-09 06:46:46,059][60143] Updated weights for policy 0, policy_version 63502 (0.0009) +[2023-10-09 06:46:46,430][60143] Updated weights for policy 0, policy_version 63512 (0.0008) +[2023-10-09 06:46:47,376][60144] Updated weights for policy 1, policy_version 64232 (0.0009) +[2023-10-09 06:46:47,739][60144] Updated weights for policy 1, policy_version 64242 (0.0007) +[2023-10-09 06:46:48,103][60144] Updated weights for policy 1, policy_version 64252 (0.0009) +[2023-10-09 06:46:50,483][60143] Updated weights for policy 0, policy_version 63522 (0.0009) +[2023-10-09 06:46:50,857][60143] Updated weights for policy 0, policy_version 63532 (0.0008) +[2023-10-09 06:46:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 130842624. Throughput: 0: 1716.0, 1: 1707.2. Samples: 32718658. Policy #0 lag: (min: 0.0, avg: 19.4, max: 32.0) +[2023-10-09 06:46:51,053][59242] Avg episode reward: [(0, '33.430'), (1, '29.350')] +[2023-10-09 06:46:51,214][60143] Updated weights for policy 0, policy_version 63542 (0.0007) +[2023-10-09 06:46:51,594][60143] Updated weights for policy 0, policy_version 63552 (0.0010) +[2023-10-09 06:46:52,017][60144] Updated weights for policy 1, policy_version 64262 (0.0007) +[2023-10-09 06:46:52,381][60144] Updated weights for policy 1, policy_version 64272 (0.0009) +[2023-10-09 06:46:52,745][60144] Updated weights for policy 1, policy_version 64282 (0.0008) +[2023-10-09 06:46:55,593][60143] Updated weights for policy 0, policy_version 63562 (0.0008) +[2023-10-09 06:46:55,955][60143] Updated weights for policy 0, policy_version 63572 (0.0007) +[2023-10-09 06:46:56,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 130908160. Throughput: 0: 1725.7, 1: 1716.9. Samples: 32740090. Policy #0 lag: (min: 0.0, avg: 19.4, max: 32.0) +[2023-10-09 06:46:56,053][59242] Avg episode reward: [(0, '33.540'), (1, '30.080')] +[2023-10-09 06:46:56,325][60143] Updated weights for policy 0, policy_version 63582 (0.0007) +[2023-10-09 06:46:56,726][60144] Updated weights for policy 1, policy_version 64292 (0.0009) +[2023-10-09 06:46:57,107][60144] Updated weights for policy 1, policy_version 64302 (0.0009) +[2023-10-09 06:46:57,468][60144] Updated weights for policy 1, policy_version 64312 (0.0007) +[2023-10-09 06:47:00,241][60143] Updated weights for policy 0, policy_version 63592 (0.0010) +[2023-10-09 06:47:00,605][60143] Updated weights for policy 0, policy_version 63602 (0.0010) +[2023-10-09 06:47:00,973][60143] Updated weights for policy 0, policy_version 63612 (0.0008) +[2023-10-09 06:47:01,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 130973696. Throughput: 0: 1716.7, 1: 1732.7. Samples: 32761014. Policy #0 lag: (min: 0.0, avg: 19.4, max: 32.0) +[2023-10-09 06:47:01,053][59242] Avg episode reward: [(0, '32.600'), (1, '33.160')] +[2023-10-09 06:47:01,518][60144] Updated weights for policy 1, policy_version 64322 (0.0008) +[2023-10-09 06:47:01,918][60144] Updated weights for policy 1, policy_version 64332 (0.0007) +[2023-10-09 06:47:02,281][60144] Updated weights for policy 1, policy_version 64342 (0.0007) +[2023-10-09 06:47:02,653][60144] Updated weights for policy 1, policy_version 64352 (0.0008) +[2023-10-09 06:47:05,061][60143] Updated weights for policy 0, policy_version 63622 (0.0008) +[2023-10-09 06:47:05,431][60143] Updated weights for policy 0, policy_version 63632 (0.0007) +[2023-10-09 06:47:05,795][60143] Updated weights for policy 0, policy_version 63642 (0.0009) +[2023-10-09 06:47:06,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 131072000. Throughput: 0: 1734.7, 1: 1703.5. Samples: 32770866. Policy #0 lag: (min: 0.0, avg: 19.4, max: 32.0) +[2023-10-09 06:47:06,053][59242] Avg episode reward: [(0, '32.440'), (1, '31.710')] +[2023-10-09 06:47:06,461][60144] Updated weights for policy 1, policy_version 64362 (0.0007) +[2023-10-09 06:47:06,827][60144] Updated weights for policy 1, policy_version 64372 (0.0008) +[2023-10-09 06:47:07,195][60144] Updated weights for policy 1, policy_version 64382 (0.0007) +[2023-10-09 06:47:09,644][60143] Updated weights for policy 0, policy_version 63652 (0.0009) +[2023-10-09 06:47:10,021][60143] Updated weights for policy 0, policy_version 63662 (0.0007) +[2023-10-09 06:47:10,387][60143] Updated weights for policy 0, policy_version 63672 (0.0009) +[2023-10-09 06:47:11,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 131137536. Throughput: 0: 1728.3, 1: 1734.4. Samples: 32792102. Policy #0 lag: (min: 0.0, avg: 19.4, max: 32.0) +[2023-10-09 06:47:11,052][59242] Avg episode reward: [(0, '34.180'), (1, '33.370')] +[2023-10-09 06:47:11,291][60144] Updated weights for policy 1, policy_version 64392 (0.0010) +[2023-10-09 06:47:11,671][60144] Updated weights for policy 1, policy_version 64402 (0.0009) +[2023-10-09 06:47:12,038][60144] Updated weights for policy 1, policy_version 64412 (0.0009) +[2023-10-09 06:47:14,317][60143] Updated weights for policy 0, policy_version 63682 (0.0007) +[2023-10-09 06:47:14,681][60143] Updated weights for policy 0, policy_version 63692 (0.0009) +[2023-10-09 06:47:15,053][60143] Updated weights for policy 0, policy_version 63702 (0.0010) +[2023-10-09 06:47:15,420][60143] Updated weights for policy 0, policy_version 63712 (0.0008) +[2023-10-09 06:47:16,012][60144] Updated weights for policy 1, policy_version 64422 (0.0011) +[2023-10-09 06:47:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 131203072. Throughput: 0: 1694.5, 1: 1730.4. Samples: 32811822. Policy #0 lag: (min: 0.0, avg: 19.4, max: 32.0) +[2023-10-09 06:47:16,053][59242] Avg episode reward: [(0, '33.270'), (1, '32.410')] +[2023-10-09 06:47:16,385][60144] Updated weights for policy 1, policy_version 64432 (0.0009) +[2023-10-09 06:47:16,744][60144] Updated weights for policy 1, policy_version 64442 (0.0010) +[2023-10-09 06:47:19,535][60143] Updated weights for policy 0, policy_version 63722 (0.0007) +[2023-10-09 06:47:19,902][60143] Updated weights for policy 0, policy_version 63732 (0.0009) +[2023-10-09 06:47:20,274][60143] Updated weights for policy 0, policy_version 63742 (0.0010) +[2023-10-09 06:47:20,528][60144] Updated weights for policy 1, policy_version 64452 (0.0009) +[2023-10-09 06:47:20,891][60144] Updated weights for policy 1, policy_version 64462 (0.0008) +[2023-10-09 06:47:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 131268608. Throughput: 0: 1722.2, 1: 1724.9. Samples: 32822442. Policy #0 lag: (min: 0.0, avg: 19.4, max: 32.0) +[2023-10-09 06:47:21,052][59242] Avg episode reward: [(0, '33.700'), (1, '32.650')] +[2023-10-09 06:47:21,264][60144] Updated weights for policy 1, policy_version 64472 (0.0008) +[2023-10-09 06:47:24,290][60143] Updated weights for policy 0, policy_version 63752 (0.0009) +[2023-10-09 06:47:24,654][60143] Updated weights for policy 0, policy_version 63762 (0.0011) +[2023-10-09 06:47:25,028][60143] Updated weights for policy 0, policy_version 63772 (0.0011) +[2023-10-09 06:47:25,137][60144] Updated weights for policy 1, policy_version 64482 (0.0008) +[2023-10-09 06:47:25,505][60144] Updated weights for policy 1, policy_version 64492 (0.0010) +[2023-10-09 06:47:25,869][60144] Updated weights for policy 1, policy_version 64502 (0.0011) +[2023-10-09 06:47:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 131334144. Throughput: 0: 1714.2, 1: 1733.5. Samples: 32843254. Policy #0 lag: (min: 0.0, avg: 19.4, max: 32.0) +[2023-10-09 06:47:26,053][59242] Avg episode reward: [(0, '32.820'), (1, '32.470')] +[2023-10-09 06:47:26,241][60144] Updated weights for policy 1, policy_version 64512 (0.0007) +[2023-10-09 06:47:28,878][60143] Updated weights for policy 0, policy_version 63782 (0.0009) +[2023-10-09 06:47:29,248][60143] Updated weights for policy 0, policy_version 63792 (0.0009) +[2023-10-09 06:47:29,618][60143] Updated weights for policy 0, policy_version 63802 (0.0008) +[2023-10-09 06:47:30,207][60144] Updated weights for policy 1, policy_version 64522 (0.0009) +[2023-10-09 06:47:30,574][60144] Updated weights for policy 1, policy_version 64532 (0.0010) +[2023-10-09 06:47:30,945][60144] Updated weights for policy 1, policy_version 64542 (0.0011) +[2023-10-09 06:47:31,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 131432448. Throughput: 0: 1694.4, 1: 1715.6. Samples: 32862782. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 06:47:31,053][59242] Avg episode reward: [(0, '32.510'), (1, '32.460')] +[2023-10-09 06:47:31,062][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000063808_65339392.pth... +[2023-10-09 06:47:31,062][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000064544_66093056.pth... +[2023-10-09 06:47:31,098][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000062912_64421888.pth +[2023-10-09 06:47:31,099][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000062208_63700992.pth +[2023-10-09 06:47:31,102][60003] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p1/milestones/checkpoint_000064544_66093056.pth +[2023-10-09 06:47:31,103][59934] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p0/milestones/checkpoint_000063808_65339392.pth +[2023-10-09 06:47:33,546][60143] Updated weights for policy 0, policy_version 63812 (0.0009) +[2023-10-09 06:47:33,921][60143] Updated weights for policy 0, policy_version 63822 (0.0010) +[2023-10-09 06:47:34,281][60143] Updated weights for policy 0, policy_version 63832 (0.0010) +[2023-10-09 06:47:34,712][60144] Updated weights for policy 1, policy_version 64552 (0.0007) +[2023-10-09 06:47:35,077][60144] Updated weights for policy 1, policy_version 64562 (0.0010) +[2023-10-09 06:47:35,440][60144] Updated weights for policy 1, policy_version 64572 (0.0010) +[2023-10-09 06:47:36,052][59242] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 131497984. Throughput: 0: 1719.7, 1: 1735.6. Samples: 32874142. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 06:47:36,052][59242] Avg episode reward: [(0, '31.990'), (1, '32.760')] +[2023-10-09 06:47:38,390][60143] Updated weights for policy 0, policy_version 63842 (0.0008) +[2023-10-09 06:47:38,758][60143] Updated weights for policy 0, policy_version 63852 (0.0007) +[2023-10-09 06:47:39,114][60143] Updated weights for policy 0, policy_version 63862 (0.0010) +[2023-10-09 06:47:39,476][60143] Updated weights for policy 0, policy_version 63872 (0.0007) +[2023-10-09 06:47:39,519][60144] Updated weights for policy 1, policy_version 64582 (0.0008) +[2023-10-09 06:47:39,893][60144] Updated weights for policy 1, policy_version 64592 (0.0007) +[2023-10-09 06:47:40,251][60144] Updated weights for policy 1, policy_version 64602 (0.0008) +[2023-10-09 06:47:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 131563520. Throughput: 0: 1683.6, 1: 1728.4. Samples: 32893632. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 06:47:41,053][59242] Avg episode reward: [(0, '31.010'), (1, '31.470')] +[2023-10-09 06:47:43,599][60143] Updated weights for policy 0, policy_version 63882 (0.0007) +[2023-10-09 06:47:43,967][60143] Updated weights for policy 0, policy_version 63892 (0.0008) +[2023-10-09 06:47:44,251][60144] Updated weights for policy 1, policy_version 64612 (0.0011) +[2023-10-09 06:47:44,331][60143] Updated weights for policy 0, policy_version 63902 (0.0008) +[2023-10-09 06:47:44,611][60144] Updated weights for policy 1, policy_version 64622 (0.0010) +[2023-10-09 06:47:44,984][60144] Updated weights for policy 1, policy_version 64632 (0.0010) +[2023-10-09 06:47:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 131629056. Throughput: 0: 1689.9, 1: 1706.0. Samples: 32913830. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 06:47:46,053][59242] Avg episode reward: [(0, '30.200'), (1, '32.520')] +[2023-10-09 06:47:48,458][60143] Updated weights for policy 0, policy_version 63912 (0.0008) +[2023-10-09 06:47:48,822][60143] Updated weights for policy 0, policy_version 63922 (0.0010) +[2023-10-09 06:47:49,043][60144] Updated weights for policy 1, policy_version 64642 (0.0009) +[2023-10-09 06:47:49,200][60143] Updated weights for policy 0, policy_version 63932 (0.0007) +[2023-10-09 06:47:49,449][60144] Updated weights for policy 1, policy_version 64652 (0.0008) +[2023-10-09 06:47:49,813][60144] Updated weights for policy 1, policy_version 64662 (0.0009) +[2023-10-09 06:47:50,179][60144] Updated weights for policy 1, policy_version 64672 (0.0008) +[2023-10-09 06:47:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 131694592. Throughput: 0: 1694.1, 1: 1738.0. Samples: 32925312. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 06:47:51,053][59242] Avg episode reward: [(0, '29.930'), (1, '31.560')] +[2023-10-09 06:47:53,211][60143] Updated weights for policy 0, policy_version 63942 (0.0007) +[2023-10-09 06:47:53,589][60143] Updated weights for policy 0, policy_version 63952 (0.0008) +[2023-10-09 06:47:53,960][60143] Updated weights for policy 0, policy_version 63962 (0.0009) +[2023-10-09 06:47:53,968][60144] Updated weights for policy 1, policy_version 64682 (0.0008) +[2023-10-09 06:47:54,335][60144] Updated weights for policy 1, policy_version 64692 (0.0009) +[2023-10-09 06:47:54,702][60144] Updated weights for policy 1, policy_version 64702 (0.0008) +[2023-10-09 06:47:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 131760128. Throughput: 0: 1672.7, 1: 1709.5. Samples: 32944304. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 06:47:56,053][59242] Avg episode reward: [(0, '30.680'), (1, '30.700')] +[2023-10-09 06:47:57,963][60143] Updated weights for policy 0, policy_version 63972 (0.0008) +[2023-10-09 06:47:58,331][60143] Updated weights for policy 0, policy_version 63982 (0.0008) +[2023-10-09 06:47:58,694][60143] Updated weights for policy 0, policy_version 63992 (0.0009) +[2023-10-09 06:47:58,698][60144] Updated weights for policy 1, policy_version 64712 (0.0008) +[2023-10-09 06:47:59,062][60144] Updated weights for policy 1, policy_version 64722 (0.0007) +[2023-10-09 06:47:59,435][60144] Updated weights for policy 1, policy_version 64732 (0.0008) +[2023-10-09 06:48:01,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 131825664. Throughput: 0: 1707.9, 1: 1702.0. Samples: 32965266. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 06:48:01,053][59242] Avg episode reward: [(0, '31.780'), (1, '30.630')] +[2023-10-09 06:48:02,594][60143] Updated weights for policy 0, policy_version 64002 (0.0008) +[2023-10-09 06:48:02,973][60143] Updated weights for policy 0, policy_version 64012 (0.0007) +[2023-10-09 06:48:03,340][60143] Updated weights for policy 0, policy_version 64022 (0.0008) +[2023-10-09 06:48:03,491][60144] Updated weights for policy 1, policy_version 64742 (0.0008) +[2023-10-09 06:48:03,710][60143] Updated weights for policy 0, policy_version 64032 (0.0008) +[2023-10-09 06:48:03,857][60144] Updated weights for policy 1, policy_version 64752 (0.0008) +[2023-10-09 06:48:04,224][60144] Updated weights for policy 1, policy_version 64762 (0.0007) +[2023-10-09 06:48:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 131891200. Throughput: 0: 1688.3, 1: 1722.2. Samples: 32975912. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 06:48:06,053][59242] Avg episode reward: [(0, '34.960'), (1, '31.240')] +[2023-10-09 06:48:07,718][60143] Updated weights for policy 0, policy_version 64042 (0.0008) +[2023-10-09 06:48:08,093][60143] Updated weights for policy 0, policy_version 64052 (0.0008) +[2023-10-09 06:48:08,308][60144] Updated weights for policy 1, policy_version 64772 (0.0009) +[2023-10-09 06:48:08,458][60143] Updated weights for policy 0, policy_version 64062 (0.0009) +[2023-10-09 06:48:08,673][60144] Updated weights for policy 1, policy_version 64782 (0.0008) +[2023-10-09 06:48:09,033][60144] Updated weights for policy 1, policy_version 64792 (0.0008) +[2023-10-09 06:48:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 131956736. Throughput: 0: 1689.5, 1: 1697.9. Samples: 32995686. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-09 06:48:11,053][59242] Avg episode reward: [(0, '32.850'), (1, '32.510')] +[2023-10-09 06:48:12,435][60143] Updated weights for policy 0, policy_version 64072 (0.0009) +[2023-10-09 06:48:12,814][60143] Updated weights for policy 0, policy_version 64082 (0.0008) +[2023-10-09 06:48:13,067][60144] Updated weights for policy 1, policy_version 64802 (0.0007) +[2023-10-09 06:48:13,182][60143] Updated weights for policy 0, policy_version 64092 (0.0009) +[2023-10-09 06:48:13,433][60144] Updated weights for policy 1, policy_version 64812 (0.0007) +[2023-10-09 06:48:13,809][60144] Updated weights for policy 1, policy_version 64822 (0.0007) +[2023-10-09 06:48:14,188][60144] Updated weights for policy 1, policy_version 64832 (0.0011) +[2023-10-09 06:48:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 132022272. Throughput: 0: 1707.5, 1: 1715.8. Samples: 33016830. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:48:16,053][59242] Avg episode reward: [(0, '34.290'), (1, '33.150')] +[2023-10-09 06:48:17,271][60143] Updated weights for policy 0, policy_version 64102 (0.0009) +[2023-10-09 06:48:17,647][60143] Updated weights for policy 0, policy_version 64112 (0.0009) +[2023-10-09 06:48:18,014][60143] Updated weights for policy 0, policy_version 64122 (0.0010) +[2023-10-09 06:48:18,190][60144] Updated weights for policy 1, policy_version 64842 (0.0008) +[2023-10-09 06:48:18,556][60144] Updated weights for policy 1, policy_version 64852 (0.0007) +[2023-10-09 06:48:18,930][60144] Updated weights for policy 1, policy_version 64862 (0.0009) +[2023-10-09 06:48:21,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 132087808. Throughput: 0: 1682.7, 1: 1704.6. Samples: 33026574. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:48:21,053][59242] Avg episode reward: [(0, '33.480'), (1, '33.110')] +[2023-10-09 06:48:22,122][60143] Updated weights for policy 0, policy_version 64132 (0.0007) +[2023-10-09 06:48:22,490][60143] Updated weights for policy 0, policy_version 64142 (0.0008) +[2023-10-09 06:48:22,681][60144] Updated weights for policy 1, policy_version 64872 (0.0008) +[2023-10-09 06:48:22,866][60143] Updated weights for policy 0, policy_version 64152 (0.0009) +[2023-10-09 06:48:23,050][60144] Updated weights for policy 1, policy_version 64882 (0.0009) +[2023-10-09 06:48:23,413][60144] Updated weights for policy 1, policy_version 64892 (0.0011) +[2023-10-09 06:48:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 132153344. Throughput: 0: 1713.5, 1: 1704.1. Samples: 33047420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:48:26,052][59242] Avg episode reward: [(0, '34.300'), (1, '33.800')] +[2023-10-09 06:48:26,801][60143] Updated weights for policy 0, policy_version 64162 (0.0009) +[2023-10-09 06:48:27,168][60143] Updated weights for policy 0, policy_version 64172 (0.0008) +[2023-10-09 06:48:27,380][60144] Updated weights for policy 1, policy_version 64902 (0.0009) +[2023-10-09 06:48:27,536][60143] Updated weights for policy 0, policy_version 64182 (0.0007) +[2023-10-09 06:48:27,751][60144] Updated weights for policy 1, policy_version 64912 (0.0007) +[2023-10-09 06:48:27,906][60143] Updated weights for policy 0, policy_version 64192 (0.0009) +[2023-10-09 06:48:28,116][60144] Updated weights for policy 1, policy_version 64922 (0.0007) +[2023-10-09 06:48:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 132218880. Throughput: 0: 1715.2, 1: 1722.3. Samples: 33068518. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:48:31,053][59242] Avg episode reward: [(0, '34.200'), (1, '32.330')] +[2023-10-09 06:48:31,877][60143] Updated weights for policy 0, policy_version 64202 (0.0009) +[2023-10-09 06:48:32,214][60144] Updated weights for policy 1, policy_version 64932 (0.0008) +[2023-10-09 06:48:32,242][60143] Updated weights for policy 0, policy_version 64212 (0.0007) +[2023-10-09 06:48:32,579][60144] Updated weights for policy 1, policy_version 64942 (0.0008) +[2023-10-09 06:48:32,608][60143] Updated weights for policy 0, policy_version 64222 (0.0007) +[2023-10-09 06:48:32,951][60144] Updated weights for policy 1, policy_version 64952 (0.0009) +[2023-10-09 06:48:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 132284416. Throughput: 0: 1702.0, 1: 1691.3. Samples: 33078010. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:48:36,052][59242] Avg episode reward: [(0, '35.350'), (1, '32.330')] +[2023-10-09 06:48:36,523][60143] Updated weights for policy 0, policy_version 64232 (0.0009) +[2023-10-09 06:48:36,867][60144] Updated weights for policy 1, policy_version 64962 (0.0008) +[2023-10-09 06:48:36,897][60143] Updated weights for policy 0, policy_version 64242 (0.0008) +[2023-10-09 06:48:37,230][60144] Updated weights for policy 1, policy_version 64972 (0.0007) +[2023-10-09 06:48:37,272][60143] Updated weights for policy 0, policy_version 64252 (0.0008) +[2023-10-09 06:48:37,597][60144] Updated weights for policy 1, policy_version 64982 (0.0007) +[2023-10-09 06:48:37,969][60144] Updated weights for policy 1, policy_version 64992 (0.0009) +[2023-10-09 06:48:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 132349952. Throughput: 0: 1727.5, 1: 1717.8. Samples: 33099342. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:48:41,053][59242] Avg episode reward: [(0, '34.840'), (1, '32.550')] +[2023-10-09 06:48:41,265][60143] Updated weights for policy 0, policy_version 64262 (0.0008) +[2023-10-09 06:48:41,644][60143] Updated weights for policy 0, policy_version 64272 (0.0007) +[2023-10-09 06:48:42,012][60143] Updated weights for policy 0, policy_version 64282 (0.0009) +[2023-10-09 06:48:42,023][60144] Updated weights for policy 1, policy_version 65002 (0.0008) +[2023-10-09 06:48:42,390][60144] Updated weights for policy 1, policy_version 65012 (0.0007) +[2023-10-09 06:48:42,767][60144] Updated weights for policy 1, policy_version 65022 (0.0008) +[2023-10-09 06:48:46,018][60143] Updated weights for policy 0, policy_version 64292 (0.0008) +[2023-10-09 06:48:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 132415488. Throughput: 0: 1721.5, 1: 1726.9. Samples: 33120446. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:48:46,053][59242] Avg episode reward: [(0, '33.180'), (1, '32.790')] +[2023-10-09 06:48:46,386][60143] Updated weights for policy 0, policy_version 64302 (0.0009) +[2023-10-09 06:48:46,703][60144] Updated weights for policy 1, policy_version 65032 (0.0007) +[2023-10-09 06:48:46,759][60143] Updated weights for policy 0, policy_version 64312 (0.0008) +[2023-10-09 06:48:47,075][60144] Updated weights for policy 1, policy_version 65042 (0.0007) +[2023-10-09 06:48:47,434][60144] Updated weights for policy 1, policy_version 65052 (0.0009) +[2023-10-09 06:48:50,636][60143] Updated weights for policy 0, policy_version 64322 (0.0009) +[2023-10-09 06:48:51,002][60143] Updated weights for policy 0, policy_version 64332 (0.0009) +[2023-10-09 06:48:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 132481024. Throughput: 0: 1713.4, 1: 1706.6. Samples: 33129814. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:48:51,052][59242] Avg episode reward: [(0, '34.280'), (1, '32.420')] +[2023-10-09 06:48:51,362][60143] Updated weights for policy 0, policy_version 64342 (0.0009) +[2023-10-09 06:48:51,599][60144] Updated weights for policy 1, policy_version 65062 (0.0007) +[2023-10-09 06:48:51,735][60143] Updated weights for policy 0, policy_version 64352 (0.0007) +[2023-10-09 06:48:51,971][60144] Updated weights for policy 1, policy_version 65072 (0.0009) +[2023-10-09 06:48:52,332][60144] Updated weights for policy 1, policy_version 65082 (0.0008) +[2023-10-09 06:48:55,702][60143] Updated weights for policy 0, policy_version 64362 (0.0008) +[2023-10-09 06:48:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 132546560. Throughput: 0: 1725.5, 1: 1726.5. Samples: 33151026. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:48:56,053][59242] Avg episode reward: [(0, '34.640'), (1, '32.570')] +[2023-10-09 06:48:56,070][60143] Updated weights for policy 0, policy_version 64372 (0.0007) +[2023-10-09 06:48:56,167][60144] Updated weights for policy 1, policy_version 65092 (0.0007) +[2023-10-09 06:48:56,448][60143] Updated weights for policy 0, policy_version 64382 (0.0009) +[2023-10-09 06:48:56,540][60144] Updated weights for policy 1, policy_version 65102 (0.0009) +[2023-10-09 06:48:56,912][60144] Updated weights for policy 1, policy_version 65112 (0.0011) +[2023-10-09 06:49:00,245][60143] Updated weights for policy 0, policy_version 64392 (0.0010) +[2023-10-09 06:49:00,615][60143] Updated weights for policy 0, policy_version 64402 (0.0010) +[2023-10-09 06:49:00,821][60144] Updated weights for policy 1, policy_version 65122 (0.0007) +[2023-10-09 06:49:00,985][60143] Updated weights for policy 0, policy_version 64412 (0.0007) +[2023-10-09 06:49:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 132612096. Throughput: 0: 1711.4, 1: 1728.0. Samples: 33171602. Policy #0 lag: (min: 2.0, avg: 5.8, max: 34.0) +[2023-10-09 06:49:01,053][59242] Avg episode reward: [(0, '34.100'), (1, '33.030')] +[2023-10-09 06:49:01,179][60144] Updated weights for policy 1, policy_version 65132 (0.0008) +[2023-10-09 06:49:01,540][60144] Updated weights for policy 1, policy_version 65142 (0.0010) +[2023-10-09 06:49:01,911][60144] Updated weights for policy 1, policy_version 65152 (0.0010) +[2023-10-09 06:49:04,924][60143] Updated weights for policy 0, policy_version 64422 (0.0007) +[2023-10-09 06:49:05,287][60143] Updated weights for policy 0, policy_version 64432 (0.0007) +[2023-10-09 06:49:05,655][60143] Updated weights for policy 0, policy_version 64442 (0.0009) +[2023-10-09 06:49:05,836][60144] Updated weights for policy 1, policy_version 65162 (0.0007) +[2023-10-09 06:49:06,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 132710400. Throughput: 0: 1725.7, 1: 1720.0. Samples: 33181628. Policy #0 lag: (min: 2.0, avg: 5.8, max: 34.0) +[2023-10-09 06:49:06,053][59242] Avg episode reward: [(0, '35.180'), (1, '32.370')] +[2023-10-09 06:49:06,202][60144] Updated weights for policy 1, policy_version 65172 (0.0009) +[2023-10-09 06:49:06,562][60144] Updated weights for policy 1, policy_version 65182 (0.0010) +[2023-10-09 06:49:09,652][60143] Updated weights for policy 0, policy_version 64452 (0.0010) +[2023-10-09 06:49:10,015][60143] Updated weights for policy 0, policy_version 64462 (0.0009) +[2023-10-09 06:49:10,394][60143] Updated weights for policy 0, policy_version 64472 (0.0008) +[2023-10-09 06:49:10,529][60144] Updated weights for policy 1, policy_version 65192 (0.0009) +[2023-10-09 06:49:10,895][60144] Updated weights for policy 1, policy_version 65202 (0.0008) +[2023-10-09 06:49:11,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 132775936. Throughput: 0: 1726.6, 1: 1729.0. Samples: 33202922. Policy #0 lag: (min: 2.0, avg: 5.8, max: 34.0) +[2023-10-09 06:49:11,053][59242] Avg episode reward: [(0, '36.120'), (1, '32.300')] +[2023-10-09 06:49:11,251][60144] Updated weights for policy 1, policy_version 65212 (0.0008) +[2023-10-09 06:49:14,541][60143] Updated weights for policy 0, policy_version 64482 (0.0009) +[2023-10-09 06:49:14,916][60143] Updated weights for policy 0, policy_version 64492 (0.0010) +[2023-10-09 06:49:15,140][60144] Updated weights for policy 1, policy_version 65222 (0.0008) +[2023-10-09 06:49:15,280][60143] Updated weights for policy 0, policy_version 64502 (0.0008) +[2023-10-09 06:49:15,507][60144] Updated weights for policy 1, policy_version 65232 (0.0008) +[2023-10-09 06:49:15,646][60143] Updated weights for policy 0, policy_version 64512 (0.0009) +[2023-10-09 06:49:15,867][60144] Updated weights for policy 1, policy_version 65242 (0.0008) +[2023-10-09 06:49:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 132841472. Throughput: 0: 1700.7, 1: 1716.0. Samples: 33222270. Policy #0 lag: (min: 2.0, avg: 5.8, max: 34.0) +[2023-10-09 06:49:16,053][59242] Avg episode reward: [(0, '35.990'), (1, '31.530')] +[2023-10-09 06:49:19,677][60143] Updated weights for policy 0, policy_version 64522 (0.0009) +[2023-10-09 06:49:19,732][60144] Updated weights for policy 1, policy_version 65252 (0.0008) +[2023-10-09 06:49:20,049][60143] Updated weights for policy 0, policy_version 64532 (0.0007) +[2023-10-09 06:49:20,099][60144] Updated weights for policy 1, policy_version 65262 (0.0008) +[2023-10-09 06:49:20,425][60143] Updated weights for policy 0, policy_version 64542 (0.0007) +[2023-10-09 06:49:20,458][60144] Updated weights for policy 1, policy_version 65272 (0.0010) +[2023-10-09 06:49:21,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 132939776. Throughput: 0: 1717.4, 1: 1738.6. Samples: 33233530. Policy #0 lag: (min: 2.0, avg: 5.8, max: 34.0) +[2023-10-09 06:49:21,053][59242] Avg episode reward: [(0, '36.410'), (1, '33.370')] +[2023-10-09 06:49:24,384][60144] Updated weights for policy 1, policy_version 65282 (0.0008) +[2023-10-09 06:49:24,476][60143] Updated weights for policy 0, policy_version 64552 (0.0008) +[2023-10-09 06:49:24,758][60144] Updated weights for policy 1, policy_version 65292 (0.0008) +[2023-10-09 06:49:24,851][60143] Updated weights for policy 0, policy_version 64562 (0.0009) +[2023-10-09 06:49:25,127][60144] Updated weights for policy 1, policy_version 65302 (0.0007) +[2023-10-09 06:49:25,210][60143] Updated weights for policy 0, policy_version 64572 (0.0008) +[2023-10-09 06:49:25,490][60144] Updated weights for policy 1, policy_version 65312 (0.0010) +[2023-10-09 06:49:26,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 133005312. Throughput: 0: 1702.5, 1: 1733.5. Samples: 33253962. Policy #0 lag: (min: 2.0, avg: 5.8, max: 34.0) +[2023-10-09 06:49:26,053][59242] Avg episode reward: [(0, '35.780'), (1, '34.470')] +[2023-10-09 06:49:29,166][60143] Updated weights for policy 0, policy_version 64582 (0.0009) +[2023-10-09 06:49:29,528][60143] Updated weights for policy 0, policy_version 64592 (0.0008) +[2023-10-09 06:49:29,608][60144] Updated weights for policy 1, policy_version 65322 (0.0007) +[2023-10-09 06:49:29,897][60143] Updated weights for policy 0, policy_version 64602 (0.0009) +[2023-10-09 06:49:29,974][60144] Updated weights for policy 1, policy_version 65332 (0.0008) +[2023-10-09 06:49:30,342][60144] Updated weights for policy 1, policy_version 65342 (0.0009) +[2023-10-09 06:49:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 133070848. Throughput: 0: 1685.6, 1: 1701.0. Samples: 33272844. Policy #0 lag: (min: 2.0, avg: 5.8, max: 34.0) +[2023-10-09 06:49:31,053][59242] Avg episode reward: [(0, '35.400'), (1, '32.910')] +[2023-10-09 06:49:31,063][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000065344_66912256.pth... +[2023-10-09 06:49:31,063][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000064608_66158592.pth... +[2023-10-09 06:49:31,099][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000063008_64520192.pth +[2023-10-09 06:49:31,100][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000063712_65241088.pth +[2023-10-09 06:49:34,014][60143] Updated weights for policy 0, policy_version 64612 (0.0007) +[2023-10-09 06:49:34,244][60144] Updated weights for policy 1, policy_version 65352 (0.0007) +[2023-10-09 06:49:34,387][60143] Updated weights for policy 0, policy_version 64622 (0.0008) +[2023-10-09 06:49:34,613][60144] Updated weights for policy 1, policy_version 65362 (0.0007) +[2023-10-09 06:49:34,755][60143] Updated weights for policy 0, policy_version 64632 (0.0008) +[2023-10-09 06:49:34,979][60144] Updated weights for policy 1, policy_version 65372 (0.0007) +[2023-10-09 06:49:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 133136384. Throughput: 0: 1715.5, 1: 1729.5. Samples: 33284844. Policy #0 lag: (min: 2.0, avg: 5.8, max: 34.0) +[2023-10-09 06:49:36,053][59242] Avg episode reward: [(0, '37.800'), (1, '33.760')] +[2023-10-09 06:49:36,054][59934] Saving new best policy, reward=37.800! +[2023-10-09 06:49:38,687][60143] Updated weights for policy 0, policy_version 64642 (0.0008) +[2023-10-09 06:49:38,752][60144] Updated weights for policy 1, policy_version 65382 (0.0009) +[2023-10-09 06:49:39,053][60143] Updated weights for policy 0, policy_version 64652 (0.0008) +[2023-10-09 06:49:39,114][60144] Updated weights for policy 1, policy_version 65392 (0.0009) +[2023-10-09 06:49:39,423][60143] Updated weights for policy 0, policy_version 64662 (0.0010) +[2023-10-09 06:49:39,482][60144] Updated weights for policy 1, policy_version 65402 (0.0008) +[2023-10-09 06:49:39,790][60143] Updated weights for policy 0, policy_version 64672 (0.0010) +[2023-10-09 06:49:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 133201920. Throughput: 0: 1690.6, 1: 1706.4. Samples: 33303888. Policy #0 lag: (min: 4.0, avg: 4.5, max: 19.0) +[2023-10-09 06:49:41,053][59242] Avg episode reward: [(0, '35.220'), (1, '33.990')] +[2023-10-09 06:49:43,549][60144] Updated weights for policy 1, policy_version 65412 (0.0008) +[2023-10-09 06:49:43,853][60143] Updated weights for policy 0, policy_version 64682 (0.0008) +[2023-10-09 06:49:43,917][60144] Updated weights for policy 1, policy_version 65422 (0.0008) +[2023-10-09 06:49:44,222][60143] Updated weights for policy 0, policy_version 64692 (0.0008) +[2023-10-09 06:49:44,280][60144] Updated weights for policy 1, policy_version 65432 (0.0007) +[2023-10-09 06:49:44,583][60143] Updated weights for policy 0, policy_version 64702 (0.0009) +[2023-10-09 06:49:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 133267456. Throughput: 0: 1694.0, 1: 1702.9. Samples: 33324462. Policy #0 lag: (min: 4.0, avg: 4.5, max: 19.0) +[2023-10-09 06:49:46,053][59242] Avg episode reward: [(0, '34.830'), (1, '34.300')] +[2023-10-09 06:49:48,429][60144] Updated weights for policy 1, policy_version 65442 (0.0009) +[2023-10-09 06:49:48,634][60143] Updated weights for policy 0, policy_version 64712 (0.0009) +[2023-10-09 06:49:48,800][60144] Updated weights for policy 1, policy_version 65452 (0.0007) +[2023-10-09 06:49:49,000][60143] Updated weights for policy 0, policy_version 64722 (0.0007) +[2023-10-09 06:49:49,160][60144] Updated weights for policy 1, policy_version 65462 (0.0007) +[2023-10-09 06:49:49,372][60143] Updated weights for policy 0, policy_version 64732 (0.0007) +[2023-10-09 06:49:49,527][60144] Updated weights for policy 1, policy_version 65472 (0.0007) +[2023-10-09 06:49:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 133332992. Throughput: 0: 1698.5, 1: 1721.6. Samples: 33335532. Policy #0 lag: (min: 4.0, avg: 4.5, max: 19.0) +[2023-10-09 06:49:51,053][59242] Avg episode reward: [(0, '34.370'), (1, '34.360')] +[2023-10-09 06:49:53,393][60143] Updated weights for policy 0, policy_version 64742 (0.0007) +[2023-10-09 06:49:53,587][60144] Updated weights for policy 1, policy_version 65482 (0.0007) +[2023-10-09 06:49:53,754][60143] Updated weights for policy 0, policy_version 64752 (0.0007) +[2023-10-09 06:49:53,953][60144] Updated weights for policy 1, policy_version 65492 (0.0007) +[2023-10-09 06:49:54,121][60143] Updated weights for policy 0, policy_version 64762 (0.0008) +[2023-10-09 06:49:54,316][60144] Updated weights for policy 1, policy_version 65502 (0.0008) +[2023-10-09 06:49:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 133398528. Throughput: 0: 1675.3, 1: 1699.8. Samples: 33354802. Policy #0 lag: (min: 4.0, avg: 4.5, max: 19.0) +[2023-10-09 06:49:56,052][59242] Avg episode reward: [(0, '33.670'), (1, '34.200')] +[2023-10-09 06:49:58,126][60144] Updated weights for policy 1, policy_version 65512 (0.0008) +[2023-10-09 06:49:58,147][60143] Updated weights for policy 0, policy_version 64772 (0.0009) +[2023-10-09 06:49:58,495][60144] Updated weights for policy 1, policy_version 65522 (0.0008) +[2023-10-09 06:49:58,526][60143] Updated weights for policy 0, policy_version 64782 (0.0007) +[2023-10-09 06:49:58,874][60144] Updated weights for policy 1, policy_version 65532 (0.0010) +[2023-10-09 06:49:58,887][60143] Updated weights for policy 0, policy_version 64792 (0.0007) +[2023-10-09 06:50:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 133464064. Throughput: 0: 1703.7, 1: 1722.7. Samples: 33376454. Policy #0 lag: (min: 4.0, avg: 4.5, max: 19.0) +[2023-10-09 06:50:01,053][59242] Avg episode reward: [(0, '35.700'), (1, '33.400')] +[2023-10-09 06:50:02,795][60144] Updated weights for policy 1, policy_version 65542 (0.0009) +[2023-10-09 06:50:02,941][60143] Updated weights for policy 0, policy_version 64802 (0.0009) +[2023-10-09 06:50:03,158][60144] Updated weights for policy 1, policy_version 65552 (0.0009) +[2023-10-09 06:50:03,315][60143] Updated weights for policy 0, policy_version 64812 (0.0007) +[2023-10-09 06:50:03,520][60144] Updated weights for policy 1, policy_version 65562 (0.0009) +[2023-10-09 06:50:03,685][60143] Updated weights for policy 0, policy_version 64822 (0.0007) +[2023-10-09 06:50:04,050][60143] Updated weights for policy 0, policy_version 64832 (0.0009) +[2023-10-09 06:50:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 133529600. Throughput: 0: 1693.1, 1: 1709.3. Samples: 33386640. Policy #0 lag: (min: 4.0, avg: 4.5, max: 19.0) +[2023-10-09 06:50:06,053][59242] Avg episode reward: [(0, '35.700'), (1, '33.230')] +[2023-10-09 06:50:07,522][60144] Updated weights for policy 1, policy_version 65572 (0.0007) +[2023-10-09 06:50:07,789][60143] Updated weights for policy 0, policy_version 64842 (0.0007) +[2023-10-09 06:50:07,885][60144] Updated weights for policy 1, policy_version 65582 (0.0009) +[2023-10-09 06:50:08,150][60143] Updated weights for policy 0, policy_version 64852 (0.0008) +[2023-10-09 06:50:08,250][60144] Updated weights for policy 1, policy_version 65592 (0.0008) +[2023-10-09 06:50:08,525][60143] Updated weights for policy 0, policy_version 64862 (0.0007) +[2023-10-09 06:50:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 133595136. Throughput: 0: 1697.5, 1: 1699.8. Samples: 33406840. Policy #0 lag: (min: 4.0, avg: 4.5, max: 19.0) +[2023-10-09 06:50:11,053][59242] Avg episode reward: [(0, '34.360'), (1, '31.970')] +[2023-10-09 06:50:12,279][60144] Updated weights for policy 1, policy_version 65602 (0.0008) +[2023-10-09 06:50:12,562][60143] Updated weights for policy 0, policy_version 64872 (0.0007) +[2023-10-09 06:50:12,646][60144] Updated weights for policy 1, policy_version 65612 (0.0008) +[2023-10-09 06:50:12,926][60143] Updated weights for policy 0, policy_version 64882 (0.0007) +[2023-10-09 06:50:13,019][60144] Updated weights for policy 1, policy_version 65622 (0.0008) +[2023-10-09 06:50:13,296][60143] Updated weights for policy 0, policy_version 64892 (0.0008) +[2023-10-09 06:50:13,376][60144] Updated weights for policy 1, policy_version 65632 (0.0009) +[2023-10-09 06:50:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 133660672. Throughput: 0: 1711.6, 1: 1729.9. Samples: 33427712. Policy #0 lag: (min: 4.0, avg: 4.5, max: 19.0) +[2023-10-09 06:50:16,053][59242] Avg episode reward: [(0, '34.260'), (1, '32.150')] +[2023-10-09 06:50:17,376][60144] Updated weights for policy 1, policy_version 65642 (0.0009) +[2023-10-09 06:50:17,481][60143] Updated weights for policy 0, policy_version 64902 (0.0009) +[2023-10-09 06:50:17,745][60144] Updated weights for policy 1, policy_version 65652 (0.0008) +[2023-10-09 06:50:17,867][60143] Updated weights for policy 0, policy_version 64912 (0.0008) +[2023-10-09 06:50:18,111][60144] Updated weights for policy 1, policy_version 65662 (0.0009) +[2023-10-09 06:50:18,236][60143] Updated weights for policy 0, policy_version 64922 (0.0008) +[2023-10-09 06:50:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 133726208. Throughput: 0: 1678.7, 1: 1697.5. Samples: 33436772. Policy #0 lag: (min: 4.0, avg: 4.5, max: 19.0) +[2023-10-09 06:50:21,053][59242] Avg episode reward: [(0, '35.460'), (1, '32.130')] +[2023-10-09 06:50:22,099][60144] Updated weights for policy 1, policy_version 65672 (0.0009) +[2023-10-09 06:50:22,196][60143] Updated weights for policy 0, policy_version 64932 (0.0007) +[2023-10-09 06:50:22,460][60144] Updated weights for policy 1, policy_version 65682 (0.0007) +[2023-10-09 06:50:22,567][60143] Updated weights for policy 0, policy_version 64942 (0.0007) +[2023-10-09 06:50:22,837][60144] Updated weights for policy 1, policy_version 65692 (0.0009) +[2023-10-09 06:50:22,944][60143] Updated weights for policy 0, policy_version 64952 (0.0010) +[2023-10-09 06:50:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 133791744. Throughput: 0: 1702.4, 1: 1723.7. Samples: 33458060. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:50:26,052][59242] Avg episode reward: [(0, '35.150'), (1, '32.610')] +[2023-10-09 06:50:26,787][60144] Updated weights for policy 1, policy_version 65702 (0.0008) +[2023-10-09 06:50:26,918][60143] Updated weights for policy 0, policy_version 64962 (0.0010) +[2023-10-09 06:50:27,155][60144] Updated weights for policy 1, policy_version 65712 (0.0009) +[2023-10-09 06:50:27,295][60143] Updated weights for policy 0, policy_version 64972 (0.0008) +[2023-10-09 06:50:27,520][60144] Updated weights for policy 1, policy_version 65722 (0.0010) +[2023-10-09 06:50:27,666][60143] Updated weights for policy 0, policy_version 64982 (0.0009) +[2023-10-09 06:50:28,040][60143] Updated weights for policy 0, policy_version 64992 (0.0008) +[2023-10-09 06:50:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 133857280. Throughput: 0: 1708.2, 1: 1726.9. Samples: 33479042. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:50:31,053][59242] Avg episode reward: [(0, '35.200'), (1, '30.430')] +[2023-10-09 06:50:31,561][60144] Updated weights for policy 1, policy_version 65732 (0.0009) +[2023-10-09 06:50:31,937][60144] Updated weights for policy 1, policy_version 65742 (0.0008) +[2023-10-09 06:50:31,968][60143] Updated weights for policy 0, policy_version 65002 (0.0010) +[2023-10-09 06:50:32,292][60144] Updated weights for policy 1, policy_version 65752 (0.0009) +[2023-10-09 06:50:32,340][60143] Updated weights for policy 0, policy_version 65012 (0.0008) +[2023-10-09 06:50:32,713][60143] Updated weights for policy 0, policy_version 65022 (0.0007) +[2023-10-09 06:50:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 133922816. Throughput: 0: 1694.4, 1: 1702.8. Samples: 33488408. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:50:36,053][59242] Avg episode reward: [(0, '34.420'), (1, '31.570')] +[2023-10-09 06:50:36,225][60144] Updated weights for policy 1, policy_version 65762 (0.0008) +[2023-10-09 06:50:36,597][60144] Updated weights for policy 1, policy_version 65772 (0.0009) +[2023-10-09 06:50:36,627][60143] Updated weights for policy 0, policy_version 65032 (0.0008) +[2023-10-09 06:50:36,961][60144] Updated weights for policy 1, policy_version 65782 (0.0010) +[2023-10-09 06:50:36,998][60143] Updated weights for policy 0, policy_version 65042 (0.0007) +[2023-10-09 06:50:37,333][60144] Updated weights for policy 1, policy_version 65792 (0.0009) +[2023-10-09 06:50:37,368][60143] Updated weights for policy 0, policy_version 65052 (0.0008) +[2023-10-09 06:50:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 133988352. Throughput: 0: 1710.0, 1: 1723.4. Samples: 33509306. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:50:41,052][59242] Avg episode reward: [(0, '34.390'), (1, '32.730')] +[2023-10-09 06:50:41,334][60143] Updated weights for policy 0, policy_version 65062 (0.0008) +[2023-10-09 06:50:41,553][60144] Updated weights for policy 1, policy_version 65802 (0.0007) +[2023-10-09 06:50:41,703][60143] Updated weights for policy 0, policy_version 65072 (0.0007) +[2023-10-09 06:50:41,920][60144] Updated weights for policy 1, policy_version 65812 (0.0009) +[2023-10-09 06:50:42,078][60143] Updated weights for policy 0, policy_version 65082 (0.0009) +[2023-10-09 06:50:42,277][60144] Updated weights for policy 1, policy_version 65822 (0.0007) +[2023-10-09 06:50:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 134053888. Throughput: 0: 1704.0, 1: 1711.3. Samples: 33530144. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:50:46,052][59242] Avg episode reward: [(0, '33.170'), (1, '32.550')] +[2023-10-09 06:50:46,063][60143] Updated weights for policy 0, policy_version 65092 (0.0008) +[2023-10-09 06:50:46,346][60144] Updated weights for policy 1, policy_version 65832 (0.0007) +[2023-10-09 06:50:46,433][60143] Updated weights for policy 0, policy_version 65102 (0.0007) +[2023-10-09 06:50:46,710][60144] Updated weights for policy 1, policy_version 65842 (0.0008) +[2023-10-09 06:50:46,804][60143] Updated weights for policy 0, policy_version 65112 (0.0008) +[2023-10-09 06:50:47,080][60144] Updated weights for policy 1, policy_version 65852 (0.0008) +[2023-10-09 06:50:51,018][60143] Updated weights for policy 0, policy_version 65122 (0.0008) +[2023-10-09 06:50:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 134119424. Throughput: 0: 1692.2, 1: 1701.7. Samples: 33539366. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:50:51,053][59242] Avg episode reward: [(0, '33.890'), (1, '32.240')] +[2023-10-09 06:50:51,113][60144] Updated weights for policy 1, policy_version 65862 (0.0007) +[2023-10-09 06:50:51,394][60143] Updated weights for policy 0, policy_version 65132 (0.0009) +[2023-10-09 06:50:51,483][60144] Updated weights for policy 1, policy_version 65872 (0.0008) +[2023-10-09 06:50:51,759][60143] Updated weights for policy 0, policy_version 65142 (0.0007) +[2023-10-09 06:50:51,845][60144] Updated weights for policy 1, policy_version 65882 (0.0008) +[2023-10-09 06:50:52,125][60143] Updated weights for policy 0, policy_version 65152 (0.0007) +[2023-10-09 06:50:55,793][60144] Updated weights for policy 1, policy_version 65892 (0.0009) +[2023-10-09 06:50:56,039][60143] Updated weights for policy 0, policy_version 65162 (0.0008) +[2023-10-09 06:50:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 134184960. Throughput: 0: 1703.2, 1: 1714.0. Samples: 33560610. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:50:56,052][59242] Avg episode reward: [(0, '34.770'), (1, '31.430')] +[2023-10-09 06:50:56,167][60144] Updated weights for policy 1, policy_version 65902 (0.0007) +[2023-10-09 06:50:56,396][60143] Updated weights for policy 0, policy_version 65172 (0.0008) +[2023-10-09 06:50:56,522][60144] Updated weights for policy 1, policy_version 65912 (0.0008) +[2023-10-09 06:50:56,768][60143] Updated weights for policy 0, policy_version 65182 (0.0008) +[2023-10-09 06:51:00,551][60144] Updated weights for policy 1, policy_version 65922 (0.0008) +[2023-10-09 06:51:00,563][60143] Updated weights for policy 0, policy_version 65192 (0.0008) +[2023-10-09 06:51:00,903][60144] Updated weights for policy 1, policy_version 65932 (0.0009) +[2023-10-09 06:51:00,932][60143] Updated weights for policy 0, policy_version 65202 (0.0007) +[2023-10-09 06:51:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 134250496. Throughput: 0: 1707.3, 1: 1710.0. Samples: 33581492. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:51:01,053][59242] Avg episode reward: [(0, '32.660'), (1, '31.860')] +[2023-10-09 06:51:01,275][60144] Updated weights for policy 1, policy_version 65942 (0.0009) +[2023-10-09 06:51:01,302][60143] Updated weights for policy 0, policy_version 65212 (0.0007) +[2023-10-09 06:51:01,643][60144] Updated weights for policy 1, policy_version 65952 (0.0009) +[2023-10-09 06:51:05,208][60143] Updated weights for policy 0, policy_version 65222 (0.0010) +[2023-10-09 06:51:05,574][60143] Updated weights for policy 0, policy_version 65232 (0.0010) +[2023-10-09 06:51:05,664][60144] Updated weights for policy 1, policy_version 65962 (0.0007) +[2023-10-09 06:51:05,951][60143] Updated weights for policy 0, policy_version 65242 (0.0008) +[2023-10-09 06:51:06,037][60144] Updated weights for policy 1, policy_version 65972 (0.0008) +[2023-10-09 06:51:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 134316032. Throughput: 0: 1717.1, 1: 1713.0. Samples: 33591128. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:51:06,053][59242] Avg episode reward: [(0, '32.570'), (1, '32.830')] +[2023-10-09 06:51:06,408][60144] Updated weights for policy 1, policy_version 65982 (0.0008) +[2023-10-09 06:51:10,179][60143] Updated weights for policy 0, policy_version 65252 (0.0008) +[2023-10-09 06:51:10,281][60144] Updated weights for policy 1, policy_version 65992 (0.0009) +[2023-10-09 06:51:10,545][60143] Updated weights for policy 0, policy_version 65262 (0.0007) +[2023-10-09 06:51:10,641][60144] Updated weights for policy 1, policy_version 66002 (0.0008) +[2023-10-09 06:51:10,908][60143] Updated weights for policy 0, policy_version 65272 (0.0010) +[2023-10-09 06:51:11,002][60144] Updated weights for policy 1, policy_version 66012 (0.0008) +[2023-10-09 06:51:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 134381568. Throughput: 0: 1712.9, 1: 1709.2. Samples: 33612056. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:51:11,052][59242] Avg episode reward: [(0, '33.020'), (1, '33.190')] +[2023-10-09 06:51:14,931][60143] Updated weights for policy 0, policy_version 65282 (0.0008) +[2023-10-09 06:51:15,019][60144] Updated weights for policy 1, policy_version 66022 (0.0007) +[2023-10-09 06:51:15,306][60143] Updated weights for policy 0, policy_version 65292 (0.0008) +[2023-10-09 06:51:15,382][60144] Updated weights for policy 1, policy_version 66032 (0.0008) +[2023-10-09 06:51:15,669][60143] Updated weights for policy 0, policy_version 65302 (0.0011) +[2023-10-09 06:51:15,749][60144] Updated weights for policy 1, policy_version 66042 (0.0007) +[2023-10-09 06:51:16,037][60143] Updated weights for policy 0, policy_version 65312 (0.0010) +[2023-10-09 06:51:16,052][59242] Fps is (10 sec: 19660.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 134512640. Throughput: 0: 1701.0, 1: 1690.5. Samples: 33631660. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:51:16,052][59242] Avg episode reward: [(0, '33.900'), (1, '31.560')] +[2023-10-09 06:51:19,702][60144] Updated weights for policy 1, policy_version 66052 (0.0007) +[2023-10-09 06:51:20,034][60143] Updated weights for policy 0, policy_version 65322 (0.0008) +[2023-10-09 06:51:20,075][60144] Updated weights for policy 1, policy_version 66062 (0.0009) +[2023-10-09 06:51:20,406][60143] Updated weights for policy 0, policy_version 65332 (0.0008) +[2023-10-09 06:51:20,440][60144] Updated weights for policy 1, policy_version 66072 (0.0008) +[2023-10-09 06:51:20,786][60143] Updated weights for policy 0, policy_version 65342 (0.0009) +[2023-10-09 06:51:21,052][59242] Fps is (10 sec: 19660.7, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 134578176. Throughput: 0: 1709.0, 1: 1712.1. Samples: 33642360. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:51:21,053][59242] Avg episode reward: [(0, '33.690'), (1, '31.500')] +[2023-10-09 06:51:24,514][60144] Updated weights for policy 1, policy_version 66082 (0.0007) +[2023-10-09 06:51:24,865][60143] Updated weights for policy 0, policy_version 65352 (0.0007) +[2023-10-09 06:51:24,880][60144] Updated weights for policy 1, policy_version 66092 (0.0007) +[2023-10-09 06:51:25,230][60143] Updated weights for policy 0, policy_version 65362 (0.0009) +[2023-10-09 06:51:25,243][60144] Updated weights for policy 1, policy_version 66102 (0.0009) +[2023-10-09 06:51:25,595][60143] Updated weights for policy 0, policy_version 65372 (0.0008) +[2023-10-09 06:51:25,604][60144] Updated weights for policy 1, policy_version 66112 (0.0007) +[2023-10-09 06:51:26,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 134643712. Throughput: 0: 1713.5, 1: 1710.9. Samples: 33663404. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:51:26,053][59242] Avg episode reward: [(0, '33.450'), (1, '32.470')] +[2023-10-09 06:51:29,576][60144] Updated weights for policy 1, policy_version 66122 (0.0009) +[2023-10-09 06:51:29,613][60143] Updated weights for policy 0, policy_version 65382 (0.0009) +[2023-10-09 06:51:29,945][60144] Updated weights for policy 1, policy_version 66132 (0.0009) +[2023-10-09 06:51:29,986][60143] Updated weights for policy 0, policy_version 65392 (0.0009) +[2023-10-09 06:51:30,315][60144] Updated weights for policy 1, policy_version 66142 (0.0008) +[2023-10-09 06:51:30,358][60143] Updated weights for policy 0, policy_version 65402 (0.0009) +[2023-10-09 06:51:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 134709248. Throughput: 0: 1689.2, 1: 1688.7. Samples: 33682152. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:51:31,053][59242] Avg episode reward: [(0, '31.440'), (1, '33.000')] +[2023-10-09 06:51:31,065][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000065408_66977792.pth... +[2023-10-09 06:51:31,065][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000066144_67731456.pth... +[2023-10-09 06:51:31,102][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000063808_65339392.pth +[2023-10-09 06:51:31,102][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000064544_66093056.pth +[2023-10-09 06:51:34,314][60143] Updated weights for policy 0, policy_version 65412 (0.0009) +[2023-10-09 06:51:34,427][60144] Updated weights for policy 1, policy_version 66152 (0.0008) +[2023-10-09 06:51:34,696][60143] Updated weights for policy 0, policy_version 65422 (0.0009) +[2023-10-09 06:51:34,802][60144] Updated weights for policy 1, policy_version 66162 (0.0008) +[2023-10-09 06:51:35,076][60143] Updated weights for policy 0, policy_version 65432 (0.0008) +[2023-10-09 06:51:35,171][60144] Updated weights for policy 1, policy_version 66172 (0.0010) +[2023-10-09 06:51:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 134774784. Throughput: 0: 1718.2, 1: 1718.6. Samples: 33694022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:51:36,053][59242] Avg episode reward: [(0, '30.930'), (1, '32.280')] +[2023-10-09 06:51:39,087][60143] Updated weights for policy 0, policy_version 65442 (0.0009) +[2023-10-09 06:51:39,092][60144] Updated weights for policy 1, policy_version 66182 (0.0007) +[2023-10-09 06:51:39,453][60143] Updated weights for policy 0, policy_version 65452 (0.0008) +[2023-10-09 06:51:39,453][60144] Updated weights for policy 1, policy_version 66192 (0.0008) +[2023-10-09 06:51:39,818][60144] Updated weights for policy 1, policy_version 66202 (0.0009) +[2023-10-09 06:51:39,825][60143] Updated weights for policy 0, policy_version 65462 (0.0007) +[2023-10-09 06:51:40,197][60143] Updated weights for policy 0, policy_version 65472 (0.0007) +[2023-10-09 06:51:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 134840320. Throughput: 0: 1704.7, 1: 1699.5. Samples: 33713798. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:51:41,052][59242] Avg episode reward: [(0, '32.640'), (1, '33.360')] +[2023-10-09 06:51:43,846][60144] Updated weights for policy 1, policy_version 66212 (0.0007) +[2023-10-09 06:51:44,158][60143] Updated weights for policy 0, policy_version 65482 (0.0008) +[2023-10-09 06:51:44,212][60144] Updated weights for policy 1, policy_version 66222 (0.0007) +[2023-10-09 06:51:44,520][60143] Updated weights for policy 0, policy_version 65492 (0.0008) +[2023-10-09 06:51:44,576][60144] Updated weights for policy 1, policy_version 66232 (0.0007) +[2023-10-09 06:51:44,895][60143] Updated weights for policy 0, policy_version 65502 (0.0008) +[2023-10-09 06:51:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 134905856. Throughput: 0: 1690.0, 1: 1692.2. Samples: 33733690. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:51:46,053][59242] Avg episode reward: [(0, '34.300'), (1, '33.940')] +[2023-10-09 06:51:48,411][60144] Updated weights for policy 1, policy_version 66242 (0.0007) +[2023-10-09 06:51:48,777][60144] Updated weights for policy 1, policy_version 66252 (0.0009) +[2023-10-09 06:51:48,889][60143] Updated weights for policy 0, policy_version 65512 (0.0008) +[2023-10-09 06:51:49,140][60144] Updated weights for policy 1, policy_version 66262 (0.0007) +[2023-10-09 06:51:49,255][60143] Updated weights for policy 0, policy_version 65522 (0.0009) +[2023-10-09 06:51:49,502][60144] Updated weights for policy 1, policy_version 66272 (0.0008) +[2023-10-09 06:51:49,628][60143] Updated weights for policy 0, policy_version 65532 (0.0008) +[2023-10-09 06:51:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 134971392. Throughput: 0: 1712.4, 1: 1714.2. Samples: 33745326. Policy #0 lag: (min: 5.0, avg: 8.9, max: 37.0) +[2023-10-09 06:51:51,053][59242] Avg episode reward: [(0, '34.430'), (1, '36.020')] +[2023-10-09 06:51:51,054][60003] Saving new best policy, reward=36.020! +[2023-10-09 06:51:53,536][60144] Updated weights for policy 1, policy_version 66282 (0.0009) +[2023-10-09 06:51:53,769][60143] Updated weights for policy 0, policy_version 65542 (0.0008) +[2023-10-09 06:51:53,913][60144] Updated weights for policy 1, policy_version 66292 (0.0008) +[2023-10-09 06:51:54,149][60143] Updated weights for policy 0, policy_version 65552 (0.0009) +[2023-10-09 06:51:54,275][60144] Updated weights for policy 1, policy_version 66302 (0.0009) +[2023-10-09 06:51:54,516][60143] Updated weights for policy 0, policy_version 65562 (0.0009) +[2023-10-09 06:51:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 135036928. Throughput: 0: 1686.2, 1: 1692.3. Samples: 33764088. Policy #0 lag: (min: 5.0, avg: 8.9, max: 37.0) +[2023-10-09 06:51:56,053][59242] Avg episode reward: [(0, '33.950'), (1, '35.740')] +[2023-10-09 06:51:58,084][60144] Updated weights for policy 1, policy_version 66312 (0.0007) +[2023-10-09 06:51:58,428][60143] Updated weights for policy 0, policy_version 65572 (0.0009) +[2023-10-09 06:51:58,462][60144] Updated weights for policy 1, policy_version 66322 (0.0007) +[2023-10-09 06:51:58,793][60143] Updated weights for policy 0, policy_version 65582 (0.0007) +[2023-10-09 06:51:58,834][60144] Updated weights for policy 1, policy_version 66332 (0.0007) +[2023-10-09 06:51:59,169][60143] Updated weights for policy 0, policy_version 65592 (0.0008) +[2023-10-09 06:52:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 135102464. Throughput: 0: 1696.1, 1: 1713.6. Samples: 33785100. Policy #0 lag: (min: 5.0, avg: 8.9, max: 37.0) +[2023-10-09 06:52:01,052][59242] Avg episode reward: [(0, '34.530'), (1, '34.870')] +[2023-10-09 06:52:02,810][60144] Updated weights for policy 1, policy_version 66342 (0.0007) +[2023-10-09 06:52:03,030][60143] Updated weights for policy 0, policy_version 65602 (0.0008) +[2023-10-09 06:52:03,176][60144] Updated weights for policy 1, policy_version 66352 (0.0007) +[2023-10-09 06:52:03,395][60143] Updated weights for policy 0, policy_version 65612 (0.0008) +[2023-10-09 06:52:03,546][60144] Updated weights for policy 1, policy_version 66362 (0.0010) +[2023-10-09 06:52:03,780][60143] Updated weights for policy 0, policy_version 65622 (0.0008) +[2023-10-09 06:52:04,138][60143] Updated weights for policy 0, policy_version 65632 (0.0008) +[2023-10-09 06:52:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 135168000. Throughput: 0: 1699.5, 1: 1699.8. Samples: 33795330. Policy #0 lag: (min: 5.0, avg: 8.9, max: 37.0) +[2023-10-09 06:52:06,053][59242] Avg episode reward: [(0, '34.620'), (1, '35.230')] +[2023-10-09 06:52:07,482][60144] Updated weights for policy 1, policy_version 66372 (0.0008) +[2023-10-09 06:52:07,856][60144] Updated weights for policy 1, policy_version 66382 (0.0009) +[2023-10-09 06:52:08,093][60143] Updated weights for policy 0, policy_version 65642 (0.0010) +[2023-10-09 06:52:08,217][60144] Updated weights for policy 1, policy_version 66392 (0.0009) +[2023-10-09 06:52:08,467][60143] Updated weights for policy 0, policy_version 65652 (0.0008) +[2023-10-09 06:52:08,838][60143] Updated weights for policy 0, policy_version 65662 (0.0009) +[2023-10-09 06:52:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 135233536. Throughput: 0: 1686.4, 1: 1696.1. Samples: 33815618. Policy #0 lag: (min: 5.0, avg: 8.9, max: 37.0) +[2023-10-09 06:52:11,052][59242] Avg episode reward: [(0, '33.820'), (1, '34.200')] +[2023-10-09 06:52:12,204][60144] Updated weights for policy 1, policy_version 66402 (0.0007) +[2023-10-09 06:52:12,573][60144] Updated weights for policy 1, policy_version 66412 (0.0007) +[2023-10-09 06:52:12,937][60144] Updated weights for policy 1, policy_version 66422 (0.0009) +[2023-10-09 06:52:12,967][60143] Updated weights for policy 0, policy_version 65672 (0.0008) +[2023-10-09 06:52:13,298][60144] Updated weights for policy 1, policy_version 66432 (0.0007) +[2023-10-09 06:52:13,338][60143] Updated weights for policy 0, policy_version 65682 (0.0009) +[2023-10-09 06:52:13,698][60143] Updated weights for policy 0, policy_version 65692 (0.0010) +[2023-10-09 06:52:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 135299072. Throughput: 0: 1712.0, 1: 1728.0. Samples: 33836956. Policy #0 lag: (min: 5.0, avg: 8.9, max: 37.0) +[2023-10-09 06:52:16,053][59242] Avg episode reward: [(0, '33.250'), (1, '33.880')] +[2023-10-09 06:52:17,287][60144] Updated weights for policy 1, policy_version 66442 (0.0010) +[2023-10-09 06:52:17,656][60144] Updated weights for policy 1, policy_version 66452 (0.0009) +[2023-10-09 06:52:17,778][60143] Updated weights for policy 0, policy_version 65702 (0.0009) +[2023-10-09 06:52:18,022][60144] Updated weights for policy 1, policy_version 66462 (0.0010) +[2023-10-09 06:52:18,144][60143] Updated weights for policy 0, policy_version 65712 (0.0007) +[2023-10-09 06:52:18,510][60143] Updated weights for policy 0, policy_version 65722 (0.0009) +[2023-10-09 06:52:21,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 135364608. Throughput: 0: 1687.3, 1: 1699.2. Samples: 33846414. Policy #0 lag: (min: 5.0, avg: 8.9, max: 37.0) +[2023-10-09 06:52:21,053][59242] Avg episode reward: [(0, '34.070'), (1, '33.810')] +[2023-10-09 06:52:21,892][60144] Updated weights for policy 1, policy_version 66472 (0.0010) +[2023-10-09 06:52:22,262][60144] Updated weights for policy 1, policy_version 66482 (0.0011) +[2023-10-09 06:52:22,475][60143] Updated weights for policy 0, policy_version 65732 (0.0007) +[2023-10-09 06:52:22,628][60144] Updated weights for policy 1, policy_version 66492 (0.0007) +[2023-10-09 06:52:22,839][60143] Updated weights for policy 0, policy_version 65742 (0.0008) +[2023-10-09 06:52:23,204][60143] Updated weights for policy 0, policy_version 65752 (0.0009) +[2023-10-09 06:52:26,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 135430144. Throughput: 0: 1691.3, 1: 1723.5. Samples: 33867462. Policy #0 lag: (min: 5.0, avg: 8.9, max: 37.0) +[2023-10-09 06:52:26,053][59242] Avg episode reward: [(0, '33.950'), (1, '33.180')] +[2023-10-09 06:52:26,594][60144] Updated weights for policy 1, policy_version 66502 (0.0008) +[2023-10-09 06:52:26,965][60144] Updated weights for policy 1, policy_version 66512 (0.0007) +[2023-10-09 06:52:27,294][60143] Updated weights for policy 0, policy_version 65762 (0.0007) +[2023-10-09 06:52:27,338][60144] Updated weights for policy 1, policy_version 66522 (0.0007) +[2023-10-09 06:52:27,666][60143] Updated weights for policy 0, policy_version 65772 (0.0007) +[2023-10-09 06:52:28,041][60143] Updated weights for policy 0, policy_version 65782 (0.0007) +[2023-10-09 06:52:28,403][60143] Updated weights for policy 0, policy_version 65792 (0.0007) +[2023-10-09 06:52:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 135495680. Throughput: 0: 1708.7, 1: 1741.4. Samples: 33888944. Policy #0 lag: (min: 5.0, avg: 8.9, max: 37.0) +[2023-10-09 06:52:31,053][59242] Avg episode reward: [(0, '34.430'), (1, '35.080')] +[2023-10-09 06:52:31,119][60144] Updated weights for policy 1, policy_version 66532 (0.0007) +[2023-10-09 06:52:31,482][60144] Updated weights for policy 1, policy_version 66542 (0.0011) +[2023-10-09 06:52:31,848][60144] Updated weights for policy 1, policy_version 66552 (0.0009) +[2023-10-09 06:52:32,363][60143] Updated weights for policy 0, policy_version 65802 (0.0007) +[2023-10-09 06:52:32,735][60143] Updated weights for policy 0, policy_version 65812 (0.0009) +[2023-10-09 06:52:33,097][60143] Updated weights for policy 0, policy_version 65822 (0.0008) +[2023-10-09 06:52:35,900][60144] Updated weights for policy 1, policy_version 66562 (0.0007) +[2023-10-09 06:52:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 135561216. Throughput: 0: 1678.8, 1: 1717.3. Samples: 33898154. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:52:36,053][59242] Avg episode reward: [(0, '35.300'), (1, '35.590')] +[2023-10-09 06:52:36,267][60144] Updated weights for policy 1, policy_version 66572 (0.0007) +[2023-10-09 06:52:36,636][60144] Updated weights for policy 1, policy_version 66582 (0.0008) +[2023-10-09 06:52:36,999][60144] Updated weights for policy 1, policy_version 66592 (0.0008) +[2023-10-09 06:52:37,041][60143] Updated weights for policy 0, policy_version 65832 (0.0008) +[2023-10-09 06:52:37,423][60143] Updated weights for policy 0, policy_version 65842 (0.0007) +[2023-10-09 06:52:37,797][60143] Updated weights for policy 0, policy_version 65852 (0.0008) +[2023-10-09 06:52:40,941][60144] Updated weights for policy 1, policy_version 66602 (0.0009) +[2023-10-09 06:52:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 135626752. Throughput: 0: 1710.1, 1: 1741.5. Samples: 33919408. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:52:41,053][59242] Avg episode reward: [(0, '35.700'), (1, '33.920')] +[2023-10-09 06:52:41,304][60144] Updated weights for policy 1, policy_version 66612 (0.0008) +[2023-10-09 06:52:41,641][60143] Updated weights for policy 0, policy_version 65862 (0.0009) +[2023-10-09 06:52:41,665][60144] Updated weights for policy 1, policy_version 66622 (0.0008) +[2023-10-09 06:52:42,017][60143] Updated weights for policy 0, policy_version 65872 (0.0008) +[2023-10-09 06:52:42,390][60143] Updated weights for policy 0, policy_version 65882 (0.0009) +[2023-10-09 06:52:45,736][60144] Updated weights for policy 1, policy_version 66632 (0.0009) +[2023-10-09 06:52:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13551.5). Total num frames: 135692288. Throughput: 0: 1717.2, 1: 1730.8. Samples: 33940260. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:52:46,053][59242] Avg episode reward: [(0, '35.660'), (1, '33.660')] +[2023-10-09 06:52:46,107][60144] Updated weights for policy 1, policy_version 66642 (0.0009) +[2023-10-09 06:52:46,330][60143] Updated weights for policy 0, policy_version 65892 (0.0009) +[2023-10-09 06:52:46,482][60144] Updated weights for policy 1, policy_version 66652 (0.0009) +[2023-10-09 06:52:46,696][60143] Updated weights for policy 0, policy_version 65902 (0.0008) +[2023-10-09 06:52:47,076][60143] Updated weights for policy 0, policy_version 65912 (0.0007) +[2023-10-09 06:52:50,327][60144] Updated weights for policy 1, policy_version 66662 (0.0009) +[2023-10-09 06:52:50,704][60144] Updated weights for policy 1, policy_version 66672 (0.0010) +[2023-10-09 06:52:51,034][60143] Updated weights for policy 0, policy_version 65922 (0.0008) +[2023-10-09 06:52:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 135757824. Throughput: 0: 1698.9, 1: 1728.8. Samples: 33949580. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:52:51,053][59242] Avg episode reward: [(0, '35.790'), (1, '34.030')] +[2023-10-09 06:52:51,067][60144] Updated weights for policy 1, policy_version 66682 (0.0009) +[2023-10-09 06:52:51,404][60143] Updated weights for policy 0, policy_version 65932 (0.0009) +[2023-10-09 06:52:51,775][60143] Updated weights for policy 0, policy_version 65942 (0.0009) +[2023-10-09 06:52:52,147][60143] Updated weights for policy 0, policy_version 65952 (0.0008) +[2023-10-09 06:52:55,123][60144] Updated weights for policy 1, policy_version 66692 (0.0009) +[2023-10-09 06:52:55,490][60144] Updated weights for policy 1, policy_version 66702 (0.0011) +[2023-10-09 06:52:55,861][60144] Updated weights for policy 1, policy_version 66712 (0.0007) +[2023-10-09 06:52:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 135823360. Throughput: 0: 1714.4, 1: 1735.7. Samples: 33970876. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:52:56,053][59242] Avg episode reward: [(0, '35.590'), (1, '32.690')] +[2023-10-09 06:52:56,194][60143] Updated weights for policy 0, policy_version 65962 (0.0008) +[2023-10-09 06:52:56,566][60143] Updated weights for policy 0, policy_version 65972 (0.0008) +[2023-10-09 06:52:56,930][60143] Updated weights for policy 0, policy_version 65982 (0.0008) +[2023-10-09 06:52:59,792][60144] Updated weights for policy 1, policy_version 66722 (0.0008) +[2023-10-09 06:53:00,167][60144] Updated weights for policy 1, policy_version 66732 (0.0010) +[2023-10-09 06:53:00,534][60144] Updated weights for policy 1, policy_version 66742 (0.0007) +[2023-10-09 06:53:00,895][60144] Updated weights for policy 1, policy_version 66752 (0.0007) +[2023-10-09 06:53:01,011][60143] Updated weights for policy 0, policy_version 65992 (0.0009) +[2023-10-09 06:53:01,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 135921664. Throughput: 0: 1717.4, 1: 1712.9. Samples: 33991320. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:53:01,053][59242] Avg episode reward: [(0, '36.690'), (1, '31.380')] +[2023-10-09 06:53:01,377][60143] Updated weights for policy 0, policy_version 66002 (0.0011) +[2023-10-09 06:53:01,754][60143] Updated weights for policy 0, policy_version 66012 (0.0008) +[2023-10-09 06:53:04,817][60144] Updated weights for policy 1, policy_version 66762 (0.0008) +[2023-10-09 06:53:05,181][60144] Updated weights for policy 1, policy_version 66772 (0.0009) +[2023-10-09 06:53:05,551][60144] Updated weights for policy 1, policy_version 66782 (0.0009) +[2023-10-09 06:53:05,861][60143] Updated weights for policy 0, policy_version 66022 (0.0008) +[2023-10-09 06:53:06,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 135987200. Throughput: 0: 1709.1, 1: 1737.1. Samples: 34001492. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:53:06,053][59242] Avg episode reward: [(0, '36.150'), (1, '31.350')] +[2023-10-09 06:53:06,236][60143] Updated weights for policy 0, policy_version 66032 (0.0008) +[2023-10-09 06:53:06,605][60143] Updated weights for policy 0, policy_version 66042 (0.0007) +[2023-10-09 06:53:09,404][60144] Updated weights for policy 1, policy_version 66792 (0.0008) +[2023-10-09 06:53:09,778][60144] Updated weights for policy 1, policy_version 66802 (0.0007) +[2023-10-09 06:53:10,150][60144] Updated weights for policy 1, policy_version 66812 (0.0009) +[2023-10-09 06:53:10,488][60143] Updated weights for policy 0, policy_version 66052 (0.0009) +[2023-10-09 06:53:10,859][60143] Updated weights for policy 0, policy_version 66062 (0.0009) +[2023-10-09 06:53:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 136052736. Throughput: 0: 1717.1, 1: 1720.3. Samples: 34022144. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:53:11,053][59242] Avg episode reward: [(0, '37.560'), (1, '29.250')] +[2023-10-09 06:53:11,224][60143] Updated weights for policy 0, policy_version 66072 (0.0008) +[2023-10-09 06:53:14,026][60144] Updated weights for policy 1, policy_version 66822 (0.0009) +[2023-10-09 06:53:14,390][60144] Updated weights for policy 1, policy_version 66832 (0.0007) +[2023-10-09 06:53:14,762][60144] Updated weights for policy 1, policy_version 66842 (0.0009) +[2023-10-09 06:53:15,409][60143] Updated weights for policy 0, policy_version 66082 (0.0007) +[2023-10-09 06:53:15,780][60143] Updated weights for policy 0, policy_version 66092 (0.0010) +[2023-10-09 06:53:16,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 136118272. Throughput: 0: 1704.4, 1: 1700.4. Samples: 34042162. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:53:16,053][59242] Avg episode reward: [(0, '37.060'), (1, '30.410')] +[2023-10-09 06:53:16,137][60143] Updated weights for policy 0, policy_version 66102 (0.0009) +[2023-10-09 06:53:16,510][60143] Updated weights for policy 0, policy_version 66112 (0.0009) +[2023-10-09 06:53:18,727][60144] Updated weights for policy 1, policy_version 66852 (0.0010) +[2023-10-09 06:53:19,087][60144] Updated weights for policy 1, policy_version 66862 (0.0008) +[2023-10-09 06:53:19,454][60144] Updated weights for policy 1, policy_version 66872 (0.0007) +[2023-10-09 06:53:20,412][60143] Updated weights for policy 0, policy_version 66122 (0.0010) +[2023-10-09 06:53:20,777][60143] Updated weights for policy 0, policy_version 66132 (0.0010) +[2023-10-09 06:53:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 136183808. Throughput: 0: 1708.6, 1: 1732.0. Samples: 34052978. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 06:53:21,053][59242] Avg episode reward: [(0, '36.800'), (1, '31.510')] +[2023-10-09 06:53:21,155][60143] Updated weights for policy 0, policy_version 66142 (0.0010) +[2023-10-09 06:53:23,145][60144] Updated weights for policy 1, policy_version 66882 (0.0007) +[2023-10-09 06:53:23,527][60144] Updated weights for policy 1, policy_version 66892 (0.0008) +[2023-10-09 06:53:23,901][60144] Updated weights for policy 1, policy_version 66902 (0.0009) +[2023-10-09 06:53:24,263][60144] Updated weights for policy 1, policy_version 66912 (0.0011) +[2023-10-09 06:53:25,288][60143] Updated weights for policy 0, policy_version 66152 (0.0008) +[2023-10-09 06:53:25,665][60143] Updated weights for policy 0, policy_version 66162 (0.0010) +[2023-10-09 06:53:26,034][60143] Updated weights for policy 0, policy_version 66172 (0.0009) +[2023-10-09 06:53:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 136249344. Throughput: 0: 1703.2, 1: 1706.8. Samples: 34072862. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 06:53:26,052][59242] Avg episode reward: [(0, '35.950'), (1, '30.570')] +[2023-10-09 06:53:28,148][60144] Updated weights for policy 1, policy_version 66922 (0.0007) +[2023-10-09 06:53:28,514][60144] Updated weights for policy 1, policy_version 66932 (0.0007) +[2023-10-09 06:53:28,887][60144] Updated weights for policy 1, policy_version 66942 (0.0007) +[2023-10-09 06:53:30,179][60143] Updated weights for policy 0, policy_version 66182 (0.0008) +[2023-10-09 06:53:30,567][60143] Updated weights for policy 0, policy_version 66192 (0.0009) +[2023-10-09 06:53:30,937][60143] Updated weights for policy 0, policy_version 66202 (0.0009) +[2023-10-09 06:53:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 136314880. Throughput: 0: 1684.4, 1: 1724.4. Samples: 34093654. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 06:53:31,053][59242] Avg episode reward: [(0, '35.990'), (1, '30.770')] +[2023-10-09 06:53:31,059][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000066944_68550656.pth... +[2023-10-09 06:53:31,097][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000065344_66912256.pth +[2023-10-09 06:53:31,155][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000066208_67796992.pth... +[2023-10-09 06:53:31,196][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000064608_66158592.pth +[2023-10-09 06:53:32,843][60144] Updated weights for policy 1, policy_version 66952 (0.0009) +[2023-10-09 06:53:33,198][60144] Updated weights for policy 1, policy_version 66962 (0.0008) +[2023-10-09 06:53:33,575][60144] Updated weights for policy 1, policy_version 66972 (0.0008) +[2023-10-09 06:53:34,982][60143] Updated weights for policy 0, policy_version 66212 (0.0009) +[2023-10-09 06:53:35,353][60143] Updated weights for policy 0, policy_version 66222 (0.0010) +[2023-10-09 06:53:35,721][60143] Updated weights for policy 0, policy_version 66232 (0.0010) +[2023-10-09 06:53:36,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 136413184. Throughput: 0: 1696.2, 1: 1726.6. Samples: 34103604. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 06:53:36,053][59242] Avg episode reward: [(0, '34.840'), (1, '30.270')] +[2023-10-09 06:53:37,583][60144] Updated weights for policy 1, policy_version 66982 (0.0008) +[2023-10-09 06:53:37,948][60144] Updated weights for policy 1, policy_version 66992 (0.0008) +[2023-10-09 06:53:38,320][60144] Updated weights for policy 1, policy_version 67002 (0.0008) +[2023-10-09 06:53:39,837][60143] Updated weights for policy 0, policy_version 66242 (0.0008) +[2023-10-09 06:53:40,207][60143] Updated weights for policy 0, policy_version 66252 (0.0010) +[2023-10-09 06:53:40,585][60143] Updated weights for policy 0, policy_version 66262 (0.0007) +[2023-10-09 06:53:40,952][60143] Updated weights for policy 0, policy_version 66272 (0.0011) +[2023-10-09 06:53:41,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 136478720. Throughput: 0: 1694.7, 1: 1722.8. Samples: 34124664. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 06:53:41,053][59242] Avg episode reward: [(0, '35.520'), (1, '30.860')] +[2023-10-09 06:53:42,299][60144] Updated weights for policy 1, policy_version 67012 (0.0007) +[2023-10-09 06:53:42,666][60144] Updated weights for policy 1, policy_version 67022 (0.0009) +[2023-10-09 06:53:43,032][60144] Updated weights for policy 1, policy_version 67032 (0.0010) +[2023-10-09 06:53:44,962][60143] Updated weights for policy 0, policy_version 66282 (0.0009) +[2023-10-09 06:53:45,334][60143] Updated weights for policy 0, policy_version 66292 (0.0010) +[2023-10-09 06:53:45,703][60143] Updated weights for policy 0, policy_version 66302 (0.0008) +[2023-10-09 06:53:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 136544256. Throughput: 0: 1669.1, 1: 1746.6. Samples: 34145026. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 06:53:46,052][59242] Avg episode reward: [(0, '35.040'), (1, '32.510')] +[2023-10-09 06:53:46,923][60144] Updated weights for policy 1, policy_version 67042 (0.0007) +[2023-10-09 06:53:47,287][60144] Updated weights for policy 1, policy_version 67052 (0.0007) +[2023-10-09 06:53:47,654][60144] Updated weights for policy 1, policy_version 67062 (0.0007) +[2023-10-09 06:53:48,024][60144] Updated weights for policy 1, policy_version 67072 (0.0007) +[2023-10-09 06:53:49,632][60143] Updated weights for policy 0, policy_version 66312 (0.0009) +[2023-10-09 06:53:49,993][60143] Updated weights for policy 0, policy_version 66322 (0.0007) +[2023-10-09 06:53:50,370][60143] Updated weights for policy 0, policy_version 66332 (0.0008) +[2023-10-09 06:53:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 136609792. Throughput: 0: 1696.0, 1: 1722.8. Samples: 34155338. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 06:53:51,053][59242] Avg episode reward: [(0, '33.960'), (1, '30.860')] +[2023-10-09 06:53:52,024][60144] Updated weights for policy 1, policy_version 67082 (0.0008) +[2023-10-09 06:53:52,392][60144] Updated weights for policy 1, policy_version 67092 (0.0009) +[2023-10-09 06:53:52,757][60144] Updated weights for policy 1, policy_version 67102 (0.0008) +[2023-10-09 06:53:54,164][60143] Updated weights for policy 0, policy_version 66342 (0.0008) +[2023-10-09 06:53:54,543][60143] Updated weights for policy 0, policy_version 66352 (0.0009) +[2023-10-09 06:53:54,917][60143] Updated weights for policy 0, policy_version 66362 (0.0008) +[2023-10-09 06:53:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 136675328. Throughput: 0: 1685.6, 1: 1733.1. Samples: 34175986. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 06:53:56,053][59242] Avg episode reward: [(0, '34.330'), (1, '29.600')] +[2023-10-09 06:53:56,826][60144] Updated weights for policy 1, policy_version 67112 (0.0007) +[2023-10-09 06:53:57,189][60144] Updated weights for policy 1, policy_version 67122 (0.0008) +[2023-10-09 06:53:57,571][60144] Updated weights for policy 1, policy_version 67132 (0.0008) +[2023-10-09 06:53:58,840][60143] Updated weights for policy 0, policy_version 66372 (0.0008) +[2023-10-09 06:53:59,205][60143] Updated weights for policy 0, policy_version 66382 (0.0008) +[2023-10-09 06:53:59,573][60143] Updated weights for policy 0, policy_version 66392 (0.0007) +[2023-10-09 06:54:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 136740864. Throughput: 0: 1681.7, 1: 1744.0. Samples: 34196322. Policy #0 lag: (min: 28.0, avg: 28.0, max: 30.0) +[2023-10-09 06:54:01,053][59242] Avg episode reward: [(0, '36.440'), (1, '29.590')] +[2023-10-09 06:54:01,500][60144] Updated weights for policy 1, policy_version 67142 (0.0007) +[2023-10-09 06:54:01,868][60144] Updated weights for policy 1, policy_version 67152 (0.0010) +[2023-10-09 06:54:02,232][60144] Updated weights for policy 1, policy_version 67162 (0.0011) +[2023-10-09 06:54:03,526][60143] Updated weights for policy 0, policy_version 66402 (0.0008) +[2023-10-09 06:54:03,902][60143] Updated weights for policy 0, policy_version 66412 (0.0009) +[2023-10-09 06:54:04,265][60143] Updated weights for policy 0, policy_version 66422 (0.0008) +[2023-10-09 06:54:04,633][60143] Updated weights for policy 0, policy_version 66432 (0.0009) +[2023-10-09 06:54:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 136806400. Throughput: 0: 1709.0, 1: 1713.2. Samples: 34206974. Policy #0 lag: (min: 28.0, avg: 28.0, max: 30.0) +[2023-10-09 06:54:06,053][59242] Avg episode reward: [(0, '35.500'), (1, '29.750')] +[2023-10-09 06:54:06,220][60144] Updated weights for policy 1, policy_version 67172 (0.0007) +[2023-10-09 06:54:06,595][60144] Updated weights for policy 1, policy_version 67182 (0.0007) +[2023-10-09 06:54:06,971][60144] Updated weights for policy 1, policy_version 67192 (0.0008) +[2023-10-09 06:54:08,566][60143] Updated weights for policy 0, policy_version 66442 (0.0009) +[2023-10-09 06:54:08,939][60143] Updated weights for policy 0, policy_version 66452 (0.0008) +[2023-10-09 06:54:09,307][60143] Updated weights for policy 0, policy_version 66462 (0.0010) +[2023-10-09 06:54:10,719][60144] Updated weights for policy 1, policy_version 67202 (0.0007) +[2023-10-09 06:54:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 136871936. Throughput: 0: 1685.4, 1: 1738.9. Samples: 34226954. Policy #0 lag: (min: 28.0, avg: 28.0, max: 30.0) +[2023-10-09 06:54:11,052][59242] Avg episode reward: [(0, '35.310'), (1, '29.220')] +[2023-10-09 06:54:11,099][60144] Updated weights for policy 1, policy_version 67212 (0.0008) +[2023-10-09 06:54:11,474][60144] Updated weights for policy 1, policy_version 67222 (0.0008) +[2023-10-09 06:54:11,834][60144] Updated weights for policy 1, policy_version 67232 (0.0007) +[2023-10-09 06:54:13,301][60143] Updated weights for policy 0, policy_version 66472 (0.0009) +[2023-10-09 06:54:13,675][60143] Updated weights for policy 0, policy_version 66482 (0.0007) +[2023-10-09 06:54:14,042][60143] Updated weights for policy 0, policy_version 66492 (0.0009) +[2023-10-09 06:54:15,837][60144] Updated weights for policy 1, policy_version 67242 (0.0010) +[2023-10-09 06:54:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 136937472. Throughput: 0: 1699.3, 1: 1732.4. Samples: 34248084. Policy #0 lag: (min: 28.0, avg: 28.0, max: 30.0) +[2023-10-09 06:54:16,052][59242] Avg episode reward: [(0, '34.460'), (1, '29.740')] +[2023-10-09 06:54:16,210][60144] Updated weights for policy 1, policy_version 67252 (0.0007) +[2023-10-09 06:54:16,574][60144] Updated weights for policy 1, policy_version 67262 (0.0007) +[2023-10-09 06:54:17,976][60143] Updated weights for policy 0, policy_version 66502 (0.0008) +[2023-10-09 06:54:18,348][60143] Updated weights for policy 0, policy_version 66512 (0.0008) +[2023-10-09 06:54:18,728][60143] Updated weights for policy 0, policy_version 66522 (0.0011) +[2023-10-09 06:54:20,387][60144] Updated weights for policy 1, policy_version 67272 (0.0008) +[2023-10-09 06:54:20,760][60144] Updated weights for policy 1, policy_version 67282 (0.0009) +[2023-10-09 06:54:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 137003008. Throughput: 0: 1702.4, 1: 1727.8. Samples: 34257964. Policy #0 lag: (min: 28.0, avg: 28.0, max: 30.0) +[2023-10-09 06:54:21,053][59242] Avg episode reward: [(0, '34.800'), (1, '29.310')] +[2023-10-09 06:54:21,122][60144] Updated weights for policy 1, policy_version 67292 (0.0010) +[2023-10-09 06:54:22,785][60143] Updated weights for policy 0, policy_version 66532 (0.0007) +[2023-10-09 06:54:23,163][60143] Updated weights for policy 0, policy_version 66542 (0.0007) +[2023-10-09 06:54:23,524][60143] Updated weights for policy 0, policy_version 66552 (0.0007) +[2023-10-09 06:54:25,144][60144] Updated weights for policy 1, policy_version 67302 (0.0008) +[2023-10-09 06:54:25,523][60144] Updated weights for policy 1, policy_version 67312 (0.0009) +[2023-10-09 06:54:25,884][60144] Updated weights for policy 1, policy_version 67322 (0.0009) +[2023-10-09 06:54:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 137068544. Throughput: 0: 1686.1, 1: 1731.6. Samples: 34278460. Policy #0 lag: (min: 28.0, avg: 28.0, max: 30.0) +[2023-10-09 06:54:26,053][59242] Avg episode reward: [(0, '34.480'), (1, '31.710')] +[2023-10-09 06:54:27,478][60143] Updated weights for policy 0, policy_version 66562 (0.0008) +[2023-10-09 06:54:27,846][60143] Updated weights for policy 0, policy_version 66572 (0.0008) +[2023-10-09 06:54:28,226][60143] Updated weights for policy 0, policy_version 66582 (0.0010) +[2023-10-09 06:54:28,597][60143] Updated weights for policy 0, policy_version 66592 (0.0009) +[2023-10-09 06:54:29,873][60144] Updated weights for policy 1, policy_version 67332 (0.0008) +[2023-10-09 06:54:30,235][60144] Updated weights for policy 1, policy_version 67342 (0.0007) +[2023-10-09 06:54:30,608][60144] Updated weights for policy 1, policy_version 67352 (0.0007) +[2023-10-09 06:54:31,052][59242] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 137166848. Throughput: 0: 1715.7, 1: 1706.3. Samples: 34299016. Policy #0 lag: (min: 28.0, avg: 28.0, max: 30.0) +[2023-10-09 06:54:31,052][59242] Avg episode reward: [(0, '32.380'), (1, '30.430')] +[2023-10-09 06:54:32,645][60143] Updated weights for policy 0, policy_version 66602 (0.0008) +[2023-10-09 06:54:33,005][60143] Updated weights for policy 0, policy_version 66612 (0.0010) +[2023-10-09 06:54:33,377][60143] Updated weights for policy 0, policy_version 66622 (0.0007) +[2023-10-09 06:54:34,610][60144] Updated weights for policy 1, policy_version 67362 (0.0007) +[2023-10-09 06:54:34,983][60144] Updated weights for policy 1, policy_version 67372 (0.0008) +[2023-10-09 06:54:35,362][60144] Updated weights for policy 1, policy_version 67382 (0.0009) +[2023-10-09 06:54:35,726][60144] Updated weights for policy 1, policy_version 67392 (0.0008) +[2023-10-09 06:54:36,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 137232384. Throughput: 0: 1692.2, 1: 1728.0. Samples: 34309246. Policy #0 lag: (min: 28.0, avg: 28.0, max: 30.0) +[2023-10-09 06:54:36,052][59242] Avg episode reward: [(0, '32.620'), (1, '30.030')] +[2023-10-09 06:54:37,356][60143] Updated weights for policy 0, policy_version 66632 (0.0008) +[2023-10-09 06:54:37,718][60143] Updated weights for policy 0, policy_version 66642 (0.0010) +[2023-10-09 06:54:38,090][60143] Updated weights for policy 0, policy_version 66652 (0.0010) +[2023-10-09 06:54:39,462][60144] Updated weights for policy 1, policy_version 67402 (0.0009) +[2023-10-09 06:54:39,821][60144] Updated weights for policy 1, policy_version 67412 (0.0009) +[2023-10-09 06:54:40,189][60144] Updated weights for policy 1, policy_version 67422 (0.0008) +[2023-10-09 06:54:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 137297920. Throughput: 0: 1701.0, 1: 1722.8. Samples: 34330054. Policy #0 lag: (min: 28.0, avg: 28.0, max: 30.0) +[2023-10-09 06:54:41,053][59242] Avg episode reward: [(0, '31.380'), (1, '29.370')] +[2023-10-09 06:54:42,009][60143] Updated weights for policy 0, policy_version 66662 (0.0010) +[2023-10-09 06:54:42,383][60143] Updated weights for policy 0, policy_version 66672 (0.0010) +[2023-10-09 06:54:42,759][60143] Updated weights for policy 0, policy_version 66682 (0.0007) +[2023-10-09 06:54:44,048][60144] Updated weights for policy 1, policy_version 67432 (0.0009) +[2023-10-09 06:54:44,423][60144] Updated weights for policy 1, policy_version 67442 (0.0009) +[2023-10-09 06:54:44,786][60144] Updated weights for policy 1, policy_version 67452 (0.0008) +[2023-10-09 06:54:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 137363456. Throughput: 0: 1726.0, 1: 1710.8. Samples: 34350980. Policy #0 lag: (min: 22.0, avg: 25.2, max: 54.0) +[2023-10-09 06:54:46,053][59242] Avg episode reward: [(0, '31.400'), (1, '29.520')] +[2023-10-09 06:54:46,761][60143] Updated weights for policy 0, policy_version 66692 (0.0007) +[2023-10-09 06:54:47,131][60143] Updated weights for policy 0, policy_version 66702 (0.0007) +[2023-10-09 06:54:47,497][60143] Updated weights for policy 0, policy_version 66712 (0.0007) +[2023-10-09 06:54:48,777][60144] Updated weights for policy 1, policy_version 67462 (0.0007) +[2023-10-09 06:54:49,142][60144] Updated weights for policy 1, policy_version 67472 (0.0007) +[2023-10-09 06:54:49,507][60144] Updated weights for policy 1, policy_version 67482 (0.0010) +[2023-10-09 06:54:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 137428992. Throughput: 0: 1691.4, 1: 1743.2. Samples: 34361528. Policy #0 lag: (min: 22.0, avg: 25.2, max: 54.0) +[2023-10-09 06:54:51,053][59242] Avg episode reward: [(0, '31.650'), (1, '30.650')] +[2023-10-09 06:54:51,459][60143] Updated weights for policy 0, policy_version 66722 (0.0010) +[2023-10-09 06:54:51,831][60143] Updated weights for policy 0, policy_version 66732 (0.0011) +[2023-10-09 06:54:52,195][60143] Updated weights for policy 0, policy_version 66742 (0.0007) +[2023-10-09 06:54:52,566][60143] Updated weights for policy 0, policy_version 66752 (0.0007) +[2023-10-09 06:54:53,438][60144] Updated weights for policy 1, policy_version 67492 (0.0008) +[2023-10-09 06:54:53,810][60144] Updated weights for policy 1, policy_version 67502 (0.0008) +[2023-10-09 06:54:54,168][60144] Updated weights for policy 1, policy_version 67512 (0.0009) +[2023-10-09 06:54:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 137494528. Throughput: 0: 1718.4, 1: 1719.1. Samples: 34381642. Policy #0 lag: (min: 22.0, avg: 25.2, max: 54.0) +[2023-10-09 06:54:56,052][59242] Avg episode reward: [(0, '31.530'), (1, '31.120')] +[2023-10-09 06:54:56,679][60143] Updated weights for policy 0, policy_version 66762 (0.0007) +[2023-10-09 06:54:57,046][60143] Updated weights for policy 0, policy_version 66772 (0.0010) +[2023-10-09 06:54:57,410][60143] Updated weights for policy 0, policy_version 66782 (0.0008) +[2023-10-09 06:54:58,144][60144] Updated weights for policy 1, policy_version 67522 (0.0008) +[2023-10-09 06:54:58,504][60144] Updated weights for policy 1, policy_version 67532 (0.0009) +[2023-10-09 06:54:58,874][60144] Updated weights for policy 1, policy_version 67542 (0.0008) +[2023-10-09 06:54:59,240][60144] Updated weights for policy 1, policy_version 67552 (0.0010) +[2023-10-09 06:55:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 137560064. Throughput: 0: 1719.6, 1: 1723.7. Samples: 34403034. Policy #0 lag: (min: 22.0, avg: 25.2, max: 54.0) +[2023-10-09 06:55:01,053][59242] Avg episode reward: [(0, '33.230'), (1, '31.090')] +[2023-10-09 06:55:01,451][60143] Updated weights for policy 0, policy_version 66792 (0.0009) +[2023-10-09 06:55:01,825][60143] Updated weights for policy 0, policy_version 66802 (0.0009) +[2023-10-09 06:55:02,187][60143] Updated weights for policy 0, policy_version 66812 (0.0009) +[2023-10-09 06:55:03,389][60144] Updated weights for policy 1, policy_version 67562 (0.0008) +[2023-10-09 06:55:03,760][60144] Updated weights for policy 1, policy_version 67572 (0.0008) +[2023-10-09 06:55:04,132][60144] Updated weights for policy 1, policy_version 67582 (0.0010) +[2023-10-09 06:55:06,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 137625600. Throughput: 0: 1707.8, 1: 1734.8. Samples: 34412882. Policy #0 lag: (min: 22.0, avg: 25.2, max: 54.0) +[2023-10-09 06:55:06,053][59242] Avg episode reward: [(0, '33.110'), (1, '31.250')] +[2023-10-09 06:55:06,154][60143] Updated weights for policy 0, policy_version 66822 (0.0009) +[2023-10-09 06:55:06,517][60143] Updated weights for policy 0, policy_version 66832 (0.0009) +[2023-10-09 06:55:06,881][60143] Updated weights for policy 0, policy_version 66842 (0.0007) +[2023-10-09 06:55:07,998][60144] Updated weights for policy 1, policy_version 67592 (0.0009) +[2023-10-09 06:55:08,367][60144] Updated weights for policy 1, policy_version 67602 (0.0011) +[2023-10-09 06:55:08,734][60144] Updated weights for policy 1, policy_version 67612 (0.0009) +[2023-10-09 06:55:10,812][60143] Updated weights for policy 0, policy_version 66852 (0.0008) +[2023-10-09 06:55:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 137691136. Throughput: 0: 1726.2, 1: 1718.7. Samples: 34433482. Policy #0 lag: (min: 22.0, avg: 25.2, max: 54.0) +[2023-10-09 06:55:11,053][59242] Avg episode reward: [(0, '31.360'), (1, '31.120')] +[2023-10-09 06:55:11,193][60143] Updated weights for policy 0, policy_version 66862 (0.0009) +[2023-10-09 06:55:11,564][60143] Updated weights for policy 0, policy_version 66872 (0.0008) +[2023-10-09 06:55:12,632][60144] Updated weights for policy 1, policy_version 67622 (0.0007) +[2023-10-09 06:55:12,994][60144] Updated weights for policy 1, policy_version 67632 (0.0009) +[2023-10-09 06:55:13,372][60144] Updated weights for policy 1, policy_version 67642 (0.0007) +[2023-10-09 06:55:15,535][60143] Updated weights for policy 0, policy_version 66882 (0.0007) +[2023-10-09 06:55:15,906][60143] Updated weights for policy 0, policy_version 66892 (0.0009) +[2023-10-09 06:55:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 137756672. Throughput: 0: 1716.8, 1: 1735.7. Samples: 34454382. Policy #0 lag: (min: 22.0, avg: 25.2, max: 54.0) +[2023-10-09 06:55:16,053][59242] Avg episode reward: [(0, '33.300'), (1, '30.550')] +[2023-10-09 06:55:16,277][60143] Updated weights for policy 0, policy_version 66902 (0.0007) +[2023-10-09 06:55:16,643][60143] Updated weights for policy 0, policy_version 66912 (0.0009) +[2023-10-09 06:55:17,283][60144] Updated weights for policy 1, policy_version 67652 (0.0008) +[2023-10-09 06:55:17,644][60144] Updated weights for policy 1, policy_version 67662 (0.0007) +[2023-10-09 06:55:18,011][60144] Updated weights for policy 1, policy_version 67672 (0.0007) +[2023-10-09 06:55:20,513][60143] Updated weights for policy 0, policy_version 66922 (0.0008) +[2023-10-09 06:55:20,891][60143] Updated weights for policy 0, policy_version 66932 (0.0009) +[2023-10-09 06:55:21,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 137822208. Throughput: 0: 1718.5, 1: 1715.7. Samples: 34463788. Policy #0 lag: (min: 22.0, avg: 25.2, max: 54.0) +[2023-10-09 06:55:21,052][59242] Avg episode reward: [(0, '34.920'), (1, '31.370')] +[2023-10-09 06:55:21,267][60143] Updated weights for policy 0, policy_version 66942 (0.0008) +[2023-10-09 06:55:21,960][60144] Updated weights for policy 1, policy_version 67682 (0.0007) +[2023-10-09 06:55:22,327][60144] Updated weights for policy 1, policy_version 67692 (0.0007) +[2023-10-09 06:55:22,695][60144] Updated weights for policy 1, policy_version 67702 (0.0007) +[2023-10-09 06:55:23,067][60144] Updated weights for policy 1, policy_version 67712 (0.0008) +[2023-10-09 06:55:25,309][60143] Updated weights for policy 0, policy_version 66952 (0.0008) +[2023-10-09 06:55:25,689][60143] Updated weights for policy 0, policy_version 66962 (0.0008) +[2023-10-09 06:55:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 137887744. Throughput: 0: 1722.7, 1: 1728.7. Samples: 34485368. Policy #0 lag: (min: 22.0, avg: 25.2, max: 54.0) +[2023-10-09 06:55:26,053][59242] Avg episode reward: [(0, '34.640'), (1, '30.540')] +[2023-10-09 06:55:26,057][60143] Updated weights for policy 0, policy_version 66972 (0.0009) +[2023-10-09 06:55:26,948][60144] Updated weights for policy 1, policy_version 67722 (0.0011) +[2023-10-09 06:55:27,317][60144] Updated weights for policy 1, policy_version 67732 (0.0009) +[2023-10-09 06:55:27,688][60144] Updated weights for policy 1, policy_version 67742 (0.0011) +[2023-10-09 06:55:30,052][60143] Updated weights for policy 0, policy_version 66982 (0.0009) +[2023-10-09 06:55:30,420][60143] Updated weights for policy 0, policy_version 66992 (0.0010) +[2023-10-09 06:55:30,789][60143] Updated weights for policy 0, policy_version 67002 (0.0010) +[2023-10-09 06:55:31,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 137986048. Throughput: 0: 1693.9, 1: 1751.9. Samples: 34506040. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:55:31,053][59242] Avg episode reward: [(0, '34.230'), (1, '31.270')] +[2023-10-09 06:55:31,063][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000067008_68616192.pth... +[2023-10-09 06:55:31,063][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000067744_69369856.pth... +[2023-10-09 06:55:31,100][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000066144_67731456.pth +[2023-10-09 06:55:31,100][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000065408_66977792.pth +[2023-10-09 06:55:31,515][60144] Updated weights for policy 1, policy_version 67752 (0.0010) +[2023-10-09 06:55:31,880][60144] Updated weights for policy 1, policy_version 67762 (0.0009) +[2023-10-09 06:55:32,243][60144] Updated weights for policy 1, policy_version 67772 (0.0007) +[2023-10-09 06:55:34,901][60143] Updated weights for policy 0, policy_version 67012 (0.0008) +[2023-10-09 06:55:35,279][60143] Updated weights for policy 0, policy_version 67022 (0.0008) +[2023-10-09 06:55:35,653][60143] Updated weights for policy 0, policy_version 67032 (0.0008) +[2023-10-09 06:55:36,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 138051584. Throughput: 0: 1710.8, 1: 1720.9. Samples: 34515956. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:55:36,052][59242] Avg episode reward: [(0, '34.270'), (1, '31.400')] +[2023-10-09 06:55:36,081][60144] Updated weights for policy 1, policy_version 67782 (0.0010) +[2023-10-09 06:55:36,449][60144] Updated weights for policy 1, policy_version 67792 (0.0009) +[2023-10-09 06:55:36,815][60144] Updated weights for policy 1, policy_version 67802 (0.0008) +[2023-10-09 06:55:39,592][60143] Updated weights for policy 0, policy_version 67042 (0.0008) +[2023-10-09 06:55:39,961][60143] Updated weights for policy 0, policy_version 67052 (0.0010) +[2023-10-09 06:55:40,329][60143] Updated weights for policy 0, policy_version 67062 (0.0009) +[2023-10-09 06:55:40,703][60143] Updated weights for policy 0, policy_version 67072 (0.0010) +[2023-10-09 06:55:40,867][60144] Updated weights for policy 1, policy_version 67812 (0.0011) +[2023-10-09 06:55:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 138117120. Throughput: 0: 1712.5, 1: 1745.5. Samples: 34537250. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:55:41,053][59242] Avg episode reward: [(0, '33.750'), (1, '31.920')] +[2023-10-09 06:55:41,230][60144] Updated weights for policy 1, policy_version 67822 (0.0009) +[2023-10-09 06:55:41,594][60144] Updated weights for policy 1, policy_version 67832 (0.0009) +[2023-10-09 06:55:44,714][60143] Updated weights for policy 0, policy_version 67082 (0.0007) +[2023-10-09 06:55:45,075][60143] Updated weights for policy 0, policy_version 67092 (0.0007) +[2023-10-09 06:55:45,439][60143] Updated weights for policy 0, policy_version 67102 (0.0008) +[2023-10-09 06:55:45,617][60144] Updated weights for policy 1, policy_version 67842 (0.0009) +[2023-10-09 06:55:45,977][60144] Updated weights for policy 1, policy_version 67852 (0.0008) +[2023-10-09 06:55:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 138182656. Throughput: 0: 1686.0, 1: 1738.4. Samples: 34557132. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:55:46,052][59242] Avg episode reward: [(0, '34.780'), (1, '32.610')] +[2023-10-09 06:55:46,346][60144] Updated weights for policy 1, policy_version 67862 (0.0007) +[2023-10-09 06:55:46,718][60144] Updated weights for policy 1, policy_version 67872 (0.0007) +[2023-10-09 06:55:49,469][60143] Updated weights for policy 0, policy_version 67112 (0.0009) +[2023-10-09 06:55:49,828][60143] Updated weights for policy 0, policy_version 67122 (0.0008) +[2023-10-09 06:55:50,195][60143] Updated weights for policy 0, policy_version 67132 (0.0007) +[2023-10-09 06:55:50,760][60144] Updated weights for policy 1, policy_version 67882 (0.0008) +[2023-10-09 06:55:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 138248192. Throughput: 0: 1713.4, 1: 1728.7. Samples: 34567776. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:55:51,053][59242] Avg episode reward: [(0, '34.470'), (1, '32.450')] +[2023-10-09 06:55:51,123][60144] Updated weights for policy 1, policy_version 67892 (0.0007) +[2023-10-09 06:55:51,487][60144] Updated weights for policy 1, policy_version 67902 (0.0008) +[2023-10-09 06:55:54,239][60143] Updated weights for policy 0, policy_version 67142 (0.0008) +[2023-10-09 06:55:54,620][60143] Updated weights for policy 0, policy_version 67152 (0.0008) +[2023-10-09 06:55:54,987][60143] Updated weights for policy 0, policy_version 67162 (0.0007) +[2023-10-09 06:55:55,502][60144] Updated weights for policy 1, policy_version 67912 (0.0008) +[2023-10-09 06:55:55,873][60144] Updated weights for policy 1, policy_version 67922 (0.0008) +[2023-10-09 06:55:56,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 138313728. Throughput: 0: 1699.5, 1: 1742.2. Samples: 34588358. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:55:56,053][59242] Avg episode reward: [(0, '33.820'), (1, '32.960')] +[2023-10-09 06:55:56,253][60144] Updated weights for policy 1, policy_version 67932 (0.0008) +[2023-10-09 06:55:58,820][60143] Updated weights for policy 0, policy_version 67172 (0.0008) +[2023-10-09 06:55:59,196][60143] Updated weights for policy 0, policy_version 67182 (0.0011) +[2023-10-09 06:55:59,571][60143] Updated weights for policy 0, policy_version 67192 (0.0011) +[2023-10-09 06:55:59,949][60144] Updated weights for policy 1, policy_version 67942 (0.0009) +[2023-10-09 06:56:00,316][60144] Updated weights for policy 1, policy_version 67952 (0.0007) +[2023-10-09 06:56:00,679][60144] Updated weights for policy 1, policy_version 67962 (0.0008) +[2023-10-09 06:56:01,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 138412032. Throughput: 0: 1689.0, 1: 1727.8. Samples: 34608138. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:56:01,053][59242] Avg episode reward: [(0, '33.590'), (1, '32.060')] +[2023-10-09 06:56:03,524][60143] Updated weights for policy 0, policy_version 67202 (0.0011) +[2023-10-09 06:56:03,894][60143] Updated weights for policy 0, policy_version 67212 (0.0009) +[2023-10-09 06:56:04,261][60143] Updated weights for policy 0, policy_version 67222 (0.0008) +[2023-10-09 06:56:04,513][60144] Updated weights for policy 1, policy_version 67972 (0.0009) +[2023-10-09 06:56:04,626][60143] Updated weights for policy 0, policy_version 67232 (0.0008) +[2023-10-09 06:56:04,876][60144] Updated weights for policy 1, policy_version 67982 (0.0007) +[2023-10-09 06:56:05,240][60144] Updated weights for policy 1, policy_version 67992 (0.0008) +[2023-10-09 06:56:06,052][59242] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 138477568. Throughput: 0: 1717.3, 1: 1750.5. Samples: 34619840. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:56:06,052][59242] Avg episode reward: [(0, '36.140'), (1, '33.120')] +[2023-10-09 06:56:08,467][60143] Updated weights for policy 0, policy_version 67242 (0.0008) +[2023-10-09 06:56:08,828][60143] Updated weights for policy 0, policy_version 67252 (0.0008) +[2023-10-09 06:56:09,090][60144] Updated weights for policy 1, policy_version 68002 (0.0008) +[2023-10-09 06:56:09,202][60143] Updated weights for policy 0, policy_version 67262 (0.0008) +[2023-10-09 06:56:09,464][60144] Updated weights for policy 1, policy_version 68012 (0.0008) +[2023-10-09 06:56:09,840][60144] Updated weights for policy 1, policy_version 68022 (0.0007) +[2023-10-09 06:56:10,204][60144] Updated weights for policy 1, policy_version 68032 (0.0009) +[2023-10-09 06:56:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 138543104. Throughput: 0: 1688.9, 1: 1734.3. Samples: 34639412. Policy #0 lag: (min: 2.0, avg: 4.3, max: 29.0) +[2023-10-09 06:56:11,054][59242] Avg episode reward: [(0, '36.320'), (1, '34.790')] +[2023-10-09 06:56:13,311][60143] Updated weights for policy 0, policy_version 67272 (0.0008) +[2023-10-09 06:56:13,688][60143] Updated weights for policy 0, policy_version 67282 (0.0007) +[2023-10-09 06:56:14,062][60143] Updated weights for policy 0, policy_version 67292 (0.0008) +[2023-10-09 06:56:14,131][60144] Updated weights for policy 1, policy_version 68042 (0.0008) +[2023-10-09 06:56:14,499][60144] Updated weights for policy 1, policy_version 68052 (0.0009) +[2023-10-09 06:56:14,863][60144] Updated weights for policy 1, policy_version 68062 (0.0009) +[2023-10-09 06:56:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 138608640. Throughput: 0: 1707.6, 1: 1710.6. Samples: 34659858. Policy #0 lag: (min: 2.0, avg: 4.3, max: 29.0) +[2023-10-09 06:56:16,053][59242] Avg episode reward: [(0, '35.420'), (1, '34.240')] +[2023-10-09 06:56:18,036][60143] Updated weights for policy 0, policy_version 67302 (0.0010) +[2023-10-09 06:56:18,403][60143] Updated weights for policy 0, policy_version 67312 (0.0011) +[2023-10-09 06:56:18,772][60143] Updated weights for policy 0, policy_version 67322 (0.0008) +[2023-10-09 06:56:18,857][60144] Updated weights for policy 1, policy_version 68072 (0.0009) +[2023-10-09 06:56:19,216][60144] Updated weights for policy 1, policy_version 68082 (0.0007) +[2023-10-09 06:56:19,591][60144] Updated weights for policy 1, policy_version 68092 (0.0011) +[2023-10-09 06:56:21,052][59242] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 138674176. Throughput: 0: 1706.3, 1: 1741.2. Samples: 34671090. Policy #0 lag: (min: 2.0, avg: 4.3, max: 29.0) +[2023-10-09 06:56:21,052][59242] Avg episode reward: [(0, '36.010'), (1, '33.950')] +[2023-10-09 06:56:22,849][60143] Updated weights for policy 0, policy_version 67332 (0.0007) +[2023-10-09 06:56:23,220][60143] Updated weights for policy 0, policy_version 67342 (0.0008) +[2023-10-09 06:56:23,588][60143] Updated weights for policy 0, policy_version 67352 (0.0007) +[2023-10-09 06:56:23,605][60144] Updated weights for policy 1, policy_version 68102 (0.0009) +[2023-10-09 06:56:23,969][60144] Updated weights for policy 1, policy_version 68112 (0.0008) +[2023-10-09 06:56:24,339][60144] Updated weights for policy 1, policy_version 68122 (0.0008) +[2023-10-09 06:56:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 138739712. Throughput: 0: 1689.2, 1: 1716.1. Samples: 34690490. Policy #0 lag: (min: 2.0, avg: 4.3, max: 29.0) +[2023-10-09 06:56:26,052][59242] Avg episode reward: [(0, '35.440'), (1, '34.540')] +[2023-10-09 06:56:27,686][60143] Updated weights for policy 0, policy_version 67362 (0.0007) +[2023-10-09 06:56:28,053][60143] Updated weights for policy 0, policy_version 67372 (0.0007) +[2023-10-09 06:56:28,101][60144] Updated weights for policy 1, policy_version 68132 (0.0008) +[2023-10-09 06:56:28,424][60143] Updated weights for policy 0, policy_version 67382 (0.0008) +[2023-10-09 06:56:28,468][60144] Updated weights for policy 1, policy_version 68142 (0.0009) +[2023-10-09 06:56:28,786][60143] Updated weights for policy 0, policy_version 67392 (0.0009) +[2023-10-09 06:56:28,842][60144] Updated weights for policy 1, policy_version 68152 (0.0008) +[2023-10-09 06:56:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 138805248. Throughput: 0: 1714.5, 1: 1726.3. Samples: 34711970. Policy #0 lag: (min: 2.0, avg: 4.3, max: 29.0) +[2023-10-09 06:56:31,053][59242] Avg episode reward: [(0, '34.940'), (1, '34.900')] +[2023-10-09 06:56:32,650][60144] Updated weights for policy 1, policy_version 68162 (0.0007) +[2023-10-09 06:56:32,794][60143] Updated weights for policy 0, policy_version 67402 (0.0009) +[2023-10-09 06:56:33,018][60144] Updated weights for policy 1, policy_version 68172 (0.0007) +[2023-10-09 06:56:33,162][60143] Updated weights for policy 0, policy_version 67412 (0.0008) +[2023-10-09 06:56:33,380][60144] Updated weights for policy 1, policy_version 68182 (0.0007) +[2023-10-09 06:56:33,531][60143] Updated weights for policy 0, policy_version 67422 (0.0009) +[2023-10-09 06:56:33,747][60144] Updated weights for policy 1, policy_version 68192 (0.0009) +[2023-10-09 06:56:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 138870784. Throughput: 0: 1687.1, 1: 1730.8. Samples: 34721582. Policy #0 lag: (min: 2.0, avg: 4.3, max: 29.0) +[2023-10-09 06:56:36,053][59242] Avg episode reward: [(0, '33.750'), (1, '33.980')] +[2023-10-09 06:56:37,633][60143] Updated weights for policy 0, policy_version 67432 (0.0009) +[2023-10-09 06:56:37,782][60144] Updated weights for policy 1, policy_version 68202 (0.0008) +[2023-10-09 06:56:38,002][60143] Updated weights for policy 0, policy_version 67442 (0.0009) +[2023-10-09 06:56:38,152][60144] Updated weights for policy 1, policy_version 68212 (0.0008) +[2023-10-09 06:56:38,367][60143] Updated weights for policy 0, policy_version 67452 (0.0007) +[2023-10-09 06:56:38,518][60144] Updated weights for policy 1, policy_version 68222 (0.0007) +[2023-10-09 06:56:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 138936320. Throughput: 0: 1691.9, 1: 1726.0. Samples: 34742162. Policy #0 lag: (min: 2.0, avg: 4.3, max: 29.0) +[2023-10-09 06:56:41,052][59242] Avg episode reward: [(0, '32.790'), (1, '33.610')] +[2023-10-09 06:56:42,289][60143] Updated weights for policy 0, policy_version 67462 (0.0007) +[2023-10-09 06:56:42,592][60144] Updated weights for policy 1, policy_version 68232 (0.0007) +[2023-10-09 06:56:42,664][60143] Updated weights for policy 0, policy_version 67472 (0.0009) +[2023-10-09 06:56:42,978][60144] Updated weights for policy 1, policy_version 68242 (0.0009) +[2023-10-09 06:56:43,045][60143] Updated weights for policy 0, policy_version 67482 (0.0009) +[2023-10-09 06:56:43,347][60144] Updated weights for policy 1, policy_version 68252 (0.0008) +[2023-10-09 06:56:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 139001856. Throughput: 0: 1701.8, 1: 1737.2. Samples: 34762890. Policy #0 lag: (min: 2.0, avg: 4.3, max: 29.0) +[2023-10-09 06:56:46,053][59242] Avg episode reward: [(0, '33.250'), (1, '33.080')] +[2023-10-09 06:56:47,140][60143] Updated weights for policy 0, policy_version 67492 (0.0011) +[2023-10-09 06:56:47,242][60144] Updated weights for policy 1, policy_version 68262 (0.0009) +[2023-10-09 06:56:47,518][60143] Updated weights for policy 0, policy_version 67502 (0.0010) +[2023-10-09 06:56:47,608][60144] Updated weights for policy 1, policy_version 68272 (0.0008) +[2023-10-09 06:56:47,885][60143] Updated weights for policy 0, policy_version 67512 (0.0008) +[2023-10-09 06:56:47,967][60144] Updated weights for policy 1, policy_version 68282 (0.0008) +[2023-10-09 06:56:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 139067392. Throughput: 0: 1671.4, 1: 1712.0. Samples: 34772090. Policy #0 lag: (min: 2.0, avg: 4.3, max: 29.0) +[2023-10-09 06:56:51,053][59242] Avg episode reward: [(0, '31.890'), (1, '35.020')] +[2023-10-09 06:56:51,830][60143] Updated weights for policy 0, policy_version 67522 (0.0010) +[2023-10-09 06:56:52,037][60144] Updated weights for policy 1, policy_version 68292 (0.0008) +[2023-10-09 06:56:52,204][60143] Updated weights for policy 0, policy_version 67532 (0.0008) +[2023-10-09 06:56:52,402][60144] Updated weights for policy 1, policy_version 68302 (0.0007) +[2023-10-09 06:56:52,571][60143] Updated weights for policy 0, policy_version 67542 (0.0008) +[2023-10-09 06:56:52,769][60144] Updated weights for policy 1, policy_version 68312 (0.0009) +[2023-10-09 06:56:52,943][60143] Updated weights for policy 0, policy_version 67552 (0.0009) +[2023-10-09 06:56:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 139132928. Throughput: 0: 1697.6, 1: 1723.8. Samples: 34793376. Policy #0 lag: (min: 28.0, avg: 39.3, max: 60.0) +[2023-10-09 06:56:56,053][59242] Avg episode reward: [(0, '31.530'), (1, '34.720')] +[2023-10-09 06:56:56,748][60144] Updated weights for policy 1, policy_version 68322 (0.0008) +[2023-10-09 06:56:57,014][60143] Updated weights for policy 0, policy_version 67562 (0.0008) +[2023-10-09 06:56:57,120][60144] Updated weights for policy 1, policy_version 68332 (0.0008) +[2023-10-09 06:56:57,376][60143] Updated weights for policy 0, policy_version 67572 (0.0009) +[2023-10-09 06:56:57,484][60144] Updated weights for policy 1, policy_version 68342 (0.0009) +[2023-10-09 06:56:57,744][60143] Updated weights for policy 0, policy_version 67582 (0.0008) +[2023-10-09 06:56:57,846][60144] Updated weights for policy 1, policy_version 68352 (0.0007) +[2023-10-09 06:57:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 139198464. Throughput: 0: 1697.0, 1: 1736.3. Samples: 34814358. Policy #0 lag: (min: 28.0, avg: 39.3, max: 60.0) +[2023-10-09 06:57:01,053][59242] Avg episode reward: [(0, '33.410'), (1, '35.120')] +[2023-10-09 06:57:01,635][60143] Updated weights for policy 0, policy_version 67592 (0.0011) +[2023-10-09 06:57:01,893][60144] Updated weights for policy 1, policy_version 68362 (0.0009) +[2023-10-09 06:57:02,003][60143] Updated weights for policy 0, policy_version 67602 (0.0009) +[2023-10-09 06:57:02,258][60144] Updated weights for policy 1, policy_version 68372 (0.0010) +[2023-10-09 06:57:02,375][60143] Updated weights for policy 0, policy_version 67612 (0.0007) +[2023-10-09 06:57:02,615][60144] Updated weights for policy 1, policy_version 68382 (0.0010) +[2023-10-09 06:57:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 139264000. Throughput: 0: 1688.3, 1: 1705.0. Samples: 34823788. Policy #0 lag: (min: 28.0, avg: 39.3, max: 60.0) +[2023-10-09 06:57:06,053][59242] Avg episode reward: [(0, '34.110'), (1, '35.090')] +[2023-10-09 06:57:06,325][60143] Updated weights for policy 0, policy_version 67622 (0.0008) +[2023-10-09 06:57:06,555][60144] Updated weights for policy 1, policy_version 68392 (0.0009) +[2023-10-09 06:57:06,701][60143] Updated weights for policy 0, policy_version 67632 (0.0007) +[2023-10-09 06:57:06,917][60144] Updated weights for policy 1, policy_version 68402 (0.0009) +[2023-10-09 06:57:07,060][60143] Updated weights for policy 0, policy_version 67642 (0.0007) +[2023-10-09 06:57:07,280][60144] Updated weights for policy 1, policy_version 68412 (0.0008) +[2023-10-09 06:57:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 139329536. Throughput: 0: 1707.0, 1: 1725.0. Samples: 34844932. Policy #0 lag: (min: 28.0, avg: 39.3, max: 60.0) +[2023-10-09 06:57:11,053][60143] Updated weights for policy 0, policy_version 67652 (0.0007) +[2023-10-09 06:57:11,053][59242] Avg episode reward: [(0, '33.890'), (1, '35.850')] +[2023-10-09 06:57:11,379][60144] Updated weights for policy 1, policy_version 68422 (0.0008) +[2023-10-09 06:57:11,424][60143] Updated weights for policy 0, policy_version 67662 (0.0008) +[2023-10-09 06:57:11,746][60144] Updated weights for policy 1, policy_version 68432 (0.0009) +[2023-10-09 06:57:11,787][60143] Updated weights for policy 0, policy_version 67672 (0.0008) +[2023-10-09 06:57:12,111][60144] Updated weights for policy 1, policy_version 68442 (0.0010) +[2023-10-09 06:57:15,750][60143] Updated weights for policy 0, policy_version 67682 (0.0008) +[2023-10-09 06:57:15,992][60144] Updated weights for policy 1, policy_version 68452 (0.0007) +[2023-10-09 06:57:16,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 139395072. Throughput: 0: 1712.7, 1: 1716.0. Samples: 34866258. Policy #0 lag: (min: 28.0, avg: 39.3, max: 60.0) +[2023-10-09 06:57:16,053][59242] Avg episode reward: [(0, '32.520'), (1, '38.690')] +[2023-10-09 06:57:16,115][60143] Updated weights for policy 0, policy_version 67692 (0.0008) +[2023-10-09 06:57:16,361][60144] Updated weights for policy 1, policy_version 68462 (0.0007) +[2023-10-09 06:57:16,477][60143] Updated weights for policy 0, policy_version 67702 (0.0008) +[2023-10-09 06:57:16,719][60144] Updated weights for policy 1, policy_version 68472 (0.0007) +[2023-10-09 06:57:16,851][60143] Updated weights for policy 0, policy_version 67712 (0.0007) +[2023-10-09 06:57:17,012][60003] Saving new best policy, reward=38.690! +[2023-10-09 06:57:20,684][60144] Updated weights for policy 1, policy_version 68482 (0.0009) +[2023-10-09 06:57:20,898][60143] Updated weights for policy 0, policy_version 67722 (0.0008) +[2023-10-09 06:57:21,051][60144] Updated weights for policy 1, policy_version 68492 (0.0007) +[2023-10-09 06:57:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13662.6). Total num frames: 139460608. Throughput: 0: 1709.6, 1: 1709.6. Samples: 34875450. Policy #0 lag: (min: 28.0, avg: 39.3, max: 60.0) +[2023-10-09 06:57:21,053][59242] Avg episode reward: [(0, '31.470'), (1, '38.360')] +[2023-10-09 06:57:21,263][60143] Updated weights for policy 0, policy_version 67732 (0.0008) +[2023-10-09 06:57:21,418][60144] Updated weights for policy 1, policy_version 68502 (0.0007) +[2023-10-09 06:57:21,642][60143] Updated weights for policy 0, policy_version 67742 (0.0008) +[2023-10-09 06:57:21,786][60144] Updated weights for policy 1, policy_version 68512 (0.0007) +[2023-10-09 06:57:25,555][60143] Updated weights for policy 0, policy_version 67752 (0.0007) +[2023-10-09 06:57:25,679][60144] Updated weights for policy 1, policy_version 68522 (0.0008) +[2023-10-09 06:57:25,919][60143] Updated weights for policy 0, policy_version 67762 (0.0009) +[2023-10-09 06:57:26,035][60144] Updated weights for policy 1, policy_version 68532 (0.0007) +[2023-10-09 06:57:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 139526144. Throughput: 0: 1721.4, 1: 1715.6. Samples: 34896830. Policy #0 lag: (min: 28.0, avg: 39.3, max: 60.0) +[2023-10-09 06:57:26,053][59242] Avg episode reward: [(0, '31.960'), (1, '36.850')] +[2023-10-09 06:57:26,296][60143] Updated weights for policy 0, policy_version 67772 (0.0010) +[2023-10-09 06:57:26,399][60144] Updated weights for policy 1, policy_version 68542 (0.0009) +[2023-10-09 06:57:30,379][60144] Updated weights for policy 1, policy_version 68552 (0.0010) +[2023-10-09 06:57:30,389][60143] Updated weights for policy 0, policy_version 67782 (0.0010) +[2023-10-09 06:57:30,753][60144] Updated weights for policy 1, policy_version 68562 (0.0008) +[2023-10-09 06:57:30,766][60143] Updated weights for policy 0, policy_version 67792 (0.0009) +[2023-10-09 06:57:31,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 139591680. Throughput: 0: 1715.1, 1: 1711.8. Samples: 34917098. Policy #0 lag: (min: 28.0, avg: 39.3, max: 60.0) +[2023-10-09 06:57:31,053][59242] Avg episode reward: [(0, '31.460'), (1, '36.460')] +[2023-10-09 06:57:31,121][60144] Updated weights for policy 1, policy_version 68572 (0.0007) +[2023-10-09 06:57:31,123][60143] Updated weights for policy 0, policy_version 67802 (0.0010) +[2023-10-09 06:57:31,263][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000068576_70221824.pth... +[2023-10-09 06:57:31,292][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000066944_68550656.pth +[2023-10-09 06:57:31,346][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000067808_69435392.pth... +[2023-10-09 06:57:31,383][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000066208_67796992.pth +[2023-10-09 06:57:35,042][60144] Updated weights for policy 1, policy_version 68582 (0.0009) +[2023-10-09 06:57:35,115][60143] Updated weights for policy 0, policy_version 67812 (0.0007) +[2023-10-09 06:57:35,406][60144] Updated weights for policy 1, policy_version 68592 (0.0009) +[2023-10-09 06:57:35,489][60143] Updated weights for policy 0, policy_version 67822 (0.0007) +[2023-10-09 06:57:35,772][60144] Updated weights for policy 1, policy_version 68602 (0.0008) +[2023-10-09 06:57:35,848][60143] Updated weights for policy 0, policy_version 67832 (0.0008) +[2023-10-09 06:57:36,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 139689984. Throughput: 0: 1723.0, 1: 1728.0. Samples: 34927386. Policy #0 lag: (min: 28.0, avg: 39.3, max: 60.0) +[2023-10-09 06:57:36,053][59242] Avg episode reward: [(0, '30.990'), (1, '35.900')] +[2023-10-09 06:57:39,751][60144] Updated weights for policy 1, policy_version 68612 (0.0007) +[2023-10-09 06:57:39,867][60143] Updated weights for policy 0, policy_version 67842 (0.0007) +[2023-10-09 06:57:40,118][60144] Updated weights for policy 1, policy_version 68622 (0.0008) +[2023-10-09 06:57:40,239][60143] Updated weights for policy 0, policy_version 67852 (0.0010) +[2023-10-09 06:57:40,483][60144] Updated weights for policy 1, policy_version 68632 (0.0009) +[2023-10-09 06:57:40,609][60143] Updated weights for policy 0, policy_version 67862 (0.0009) +[2023-10-09 06:57:40,979][60143] Updated weights for policy 0, policy_version 67872 (0.0010) +[2023-10-09 06:57:41,052][59242] Fps is (10 sec: 19660.9, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 139788288. Throughput: 0: 1719.7, 1: 1725.5. Samples: 34948410. Policy #0 lag: (min: 12.0, avg: 20.0, max: 44.0) +[2023-10-09 06:57:41,053][59242] Avg episode reward: [(0, '31.840'), (1, '34.660')] +[2023-10-09 06:57:44,257][60144] Updated weights for policy 1, policy_version 68642 (0.0009) +[2023-10-09 06:57:44,621][60144] Updated weights for policy 1, policy_version 68652 (0.0010) +[2023-10-09 06:57:44,984][60143] Updated weights for policy 0, policy_version 67882 (0.0009) +[2023-10-09 06:57:44,988][60144] Updated weights for policy 1, policy_version 68662 (0.0009) +[2023-10-09 06:57:45,358][60143] Updated weights for policy 0, policy_version 67892 (0.0007) +[2023-10-09 06:57:45,362][60144] Updated weights for policy 1, policy_version 68672 (0.0008) +[2023-10-09 06:57:45,733][60143] Updated weights for policy 0, policy_version 67902 (0.0007) +[2023-10-09 06:57:46,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 139853824. Throughput: 0: 1699.2, 1: 1701.5. Samples: 34967394. Policy #0 lag: (min: 12.0, avg: 20.0, max: 44.0) +[2023-10-09 06:57:46,053][59242] Avg episode reward: [(0, '32.470'), (1, '34.010')] +[2023-10-09 06:57:49,429][60144] Updated weights for policy 1, policy_version 68682 (0.0012) +[2023-10-09 06:57:49,742][60143] Updated weights for policy 0, policy_version 67912 (0.0007) +[2023-10-09 06:57:49,786][60144] Updated weights for policy 1, policy_version 68692 (0.0007) +[2023-10-09 06:57:50,105][60143] Updated weights for policy 0, policy_version 67922 (0.0008) +[2023-10-09 06:57:50,145][60144] Updated weights for policy 1, policy_version 68702 (0.0008) +[2023-10-09 06:57:50,472][60143] Updated weights for policy 0, policy_version 67932 (0.0011) +[2023-10-09 06:57:51,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 139919360. Throughput: 0: 1713.8, 1: 1727.5. Samples: 34978650. Policy #0 lag: (min: 12.0, avg: 20.0, max: 44.0) +[2023-10-09 06:57:51,053][59242] Avg episode reward: [(0, '31.050'), (1, '34.570')] +[2023-10-09 06:57:54,110][60144] Updated weights for policy 1, policy_version 68712 (0.0008) +[2023-10-09 06:57:54,481][60144] Updated weights for policy 1, policy_version 68722 (0.0007) +[2023-10-09 06:57:54,556][60143] Updated weights for policy 0, policy_version 67942 (0.0008) +[2023-10-09 06:57:54,855][60144] Updated weights for policy 1, policy_version 68732 (0.0007) +[2023-10-09 06:57:54,924][60143] Updated weights for policy 0, policy_version 67952 (0.0009) +[2023-10-09 06:57:55,282][60143] Updated weights for policy 0, policy_version 67962 (0.0010) +[2023-10-09 06:57:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 139984896. Throughput: 0: 1701.2, 1: 1718.0. Samples: 34998800. Policy #0 lag: (min: 12.0, avg: 20.0, max: 44.0) +[2023-10-09 06:57:56,053][59242] Avg episode reward: [(0, '31.840'), (1, '34.790')] +[2023-10-09 06:57:58,793][60144] Updated weights for policy 1, policy_version 68742 (0.0008) +[2023-10-09 06:57:59,162][60144] Updated weights for policy 1, policy_version 68752 (0.0008) +[2023-10-09 06:57:59,202][60143] Updated weights for policy 0, policy_version 67972 (0.0009) +[2023-10-09 06:57:59,530][60144] Updated weights for policy 1, policy_version 68762 (0.0008) +[2023-10-09 06:57:59,567][60143] Updated weights for policy 0, policy_version 67982 (0.0008) +[2023-10-09 06:57:59,938][60143] Updated weights for policy 0, policy_version 67992 (0.0010) +[2023-10-09 06:58:01,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 140050432. Throughput: 0: 1671.4, 1: 1708.4. Samples: 35018350. Policy #0 lag: (min: 12.0, avg: 20.0, max: 44.0) +[2023-10-09 06:58:01,053][59242] Avg episode reward: [(0, '32.740'), (1, '33.000')] +[2023-10-09 06:58:03,449][60144] Updated weights for policy 1, policy_version 68772 (0.0007) +[2023-10-09 06:58:03,819][60144] Updated weights for policy 1, policy_version 68782 (0.0008) +[2023-10-09 06:58:03,917][60143] Updated weights for policy 0, policy_version 68002 (0.0009) +[2023-10-09 06:58:04,184][60144] Updated weights for policy 1, policy_version 68792 (0.0008) +[2023-10-09 06:58:04,284][60143] Updated weights for policy 0, policy_version 68012 (0.0007) +[2023-10-09 06:58:04,654][60143] Updated weights for policy 0, policy_version 68022 (0.0010) +[2023-10-09 06:58:05,026][60143] Updated weights for policy 0, policy_version 68032 (0.0010) +[2023-10-09 06:58:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 140115968. Throughput: 0: 1704.1, 1: 1731.8. Samples: 35030066. Policy #0 lag: (min: 12.0, avg: 20.0, max: 44.0) +[2023-10-09 06:58:06,053][59242] Avg episode reward: [(0, '32.620'), (1, '33.970')] +[2023-10-09 06:58:08,298][60144] Updated weights for policy 1, policy_version 68802 (0.0009) +[2023-10-09 06:58:08,666][60144] Updated weights for policy 1, policy_version 68812 (0.0009) +[2023-10-09 06:58:09,028][60144] Updated weights for policy 1, policy_version 68822 (0.0008) +[2023-10-09 06:58:09,121][60143] Updated weights for policy 0, policy_version 68042 (0.0008) +[2023-10-09 06:58:09,395][60144] Updated weights for policy 1, policy_version 68832 (0.0007) +[2023-10-09 06:58:09,485][60143] Updated weights for policy 0, policy_version 68052 (0.0008) +[2023-10-09 06:58:09,851][60143] Updated weights for policy 0, policy_version 68062 (0.0008) +[2023-10-09 06:58:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 140181504. Throughput: 0: 1680.0, 1: 1711.5. Samples: 35049448. Policy #0 lag: (min: 12.0, avg: 20.0, max: 44.0) +[2023-10-09 06:58:11,053][59242] Avg episode reward: [(0, '32.100'), (1, '33.360')] +[2023-10-09 06:58:13,465][60144] Updated weights for policy 1, policy_version 68842 (0.0009) +[2023-10-09 06:58:13,817][60143] Updated weights for policy 0, policy_version 68072 (0.0010) +[2023-10-09 06:58:13,841][60144] Updated weights for policy 1, policy_version 68852 (0.0007) +[2023-10-09 06:58:14,178][60143] Updated weights for policy 0, policy_version 68082 (0.0007) +[2023-10-09 06:58:14,208][60144] Updated weights for policy 1, policy_version 68862 (0.0007) +[2023-10-09 06:58:14,540][60143] Updated weights for policy 0, policy_version 68092 (0.0008) +[2023-10-09 06:58:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 140247040. Throughput: 0: 1682.1, 1: 1723.1. Samples: 35070334. Policy #0 lag: (min: 12.0, avg: 20.0, max: 44.0) +[2023-10-09 06:58:16,053][59242] Avg episode reward: [(0, '32.200'), (1, '35.110')] +[2023-10-09 06:58:18,236][60144] Updated weights for policy 1, policy_version 68872 (0.0008) +[2023-10-09 06:58:18,604][60144] Updated weights for policy 1, policy_version 68882 (0.0009) +[2023-10-09 06:58:18,639][60143] Updated weights for policy 0, policy_version 68102 (0.0007) +[2023-10-09 06:58:18,966][60144] Updated weights for policy 1, policy_version 68892 (0.0008) +[2023-10-09 06:58:19,020][60143] Updated weights for policy 0, policy_version 68112 (0.0008) +[2023-10-09 06:58:19,395][60143] Updated weights for policy 0, policy_version 68122 (0.0008) +[2023-10-09 06:58:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 140312576. Throughput: 0: 1700.1, 1: 1716.9. Samples: 35081152. Policy #0 lag: (min: 21.0, avg: 27.7, max: 53.0) +[2023-10-09 06:58:21,053][59242] Avg episode reward: [(0, '31.900'), (1, '37.600')] +[2023-10-09 06:58:22,885][60144] Updated weights for policy 1, policy_version 68902 (0.0009) +[2023-10-09 06:58:23,255][60144] Updated weights for policy 1, policy_version 68912 (0.0008) +[2023-10-09 06:58:23,405][60143] Updated weights for policy 0, policy_version 68132 (0.0008) +[2023-10-09 06:58:23,619][60144] Updated weights for policy 1, policy_version 68922 (0.0007) +[2023-10-09 06:58:23,768][60143] Updated weights for policy 0, policy_version 68142 (0.0008) +[2023-10-09 06:58:24,143][60143] Updated weights for policy 0, policy_version 68152 (0.0008) +[2023-10-09 06:58:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 140378112. Throughput: 0: 1674.8, 1: 1705.6. Samples: 35100532. Policy #0 lag: (min: 21.0, avg: 27.7, max: 53.0) +[2023-10-09 06:58:26,053][59242] Avg episode reward: [(0, '31.350'), (1, '35.440')] +[2023-10-09 06:58:27,392][60144] Updated weights for policy 1, policy_version 68932 (0.0009) +[2023-10-09 06:58:27,765][60144] Updated weights for policy 1, policy_version 68942 (0.0008) +[2023-10-09 06:58:28,126][60144] Updated weights for policy 1, policy_version 68952 (0.0009) +[2023-10-09 06:58:28,216][60143] Updated weights for policy 0, policy_version 68162 (0.0009) +[2023-10-09 06:58:28,585][60143] Updated weights for policy 0, policy_version 68172 (0.0009) +[2023-10-09 06:58:28,962][60143] Updated weights for policy 0, policy_version 68182 (0.0009) +[2023-10-09 06:58:29,323][60143] Updated weights for policy 0, policy_version 68192 (0.0010) +[2023-10-09 06:58:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 140443648. Throughput: 0: 1692.7, 1: 1734.8. Samples: 35121628. Policy #0 lag: (min: 21.0, avg: 27.7, max: 53.0) +[2023-10-09 06:58:31,053][59242] Avg episode reward: [(0, '30.250'), (1, '34.810')] +[2023-10-09 06:58:32,070][60144] Updated weights for policy 1, policy_version 68962 (0.0008) +[2023-10-09 06:58:32,445][60144] Updated weights for policy 1, policy_version 68972 (0.0007) +[2023-10-09 06:58:32,812][60144] Updated weights for policy 1, policy_version 68982 (0.0007) +[2023-10-09 06:58:33,174][60144] Updated weights for policy 1, policy_version 68992 (0.0008) +[2023-10-09 06:58:33,341][60143] Updated weights for policy 0, policy_version 68202 (0.0010) +[2023-10-09 06:58:33,709][60143] Updated weights for policy 0, policy_version 68212 (0.0010) +[2023-10-09 06:58:34,074][60143] Updated weights for policy 0, policy_version 68222 (0.0010) +[2023-10-09 06:58:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 140509184. Throughput: 0: 1690.9, 1: 1708.0. Samples: 35131600. Policy #0 lag: (min: 21.0, avg: 27.7, max: 53.0) +[2023-10-09 06:58:36,053][59242] Avg episode reward: [(0, '30.140'), (1, '34.200')] +[2023-10-09 06:58:37,021][60144] Updated weights for policy 1, policy_version 69002 (0.0008) +[2023-10-09 06:58:37,383][60144] Updated weights for policy 1, policy_version 69012 (0.0010) +[2023-10-09 06:58:37,758][60144] Updated weights for policy 1, policy_version 69022 (0.0007) +[2023-10-09 06:58:38,063][60143] Updated weights for policy 0, policy_version 68232 (0.0009) +[2023-10-09 06:58:38,434][60143] Updated weights for policy 0, policy_version 68242 (0.0008) +[2023-10-09 06:58:38,800][60143] Updated weights for policy 0, policy_version 68252 (0.0010) +[2023-10-09 06:58:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 140574720. Throughput: 0: 1679.8, 1: 1726.3. Samples: 35152076. Policy #0 lag: (min: 21.0, avg: 27.7, max: 53.0) +[2023-10-09 06:58:41,053][59242] Avg episode reward: [(0, '30.270'), (1, '33.900')] +[2023-10-09 06:58:41,803][60144] Updated weights for policy 1, policy_version 69032 (0.0009) +[2023-10-09 06:58:42,171][60144] Updated weights for policy 1, policy_version 69042 (0.0007) +[2023-10-09 06:58:42,534][60144] Updated weights for policy 1, policy_version 69052 (0.0007) +[2023-10-09 06:58:42,800][60143] Updated weights for policy 0, policy_version 68262 (0.0011) +[2023-10-09 06:58:43,169][60143] Updated weights for policy 0, policy_version 68272 (0.0010) +[2023-10-09 06:58:43,549][60143] Updated weights for policy 0, policy_version 68282 (0.0011) +[2023-10-09 06:58:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 140640256. Throughput: 0: 1708.9, 1: 1739.4. Samples: 35173524. Policy #0 lag: (min: 21.0, avg: 27.7, max: 53.0) +[2023-10-09 06:58:46,053][59242] Avg episode reward: [(0, '31.620'), (1, '32.780')] +[2023-10-09 06:58:46,390][60144] Updated weights for policy 1, policy_version 69062 (0.0009) +[2023-10-09 06:58:46,750][60144] Updated weights for policy 1, policy_version 69072 (0.0007) +[2023-10-09 06:58:47,114][60144] Updated weights for policy 1, policy_version 69082 (0.0009) +[2023-10-09 06:58:47,472][60143] Updated weights for policy 0, policy_version 68292 (0.0010) +[2023-10-09 06:58:47,853][60143] Updated weights for policy 0, policy_version 68302 (0.0011) +[2023-10-09 06:58:48,216][60143] Updated weights for policy 0, policy_version 68312 (0.0008) +[2023-10-09 06:58:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 140705792. Throughput: 0: 1682.6, 1: 1714.0. Samples: 35182914. Policy #0 lag: (min: 21.0, avg: 27.7, max: 53.0) +[2023-10-09 06:58:51,053][59242] Avg episode reward: [(0, '31.390'), (1, '33.810')] +[2023-10-09 06:58:51,177][60144] Updated weights for policy 1, policy_version 69092 (0.0007) +[2023-10-09 06:58:51,539][60144] Updated weights for policy 1, policy_version 69102 (0.0007) +[2023-10-09 06:58:51,911][60144] Updated weights for policy 1, policy_version 69112 (0.0007) +[2023-10-09 06:58:52,162][60143] Updated weights for policy 0, policy_version 68322 (0.0008) +[2023-10-09 06:58:52,527][60143] Updated weights for policy 0, policy_version 68332 (0.0011) +[2023-10-09 06:58:52,891][60143] Updated weights for policy 0, policy_version 68342 (0.0011) +[2023-10-09 06:58:53,258][60143] Updated weights for policy 0, policy_version 68352 (0.0011) +[2023-10-09 06:58:55,887][60144] Updated weights for policy 1, policy_version 69122 (0.0008) +[2023-10-09 06:58:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 140771328. Throughput: 0: 1703.0, 1: 1731.9. Samples: 35204016. Policy #0 lag: (min: 21.0, avg: 27.7, max: 53.0) +[2023-10-09 06:58:56,053][59242] Avg episode reward: [(0, '32.000'), (1, '32.270')] +[2023-10-09 06:58:56,256][60144] Updated weights for policy 1, policy_version 69132 (0.0007) +[2023-10-09 06:58:56,625][60144] Updated weights for policy 1, policy_version 69142 (0.0008) +[2023-10-09 06:58:56,984][60144] Updated weights for policy 1, policy_version 69152 (0.0008) +[2023-10-09 06:58:57,154][60143] Updated weights for policy 0, policy_version 68362 (0.0010) +[2023-10-09 06:58:57,527][60143] Updated weights for policy 0, policy_version 68372 (0.0009) +[2023-10-09 06:58:57,898][60143] Updated weights for policy 0, policy_version 68382 (0.0008) +[2023-10-09 06:59:00,925][60144] Updated weights for policy 1, policy_version 69162 (0.0009) +[2023-10-09 06:59:01,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 140836864. Throughput: 0: 1718.6, 1: 1727.1. Samples: 35225390. Policy #0 lag: (min: 21.0, avg: 27.7, max: 53.0) +[2023-10-09 06:59:01,053][59242] Avg episode reward: [(0, '31.040'), (1, '34.130')] +[2023-10-09 06:59:01,286][60144] Updated weights for policy 1, policy_version 69172 (0.0009) +[2023-10-09 06:59:01,655][60144] Updated weights for policy 1, policy_version 69182 (0.0010) +[2023-10-09 06:59:01,877][60143] Updated weights for policy 0, policy_version 68392 (0.0008) +[2023-10-09 06:59:02,241][60143] Updated weights for policy 0, policy_version 68402 (0.0007) +[2023-10-09 06:59:02,619][60143] Updated weights for policy 0, policy_version 68412 (0.0008) +[2023-10-09 06:59:05,642][60144] Updated weights for policy 1, policy_version 69192 (0.0010) +[2023-10-09 06:59:06,032][60144] Updated weights for policy 1, policy_version 69202 (0.0011) +[2023-10-09 06:59:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 140902400. Throughput: 0: 1694.7, 1: 1722.2. Samples: 35234912. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:59:06,053][59242] Avg episode reward: [(0, '31.130'), (1, '34.480')] +[2023-10-09 06:59:06,396][60144] Updated weights for policy 1, policy_version 69212 (0.0008) +[2023-10-09 06:59:06,721][60143] Updated weights for policy 0, policy_version 68422 (0.0007) +[2023-10-09 06:59:07,094][60143] Updated weights for policy 0, policy_version 68432 (0.0008) +[2023-10-09 06:59:07,461][60143] Updated weights for policy 0, policy_version 68442 (0.0009) +[2023-10-09 06:59:10,170][60144] Updated weights for policy 1, policy_version 69222 (0.0008) +[2023-10-09 06:59:10,535][60144] Updated weights for policy 1, policy_version 69232 (0.0010) +[2023-10-09 06:59:10,913][60144] Updated weights for policy 1, policy_version 69242 (0.0009) +[2023-10-09 06:59:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 140967936. Throughput: 0: 1714.9, 1: 1736.5. Samples: 35255844. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:59:11,053][59242] Avg episode reward: [(0, '31.430'), (1, '34.380')] +[2023-10-09 06:59:11,470][60143] Updated weights for policy 0, policy_version 68452 (0.0008) +[2023-10-09 06:59:11,847][60143] Updated weights for policy 0, policy_version 68462 (0.0008) +[2023-10-09 06:59:12,215][60143] Updated weights for policy 0, policy_version 68472 (0.0009) +[2023-10-09 06:59:14,808][60144] Updated weights for policy 1, policy_version 69252 (0.0008) +[2023-10-09 06:59:15,175][60144] Updated weights for policy 1, policy_version 69262 (0.0008) +[2023-10-09 06:59:15,544][60144] Updated weights for policy 1, policy_version 69272 (0.0008) +[2023-10-09 06:59:16,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 141066240. Throughput: 0: 1718.9, 1: 1711.9. Samples: 35276016. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:59:16,052][59242] Avg episode reward: [(0, '30.620'), (1, '34.550')] +[2023-10-09 06:59:16,256][60143] Updated weights for policy 0, policy_version 68482 (0.0007) +[2023-10-09 06:59:16,631][60143] Updated weights for policy 0, policy_version 68492 (0.0009) +[2023-10-09 06:59:16,998][60143] Updated weights for policy 0, policy_version 68502 (0.0008) +[2023-10-09 06:59:17,372][60143] Updated weights for policy 0, policy_version 68512 (0.0009) +[2023-10-09 06:59:19,579][60144] Updated weights for policy 1, policy_version 69282 (0.0008) +[2023-10-09 06:59:19,943][60144] Updated weights for policy 1, policy_version 69292 (0.0008) +[2023-10-09 06:59:20,303][60144] Updated weights for policy 1, policy_version 69302 (0.0011) +[2023-10-09 06:59:20,671][60144] Updated weights for policy 1, policy_version 69312 (0.0010) +[2023-10-09 06:59:21,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 141131776. Throughput: 0: 1701.2, 1: 1738.1. Samples: 35286370. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:59:21,053][59242] Avg episode reward: [(0, '29.860'), (1, '34.680')] +[2023-10-09 06:59:21,277][60143] Updated weights for policy 0, policy_version 68522 (0.0007) +[2023-10-09 06:59:21,638][60143] Updated weights for policy 0, policy_version 68532 (0.0008) +[2023-10-09 06:59:22,014][60143] Updated weights for policy 0, policy_version 68542 (0.0008) +[2023-10-09 06:59:24,698][60144] Updated weights for policy 1, policy_version 69322 (0.0007) +[2023-10-09 06:59:25,071][60144] Updated weights for policy 1, policy_version 69332 (0.0008) +[2023-10-09 06:59:25,433][60144] Updated weights for policy 1, policy_version 69342 (0.0009) +[2023-10-09 06:59:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 141197312. Throughput: 0: 1724.0, 1: 1726.9. Samples: 35307364. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:59:26,052][59242] Avg episode reward: [(0, '30.770'), (1, '34.040')] +[2023-10-09 06:59:26,103][60143] Updated weights for policy 0, policy_version 68552 (0.0011) +[2023-10-09 06:59:26,464][60143] Updated weights for policy 0, policy_version 68562 (0.0009) +[2023-10-09 06:59:26,846][60143] Updated weights for policy 0, policy_version 68572 (0.0008) +[2023-10-09 06:59:29,361][60144] Updated weights for policy 1, policy_version 69352 (0.0009) +[2023-10-09 06:59:29,733][60144] Updated weights for policy 1, policy_version 69362 (0.0010) +[2023-10-09 06:59:30,103][60144] Updated weights for policy 1, policy_version 69372 (0.0008) +[2023-10-09 06:59:30,760][60143] Updated weights for policy 0, policy_version 68582 (0.0007) +[2023-10-09 06:59:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 141262848. Throughput: 0: 1722.0, 1: 1703.3. Samples: 35327660. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:59:31,053][59242] Avg episode reward: [(0, '30.580'), (1, '33.330')] +[2023-10-09 06:59:31,060][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000069376_71041024.pth... +[2023-10-09 06:59:31,090][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000067744_69369856.pth +[2023-10-09 06:59:31,119][60143] Updated weights for policy 0, policy_version 68592 (0.0009) +[2023-10-09 06:59:31,489][60143] Updated weights for policy 0, policy_version 68602 (0.0007) +[2023-10-09 06:59:31,707][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000068608_70254592.pth... +[2023-10-09 06:59:31,747][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000067008_68616192.pth +[2023-10-09 06:59:34,076][60144] Updated weights for policy 1, policy_version 69382 (0.0007) +[2023-10-09 06:59:34,445][60144] Updated weights for policy 1, policy_version 69392 (0.0008) +[2023-10-09 06:59:34,811][60144] Updated weights for policy 1, policy_version 69402 (0.0009) +[2023-10-09 06:59:35,343][60143] Updated weights for policy 0, policy_version 68612 (0.0010) +[2023-10-09 06:59:35,711][60143] Updated weights for policy 0, policy_version 68622 (0.0009) +[2023-10-09 06:59:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 141328384. Throughput: 0: 1720.9, 1: 1735.7. Samples: 35338462. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:59:36,053][59242] Avg episode reward: [(0, '31.890'), (1, '34.290')] +[2023-10-09 06:59:36,086][60143] Updated weights for policy 0, policy_version 68632 (0.0009) +[2023-10-09 06:59:38,685][60144] Updated weights for policy 1, policy_version 69412 (0.0008) +[2023-10-09 06:59:39,045][60144] Updated weights for policy 1, policy_version 69422 (0.0009) +[2023-10-09 06:59:39,420][60144] Updated weights for policy 1, policy_version 69432 (0.0010) +[2023-10-09 06:59:40,213][60143] Updated weights for policy 0, policy_version 68642 (0.0009) +[2023-10-09 06:59:40,570][60143] Updated weights for policy 0, policy_version 68652 (0.0010) +[2023-10-09 06:59:40,942][60143] Updated weights for policy 0, policy_version 68662 (0.0008) +[2023-10-09 06:59:41,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 141393920. Throughput: 0: 1721.8, 1: 1710.3. Samples: 35358462. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:59:41,053][59242] Avg episode reward: [(0, '31.650'), (1, '36.750')] +[2023-10-09 06:59:41,305][60143] Updated weights for policy 0, policy_version 68672 (0.0010) +[2023-10-09 06:59:43,213][60144] Updated weights for policy 1, policy_version 69442 (0.0011) +[2023-10-09 06:59:43,577][60144] Updated weights for policy 1, policy_version 69452 (0.0010) +[2023-10-09 06:59:43,942][60144] Updated weights for policy 1, policy_version 69462 (0.0010) +[2023-10-09 06:59:44,315][60144] Updated weights for policy 1, policy_version 69472 (0.0010) +[2023-10-09 06:59:45,362][60143] Updated weights for policy 0, policy_version 68682 (0.0007) +[2023-10-09 06:59:45,721][60143] Updated weights for policy 0, policy_version 68692 (0.0007) +[2023-10-09 06:59:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 141459456. Throughput: 0: 1703.2, 1: 1710.0. Samples: 35378988. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 06:59:46,053][59242] Avg episode reward: [(0, '31.680'), (1, '35.270')] +[2023-10-09 06:59:46,093][60143] Updated weights for policy 0, policy_version 68702 (0.0007) +[2023-10-09 06:59:48,437][60144] Updated weights for policy 1, policy_version 69482 (0.0009) +[2023-10-09 06:59:48,801][60144] Updated weights for policy 1, policy_version 69492 (0.0010) +[2023-10-09 06:59:49,164][60144] Updated weights for policy 1, policy_version 69502 (0.0011) +[2023-10-09 06:59:49,967][60143] Updated weights for policy 0, policy_version 68712 (0.0010) +[2023-10-09 06:59:50,333][60143] Updated weights for policy 0, policy_version 68722 (0.0009) +[2023-10-09 06:59:50,692][60143] Updated weights for policy 0, policy_version 68732 (0.0007) +[2023-10-09 06:59:51,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 141557760. Throughput: 0: 1715.6, 1: 1722.1. Samples: 35389608. Policy #0 lag: (min: 8.0, avg: 30.7, max: 40.0) +[2023-10-09 06:59:51,053][59242] Avg episode reward: [(0, '30.960'), (1, '35.350')] +[2023-10-09 06:59:53,280][60144] Updated weights for policy 1, policy_version 69512 (0.0008) +[2023-10-09 06:59:53,660][60144] Updated weights for policy 1, policy_version 69522 (0.0009) +[2023-10-09 06:59:54,028][60144] Updated weights for policy 1, policy_version 69532 (0.0008) +[2023-10-09 06:59:54,804][60143] Updated weights for policy 0, policy_version 68742 (0.0008) +[2023-10-09 06:59:55,170][60143] Updated weights for policy 0, policy_version 68752 (0.0007) +[2023-10-09 06:59:55,531][60143] Updated weights for policy 0, policy_version 68762 (0.0010) +[2023-10-09 06:59:56,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 141623296. Throughput: 0: 1723.4, 1: 1697.4. Samples: 35409780. Policy #0 lag: (min: 8.0, avg: 30.7, max: 40.0) +[2023-10-09 06:59:56,052][59242] Avg episode reward: [(0, '33.210'), (1, '34.380')] +[2023-10-09 06:59:57,963][60144] Updated weights for policy 1, policy_version 69542 (0.0008) +[2023-10-09 06:59:58,328][60144] Updated weights for policy 1, policy_version 69552 (0.0007) +[2023-10-09 06:59:58,697][60144] Updated weights for policy 1, policy_version 69562 (0.0008) +[2023-10-09 06:59:59,485][60143] Updated weights for policy 0, policy_version 68772 (0.0010) +[2023-10-09 06:59:59,860][60143] Updated weights for policy 0, policy_version 68782 (0.0009) +[2023-10-09 07:00:00,231][60143] Updated weights for policy 0, policy_version 68792 (0.0008) +[2023-10-09 07:00:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 141688832. Throughput: 0: 1698.3, 1: 1721.0. Samples: 35429882. Policy #0 lag: (min: 8.0, avg: 30.7, max: 40.0) +[2023-10-09 07:00:01,053][59242] Avg episode reward: [(0, '32.780'), (1, '33.950')] +[2023-10-09 07:00:02,623][60144] Updated weights for policy 1, policy_version 69572 (0.0008) +[2023-10-09 07:00:02,989][60144] Updated weights for policy 1, policy_version 69582 (0.0007) +[2023-10-09 07:00:03,357][60144] Updated weights for policy 1, policy_version 69592 (0.0007) +[2023-10-09 07:00:04,150][60143] Updated weights for policy 0, policy_version 68802 (0.0008) +[2023-10-09 07:00:04,516][60143] Updated weights for policy 0, policy_version 68812 (0.0007) +[2023-10-09 07:00:04,880][60143] Updated weights for policy 0, policy_version 68822 (0.0010) +[2023-10-09 07:00:05,251][60143] Updated weights for policy 0, policy_version 68832 (0.0010) +[2023-10-09 07:00:06,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 141754368. Throughput: 0: 1727.1, 1: 1698.0. Samples: 35440498. Policy #0 lag: (min: 8.0, avg: 30.7, max: 40.0) +[2023-10-09 07:00:06,053][59242] Avg episode reward: [(0, '30.940'), (1, '34.760')] +[2023-10-09 07:00:07,386][60144] Updated weights for policy 1, policy_version 69602 (0.0009) +[2023-10-09 07:00:07,761][60144] Updated weights for policy 1, policy_version 69612 (0.0008) +[2023-10-09 07:00:08,125][60144] Updated weights for policy 1, policy_version 69622 (0.0009) +[2023-10-09 07:00:08,492][60144] Updated weights for policy 1, policy_version 69632 (0.0008) +[2023-10-09 07:00:09,401][60143] Updated weights for policy 0, policy_version 68842 (0.0008) +[2023-10-09 07:00:09,769][60143] Updated weights for policy 0, policy_version 68852 (0.0009) +[2023-10-09 07:00:10,129][60143] Updated weights for policy 0, policy_version 68862 (0.0009) +[2023-10-09 07:00:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 141819904. Throughput: 0: 1707.9, 1: 1704.1. Samples: 35460904. Policy #0 lag: (min: 8.0, avg: 30.7, max: 40.0) +[2023-10-09 07:00:11,053][59242] Avg episode reward: [(0, '30.690'), (1, '35.500')] +[2023-10-09 07:00:12,374][60144] Updated weights for policy 1, policy_version 69642 (0.0008) +[2023-10-09 07:00:12,744][60144] Updated weights for policy 1, policy_version 69652 (0.0010) +[2023-10-09 07:00:13,110][60144] Updated weights for policy 1, policy_version 69662 (0.0011) +[2023-10-09 07:00:14,168][60143] Updated weights for policy 0, policy_version 68872 (0.0011) +[2023-10-09 07:00:14,536][60143] Updated weights for policy 0, policy_version 68882 (0.0011) +[2023-10-09 07:00:14,902][60143] Updated weights for policy 0, policy_version 68892 (0.0011) +[2023-10-09 07:00:16,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 141885440. Throughput: 0: 1685.7, 1: 1730.0. Samples: 35481368. Policy #0 lag: (min: 8.0, avg: 30.7, max: 40.0) +[2023-10-09 07:00:16,052][59242] Avg episode reward: [(0, '31.100'), (1, '35.480')] +[2023-10-09 07:00:17,079][60144] Updated weights for policy 1, policy_version 69672 (0.0009) +[2023-10-09 07:00:17,457][60144] Updated weights for policy 1, policy_version 69682 (0.0009) +[2023-10-09 07:00:17,826][60144] Updated weights for policy 1, policy_version 69692 (0.0009) +[2023-10-09 07:00:18,966][60143] Updated weights for policy 0, policy_version 68902 (0.0010) +[2023-10-09 07:00:19,334][60143] Updated weights for policy 0, policy_version 68912 (0.0008) +[2023-10-09 07:00:19,702][60143] Updated weights for policy 0, policy_version 68922 (0.0009) +[2023-10-09 07:00:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 141950976. Throughput: 0: 1713.1, 1: 1699.1. Samples: 35492010. Policy #0 lag: (min: 8.0, avg: 30.7, max: 40.0) +[2023-10-09 07:00:21,052][59242] Avg episode reward: [(0, '30.820'), (1, '35.330')] +[2023-10-09 07:00:21,759][60144] Updated weights for policy 1, policy_version 69702 (0.0007) +[2023-10-09 07:00:22,114][60144] Updated weights for policy 1, policy_version 69712 (0.0007) +[2023-10-09 07:00:22,488][60144] Updated weights for policy 1, policy_version 69722 (0.0007) +[2023-10-09 07:00:23,732][60143] Updated weights for policy 0, policy_version 68932 (0.0009) +[2023-10-09 07:00:24,096][60143] Updated weights for policy 0, policy_version 68942 (0.0009) +[2023-10-09 07:00:24,474][60143] Updated weights for policy 0, policy_version 68952 (0.0007) +[2023-10-09 07:00:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 142016512. Throughput: 0: 1694.9, 1: 1729.7. Samples: 35512564. Policy #0 lag: (min: 8.0, avg: 30.7, max: 40.0) +[2023-10-09 07:00:26,053][59242] Avg episode reward: [(0, '32.070'), (1, '33.270')] +[2023-10-09 07:00:26,349][60144] Updated weights for policy 1, policy_version 69732 (0.0009) +[2023-10-09 07:00:26,717][60144] Updated weights for policy 1, policy_version 69742 (0.0008) +[2023-10-09 07:00:27,086][60144] Updated weights for policy 1, policy_version 69752 (0.0008) +[2023-10-09 07:00:28,519][60143] Updated weights for policy 0, policy_version 68962 (0.0008) +[2023-10-09 07:00:28,885][60143] Updated weights for policy 0, policy_version 68972 (0.0007) +[2023-10-09 07:00:29,247][60143] Updated weights for policy 0, policy_version 68982 (0.0009) +[2023-10-09 07:00:29,611][60143] Updated weights for policy 0, policy_version 68992 (0.0010) +[2023-10-09 07:00:30,823][60144] Updated weights for policy 1, policy_version 69762 (0.0008) +[2023-10-09 07:00:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 142082048. Throughput: 0: 1696.0, 1: 1734.6. Samples: 35533366. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) +[2023-10-09 07:00:31,053][59242] Avg episode reward: [(0, '32.090'), (1, '33.730')] +[2023-10-09 07:00:31,187][60144] Updated weights for policy 1, policy_version 69772 (0.0011) +[2023-10-09 07:00:31,550][60144] Updated weights for policy 1, policy_version 69782 (0.0008) +[2023-10-09 07:00:31,913][60144] Updated weights for policy 1, policy_version 69792 (0.0007) +[2023-10-09 07:00:33,453][60143] Updated weights for policy 0, policy_version 69002 (0.0007) +[2023-10-09 07:00:33,818][60143] Updated weights for policy 0, policy_version 69012 (0.0010) +[2023-10-09 07:00:34,196][60143] Updated weights for policy 0, policy_version 69022 (0.0008) +[2023-10-09 07:00:35,844][60144] Updated weights for policy 1, policy_version 69802 (0.0008) +[2023-10-09 07:00:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 142147584. Throughput: 0: 1702.6, 1: 1719.0. Samples: 35543580. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) +[2023-10-09 07:00:36,052][59242] Avg episode reward: [(0, '33.490'), (1, '33.220')] +[2023-10-09 07:00:36,217][60144] Updated weights for policy 1, policy_version 69812 (0.0007) +[2023-10-09 07:00:36,600][60144] Updated weights for policy 1, policy_version 69822 (0.0009) +[2023-10-09 07:00:38,273][60143] Updated weights for policy 0, policy_version 69032 (0.0008) +[2023-10-09 07:00:38,639][60143] Updated weights for policy 0, policy_version 69042 (0.0007) +[2023-10-09 07:00:39,015][60143] Updated weights for policy 0, policy_version 69052 (0.0008) +[2023-10-09 07:00:40,705][60144] Updated weights for policy 1, policy_version 69832 (0.0010) +[2023-10-09 07:00:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 142213120. Throughput: 0: 1683.1, 1: 1741.6. Samples: 35563890. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) +[2023-10-09 07:00:41,053][59242] Avg episode reward: [(0, '33.340'), (1, '33.030')] +[2023-10-09 07:00:41,080][60144] Updated weights for policy 1, policy_version 69842 (0.0009) +[2023-10-09 07:00:41,441][60144] Updated weights for policy 1, policy_version 69852 (0.0007) +[2023-10-09 07:00:42,889][60143] Updated weights for policy 0, policy_version 69062 (0.0009) +[2023-10-09 07:00:43,259][60143] Updated weights for policy 0, policy_version 69072 (0.0008) +[2023-10-09 07:00:43,625][60143] Updated weights for policy 0, policy_version 69082 (0.0008) +[2023-10-09 07:00:45,517][60144] Updated weights for policy 1, policy_version 69862 (0.0009) +[2023-10-09 07:00:45,873][60144] Updated weights for policy 1, policy_version 69872 (0.0009) +[2023-10-09 07:00:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 142278656. Throughput: 0: 1702.9, 1: 1731.5. Samples: 35584426. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) +[2023-10-09 07:00:46,053][59242] Avg episode reward: [(0, '33.080'), (1, '32.540')] +[2023-10-09 07:00:46,248][60144] Updated weights for policy 1, policy_version 69882 (0.0008) +[2023-10-09 07:00:47,625][60143] Updated weights for policy 0, policy_version 69092 (0.0009) +[2023-10-09 07:00:47,998][60143] Updated weights for policy 0, policy_version 69102 (0.0009) +[2023-10-09 07:00:48,370][60143] Updated weights for policy 0, policy_version 69112 (0.0011) +[2023-10-09 07:00:50,096][60144] Updated weights for policy 1, policy_version 69892 (0.0007) +[2023-10-09 07:00:50,461][60144] Updated weights for policy 1, policy_version 69902 (0.0007) +[2023-10-09 07:00:50,833][60144] Updated weights for policy 1, policy_version 69912 (0.0009) +[2023-10-09 07:00:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 142344192. Throughput: 0: 1679.5, 1: 1737.0. Samples: 35594240. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) +[2023-10-09 07:00:51,053][59242] Avg episode reward: [(0, '34.170'), (1, '33.600')] +[2023-10-09 07:00:52,228][60143] Updated weights for policy 0, policy_version 69122 (0.0008) +[2023-10-09 07:00:52,595][60143] Updated weights for policy 0, policy_version 69132 (0.0008) +[2023-10-09 07:00:52,964][60143] Updated weights for policy 0, policy_version 69142 (0.0011) +[2023-10-09 07:00:53,322][60143] Updated weights for policy 0, policy_version 69152 (0.0009) +[2023-10-09 07:00:54,627][60144] Updated weights for policy 1, policy_version 69922 (0.0011) +[2023-10-09 07:00:55,005][60144] Updated weights for policy 1, policy_version 69932 (0.0011) +[2023-10-09 07:00:55,374][60144] Updated weights for policy 1, policy_version 69942 (0.0009) +[2023-10-09 07:00:55,737][60144] Updated weights for policy 1, policy_version 69952 (0.0010) +[2023-10-09 07:00:56,052][59242] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 142442496. Throughput: 0: 1695.8, 1: 1741.9. Samples: 35615600. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) +[2023-10-09 07:00:56,053][59242] Avg episode reward: [(0, '33.250'), (1, '34.370')] +[2023-10-09 07:00:57,376][60143] Updated weights for policy 0, policy_version 69162 (0.0008) +[2023-10-09 07:00:57,761][60143] Updated weights for policy 0, policy_version 69172 (0.0008) +[2023-10-09 07:00:58,125][60143] Updated weights for policy 0, policy_version 69182 (0.0010) +[2023-10-09 07:00:59,764][60144] Updated weights for policy 1, policy_version 69962 (0.0011) +[2023-10-09 07:01:00,128][60144] Updated weights for policy 1, policy_version 69972 (0.0010) +[2023-10-09 07:01:00,495][60144] Updated weights for policy 1, policy_version 69982 (0.0009) +[2023-10-09 07:01:01,052][59242] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 142508032. Throughput: 0: 1716.9, 1: 1705.4. Samples: 35635370. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) +[2023-10-09 07:01:01,053][59242] Avg episode reward: [(0, '32.380'), (1, '34.890')] +[2023-10-09 07:01:02,184][60143] Updated weights for policy 0, policy_version 69192 (0.0009) +[2023-10-09 07:01:02,552][60143] Updated weights for policy 0, policy_version 69202 (0.0009) +[2023-10-09 07:01:02,935][60143] Updated weights for policy 0, policy_version 69212 (0.0008) +[2023-10-09 07:01:04,378][60144] Updated weights for policy 1, policy_version 69992 (0.0009) +[2023-10-09 07:01:04,741][60144] Updated weights for policy 1, policy_version 70002 (0.0009) +[2023-10-09 07:01:05,111][60144] Updated weights for policy 1, policy_version 70012 (0.0008) +[2023-10-09 07:01:06,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 142573568. Throughput: 0: 1685.5, 1: 1737.2. Samples: 35646032. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) +[2023-10-09 07:01:06,053][59242] Avg episode reward: [(0, '32.820'), (1, '32.910')] +[2023-10-09 07:01:06,851][60143] Updated weights for policy 0, policy_version 69222 (0.0008) +[2023-10-09 07:01:07,212][60143] Updated weights for policy 0, policy_version 69232 (0.0008) +[2023-10-09 07:01:07,577][60143] Updated weights for policy 0, policy_version 69242 (0.0010) +[2023-10-09 07:01:09,047][60144] Updated weights for policy 1, policy_version 70022 (0.0010) +[2023-10-09 07:01:09,419][60144] Updated weights for policy 1, policy_version 70032 (0.0007) +[2023-10-09 07:01:09,783][60144] Updated weights for policy 1, policy_version 70042 (0.0009) +[2023-10-09 07:01:11,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 142639104. Throughput: 0: 1706.7, 1: 1718.2. Samples: 35666686. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) +[2023-10-09 07:01:11,053][59242] Avg episode reward: [(0, '33.640'), (1, '32.700')] +[2023-10-09 07:01:11,546][60143] Updated weights for policy 0, policy_version 69252 (0.0008) +[2023-10-09 07:01:11,907][60143] Updated weights for policy 0, policy_version 69262 (0.0012) +[2023-10-09 07:01:12,280][60143] Updated weights for policy 0, policy_version 69272 (0.0009) +[2023-10-09 07:01:13,820][60144] Updated weights for policy 1, policy_version 70052 (0.0009) +[2023-10-09 07:01:14,183][60144] Updated weights for policy 1, policy_version 70062 (0.0008) +[2023-10-09 07:01:14,544][60144] Updated weights for policy 1, policy_version 70072 (0.0008) +[2023-10-09 07:01:16,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 142704640. Throughput: 0: 1719.1, 1: 1702.8. Samples: 35687350. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) +[2023-10-09 07:01:16,053][59242] Avg episode reward: [(0, '32.720'), (1, '32.520')] +[2023-10-09 07:01:16,289][60143] Updated weights for policy 0, policy_version 69282 (0.0008) +[2023-10-09 07:01:16,653][60143] Updated weights for policy 0, policy_version 69292 (0.0009) +[2023-10-09 07:01:17,013][60143] Updated weights for policy 0, policy_version 69302 (0.0010) +[2023-10-09 07:01:17,383][60143] Updated weights for policy 0, policy_version 69312 (0.0007) +[2023-10-09 07:01:18,634][60144] Updated weights for policy 1, policy_version 70082 (0.0009) +[2023-10-09 07:01:19,005][60144] Updated weights for policy 1, policy_version 70092 (0.0008) +[2023-10-09 07:01:19,368][60144] Updated weights for policy 1, policy_version 70102 (0.0007) +[2023-10-09 07:01:19,736][60144] Updated weights for policy 1, policy_version 70112 (0.0008) +[2023-10-09 07:01:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 142770176. Throughput: 0: 1697.0, 1: 1732.3. Samples: 35697896. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-09 07:01:21,053][59242] Avg episode reward: [(0, '32.750'), (1, '30.840')] +[2023-10-09 07:01:21,528][60143] Updated weights for policy 0, policy_version 69322 (0.0007) +[2023-10-09 07:01:21,897][60143] Updated weights for policy 0, policy_version 69332 (0.0008) +[2023-10-09 07:01:22,269][60143] Updated weights for policy 0, policy_version 69342 (0.0008) +[2023-10-09 07:01:23,687][60144] Updated weights for policy 1, policy_version 70122 (0.0010) +[2023-10-09 07:01:24,057][60144] Updated weights for policy 1, policy_version 70132 (0.0011) +[2023-10-09 07:01:24,424][60144] Updated weights for policy 1, policy_version 70142 (0.0010) +[2023-10-09 07:01:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 142835712. Throughput: 0: 1720.8, 1: 1704.7. Samples: 35718034. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-09 07:01:26,053][59242] Avg episode reward: [(0, '31.680'), (1, '30.600')] +[2023-10-09 07:01:26,102][60143] Updated weights for policy 0, policy_version 69352 (0.0010) +[2023-10-09 07:01:26,484][60143] Updated weights for policy 0, policy_version 69362 (0.0010) +[2023-10-09 07:01:26,849][60143] Updated weights for policy 0, policy_version 69372 (0.0008) +[2023-10-09 07:01:28,089][60144] Updated weights for policy 1, policy_version 70152 (0.0009) +[2023-10-09 07:01:28,477][60144] Updated weights for policy 1, policy_version 70162 (0.0011) +[2023-10-09 07:01:28,843][60144] Updated weights for policy 1, policy_version 70172 (0.0011) +[2023-10-09 07:01:30,878][60143] Updated weights for policy 0, policy_version 69382 (0.0007) +[2023-10-09 07:01:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 142901248. Throughput: 0: 1724.9, 1: 1716.3. Samples: 35739278. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-09 07:01:31,053][59242] Avg episode reward: [(0, '31.470'), (1, '30.440')] +[2023-10-09 07:01:31,060][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000070176_71860224.pth... +[2023-10-09 07:01:31,094][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000068576_70221824.pth +[2023-10-09 07:01:31,261][60143] Updated weights for policy 0, policy_version 69392 (0.0008) +[2023-10-09 07:01:31,627][60143] Updated weights for policy 0, policy_version 69402 (0.0008) +[2023-10-09 07:01:31,848][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000069408_71073792.pth... +[2023-10-09 07:01:31,887][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000067808_69435392.pth +[2023-10-09 07:01:32,987][60144] Updated weights for policy 1, policy_version 70182 (0.0009) +[2023-10-09 07:01:33,353][60144] Updated weights for policy 1, policy_version 70192 (0.0007) +[2023-10-09 07:01:33,710][60144] Updated weights for policy 1, policy_version 70202 (0.0009) +[2023-10-09 07:01:35,787][60143] Updated weights for policy 0, policy_version 69412 (0.0008) +[2023-10-09 07:01:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 142966784. Throughput: 0: 1717.2, 1: 1721.7. Samples: 35748992. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-09 07:01:36,052][59242] Avg episode reward: [(0, '32.920'), (1, '29.850')] +[2023-10-09 07:01:36,153][60143] Updated weights for policy 0, policy_version 69422 (0.0009) +[2023-10-09 07:01:36,532][60143] Updated weights for policy 0, policy_version 69432 (0.0007) +[2023-10-09 07:01:37,650][60144] Updated weights for policy 1, policy_version 70212 (0.0010) +[2023-10-09 07:01:38,017][60144] Updated weights for policy 1, policy_version 70222 (0.0007) +[2023-10-09 07:01:38,395][60144] Updated weights for policy 1, policy_version 70232 (0.0008) +[2023-10-09 07:01:40,322][60143] Updated weights for policy 0, policy_version 69442 (0.0007) +[2023-10-09 07:01:40,699][60143] Updated weights for policy 0, policy_version 69452 (0.0008) +[2023-10-09 07:01:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 143032320. Throughput: 0: 1719.4, 1: 1708.3. Samples: 35769846. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-09 07:01:41,053][59242] Avg episode reward: [(0, '33.450'), (1, '30.590')] +[2023-10-09 07:01:41,070][60143] Updated weights for policy 0, policy_version 69462 (0.0008) +[2023-10-09 07:01:41,438][60143] Updated weights for policy 0, policy_version 69472 (0.0007) +[2023-10-09 07:01:42,362][60144] Updated weights for policy 1, policy_version 70242 (0.0008) +[2023-10-09 07:01:42,730][60144] Updated weights for policy 1, policy_version 70252 (0.0009) +[2023-10-09 07:01:43,093][60144] Updated weights for policy 1, policy_version 70262 (0.0009) +[2023-10-09 07:01:43,458][60144] Updated weights for policy 1, policy_version 70272 (0.0009) +[2023-10-09 07:01:45,496][60143] Updated weights for policy 0, policy_version 69482 (0.0009) +[2023-10-09 07:01:45,869][60143] Updated weights for policy 0, policy_version 69492 (0.0010) +[2023-10-09 07:01:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 143097856. Throughput: 0: 1707.9, 1: 1736.7. Samples: 35790374. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-09 07:01:46,053][59242] Avg episode reward: [(0, '33.480'), (1, '30.890')] +[2023-10-09 07:01:46,236][60143] Updated weights for policy 0, policy_version 69502 (0.0007) +[2023-10-09 07:01:47,512][60144] Updated weights for policy 1, policy_version 70282 (0.0009) +[2023-10-09 07:01:47,870][60144] Updated weights for policy 1, policy_version 70292 (0.0010) +[2023-10-09 07:01:48,237][60144] Updated weights for policy 1, policy_version 70302 (0.0010) +[2023-10-09 07:01:50,236][60143] Updated weights for policy 0, policy_version 69512 (0.0010) +[2023-10-09 07:01:50,619][60143] Updated weights for policy 0, policy_version 69522 (0.0010) +[2023-10-09 07:01:50,980][60143] Updated weights for policy 0, policy_version 69532 (0.0011) +[2023-10-09 07:01:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 143163392. Throughput: 0: 1716.6, 1: 1706.3. Samples: 35800062. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-09 07:01:51,053][59242] Avg episode reward: [(0, '33.680'), (1, '31.190')] +[2023-10-09 07:01:52,135][60144] Updated weights for policy 1, policy_version 70312 (0.0009) +[2023-10-09 07:01:52,510][60144] Updated weights for policy 1, policy_version 70322 (0.0007) +[2023-10-09 07:01:52,874][60144] Updated weights for policy 1, policy_version 70332 (0.0007) +[2023-10-09 07:01:55,030][60143] Updated weights for policy 0, policy_version 69542 (0.0010) +[2023-10-09 07:01:55,408][60143] Updated weights for policy 0, policy_version 69552 (0.0008) +[2023-10-09 07:01:55,774][60143] Updated weights for policy 0, policy_version 69562 (0.0008) +[2023-10-09 07:01:56,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 143261696. Throughput: 0: 1711.7, 1: 1719.7. Samples: 35821098. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-09 07:01:56,053][59242] Avg episode reward: [(0, '34.650'), (1, '29.990')] +[2023-10-09 07:01:56,846][60144] Updated weights for policy 1, policy_version 70342 (0.0008) +[2023-10-09 07:01:57,212][60144] Updated weights for policy 1, policy_version 70352 (0.0008) +[2023-10-09 07:01:57,571][60144] Updated weights for policy 1, policy_version 70362 (0.0007) +[2023-10-09 07:01:59,749][60143] Updated weights for policy 0, policy_version 69572 (0.0008) +[2023-10-09 07:02:00,109][60143] Updated weights for policy 0, policy_version 69582 (0.0011) +[2023-10-09 07:02:00,476][60143] Updated weights for policy 0, policy_version 69592 (0.0010) +[2023-10-09 07:02:01,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 143327232. Throughput: 0: 1690.9, 1: 1733.5. Samples: 35841448. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-09 07:02:01,053][59242] Avg episode reward: [(0, '33.520'), (1, '30.910')] +[2023-10-09 07:02:01,560][60144] Updated weights for policy 1, policy_version 70372 (0.0009) +[2023-10-09 07:02:01,932][60144] Updated weights for policy 1, policy_version 70382 (0.0010) +[2023-10-09 07:02:02,307][60144] Updated weights for policy 1, policy_version 70392 (0.0008) +[2023-10-09 07:02:04,322][60143] Updated weights for policy 0, policy_version 69602 (0.0008) +[2023-10-09 07:02:04,687][60143] Updated weights for policy 0, policy_version 69612 (0.0007) +[2023-10-09 07:02:05,054][60143] Updated weights for policy 0, policy_version 69622 (0.0007) +[2023-10-09 07:02:05,436][60143] Updated weights for policy 0, policy_version 69632 (0.0010) +[2023-10-09 07:02:06,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 143392768. Throughput: 0: 1721.4, 1: 1704.7. Samples: 35852068. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:02:06,053][59242] Avg episode reward: [(0, '33.120'), (1, '31.100')] +[2023-10-09 07:02:06,248][60144] Updated weights for policy 1, policy_version 70402 (0.0008) +[2023-10-09 07:02:06,610][60144] Updated weights for policy 1, policy_version 70412 (0.0008) +[2023-10-09 07:02:06,971][60144] Updated weights for policy 1, policy_version 70422 (0.0009) +[2023-10-09 07:02:07,342][60144] Updated weights for policy 1, policy_version 70432 (0.0009) +[2023-10-09 07:02:09,553][60143] Updated weights for policy 0, policy_version 69642 (0.0008) +[2023-10-09 07:02:09,918][60143] Updated weights for policy 0, policy_version 69652 (0.0009) +[2023-10-09 07:02:10,286][60143] Updated weights for policy 0, policy_version 69662 (0.0011) +[2023-10-09 07:02:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 143458304. Throughput: 0: 1707.3, 1: 1733.7. Samples: 35872880. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:02:11,053][59242] Avg episode reward: [(0, '33.530'), (1, '31.690')] +[2023-10-09 07:02:11,217][60144] Updated weights for policy 1, policy_version 70442 (0.0008) +[2023-10-09 07:02:11,575][60144] Updated weights for policy 1, policy_version 70452 (0.0008) +[2023-10-09 07:02:11,948][60144] Updated weights for policy 1, policy_version 70462 (0.0008) +[2023-10-09 07:02:14,318][60143] Updated weights for policy 0, policy_version 69672 (0.0009) +[2023-10-09 07:02:14,685][60143] Updated weights for policy 0, policy_version 69682 (0.0010) +[2023-10-09 07:02:15,051][60143] Updated weights for policy 0, policy_version 69692 (0.0009) +[2023-10-09 07:02:15,973][60144] Updated weights for policy 1, policy_version 70472 (0.0008) +[2023-10-09 07:02:16,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 143523840. Throughput: 0: 1682.1, 1: 1729.9. Samples: 35892818. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:02:16,053][59242] Avg episode reward: [(0, '33.980'), (1, '32.790')] +[2023-10-09 07:02:16,343][60144] Updated weights for policy 1, policy_version 70482 (0.0007) +[2023-10-09 07:02:16,711][60144] Updated weights for policy 1, policy_version 70492 (0.0007) +[2023-10-09 07:02:19,171][60143] Updated weights for policy 0, policy_version 69702 (0.0009) +[2023-10-09 07:02:19,550][60143] Updated weights for policy 0, policy_version 69712 (0.0009) +[2023-10-09 07:02:19,926][60143] Updated weights for policy 0, policy_version 69722 (0.0007) +[2023-10-09 07:02:20,497][60144] Updated weights for policy 1, policy_version 70502 (0.0009) +[2023-10-09 07:02:20,866][60144] Updated weights for policy 1, policy_version 70512 (0.0008) +[2023-10-09 07:02:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 143589376. Throughput: 0: 1713.4, 1: 1716.0. Samples: 35903312. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:02:21,053][59242] Avg episode reward: [(0, '32.370'), (1, '31.690')] +[2023-10-09 07:02:21,224][60144] Updated weights for policy 1, policy_version 70522 (0.0009) +[2023-10-09 07:02:23,786][60143] Updated weights for policy 0, policy_version 69732 (0.0009) +[2023-10-09 07:02:24,156][60143] Updated weights for policy 0, policy_version 69742 (0.0007) +[2023-10-09 07:02:24,521][60143] Updated weights for policy 0, policy_version 69752 (0.0008) +[2023-10-09 07:02:25,012][60144] Updated weights for policy 1, policy_version 70532 (0.0008) +[2023-10-09 07:02:25,381][60144] Updated weights for policy 1, policy_version 70542 (0.0007) +[2023-10-09 07:02:25,748][60144] Updated weights for policy 1, policy_version 70552 (0.0008) +[2023-10-09 07:02:26,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 143687680. Throughput: 0: 1686.5, 1: 1734.8. Samples: 35923804. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:02:26,053][59242] Avg episode reward: [(0, '34.130'), (1, '31.440')] +[2023-10-09 07:02:28,427][60143] Updated weights for policy 0, policy_version 69762 (0.0007) +[2023-10-09 07:02:28,803][60143] Updated weights for policy 0, policy_version 69772 (0.0009) +[2023-10-09 07:02:29,171][60143] Updated weights for policy 0, policy_version 69782 (0.0008) +[2023-10-09 07:02:29,537][60143] Updated weights for policy 0, policy_version 69792 (0.0008) +[2023-10-09 07:02:29,707][60144] Updated weights for policy 1, policy_version 70562 (0.0008) +[2023-10-09 07:02:30,079][60144] Updated weights for policy 1, policy_version 70572 (0.0009) +[2023-10-09 07:02:30,435][60144] Updated weights for policy 1, policy_version 70582 (0.0009) +[2023-10-09 07:02:30,798][60144] Updated weights for policy 1, policy_version 70592 (0.0010) +[2023-10-09 07:02:31,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 143753216. Throughput: 0: 1692.8, 1: 1716.0. Samples: 35943772. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:02:31,052][59242] Avg episode reward: [(0, '34.600'), (1, '31.950')] +[2023-10-09 07:02:33,682][60143] Updated weights for policy 0, policy_version 69802 (0.0009) +[2023-10-09 07:02:34,054][60143] Updated weights for policy 0, policy_version 69812 (0.0007) +[2023-10-09 07:02:34,431][60143] Updated weights for policy 0, policy_version 69822 (0.0008) +[2023-10-09 07:02:34,872][60144] Updated weights for policy 1, policy_version 70602 (0.0011) +[2023-10-09 07:02:35,245][60144] Updated weights for policy 1, policy_version 70612 (0.0008) +[2023-10-09 07:02:35,610][60144] Updated weights for policy 1, policy_version 70622 (0.0008) +[2023-10-09 07:02:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 143818752. Throughput: 0: 1710.4, 1: 1739.5. Samples: 35955304. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:02:36,053][59242] Avg episode reward: [(0, '34.800'), (1, '31.850')] +[2023-10-09 07:02:38,459][60143] Updated weights for policy 0, policy_version 69832 (0.0008) +[2023-10-09 07:02:38,831][60143] Updated weights for policy 0, policy_version 69842 (0.0008) +[2023-10-09 07:02:39,191][60143] Updated weights for policy 0, policy_version 69852 (0.0007) +[2023-10-09 07:02:39,517][60144] Updated weights for policy 1, policy_version 70632 (0.0009) +[2023-10-09 07:02:39,887][60144] Updated weights for policy 1, policy_version 70642 (0.0010) +[2023-10-09 07:02:40,264][60144] Updated weights for policy 1, policy_version 70652 (0.0008) +[2023-10-09 07:02:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 143884288. Throughput: 0: 1686.2, 1: 1737.9. Samples: 35975182. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:02:41,053][59242] Avg episode reward: [(0, '35.020'), (1, '32.150')] +[2023-10-09 07:02:43,113][60143] Updated weights for policy 0, policy_version 69862 (0.0009) +[2023-10-09 07:02:43,482][60143] Updated weights for policy 0, policy_version 69872 (0.0009) +[2023-10-09 07:02:43,855][60143] Updated weights for policy 0, policy_version 69882 (0.0009) +[2023-10-09 07:02:44,338][60144] Updated weights for policy 1, policy_version 70662 (0.0008) +[2023-10-09 07:02:44,705][60144] Updated weights for policy 1, policy_version 70672 (0.0009) +[2023-10-09 07:02:45,071][60144] Updated weights for policy 1, policy_version 70682 (0.0008) +[2023-10-09 07:02:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 143949824. Throughput: 0: 1710.8, 1: 1713.8. Samples: 35995554. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:02:46,053][59242] Avg episode reward: [(0, '34.640'), (1, '32.680')] +[2023-10-09 07:02:47,754][60143] Updated weights for policy 0, policy_version 69892 (0.0007) +[2023-10-09 07:02:48,125][60143] Updated weights for policy 0, policy_version 69902 (0.0009) +[2023-10-09 07:02:48,499][60143] Updated weights for policy 0, policy_version 69912 (0.0007) +[2023-10-09 07:02:49,150][60144] Updated weights for policy 1, policy_version 70692 (0.0008) +[2023-10-09 07:02:49,515][60144] Updated weights for policy 1, policy_version 70702 (0.0008) +[2023-10-09 07:02:49,889][60144] Updated weights for policy 1, policy_version 70712 (0.0007) +[2023-10-09 07:02:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 144015360. Throughput: 0: 1688.2, 1: 1741.3. Samples: 36006398. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:02:51,053][59242] Avg episode reward: [(0, '34.480'), (1, '32.230')] +[2023-10-09 07:02:52,429][60143] Updated weights for policy 0, policy_version 69922 (0.0008) +[2023-10-09 07:02:52,790][60143] Updated weights for policy 0, policy_version 69932 (0.0009) +[2023-10-09 07:02:53,160][60143] Updated weights for policy 0, policy_version 69942 (0.0008) +[2023-10-09 07:02:53,529][60143] Updated weights for policy 0, policy_version 69952 (0.0008) +[2023-10-09 07:02:53,830][60144] Updated weights for policy 1, policy_version 70722 (0.0010) +[2023-10-09 07:02:54,204][60144] Updated weights for policy 1, policy_version 70732 (0.0009) +[2023-10-09 07:02:54,567][60144] Updated weights for policy 1, policy_version 70742 (0.0009) +[2023-10-09 07:02:54,942][60144] Updated weights for policy 1, policy_version 70752 (0.0009) +[2023-10-09 07:02:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 144080896. Throughput: 0: 1693.2, 1: 1724.9. Samples: 36026696. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) +[2023-10-09 07:02:56,053][59242] Avg episode reward: [(0, '34.760'), (1, '32.610')] +[2023-10-09 07:02:57,374][60143] Updated weights for policy 0, policy_version 69962 (0.0008) +[2023-10-09 07:02:57,744][60143] Updated weights for policy 0, policy_version 69972 (0.0007) +[2023-10-09 07:02:58,109][60143] Updated weights for policy 0, policy_version 69982 (0.0011) +[2023-10-09 07:02:58,801][60144] Updated weights for policy 1, policy_version 70762 (0.0008) +[2023-10-09 07:02:59,165][60144] Updated weights for policy 1, policy_version 70772 (0.0009) +[2023-10-09 07:02:59,530][60144] Updated weights for policy 1, policy_version 70782 (0.0007) +[2023-10-09 07:03:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 144146432. Throughput: 0: 1722.4, 1: 1721.6. Samples: 36047800. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) +[2023-10-09 07:03:01,053][59242] Avg episode reward: [(0, '33.730'), (1, '34.530')] +[2023-10-09 07:03:02,080][60143] Updated weights for policy 0, policy_version 69992 (0.0010) +[2023-10-09 07:03:02,457][60143] Updated weights for policy 0, policy_version 70002 (0.0010) +[2023-10-09 07:03:02,818][60143] Updated weights for policy 0, policy_version 70012 (0.0009) +[2023-10-09 07:03:03,486][60144] Updated weights for policy 1, policy_version 70792 (0.0008) +[2023-10-09 07:03:03,859][60144] Updated weights for policy 1, policy_version 70802 (0.0007) +[2023-10-09 07:03:04,226][60144] Updated weights for policy 1, policy_version 70812 (0.0007) +[2023-10-09 07:03:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 144211968. Throughput: 0: 1694.4, 1: 1740.5. Samples: 36057884. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) +[2023-10-09 07:03:06,053][59242] Avg episode reward: [(0, '36.360'), (1, '33.610')] +[2023-10-09 07:03:06,982][60143] Updated weights for policy 0, policy_version 70022 (0.0008) +[2023-10-09 07:03:07,357][60143] Updated weights for policy 0, policy_version 70032 (0.0008) +[2023-10-09 07:03:07,724][60143] Updated weights for policy 0, policy_version 70042 (0.0009) +[2023-10-09 07:03:08,223][60144] Updated weights for policy 1, policy_version 70822 (0.0008) +[2023-10-09 07:03:08,591][60144] Updated weights for policy 1, policy_version 70832 (0.0007) +[2023-10-09 07:03:08,958][60144] Updated weights for policy 1, policy_version 70842 (0.0008) +[2023-10-09 07:03:11,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 144277504. Throughput: 0: 1714.6, 1: 1709.3. Samples: 36077880. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) +[2023-10-09 07:03:11,054][59242] Avg episode reward: [(0, '37.690'), (1, '34.650')] +[2023-10-09 07:03:11,867][60143] Updated weights for policy 0, policy_version 70052 (0.0008) +[2023-10-09 07:03:12,277][60143] Updated weights for policy 0, policy_version 70062 (0.0009) +[2023-10-09 07:03:12,646][60143] Updated weights for policy 0, policy_version 70072 (0.0008) +[2023-10-09 07:03:12,915][60144] Updated weights for policy 1, policy_version 70852 (0.0008) +[2023-10-09 07:03:13,291][60144] Updated weights for policy 1, policy_version 70862 (0.0010) +[2023-10-09 07:03:13,648][60144] Updated weights for policy 1, policy_version 70872 (0.0008) +[2023-10-09 07:03:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 144343040. Throughput: 0: 1718.2, 1: 1731.5. Samples: 36099008. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) +[2023-10-09 07:03:16,053][59242] Avg episode reward: [(0, '36.930'), (1, '33.900')] +[2023-10-09 07:03:16,598][60143] Updated weights for policy 0, policy_version 70082 (0.0007) +[2023-10-09 07:03:16,959][60143] Updated weights for policy 0, policy_version 70092 (0.0009) +[2023-10-09 07:03:17,319][60143] Updated weights for policy 0, policy_version 70102 (0.0007) +[2023-10-09 07:03:17,458][60144] Updated weights for policy 1, policy_version 70882 (0.0011) +[2023-10-09 07:03:17,690][60143] Updated weights for policy 0, policy_version 70112 (0.0007) +[2023-10-09 07:03:17,818][60144] Updated weights for policy 1, policy_version 70892 (0.0009) +[2023-10-09 07:03:18,180][60144] Updated weights for policy 1, policy_version 70902 (0.0007) +[2023-10-09 07:03:18,546][60144] Updated weights for policy 1, policy_version 70912 (0.0007) +[2023-10-09 07:03:21,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 144408576. Throughput: 0: 1689.6, 1: 1711.6. Samples: 36108362. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) +[2023-10-09 07:03:21,053][59242] Avg episode reward: [(0, '36.870'), (1, '33.450')] +[2023-10-09 07:03:21,671][60143] Updated weights for policy 0, policy_version 70122 (0.0008) +[2023-10-09 07:03:22,024][60143] Updated weights for policy 0, policy_version 70132 (0.0007) +[2023-10-09 07:03:22,385][60143] Updated weights for policy 0, policy_version 70142 (0.0008) +[2023-10-09 07:03:22,486][60144] Updated weights for policy 1, policy_version 70922 (0.0007) +[2023-10-09 07:03:22,851][60144] Updated weights for policy 1, policy_version 70932 (0.0008) +[2023-10-09 07:03:23,218][60144] Updated weights for policy 1, policy_version 70942 (0.0007) +[2023-10-09 07:03:26,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 144474112. Throughput: 0: 1715.1, 1: 1718.8. Samples: 36129710. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) +[2023-10-09 07:03:26,052][59242] Avg episode reward: [(0, '36.670'), (1, '33.020')] +[2023-10-09 07:03:26,337][60143] Updated weights for policy 0, policy_version 70152 (0.0007) +[2023-10-09 07:03:26,712][60143] Updated weights for policy 0, policy_version 70162 (0.0008) +[2023-10-09 07:03:27,086][60143] Updated weights for policy 0, policy_version 70172 (0.0007) +[2023-10-09 07:03:27,129][60144] Updated weights for policy 1, policy_version 70952 (0.0007) +[2023-10-09 07:03:27,500][60144] Updated weights for policy 1, policy_version 70962 (0.0007) +[2023-10-09 07:03:27,870][60144] Updated weights for policy 1, policy_version 70972 (0.0008) +[2023-10-09 07:03:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13662.6). Total num frames: 144539648. Throughput: 0: 1710.8, 1: 1742.3. Samples: 36150944. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) +[2023-10-09 07:03:31,053][59242] Avg episode reward: [(0, '34.280'), (1, '32.390')] +[2023-10-09 07:03:31,065][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000070976_72679424.pth... +[2023-10-09 07:03:31,102][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000069376_71041024.pth +[2023-10-09 07:03:31,107][60143] Updated weights for policy 0, policy_version 70182 (0.0010) +[2023-10-09 07:03:31,471][60143] Updated weights for policy 0, policy_version 70192 (0.0007) +[2023-10-09 07:03:31,784][60144] Updated weights for policy 1, policy_version 70982 (0.0009) +[2023-10-09 07:03:31,838][60143] Updated weights for policy 0, policy_version 70202 (0.0009) +[2023-10-09 07:03:32,060][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000070208_71892992.pth... +[2023-10-09 07:03:32,092][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000068608_70254592.pth +[2023-10-09 07:03:32,153][60144] Updated weights for policy 1, policy_version 70992 (0.0007) +[2023-10-09 07:03:32,519][60144] Updated weights for policy 1, policy_version 71002 (0.0007) +[2023-10-09 07:03:35,795][60143] Updated weights for policy 0, policy_version 70212 (0.0008) +[2023-10-09 07:03:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 144605184. Throughput: 0: 1708.3, 1: 1714.4. Samples: 36160420. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) +[2023-10-09 07:03:36,052][59242] Avg episode reward: [(0, '34.920'), (1, '32.300')] +[2023-10-09 07:03:36,164][60143] Updated weights for policy 0, policy_version 70222 (0.0008) +[2023-10-09 07:03:36,454][60144] Updated weights for policy 1, policy_version 71012 (0.0007) +[2023-10-09 07:03:36,531][60143] Updated weights for policy 0, policy_version 70232 (0.0009) +[2023-10-09 07:03:36,823][60144] Updated weights for policy 1, policy_version 71022 (0.0009) +[2023-10-09 07:03:37,179][60144] Updated weights for policy 1, policy_version 71032 (0.0007) +[2023-10-09 07:03:40,475][60143] Updated weights for policy 0, policy_version 70242 (0.0007) +[2023-10-09 07:03:40,836][60143] Updated weights for policy 0, policy_version 70252 (0.0007) +[2023-10-09 07:03:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 144670720. Throughput: 0: 1715.7, 1: 1731.5. Samples: 36181820. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) +[2023-10-09 07:03:41,052][59242] Avg episode reward: [(0, '35.400'), (1, '32.350')] +[2023-10-09 07:03:41,149][60144] Updated weights for policy 1, policy_version 71042 (0.0009) +[2023-10-09 07:03:41,214][60143] Updated weights for policy 0, policy_version 70262 (0.0008) +[2023-10-09 07:03:41,518][60144] Updated weights for policy 1, policy_version 71052 (0.0008) +[2023-10-09 07:03:41,587][60143] Updated weights for policy 0, policy_version 70272 (0.0008) +[2023-10-09 07:03:41,879][60144] Updated weights for policy 1, policy_version 71062 (0.0008) +[2023-10-09 07:03:42,245][60144] Updated weights for policy 1, policy_version 71072 (0.0010) +[2023-10-09 07:03:45,508][60143] Updated weights for policy 0, policy_version 70282 (0.0011) +[2023-10-09 07:03:45,876][60143] Updated weights for policy 0, policy_version 70292 (0.0010) +[2023-10-09 07:03:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 144736256. Throughput: 0: 1705.8, 1: 1736.4. Samples: 36202696. Policy #0 lag: (min: 3.0, avg: 5.9, max: 35.0) +[2023-10-09 07:03:46,053][59242] Avg episode reward: [(0, '34.370'), (1, '31.860')] +[2023-10-09 07:03:46,239][60143] Updated weights for policy 0, policy_version 70302 (0.0007) +[2023-10-09 07:03:46,249][60144] Updated weights for policy 1, policy_version 71082 (0.0008) +[2023-10-09 07:03:46,619][60144] Updated weights for policy 1, policy_version 71092 (0.0009) +[2023-10-09 07:03:46,991][60144] Updated weights for policy 1, policy_version 71102 (0.0008) +[2023-10-09 07:03:50,248][60143] Updated weights for policy 0, policy_version 70312 (0.0010) +[2023-10-09 07:03:50,619][60143] Updated weights for policy 0, policy_version 70322 (0.0010) +[2023-10-09 07:03:50,838][60144] Updated weights for policy 1, policy_version 71112 (0.0008) +[2023-10-09 07:03:50,992][60143] Updated weights for policy 0, policy_version 70332 (0.0008) +[2023-10-09 07:03:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 144801792. Throughput: 0: 1712.3, 1: 1717.8. Samples: 36212236. Policy #0 lag: (min: 3.0, avg: 5.9, max: 35.0) +[2023-10-09 07:03:51,052][59242] Avg episode reward: [(0, '34.090'), (1, '32.110')] +[2023-10-09 07:03:51,213][60144] Updated weights for policy 1, policy_version 71122 (0.0009) +[2023-10-09 07:03:51,593][60144] Updated weights for policy 1, policy_version 71132 (0.0007) +[2023-10-09 07:03:54,978][60143] Updated weights for policy 0, policy_version 70342 (0.0009) +[2023-10-09 07:03:55,345][60143] Updated weights for policy 0, policy_version 70352 (0.0009) +[2023-10-09 07:03:55,505][60144] Updated weights for policy 1, policy_version 71142 (0.0008) +[2023-10-09 07:03:55,716][60143] Updated weights for policy 0, policy_version 70362 (0.0008) +[2023-10-09 07:03:55,876][60144] Updated weights for policy 1, policy_version 71152 (0.0007) +[2023-10-09 07:03:56,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 144900096. Throughput: 0: 1711.8, 1: 1740.4. Samples: 36233228. Policy #0 lag: (min: 3.0, avg: 5.9, max: 35.0) +[2023-10-09 07:03:56,053][59242] Avg episode reward: [(0, '35.130'), (1, '32.340')] +[2023-10-09 07:03:56,246][60144] Updated weights for policy 1, policy_version 71162 (0.0007) +[2023-10-09 07:03:59,753][60143] Updated weights for policy 0, policy_version 70372 (0.0008) +[2023-10-09 07:04:00,128][60143] Updated weights for policy 0, policy_version 70382 (0.0009) +[2023-10-09 07:04:00,133][60144] Updated weights for policy 1, policy_version 71172 (0.0008) +[2023-10-09 07:04:00,497][60143] Updated weights for policy 0, policy_version 70392 (0.0009) +[2023-10-09 07:04:00,505][60144] Updated weights for policy 1, policy_version 71182 (0.0009) +[2023-10-09 07:04:00,877][60144] Updated weights for policy 1, policy_version 71192 (0.0009) +[2023-10-09 07:04:01,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 144965632. Throughput: 0: 1692.7, 1: 1727.5. Samples: 36252916. Policy #0 lag: (min: 3.0, avg: 5.9, max: 35.0) +[2023-10-09 07:04:01,053][59242] Avg episode reward: [(0, '35.340'), (1, '33.620')] +[2023-10-09 07:04:04,374][60143] Updated weights for policy 0, policy_version 70402 (0.0008) +[2023-10-09 07:04:04,747][60143] Updated weights for policy 0, policy_version 70412 (0.0009) +[2023-10-09 07:04:05,026][60144] Updated weights for policy 1, policy_version 71202 (0.0008) +[2023-10-09 07:04:05,111][60143] Updated weights for policy 0, policy_version 70422 (0.0007) +[2023-10-09 07:04:05,391][60144] Updated weights for policy 1, policy_version 71212 (0.0008) +[2023-10-09 07:04:05,479][60143] Updated weights for policy 0, policy_version 70432 (0.0008) +[2023-10-09 07:04:05,757][60144] Updated weights for policy 1, policy_version 71222 (0.0009) +[2023-10-09 07:04:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 145031168. Throughput: 0: 1720.7, 1: 1735.0. Samples: 36263866. Policy #0 lag: (min: 3.0, avg: 5.9, max: 35.0) +[2023-10-09 07:04:06,053][59242] Avg episode reward: [(0, '33.910'), (1, '32.330')] +[2023-10-09 07:04:06,122][60144] Updated weights for policy 1, policy_version 71232 (0.0008) +[2023-10-09 07:04:09,576][60143] Updated weights for policy 0, policy_version 70442 (0.0007) +[2023-10-09 07:04:09,947][60143] Updated weights for policy 0, policy_version 70452 (0.0008) +[2023-10-09 07:04:10,005][60144] Updated weights for policy 1, policy_version 71242 (0.0007) +[2023-10-09 07:04:10,321][60143] Updated weights for policy 0, policy_version 70462 (0.0009) +[2023-10-09 07:04:10,380][60144] Updated weights for policy 1, policy_version 71252 (0.0007) +[2023-10-09 07:04:10,744][60144] Updated weights for policy 1, policy_version 71262 (0.0008) +[2023-10-09 07:04:11,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 145129472. Throughput: 0: 1709.9, 1: 1733.6. Samples: 36284672. Policy #0 lag: (min: 3.0, avg: 5.9, max: 35.0) +[2023-10-09 07:04:11,053][59242] Avg episode reward: [(0, '34.720'), (1, '32.650')] +[2023-10-09 07:04:14,273][60143] Updated weights for policy 0, policy_version 70472 (0.0010) +[2023-10-09 07:04:14,644][60143] Updated weights for policy 0, policy_version 70482 (0.0007) +[2023-10-09 07:04:14,866][60144] Updated weights for policy 1, policy_version 71272 (0.0008) +[2023-10-09 07:04:15,010][60143] Updated weights for policy 0, policy_version 70492 (0.0007) +[2023-10-09 07:04:15,234][60144] Updated weights for policy 1, policy_version 71282 (0.0008) +[2023-10-09 07:04:15,608][60144] Updated weights for policy 1, policy_version 71292 (0.0008) +[2023-10-09 07:04:16,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 145195008. Throughput: 0: 1685.7, 1: 1708.3. Samples: 36303670. Policy #0 lag: (min: 3.0, avg: 5.9, max: 35.0) +[2023-10-09 07:04:16,053][59242] Avg episode reward: [(0, '34.350'), (1, '33.390')] +[2023-10-09 07:04:18,914][60143] Updated weights for policy 0, policy_version 70502 (0.0010) +[2023-10-09 07:04:19,287][60143] Updated weights for policy 0, policy_version 70512 (0.0009) +[2023-10-09 07:04:19,492][60144] Updated weights for policy 1, policy_version 71302 (0.0008) +[2023-10-09 07:04:19,655][60143] Updated weights for policy 0, policy_version 70522 (0.0007) +[2023-10-09 07:04:19,850][60144] Updated weights for policy 1, policy_version 71312 (0.0007) +[2023-10-09 07:04:20,222][60144] Updated weights for policy 1, policy_version 71322 (0.0010) +[2023-10-09 07:04:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 145260544. Throughput: 0: 1714.7, 1: 1731.9. Samples: 36315520. Policy #0 lag: (min: 3.0, avg: 5.9, max: 35.0) +[2023-10-09 07:04:21,053][59242] Avg episode reward: [(0, '33.270'), (1, '34.030')] +[2023-10-09 07:04:23,776][60143] Updated weights for policy 0, policy_version 70532 (0.0008) +[2023-10-09 07:04:24,032][60144] Updated weights for policy 1, policy_version 71332 (0.0008) +[2023-10-09 07:04:24,144][60143] Updated weights for policy 0, policy_version 70542 (0.0009) +[2023-10-09 07:04:24,402][60144] Updated weights for policy 1, policy_version 71342 (0.0008) +[2023-10-09 07:04:24,513][60143] Updated weights for policy 0, policy_version 70552 (0.0007) +[2023-10-09 07:04:24,761][60144] Updated weights for policy 1, policy_version 71352 (0.0007) +[2023-10-09 07:04:26,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 145326080. Throughput: 0: 1691.3, 1: 1717.1. Samples: 36335198. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-09 07:04:26,053][59242] Avg episode reward: [(0, '33.920'), (1, '34.830')] +[2023-10-09 07:04:28,440][60143] Updated weights for policy 0, policy_version 70562 (0.0007) +[2023-10-09 07:04:28,726][60144] Updated weights for policy 1, policy_version 71362 (0.0008) +[2023-10-09 07:04:28,806][60143] Updated weights for policy 0, policy_version 70572 (0.0008) +[2023-10-09 07:04:29,082][60144] Updated weights for policy 1, policy_version 71372 (0.0007) +[2023-10-09 07:04:29,173][60143] Updated weights for policy 0, policy_version 70582 (0.0008) +[2023-10-09 07:04:29,457][60144] Updated weights for policy 1, policy_version 71382 (0.0008) +[2023-10-09 07:04:29,537][60143] Updated weights for policy 0, policy_version 70592 (0.0007) +[2023-10-09 07:04:29,823][60144] Updated weights for policy 1, policy_version 71392 (0.0008) +[2023-10-09 07:04:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 145391616. Throughput: 0: 1688.4, 1: 1706.6. Samples: 36355474. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-09 07:04:31,053][59242] Avg episode reward: [(0, '34.680'), (1, '32.510')] +[2023-10-09 07:04:33,351][60143] Updated weights for policy 0, policy_version 70602 (0.0008) +[2023-10-09 07:04:33,706][60144] Updated weights for policy 1, policy_version 71402 (0.0008) +[2023-10-09 07:04:33,716][60143] Updated weights for policy 0, policy_version 70612 (0.0008) +[2023-10-09 07:04:34,067][60144] Updated weights for policy 1, policy_version 71412 (0.0007) +[2023-10-09 07:04:34,079][60143] Updated weights for policy 0, policy_version 70622 (0.0009) +[2023-10-09 07:04:34,431][60144] Updated weights for policy 1, policy_version 71422 (0.0009) +[2023-10-09 07:04:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 145457152. Throughput: 0: 1705.5, 1: 1729.9. Samples: 36366828. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-09 07:04:36,052][59242] Avg episode reward: [(0, '33.230'), (1, '33.090')] +[2023-10-09 07:04:37,983][60143] Updated weights for policy 0, policy_version 70632 (0.0009) +[2023-10-09 07:04:38,351][60143] Updated weights for policy 0, policy_version 70642 (0.0007) +[2023-10-09 07:04:38,400][60144] Updated weights for policy 1, policy_version 71432 (0.0010) +[2023-10-09 07:04:38,715][60143] Updated weights for policy 0, policy_version 70652 (0.0008) +[2023-10-09 07:04:38,764][60144] Updated weights for policy 1, policy_version 71442 (0.0008) +[2023-10-09 07:04:39,133][60144] Updated weights for policy 1, policy_version 71452 (0.0008) +[2023-10-09 07:04:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 145522688. Throughput: 0: 1693.8, 1: 1707.1. Samples: 36386268. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-09 07:04:41,053][59242] Avg episode reward: [(0, '31.280'), (1, '33.800')] +[2023-10-09 07:04:42,686][60143] Updated weights for policy 0, policy_version 70662 (0.0009) +[2023-10-09 07:04:43,050][60143] Updated weights for policy 0, policy_version 70672 (0.0008) +[2023-10-09 07:04:43,154][60144] Updated weights for policy 1, policy_version 71462 (0.0009) +[2023-10-09 07:04:43,413][60143] Updated weights for policy 0, policy_version 70682 (0.0008) +[2023-10-09 07:04:43,541][60144] Updated weights for policy 1, policy_version 71472 (0.0009) +[2023-10-09 07:04:43,905][60144] Updated weights for policy 1, policy_version 71482 (0.0010) +[2023-10-09 07:04:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 145588224. Throughput: 0: 1715.2, 1: 1719.1. Samples: 36407462. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-09 07:04:46,053][59242] Avg episode reward: [(0, '32.010'), (1, '34.010')] +[2023-10-09 07:04:47,494][60143] Updated weights for policy 0, policy_version 70692 (0.0008) +[2023-10-09 07:04:47,791][60144] Updated weights for policy 1, policy_version 71492 (0.0007) +[2023-10-09 07:04:47,880][60143] Updated weights for policy 0, policy_version 70702 (0.0008) +[2023-10-09 07:04:48,162][60144] Updated weights for policy 1, policy_version 71502 (0.0009) +[2023-10-09 07:04:48,249][60143] Updated weights for policy 0, policy_version 70712 (0.0009) +[2023-10-09 07:04:48,530][60144] Updated weights for policy 1, policy_version 71512 (0.0008) +[2023-10-09 07:04:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 145653760. Throughput: 0: 1691.5, 1: 1715.9. Samples: 36417196. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-09 07:04:51,053][59242] Avg episode reward: [(0, '31.380'), (1, '35.420')] +[2023-10-09 07:04:52,341][60143] Updated weights for policy 0, policy_version 70722 (0.0008) +[2023-10-09 07:04:52,462][60144] Updated weights for policy 1, policy_version 71522 (0.0008) +[2023-10-09 07:04:52,705][60143] Updated weights for policy 0, policy_version 70732 (0.0010) +[2023-10-09 07:04:52,830][60144] Updated weights for policy 1, policy_version 71532 (0.0007) +[2023-10-09 07:04:53,073][60143] Updated weights for policy 0, policy_version 70742 (0.0007) +[2023-10-09 07:04:53,193][60144] Updated weights for policy 1, policy_version 71542 (0.0007) +[2023-10-09 07:04:53,451][60143] Updated weights for policy 0, policy_version 70752 (0.0008) +[2023-10-09 07:04:53,565][60144] Updated weights for policy 1, policy_version 71552 (0.0009) +[2023-10-09 07:04:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 145719296. Throughput: 0: 1701.1, 1: 1706.5. Samples: 36438014. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-09 07:04:56,053][59242] Avg episode reward: [(0, '30.900'), (1, '35.430')] +[2023-10-09 07:04:57,456][60144] Updated weights for policy 1, policy_version 71562 (0.0009) +[2023-10-09 07:04:57,527][60143] Updated weights for policy 0, policy_version 70762 (0.0007) +[2023-10-09 07:04:57,820][60144] Updated weights for policy 1, policy_version 71572 (0.0008) +[2023-10-09 07:04:57,891][60143] Updated weights for policy 0, policy_version 70772 (0.0007) +[2023-10-09 07:04:58,190][60144] Updated weights for policy 1, policy_version 71582 (0.0008) +[2023-10-09 07:04:58,258][60143] Updated weights for policy 0, policy_version 70782 (0.0009) +[2023-10-09 07:05:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 145784832. Throughput: 0: 1731.4, 1: 1732.8. Samples: 36459562. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-09 07:05:01,053][59242] Avg episode reward: [(0, '33.590'), (1, '35.330')] +[2023-10-09 07:05:02,177][60144] Updated weights for policy 1, policy_version 71592 (0.0008) +[2023-10-09 07:05:02,318][60143] Updated weights for policy 0, policy_version 70792 (0.0009) +[2023-10-09 07:05:02,541][60144] Updated weights for policy 1, policy_version 71602 (0.0010) +[2023-10-09 07:05:02,689][60143] Updated weights for policy 0, policy_version 70802 (0.0007) +[2023-10-09 07:05:02,907][60144] Updated weights for policy 1, policy_version 71612 (0.0008) +[2023-10-09 07:05:03,052][60143] Updated weights for policy 0, policy_version 70812 (0.0008) +[2023-10-09 07:05:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 145850368. Throughput: 0: 1698.0, 1: 1708.0. Samples: 36468790. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-09 07:05:06,053][59242] Avg episode reward: [(0, '33.440'), (1, '35.410')] +[2023-10-09 07:05:07,003][60143] Updated weights for policy 0, policy_version 70822 (0.0007) +[2023-10-09 07:05:07,018][60144] Updated weights for policy 1, policy_version 71622 (0.0008) +[2023-10-09 07:05:07,373][60143] Updated weights for policy 0, policy_version 70832 (0.0008) +[2023-10-09 07:05:07,388][60144] Updated weights for policy 1, policy_version 71632 (0.0009) +[2023-10-09 07:05:07,743][60143] Updated weights for policy 0, policy_version 70842 (0.0008) +[2023-10-09 07:05:07,760][60144] Updated weights for policy 1, policy_version 71642 (0.0007) +[2023-10-09 07:05:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 145915904. Throughput: 0: 1712.8, 1: 1719.1. Samples: 36489630. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-09 07:05:11,053][59242] Avg episode reward: [(0, '33.490'), (1, '35.780')] +[2023-10-09 07:05:11,770][60144] Updated weights for policy 1, policy_version 71652 (0.0008) +[2023-10-09 07:05:11,869][60143] Updated weights for policy 0, policy_version 70852 (0.0008) +[2023-10-09 07:05:12,136][60144] Updated weights for policy 1, policy_version 71662 (0.0010) +[2023-10-09 07:05:12,243][60143] Updated weights for policy 0, policy_version 70862 (0.0008) +[2023-10-09 07:05:12,506][60144] Updated weights for policy 1, policy_version 71672 (0.0007) +[2023-10-09 07:05:12,609][60143] Updated weights for policy 0, policy_version 70872 (0.0008) +[2023-10-09 07:05:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 145981440. Throughput: 0: 1725.5, 1: 1732.2. Samples: 36511070. Policy #0 lag: (min: 2.0, avg: 5.0, max: 34.0) +[2023-10-09 07:05:16,053][59242] Avg episode reward: [(0, '32.290'), (1, '35.830')] +[2023-10-09 07:05:16,407][60144] Updated weights for policy 1, policy_version 71682 (0.0008) +[2023-10-09 07:05:16,587][60143] Updated weights for policy 0, policy_version 70882 (0.0007) +[2023-10-09 07:05:16,776][60144] Updated weights for policy 1, policy_version 71692 (0.0007) +[2023-10-09 07:05:16,958][60143] Updated weights for policy 0, policy_version 70892 (0.0009) +[2023-10-09 07:05:17,138][60144] Updated weights for policy 1, policy_version 71702 (0.0007) +[2023-10-09 07:05:17,322][60143] Updated weights for policy 0, policy_version 70902 (0.0009) +[2023-10-09 07:05:17,504][60144] Updated weights for policy 1, policy_version 71712 (0.0008) +[2023-10-09 07:05:17,687][60143] Updated weights for policy 0, policy_version 70912 (0.0009) +[2023-10-09 07:05:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 146046976. Throughput: 0: 1701.9, 1: 1707.6. Samples: 36520256. Policy #0 lag: (min: 2.0, avg: 5.0, max: 34.0) +[2023-10-09 07:05:21,053][59242] Avg episode reward: [(0, '33.110'), (1, '36.950')] +[2023-10-09 07:05:21,541][60143] Updated weights for policy 0, policy_version 70922 (0.0009) +[2023-10-09 07:05:21,570][60144] Updated weights for policy 1, policy_version 71722 (0.0008) +[2023-10-09 07:05:21,912][60143] Updated weights for policy 0, policy_version 70932 (0.0009) +[2023-10-09 07:05:21,940][60144] Updated weights for policy 1, policy_version 71732 (0.0009) +[2023-10-09 07:05:22,287][60143] Updated weights for policy 0, policy_version 70942 (0.0008) +[2023-10-09 07:05:22,311][60144] Updated weights for policy 1, policy_version 71742 (0.0009) +[2023-10-09 07:05:25,980][60144] Updated weights for policy 1, policy_version 71752 (0.0008) +[2023-10-09 07:05:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 146112512. Throughput: 0: 1717.3, 1: 1732.4. Samples: 36541502. Policy #0 lag: (min: 2.0, avg: 5.0, max: 34.0) +[2023-10-09 07:05:26,053][59242] Avg episode reward: [(0, '32.850'), (1, '37.070')] +[2023-10-09 07:05:26,341][60144] Updated weights for policy 1, policy_version 71762 (0.0008) +[2023-10-09 07:05:26,366][60143] Updated weights for policy 0, policy_version 70952 (0.0009) +[2023-10-09 07:05:26,703][60144] Updated weights for policy 1, policy_version 71772 (0.0007) +[2023-10-09 07:05:26,741][60143] Updated weights for policy 0, policy_version 70962 (0.0009) +[2023-10-09 07:05:27,109][60143] Updated weights for policy 0, policy_version 70972 (0.0007) +[2023-10-09 07:05:30,720][60144] Updated weights for policy 1, policy_version 71782 (0.0008) +[2023-10-09 07:05:30,938][60143] Updated weights for policy 0, policy_version 70982 (0.0010) +[2023-10-09 07:05:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 146178048. Throughput: 0: 1719.3, 1: 1731.3. Samples: 36562742. Policy #0 lag: (min: 2.0, avg: 5.0, max: 34.0) +[2023-10-09 07:05:31,053][59242] Avg episode reward: [(0, '32.300'), (1, '37.930')] +[2023-10-09 07:05:31,108][60144] Updated weights for policy 1, policy_version 71792 (0.0007) +[2023-10-09 07:05:31,304][60143] Updated weights for policy 0, policy_version 70992 (0.0009) +[2023-10-09 07:05:31,462][60144] Updated weights for policy 1, policy_version 71802 (0.0009) +[2023-10-09 07:05:31,675][60143] Updated weights for policy 0, policy_version 71002 (0.0008) +[2023-10-09 07:05:31,677][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000071808_73531392.pth... +[2023-10-09 07:05:31,705][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000070176_71860224.pth +[2023-10-09 07:05:31,895][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000071008_72712192.pth... +[2023-10-09 07:05:31,932][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000069408_71073792.pth +[2023-10-09 07:05:35,423][60144] Updated weights for policy 1, policy_version 71812 (0.0008) +[2023-10-09 07:05:35,602][60143] Updated weights for policy 0, policy_version 71012 (0.0008) +[2023-10-09 07:05:35,788][60144] Updated weights for policy 1, policy_version 71822 (0.0008) +[2023-10-09 07:05:35,983][60143] Updated weights for policy 0, policy_version 71022 (0.0008) +[2023-10-09 07:05:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 146243584. Throughput: 0: 1717.3, 1: 1723.4. Samples: 36572030. Policy #0 lag: (min: 2.0, avg: 5.0, max: 34.0) +[2023-10-09 07:05:36,053][59242] Avg episode reward: [(0, '32.800'), (1, '36.960')] +[2023-10-09 07:05:36,162][60144] Updated weights for policy 1, policy_version 71832 (0.0008) +[2023-10-09 07:05:36,345][60143] Updated weights for policy 0, policy_version 71032 (0.0009) +[2023-10-09 07:05:40,224][60144] Updated weights for policy 1, policy_version 71842 (0.0008) +[2023-10-09 07:05:40,313][60143] Updated weights for policy 0, policy_version 71042 (0.0007) +[2023-10-09 07:05:40,594][60144] Updated weights for policy 1, policy_version 71852 (0.0008) +[2023-10-09 07:05:40,687][60143] Updated weights for policy 0, policy_version 71052 (0.0008) +[2023-10-09 07:05:40,954][60144] Updated weights for policy 1, policy_version 71862 (0.0009) +[2023-10-09 07:05:41,052][60143] Updated weights for policy 0, policy_version 71062 (0.0007) +[2023-10-09 07:05:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 146309120. Throughput: 0: 1722.1, 1: 1730.6. Samples: 36593388. Policy #0 lag: (min: 2.0, avg: 5.0, max: 34.0) +[2023-10-09 07:05:41,053][59242] Avg episode reward: [(0, '31.460'), (1, '35.030')] +[2023-10-09 07:05:41,314][60144] Updated weights for policy 1, policy_version 71872 (0.0007) +[2023-10-09 07:05:41,423][60143] Updated weights for policy 0, policy_version 71072 (0.0007) +[2023-10-09 07:05:45,300][60143] Updated weights for policy 0, policy_version 71082 (0.0007) +[2023-10-09 07:05:45,361][60144] Updated weights for policy 1, policy_version 71882 (0.0008) +[2023-10-09 07:05:45,669][60143] Updated weights for policy 0, policy_version 71092 (0.0009) +[2023-10-09 07:05:45,721][60144] Updated weights for policy 1, policy_version 71892 (0.0009) +[2023-10-09 07:05:46,041][60143] Updated weights for policy 0, policy_version 71102 (0.0008) +[2023-10-09 07:05:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 146374656. Throughput: 0: 1703.6, 1: 1711.6. Samples: 36613244. Policy #0 lag: (min: 2.0, avg: 5.0, max: 34.0) +[2023-10-09 07:05:46,052][59242] Avg episode reward: [(0, '33.380'), (1, '36.060')] +[2023-10-09 07:05:46,086][60144] Updated weights for policy 1, policy_version 71902 (0.0007) +[2023-10-09 07:05:49,860][60144] Updated weights for policy 1, policy_version 71912 (0.0009) +[2023-10-09 07:05:50,056][60143] Updated weights for policy 0, policy_version 71112 (0.0008) +[2023-10-09 07:05:50,226][60144] Updated weights for policy 1, policy_version 71922 (0.0008) +[2023-10-09 07:05:50,424][60143] Updated weights for policy 0, policy_version 71122 (0.0008) +[2023-10-09 07:05:50,588][60144] Updated weights for policy 1, policy_version 71932 (0.0009) +[2023-10-09 07:05:50,792][60143] Updated weights for policy 0, policy_version 71132 (0.0009) +[2023-10-09 07:05:51,052][59242] Fps is (10 sec: 19660.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 146505728. Throughput: 0: 1716.3, 1: 1730.2. Samples: 36623884. Policy #0 lag: (min: 2.0, avg: 5.0, max: 34.0) +[2023-10-09 07:05:51,053][59242] Avg episode reward: [(0, '32.990'), (1, '35.350')] +[2023-10-09 07:05:54,640][60144] Updated weights for policy 1, policy_version 71942 (0.0009) +[2023-10-09 07:05:54,827][60143] Updated weights for policy 0, policy_version 71142 (0.0009) +[2023-10-09 07:05:55,004][60144] Updated weights for policy 1, policy_version 71952 (0.0008) +[2023-10-09 07:05:55,187][60143] Updated weights for policy 0, policy_version 71152 (0.0008) +[2023-10-09 07:05:55,376][60144] Updated weights for policy 1, policy_version 71962 (0.0009) +[2023-10-09 07:05:55,553][60143] Updated weights for policy 0, policy_version 71162 (0.0009) +[2023-10-09 07:05:56,052][59242] Fps is (10 sec: 19660.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 146571264. Throughput: 0: 1722.8, 1: 1722.3. Samples: 36644660. Policy #0 lag: (min: 2.0, avg: 5.0, max: 34.0) +[2023-10-09 07:05:56,052][59242] Avg episode reward: [(0, '31.490'), (1, '33.940')] +[2023-10-09 07:05:59,351][60144] Updated weights for policy 1, policy_version 71972 (0.0008) +[2023-10-09 07:05:59,667][60143] Updated weights for policy 0, policy_version 71172 (0.0011) +[2023-10-09 07:05:59,708][60144] Updated weights for policy 1, policy_version 71982 (0.0008) +[2023-10-09 07:06:00,045][60143] Updated weights for policy 0, policy_version 71182 (0.0009) +[2023-10-09 07:06:00,075][60144] Updated weights for policy 1, policy_version 71992 (0.0009) +[2023-10-09 07:06:00,417][60143] Updated weights for policy 0, policy_version 71192 (0.0009) +[2023-10-09 07:06:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 146636800. Throughput: 0: 1693.5, 1: 1693.3. Samples: 36663478. Policy #0 lag: (min: 0.0, avg: 17.3, max: 32.0) +[2023-10-09 07:06:01,053][59242] Avg episode reward: [(0, '31.420'), (1, '33.400')] +[2023-10-09 07:06:04,137][60144] Updated weights for policy 1, policy_version 72002 (0.0007) +[2023-10-09 07:06:04,454][60143] Updated weights for policy 0, policy_version 71202 (0.0010) +[2023-10-09 07:06:04,512][60144] Updated weights for policy 1, policy_version 72012 (0.0008) +[2023-10-09 07:06:04,817][60143] Updated weights for policy 0, policy_version 71212 (0.0008) +[2023-10-09 07:06:04,868][60144] Updated weights for policy 1, policy_version 72022 (0.0007) +[2023-10-09 07:06:05,182][60143] Updated weights for policy 0, policy_version 71222 (0.0009) +[2023-10-09 07:06:05,241][60144] Updated weights for policy 1, policy_version 72032 (0.0007) +[2023-10-09 07:06:05,549][60143] Updated weights for policy 0, policy_version 71232 (0.0009) +[2023-10-09 07:06:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 146702336. Throughput: 0: 1717.1, 1: 1724.2. Samples: 36675114. Policy #0 lag: (min: 0.0, avg: 17.3, max: 32.0) +[2023-10-09 07:06:06,052][59242] Avg episode reward: [(0, '31.410'), (1, '33.200')] +[2023-10-09 07:06:09,147][60144] Updated weights for policy 1, policy_version 72042 (0.0008) +[2023-10-09 07:06:09,480][60143] Updated weights for policy 0, policy_version 71242 (0.0010) +[2023-10-09 07:06:09,522][60144] Updated weights for policy 1, policy_version 72052 (0.0007) +[2023-10-09 07:06:09,857][60143] Updated weights for policy 0, policy_version 71252 (0.0009) +[2023-10-09 07:06:09,884][60144] Updated weights for policy 1, policy_version 72062 (0.0007) +[2023-10-09 07:06:10,229][60143] Updated weights for policy 0, policy_version 71262 (0.0008) +[2023-10-09 07:06:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 146767872. Throughput: 0: 1711.4, 1: 1702.3. Samples: 36695118. Policy #0 lag: (min: 0.0, avg: 17.3, max: 32.0) +[2023-10-09 07:06:11,053][59242] Avg episode reward: [(0, '32.060'), (1, '35.580')] +[2023-10-09 07:06:13,734][60144] Updated weights for policy 1, policy_version 72072 (0.0008) +[2023-10-09 07:06:14,107][60144] Updated weights for policy 1, policy_version 72082 (0.0009) +[2023-10-09 07:06:14,295][60143] Updated weights for policy 0, policy_version 71272 (0.0009) +[2023-10-09 07:06:14,471][60144] Updated weights for policy 1, policy_version 72092 (0.0007) +[2023-10-09 07:06:14,658][60143] Updated weights for policy 0, policy_version 71282 (0.0009) +[2023-10-09 07:06:15,035][60143] Updated weights for policy 0, policy_version 71292 (0.0011) +[2023-10-09 07:06:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 146833408. Throughput: 0: 1685.6, 1: 1697.1. Samples: 36714960. Policy #0 lag: (min: 0.0, avg: 17.3, max: 32.0) +[2023-10-09 07:06:16,053][59242] Avg episode reward: [(0, '31.510'), (1, '34.720')] +[2023-10-09 07:06:18,578][60144] Updated weights for policy 1, policy_version 72102 (0.0007) +[2023-10-09 07:06:18,850][60143] Updated weights for policy 0, policy_version 71302 (0.0010) +[2023-10-09 07:06:18,965][60144] Updated weights for policy 1, policy_version 72112 (0.0008) +[2023-10-09 07:06:19,214][60143] Updated weights for policy 0, policy_version 71312 (0.0008) +[2023-10-09 07:06:19,336][60144] Updated weights for policy 1, policy_version 72122 (0.0008) +[2023-10-09 07:06:19,577][60143] Updated weights for policy 0, policy_version 71322 (0.0007) +[2023-10-09 07:06:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 146898944. Throughput: 0: 1718.9, 1: 1716.5. Samples: 36726626. Policy #0 lag: (min: 0.0, avg: 17.3, max: 32.0) +[2023-10-09 07:06:21,053][59242] Avg episode reward: [(0, '30.890'), (1, '33.460')] +[2023-10-09 07:06:23,304][60144] Updated weights for policy 1, policy_version 72132 (0.0008) +[2023-10-09 07:06:23,610][60143] Updated weights for policy 0, policy_version 71332 (0.0009) +[2023-10-09 07:06:23,671][60144] Updated weights for policy 1, policy_version 72142 (0.0007) +[2023-10-09 07:06:23,999][60143] Updated weights for policy 0, policy_version 71342 (0.0009) +[2023-10-09 07:06:24,033][60144] Updated weights for policy 1, policy_version 72152 (0.0007) +[2023-10-09 07:06:24,361][60143] Updated weights for policy 0, policy_version 71352 (0.0008) +[2023-10-09 07:06:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 146964480. Throughput: 0: 1685.6, 1: 1691.4. Samples: 36745354. Policy #0 lag: (min: 0.0, avg: 17.3, max: 32.0) +[2023-10-09 07:06:26,053][59242] Avg episode reward: [(0, '30.370'), (1, '33.410')] +[2023-10-09 07:06:28,075][60144] Updated weights for policy 1, policy_version 72162 (0.0007) +[2023-10-09 07:06:28,397][60143] Updated weights for policy 0, policy_version 71362 (0.0008) +[2023-10-09 07:06:28,453][60144] Updated weights for policy 1, policy_version 72172 (0.0010) +[2023-10-09 07:06:28,766][60143] Updated weights for policy 0, policy_version 71372 (0.0007) +[2023-10-09 07:06:28,810][60144] Updated weights for policy 1, policy_version 72182 (0.0009) +[2023-10-09 07:06:29,137][60143] Updated weights for policy 0, policy_version 71382 (0.0009) +[2023-10-09 07:06:29,180][60144] Updated weights for policy 1, policy_version 72192 (0.0008) +[2023-10-09 07:06:29,503][60143] Updated weights for policy 0, policy_version 71392 (0.0008) +[2023-10-09 07:06:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 147030016. Throughput: 0: 1691.9, 1: 1710.0. Samples: 36766326. Policy #0 lag: (min: 0.0, avg: 17.3, max: 32.0) +[2023-10-09 07:06:31,053][59242] Avg episode reward: [(0, '30.790'), (1, '33.410')] +[2023-10-09 07:06:33,228][60144] Updated weights for policy 1, policy_version 72202 (0.0007) +[2023-10-09 07:06:33,526][60143] Updated weights for policy 0, policy_version 71402 (0.0010) +[2023-10-09 07:06:33,595][60144] Updated weights for policy 1, policy_version 72212 (0.0007) +[2023-10-09 07:06:33,887][60143] Updated weights for policy 0, policy_version 71412 (0.0008) +[2023-10-09 07:06:33,963][60144] Updated weights for policy 1, policy_version 72222 (0.0008) +[2023-10-09 07:06:34,273][60143] Updated weights for policy 0, policy_version 71422 (0.0007) +[2023-10-09 07:06:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 147095552. Throughput: 0: 1700.6, 1: 1700.7. Samples: 36776942. Policy #0 lag: (min: 0.0, avg: 17.3, max: 32.0) +[2023-10-09 07:06:36,052][59242] Avg episode reward: [(0, '30.640'), (1, '34.080')] +[2023-10-09 07:06:37,859][60144] Updated weights for policy 1, policy_version 72232 (0.0008) +[2023-10-09 07:06:38,223][60144] Updated weights for policy 1, policy_version 72242 (0.0010) +[2023-10-09 07:06:38,310][60143] Updated weights for policy 0, policy_version 71432 (0.0007) +[2023-10-09 07:06:38,586][60144] Updated weights for policy 1, policy_version 72252 (0.0007) +[2023-10-09 07:06:38,669][60143] Updated weights for policy 0, policy_version 71442 (0.0009) +[2023-10-09 07:06:39,045][60143] Updated weights for policy 0, policy_version 71452 (0.0009) +[2023-10-09 07:06:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 147161088. Throughput: 0: 1680.8, 1: 1697.7. Samples: 36796694. Policy #0 lag: (min: 0.0, avg: 17.3, max: 32.0) +[2023-10-09 07:06:41,053][59242] Avg episode reward: [(0, '30.780'), (1, '34.480')] +[2023-10-09 07:06:42,403][60144] Updated weights for policy 1, policy_version 72262 (0.0008) +[2023-10-09 07:06:42,770][60144] Updated weights for policy 1, policy_version 72272 (0.0010) +[2023-10-09 07:06:43,117][60143] Updated weights for policy 0, policy_version 71462 (0.0008) +[2023-10-09 07:06:43,136][60144] Updated weights for policy 1, policy_version 72282 (0.0009) +[2023-10-09 07:06:43,488][60143] Updated weights for policy 0, policy_version 71472 (0.0009) +[2023-10-09 07:06:43,860][60143] Updated weights for policy 0, policy_version 71482 (0.0008) +[2023-10-09 07:06:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 147226624. Throughput: 0: 1700.3, 1: 1729.6. Samples: 36817824. Policy #0 lag: (min: 0.0, avg: 17.3, max: 32.0) +[2023-10-09 07:06:46,053][59242] Avg episode reward: [(0, '33.320'), (1, '35.340')] +[2023-10-09 07:06:46,961][60144] Updated weights for policy 1, policy_version 72292 (0.0008) +[2023-10-09 07:06:47,324][60144] Updated weights for policy 1, policy_version 72302 (0.0008) +[2023-10-09 07:06:47,690][60144] Updated weights for policy 1, policy_version 72312 (0.0007) +[2023-10-09 07:06:47,948][60143] Updated weights for policy 0, policy_version 71492 (0.0008) +[2023-10-09 07:06:48,328][60143] Updated weights for policy 0, policy_version 71502 (0.0009) +[2023-10-09 07:06:48,702][60143] Updated weights for policy 0, policy_version 71512 (0.0008) +[2023-10-09 07:06:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 147292160. Throughput: 0: 1689.8, 1: 1700.7. Samples: 36827684. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:06:51,053][59242] Avg episode reward: [(0, '31.730'), (1, '35.500')] +[2023-10-09 07:06:51,627][60144] Updated weights for policy 1, policy_version 72322 (0.0008) +[2023-10-09 07:06:51,991][60144] Updated weights for policy 1, policy_version 72332 (0.0008) +[2023-10-09 07:06:52,367][60144] Updated weights for policy 1, policy_version 72342 (0.0008) +[2023-10-09 07:06:52,655][60143] Updated weights for policy 0, policy_version 71522 (0.0008) +[2023-10-09 07:06:52,740][60144] Updated weights for policy 1, policy_version 72352 (0.0007) +[2023-10-09 07:06:53,020][60143] Updated weights for policy 0, policy_version 71532 (0.0008) +[2023-10-09 07:06:53,398][60143] Updated weights for policy 0, policy_version 71542 (0.0009) +[2023-10-09 07:06:53,762][60143] Updated weights for policy 0, policy_version 71552 (0.0008) +[2023-10-09 07:06:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 147357696. Throughput: 0: 1684.0, 1: 1722.4. Samples: 36848406. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:06:56,052][59242] Avg episode reward: [(0, '30.760'), (1, '35.330')] +[2023-10-09 07:06:56,809][60144] Updated weights for policy 1, policy_version 72362 (0.0008) +[2023-10-09 07:06:57,184][60144] Updated weights for policy 1, policy_version 72372 (0.0007) +[2023-10-09 07:06:57,558][60144] Updated weights for policy 1, policy_version 72382 (0.0007) +[2023-10-09 07:06:57,675][60143] Updated weights for policy 0, policy_version 71562 (0.0007) +[2023-10-09 07:06:58,050][60143] Updated weights for policy 0, policy_version 71572 (0.0008) +[2023-10-09 07:06:58,419][60143] Updated weights for policy 0, policy_version 71582 (0.0010) +[2023-10-09 07:07:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 147423232. Throughput: 0: 1711.9, 1: 1728.3. Samples: 36869768. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:07:01,053][59242] Avg episode reward: [(0, '32.590'), (1, '33.830')] +[2023-10-09 07:07:01,598][60144] Updated weights for policy 1, policy_version 72392 (0.0010) +[2023-10-09 07:07:01,968][60144] Updated weights for policy 1, policy_version 72402 (0.0010) +[2023-10-09 07:07:02,347][60144] Updated weights for policy 1, policy_version 72412 (0.0008) +[2023-10-09 07:07:02,398][60143] Updated weights for policy 0, policy_version 71592 (0.0007) +[2023-10-09 07:07:02,760][60143] Updated weights for policy 0, policy_version 71602 (0.0010) +[2023-10-09 07:07:03,121][60143] Updated weights for policy 0, policy_version 71612 (0.0010) +[2023-10-09 07:07:06,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 147488768. Throughput: 0: 1677.3, 1: 1707.2. Samples: 36878932. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:07:06,053][59242] Avg episode reward: [(0, '33.660'), (1, '33.720')] +[2023-10-09 07:07:06,375][60144] Updated weights for policy 1, policy_version 72422 (0.0009) +[2023-10-09 07:07:06,743][60144] Updated weights for policy 1, policy_version 72432 (0.0009) +[2023-10-09 07:07:07,108][60144] Updated weights for policy 1, policy_version 72442 (0.0009) +[2023-10-09 07:07:07,160][60143] Updated weights for policy 0, policy_version 71622 (0.0009) +[2023-10-09 07:07:07,536][60143] Updated weights for policy 0, policy_version 71632 (0.0008) +[2023-10-09 07:07:07,906][60143] Updated weights for policy 0, policy_version 71642 (0.0010) +[2023-10-09 07:07:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 147554304. Throughput: 0: 1705.0, 1: 1727.9. Samples: 36899832. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:07:11,053][59242] Avg episode reward: [(0, '33.350'), (1, '34.070')] +[2023-10-09 07:07:11,153][60144] Updated weights for policy 1, policy_version 72452 (0.0009) +[2023-10-09 07:07:11,519][60144] Updated weights for policy 1, policy_version 72462 (0.0009) +[2023-10-09 07:07:11,897][60144] Updated weights for policy 1, policy_version 72472 (0.0008) +[2023-10-09 07:07:11,996][60143] Updated weights for policy 0, policy_version 71652 (0.0009) +[2023-10-09 07:07:12,376][60143] Updated weights for policy 0, policy_version 71662 (0.0009) +[2023-10-09 07:07:12,749][60143] Updated weights for policy 0, policy_version 71672 (0.0010) +[2023-10-09 07:07:15,737][60144] Updated weights for policy 1, policy_version 72482 (0.0007) +[2023-10-09 07:07:16,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 147619840. Throughput: 0: 1707.6, 1: 1729.1. Samples: 36920978. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:07:16,053][59242] Avg episode reward: [(0, '32.210'), (1, '34.230')] +[2023-10-09 07:07:16,108][60144] Updated weights for policy 1, policy_version 72492 (0.0007) +[2023-10-09 07:07:16,470][60144] Updated weights for policy 1, policy_version 72502 (0.0007) +[2023-10-09 07:07:16,695][60143] Updated weights for policy 0, policy_version 71682 (0.0009) +[2023-10-09 07:07:16,832][60144] Updated weights for policy 1, policy_version 72512 (0.0009) +[2023-10-09 07:07:17,060][60143] Updated weights for policy 0, policy_version 71692 (0.0007) +[2023-10-09 07:07:17,433][60143] Updated weights for policy 0, policy_version 71702 (0.0009) +[2023-10-09 07:07:17,799][60143] Updated weights for policy 0, policy_version 71712 (0.0007) +[2023-10-09 07:07:20,705][60144] Updated weights for policy 1, policy_version 72522 (0.0008) +[2023-10-09 07:07:21,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 147685376. Throughput: 0: 1686.0, 1: 1721.1. Samples: 36930266. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:07:21,053][59242] Avg episode reward: [(0, '32.760'), (1, '35.200')] +[2023-10-09 07:07:21,061][60144] Updated weights for policy 1, policy_version 72532 (0.0008) +[2023-10-09 07:07:21,427][60144] Updated weights for policy 1, policy_version 72542 (0.0007) +[2023-10-09 07:07:21,964][60143] Updated weights for policy 0, policy_version 71722 (0.0010) +[2023-10-09 07:07:22,338][60143] Updated weights for policy 0, policy_version 71732 (0.0008) +[2023-10-09 07:07:22,706][60143] Updated weights for policy 0, policy_version 71742 (0.0008) +[2023-10-09 07:07:25,359][60144] Updated weights for policy 1, policy_version 72552 (0.0007) +[2023-10-09 07:07:25,720][60144] Updated weights for policy 1, policy_version 72562 (0.0010) +[2023-10-09 07:07:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 147750912. Throughput: 0: 1707.3, 1: 1735.3. Samples: 36951612. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:07:26,053][59242] Avg episode reward: [(0, '32.630'), (1, '34.620')] +[2023-10-09 07:07:26,092][60144] Updated weights for policy 1, policy_version 72572 (0.0008) +[2023-10-09 07:07:26,659][60143] Updated weights for policy 0, policy_version 71752 (0.0009) +[2023-10-09 07:07:27,034][60143] Updated weights for policy 0, policy_version 71762 (0.0009) +[2023-10-09 07:07:27,393][60143] Updated weights for policy 0, policy_version 71772 (0.0009) +[2023-10-09 07:07:30,038][60144] Updated weights for policy 1, policy_version 72582 (0.0009) +[2023-10-09 07:07:30,413][60144] Updated weights for policy 1, policy_version 72592 (0.0009) +[2023-10-09 07:07:30,781][60144] Updated weights for policy 1, policy_version 72602 (0.0010) +[2023-10-09 07:07:31,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 147849216. Throughput: 0: 1714.2, 1: 1716.5. Samples: 36972204. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:07:31,053][59242] Avg episode reward: [(0, '32.810'), (1, '35.640')] +[2023-10-09 07:07:31,061][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000072608_74350592.pth... +[2023-10-09 07:07:31,099][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000070976_72679424.pth +[2023-10-09 07:07:31,105][60003] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p1/milestones/checkpoint_000072608_74350592.pth +[2023-10-09 07:07:31,327][60143] Updated weights for policy 0, policy_version 71782 (0.0009) +[2023-10-09 07:07:31,692][60143] Updated weights for policy 0, policy_version 71792 (0.0007) +[2023-10-09 07:07:32,065][60143] Updated weights for policy 0, policy_version 71802 (0.0007) +[2023-10-09 07:07:32,277][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000071808_73531392.pth... +[2023-10-09 07:07:32,306][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000070208_71892992.pth +[2023-10-09 07:07:32,310][59934] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p0/milestones/checkpoint_000071808_73531392.pth +[2023-10-09 07:07:34,700][60144] Updated weights for policy 1, policy_version 72612 (0.0010) +[2023-10-09 07:07:35,069][60144] Updated weights for policy 1, policy_version 72622 (0.0009) +[2023-10-09 07:07:35,441][60144] Updated weights for policy 1, policy_version 72632 (0.0009) +[2023-10-09 07:07:35,893][60143] Updated weights for policy 0, policy_version 71812 (0.0007) +[2023-10-09 07:07:36,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 147914752. Throughput: 0: 1702.9, 1: 1735.8. Samples: 36982426. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:07:36,052][59242] Avg episode reward: [(0, '32.530'), (1, '37.390')] +[2023-10-09 07:07:36,261][60143] Updated weights for policy 0, policy_version 71822 (0.0008) +[2023-10-09 07:07:36,621][60143] Updated weights for policy 0, policy_version 71832 (0.0007) +[2023-10-09 07:07:39,345][60144] Updated weights for policy 1, policy_version 72642 (0.0009) +[2023-10-09 07:07:39,712][60144] Updated weights for policy 1, policy_version 72652 (0.0010) +[2023-10-09 07:07:40,073][60144] Updated weights for policy 1, policy_version 72662 (0.0008) +[2023-10-09 07:07:40,443][60144] Updated weights for policy 1, policy_version 72672 (0.0009) +[2023-10-09 07:07:40,558][60143] Updated weights for policy 0, policy_version 71842 (0.0008) +[2023-10-09 07:07:40,928][60143] Updated weights for policy 0, policy_version 71852 (0.0009) +[2023-10-09 07:07:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 147980288. Throughput: 0: 1712.8, 1: 1729.7. Samples: 37003318. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-09 07:07:41,052][59242] Avg episode reward: [(0, '31.990'), (1, '34.330')] +[2023-10-09 07:07:41,308][60143] Updated weights for policy 0, policy_version 71862 (0.0009) +[2023-10-09 07:07:41,679][60143] Updated weights for policy 0, policy_version 71872 (0.0009) +[2023-10-09 07:07:44,352][60144] Updated weights for policy 1, policy_version 72682 (0.0011) +[2023-10-09 07:07:44,720][60144] Updated weights for policy 1, policy_version 72692 (0.0009) +[2023-10-09 07:07:45,101][60144] Updated weights for policy 1, policy_version 72702 (0.0009) +[2023-10-09 07:07:45,670][60143] Updated weights for policy 0, policy_version 71882 (0.0008) +[2023-10-09 07:07:46,024][60143] Updated weights for policy 0, policy_version 71892 (0.0010) +[2023-10-09 07:07:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 148045824. Throughput: 0: 1703.0, 1: 1708.2. Samples: 37023274. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-09 07:07:46,053][59242] Avg episode reward: [(0, '33.090'), (1, '33.190')] +[2023-10-09 07:07:46,391][60143] Updated weights for policy 0, policy_version 71902 (0.0009) +[2023-10-09 07:07:49,012][60144] Updated weights for policy 1, policy_version 72712 (0.0008) +[2023-10-09 07:07:49,391][60144] Updated weights for policy 1, policy_version 72722 (0.0009) +[2023-10-09 07:07:49,755][60144] Updated weights for policy 1, policy_version 72732 (0.0008) +[2023-10-09 07:07:50,613][60143] Updated weights for policy 0, policy_version 71912 (0.0010) +[2023-10-09 07:07:50,985][60143] Updated weights for policy 0, policy_version 71922 (0.0008) +[2023-10-09 07:07:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 148111360. Throughput: 0: 1707.1, 1: 1741.6. Samples: 37034124. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-09 07:07:51,053][59242] Avg episode reward: [(0, '34.410'), (1, '34.140')] +[2023-10-09 07:07:51,371][60143] Updated weights for policy 0, policy_version 71932 (0.0009) +[2023-10-09 07:07:53,660][60144] Updated weights for policy 1, policy_version 72742 (0.0007) +[2023-10-09 07:07:54,037][60144] Updated weights for policy 1, policy_version 72752 (0.0008) +[2023-10-09 07:07:54,405][60144] Updated weights for policy 1, policy_version 72762 (0.0008) +[2023-10-09 07:07:55,380][60143] Updated weights for policy 0, policy_version 71942 (0.0010) +[2023-10-09 07:07:55,748][60143] Updated weights for policy 0, policy_version 71952 (0.0008) +[2023-10-09 07:07:56,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 148176896. Throughput: 0: 1712.7, 1: 1716.9. Samples: 37054164. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-09 07:07:56,053][59242] Avg episode reward: [(0, '34.490'), (1, '33.870')] +[2023-10-09 07:07:56,116][60143] Updated weights for policy 0, policy_version 71962 (0.0010) +[2023-10-09 07:07:58,401][60144] Updated weights for policy 1, policy_version 72772 (0.0009) +[2023-10-09 07:07:58,772][60144] Updated weights for policy 1, policy_version 72782 (0.0007) +[2023-10-09 07:07:59,135][60144] Updated weights for policy 1, policy_version 72792 (0.0009) +[2023-10-09 07:08:00,194][60143] Updated weights for policy 0, policy_version 71972 (0.0010) +[2023-10-09 07:08:00,582][60143] Updated weights for policy 0, policy_version 71982 (0.0009) +[2023-10-09 07:08:00,949][60143] Updated weights for policy 0, policy_version 71992 (0.0010) +[2023-10-09 07:08:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 148242432. Throughput: 0: 1706.1, 1: 1716.2. Samples: 37074982. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-09 07:08:01,053][59242] Avg episode reward: [(0, '32.940'), (1, '34.160')] +[2023-10-09 07:08:02,915][60144] Updated weights for policy 1, policy_version 72802 (0.0007) +[2023-10-09 07:08:03,281][60144] Updated weights for policy 1, policy_version 72812 (0.0007) +[2023-10-09 07:08:03,653][60144] Updated weights for policy 1, policy_version 72822 (0.0007) +[2023-10-09 07:08:04,019][60144] Updated weights for policy 1, policy_version 72832 (0.0007) +[2023-10-09 07:08:04,913][60143] Updated weights for policy 0, policy_version 72002 (0.0008) +[2023-10-09 07:08:05,286][60143] Updated weights for policy 0, policy_version 72012 (0.0007) +[2023-10-09 07:08:05,650][60143] Updated weights for policy 0, policy_version 72022 (0.0011) +[2023-10-09 07:08:06,024][60143] Updated weights for policy 0, policy_version 72032 (0.0010) +[2023-10-09 07:08:06,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 148340736. Throughput: 0: 1716.6, 1: 1726.0. Samples: 37085184. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-09 07:08:06,053][59242] Avg episode reward: [(0, '33.340'), (1, '34.100')] +[2023-10-09 07:08:08,006][60144] Updated weights for policy 1, policy_version 72842 (0.0009) +[2023-10-09 07:08:08,367][60144] Updated weights for policy 1, policy_version 72852 (0.0008) +[2023-10-09 07:08:08,746][60144] Updated weights for policy 1, policy_version 72862 (0.0009) +[2023-10-09 07:08:09,931][60143] Updated weights for policy 0, policy_version 72042 (0.0008) +[2023-10-09 07:08:10,309][60143] Updated weights for policy 0, policy_version 72052 (0.0008) +[2023-10-09 07:08:10,694][60143] Updated weights for policy 0, policy_version 72062 (0.0010) +[2023-10-09 07:08:11,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 148406272. Throughput: 0: 1719.1, 1: 1716.1. Samples: 37106194. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-09 07:08:11,053][59242] Avg episode reward: [(0, '31.670'), (1, '32.780')] +[2023-10-09 07:08:12,684][60144] Updated weights for policy 1, policy_version 72872 (0.0007) +[2023-10-09 07:08:13,049][60144] Updated weights for policy 1, policy_version 72882 (0.0008) +[2023-10-09 07:08:13,426][60144] Updated weights for policy 1, policy_version 72892 (0.0008) +[2023-10-09 07:08:14,576][60143] Updated weights for policy 0, policy_version 72072 (0.0009) +[2023-10-09 07:08:14,951][60143] Updated weights for policy 0, policy_version 72082 (0.0010) +[2023-10-09 07:08:15,321][60143] Updated weights for policy 0, policy_version 72092 (0.0008) +[2023-10-09 07:08:16,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 148471808. Throughput: 0: 1690.1, 1: 1727.9. Samples: 37126012. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-09 07:08:16,052][59242] Avg episode reward: [(0, '31.140'), (1, '32.330')] +[2023-10-09 07:08:17,612][60144] Updated weights for policy 1, policy_version 72902 (0.0008) +[2023-10-09 07:08:17,980][60144] Updated weights for policy 1, policy_version 72912 (0.0007) +[2023-10-09 07:08:18,342][60144] Updated weights for policy 1, policy_version 72922 (0.0007) +[2023-10-09 07:08:19,196][60143] Updated weights for policy 0, policy_version 72102 (0.0008) +[2023-10-09 07:08:19,569][60143] Updated weights for policy 0, policy_version 72112 (0.0008) +[2023-10-09 07:08:19,938][60143] Updated weights for policy 0, policy_version 72122 (0.0007) +[2023-10-09 07:08:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 148537344. Throughput: 0: 1720.1, 1: 1708.4. Samples: 37136710. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-09 07:08:21,053][59242] Avg episode reward: [(0, '31.220'), (1, '31.410')] +[2023-10-09 07:08:22,249][60144] Updated weights for policy 1, policy_version 72932 (0.0007) +[2023-10-09 07:08:22,613][60144] Updated weights for policy 1, policy_version 72942 (0.0008) +[2023-10-09 07:08:22,981][60144] Updated weights for policy 1, policy_version 72952 (0.0008) +[2023-10-09 07:08:23,985][60143] Updated weights for policy 0, policy_version 72132 (0.0009) +[2023-10-09 07:08:24,350][60143] Updated weights for policy 0, policy_version 72142 (0.0009) +[2023-10-09 07:08:24,724][60143] Updated weights for policy 0, policy_version 72152 (0.0009) +[2023-10-09 07:08:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 148602880. Throughput: 0: 1707.1, 1: 1710.9. Samples: 37157128. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:08:26,052][59242] Avg episode reward: [(0, '32.060'), (1, '31.290')] +[2023-10-09 07:08:26,857][60144] Updated weights for policy 1, policy_version 72962 (0.0008) +[2023-10-09 07:08:27,215][60144] Updated weights for policy 1, policy_version 72972 (0.0008) +[2023-10-09 07:08:27,586][60144] Updated weights for policy 1, policy_version 72982 (0.0007) +[2023-10-09 07:08:27,953][60144] Updated weights for policy 1, policy_version 72992 (0.0008) +[2023-10-09 07:08:28,768][60143] Updated weights for policy 0, policy_version 72162 (0.0007) +[2023-10-09 07:08:29,136][60143] Updated weights for policy 0, policy_version 72172 (0.0008) +[2023-10-09 07:08:29,503][60143] Updated weights for policy 0, policy_version 72182 (0.0007) +[2023-10-09 07:08:29,871][60143] Updated weights for policy 0, policy_version 72192 (0.0007) +[2023-10-09 07:08:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 148668416. Throughput: 0: 1695.6, 1: 1732.2. Samples: 37177524. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:08:31,053][59242] Avg episode reward: [(0, '31.870'), (1, '32.110')] +[2023-10-09 07:08:31,943][60144] Updated weights for policy 1, policy_version 73002 (0.0008) +[2023-10-09 07:08:32,316][60144] Updated weights for policy 1, policy_version 73012 (0.0008) +[2023-10-09 07:08:32,686][60144] Updated weights for policy 1, policy_version 73022 (0.0007) +[2023-10-09 07:08:33,843][60143] Updated weights for policy 0, policy_version 72202 (0.0009) +[2023-10-09 07:08:34,206][60143] Updated weights for policy 0, policy_version 72212 (0.0009) +[2023-10-09 07:08:34,581][60143] Updated weights for policy 0, policy_version 72222 (0.0007) +[2023-10-09 07:08:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 148733952. Throughput: 0: 1722.5, 1: 1699.9. Samples: 37188130. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:08:36,053][59242] Avg episode reward: [(0, '31.330'), (1, '31.120')] +[2023-10-09 07:08:36,737][60144] Updated weights for policy 1, policy_version 73032 (0.0008) +[2023-10-09 07:08:37,110][60144] Updated weights for policy 1, policy_version 73042 (0.0007) +[2023-10-09 07:08:37,476][60144] Updated weights for policy 1, policy_version 73052 (0.0008) +[2023-10-09 07:08:38,572][60143] Updated weights for policy 0, policy_version 72232 (0.0011) +[2023-10-09 07:08:38,939][60143] Updated weights for policy 0, policy_version 72242 (0.0011) +[2023-10-09 07:08:39,302][60143] Updated weights for policy 0, policy_version 72252 (0.0010) +[2023-10-09 07:08:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 148799488. Throughput: 0: 1690.2, 1: 1730.1. Samples: 37208076. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:08:41,052][59242] Avg episode reward: [(0, '32.340'), (1, '31.340')] +[2023-10-09 07:08:41,515][60144] Updated weights for policy 1, policy_version 73062 (0.0007) +[2023-10-09 07:08:41,905][60144] Updated weights for policy 1, policy_version 73072 (0.0008) +[2023-10-09 07:08:42,273][60144] Updated weights for policy 1, policy_version 73082 (0.0008) +[2023-10-09 07:08:43,268][60143] Updated weights for policy 0, policy_version 72262 (0.0008) +[2023-10-09 07:08:43,632][60143] Updated weights for policy 0, policy_version 72272 (0.0008) +[2023-10-09 07:08:44,017][60143] Updated weights for policy 0, policy_version 72282 (0.0009) +[2023-10-09 07:08:46,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 148865024. Throughput: 0: 1701.5, 1: 1727.8. Samples: 37229300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:08:46,052][59242] Avg episode reward: [(0, '32.830'), (1, '32.050')] +[2023-10-09 07:08:46,157][60144] Updated weights for policy 1, policy_version 73092 (0.0008) +[2023-10-09 07:08:46,527][60144] Updated weights for policy 1, policy_version 73102 (0.0008) +[2023-10-09 07:08:46,895][60144] Updated weights for policy 1, policy_version 73112 (0.0010) +[2023-10-09 07:08:47,988][60143] Updated weights for policy 0, policy_version 72292 (0.0010) +[2023-10-09 07:08:48,366][60143] Updated weights for policy 0, policy_version 72302 (0.0007) +[2023-10-09 07:08:48,743][60143] Updated weights for policy 0, policy_version 72312 (0.0008) +[2023-10-09 07:08:50,928][60144] Updated weights for policy 1, policy_version 73122 (0.0008) +[2023-10-09 07:08:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 148930560. Throughput: 0: 1704.8, 1: 1716.8. Samples: 37239156. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:08:51,053][59242] Avg episode reward: [(0, '32.660'), (1, '32.110')] +[2023-10-09 07:08:51,292][60144] Updated weights for policy 1, policy_version 73132 (0.0008) +[2023-10-09 07:08:51,657][60144] Updated weights for policy 1, policy_version 73142 (0.0008) +[2023-10-09 07:08:52,021][60144] Updated weights for policy 1, policy_version 73152 (0.0009) +[2023-10-09 07:08:52,537][60143] Updated weights for policy 0, policy_version 72322 (0.0009) +[2023-10-09 07:08:52,909][60143] Updated weights for policy 0, policy_version 72332 (0.0008) +[2023-10-09 07:08:53,289][60143] Updated weights for policy 0, policy_version 72342 (0.0008) +[2023-10-09 07:08:53,663][60143] Updated weights for policy 0, policy_version 72352 (0.0009) +[2023-10-09 07:08:55,901][60144] Updated weights for policy 1, policy_version 73162 (0.0008) +[2023-10-09 07:08:56,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 148996096. Throughput: 0: 1685.1, 1: 1726.8. Samples: 37259730. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:08:56,053][59242] Avg episode reward: [(0, '33.470'), (1, '33.130')] +[2023-10-09 07:08:56,270][60144] Updated weights for policy 1, policy_version 73172 (0.0007) +[2023-10-09 07:08:56,630][60144] Updated weights for policy 1, policy_version 73182 (0.0007) +[2023-10-09 07:08:57,695][60143] Updated weights for policy 0, policy_version 72362 (0.0007) +[2023-10-09 07:08:58,072][60143] Updated weights for policy 0, policy_version 72372 (0.0009) +[2023-10-09 07:08:58,439][60143] Updated weights for policy 0, policy_version 72382 (0.0008) +[2023-10-09 07:09:00,372][60144] Updated weights for policy 1, policy_version 73192 (0.0007) +[2023-10-09 07:09:00,738][60144] Updated weights for policy 1, policy_version 73202 (0.0008) +[2023-10-09 07:09:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 149061632. Throughput: 0: 1715.5, 1: 1722.5. Samples: 37280722. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:09:01,053][59242] Avg episode reward: [(0, '34.240'), (1, '32.630')] +[2023-10-09 07:09:01,112][60144] Updated weights for policy 1, policy_version 73212 (0.0008) +[2023-10-09 07:09:02,332][60143] Updated weights for policy 0, policy_version 72392 (0.0008) +[2023-10-09 07:09:02,700][60143] Updated weights for policy 0, policy_version 72402 (0.0009) +[2023-10-09 07:09:03,087][60143] Updated weights for policy 0, policy_version 72412 (0.0010) +[2023-10-09 07:09:05,137][60144] Updated weights for policy 1, policy_version 73222 (0.0008) +[2023-10-09 07:09:05,507][60144] Updated weights for policy 1, policy_version 73232 (0.0007) +[2023-10-09 07:09:05,875][60144] Updated weights for policy 1, policy_version 73242 (0.0009) +[2023-10-09 07:09:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 149127168. Throughput: 0: 1684.4, 1: 1738.0. Samples: 37290714. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:09:06,053][59242] Avg episode reward: [(0, '34.670'), (1, '31.090')] +[2023-10-09 07:09:06,933][60143] Updated weights for policy 0, policy_version 72422 (0.0009) +[2023-10-09 07:09:07,287][60143] Updated weights for policy 0, policy_version 72432 (0.0008) +[2023-10-09 07:09:07,652][60143] Updated weights for policy 0, policy_version 72442 (0.0007) +[2023-10-09 07:09:09,788][60144] Updated weights for policy 1, policy_version 73252 (0.0010) +[2023-10-09 07:09:10,163][60144] Updated weights for policy 1, policy_version 73262 (0.0008) +[2023-10-09 07:09:10,522][60144] Updated weights for policy 1, policy_version 73272 (0.0009) +[2023-10-09 07:09:11,052][59242] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 149225472. Throughput: 0: 1698.8, 1: 1742.5. Samples: 37311986. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:09:11,052][59242] Avg episode reward: [(0, '34.370'), (1, '31.990')] +[2023-10-09 07:09:11,717][60143] Updated weights for policy 0, policy_version 72452 (0.0009) +[2023-10-09 07:09:12,083][60143] Updated weights for policy 0, policy_version 72462 (0.0012) +[2023-10-09 07:09:12,452][60143] Updated weights for policy 0, policy_version 72472 (0.0008) +[2023-10-09 07:09:14,326][60144] Updated weights for policy 1, policy_version 73282 (0.0009) +[2023-10-09 07:09:14,692][60144] Updated weights for policy 1, policy_version 73292 (0.0009) +[2023-10-09 07:09:15,055][60144] Updated weights for policy 1, policy_version 73302 (0.0010) +[2023-10-09 07:09:15,423][60144] Updated weights for policy 1, policy_version 73312 (0.0008) +[2023-10-09 07:09:16,052][59242] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 149291008. Throughput: 0: 1718.2, 1: 1717.9. Samples: 37332150. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 07:09:16,053][59242] Avg episode reward: [(0, '33.650'), (1, '32.540')] +[2023-10-09 07:09:16,511][60143] Updated weights for policy 0, policy_version 72482 (0.0008) +[2023-10-09 07:09:16,885][60143] Updated weights for policy 0, policy_version 72492 (0.0008) +[2023-10-09 07:09:17,258][60143] Updated weights for policy 0, policy_version 72502 (0.0008) +[2023-10-09 07:09:17,629][60143] Updated weights for policy 0, policy_version 72512 (0.0009) +[2023-10-09 07:09:19,363][60144] Updated weights for policy 1, policy_version 73322 (0.0008) +[2023-10-09 07:09:19,731][60144] Updated weights for policy 1, policy_version 73332 (0.0009) +[2023-10-09 07:09:20,089][60144] Updated weights for policy 1, policy_version 73342 (0.0011) +[2023-10-09 07:09:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 149356544. Throughput: 0: 1688.8, 1: 1751.4. Samples: 37342940. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 07:09:21,052][59242] Avg episode reward: [(0, '34.020'), (1, '35.350')] +[2023-10-09 07:09:21,695][60143] Updated weights for policy 0, policy_version 72522 (0.0009) +[2023-10-09 07:09:22,066][60143] Updated weights for policy 0, policy_version 72532 (0.0007) +[2023-10-09 07:09:22,436][60143] Updated weights for policy 0, policy_version 72542 (0.0007) +[2023-10-09 07:09:24,184][60144] Updated weights for policy 1, policy_version 73352 (0.0009) +[2023-10-09 07:09:24,557][60144] Updated weights for policy 1, policy_version 73362 (0.0007) +[2023-10-09 07:09:24,925][60144] Updated weights for policy 1, policy_version 73372 (0.0007) +[2023-10-09 07:09:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 149422080. Throughput: 0: 1720.6, 1: 1729.0. Samples: 37363306. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 07:09:26,053][59242] Avg episode reward: [(0, '33.710'), (1, '35.540')] +[2023-10-09 07:09:26,425][60143] Updated weights for policy 0, policy_version 72552 (0.0008) +[2023-10-09 07:09:26,790][60143] Updated weights for policy 0, policy_version 72562 (0.0010) +[2023-10-09 07:09:27,165][60143] Updated weights for policy 0, policy_version 72572 (0.0011) +[2023-10-09 07:09:28,969][60144] Updated weights for policy 1, policy_version 73382 (0.0008) +[2023-10-09 07:09:29,363][60144] Updated weights for policy 1, policy_version 73392 (0.0007) +[2023-10-09 07:09:29,737][60144] Updated weights for policy 1, policy_version 73402 (0.0008) +[2023-10-09 07:09:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 149487616. Throughput: 0: 1719.8, 1: 1712.7. Samples: 37383762. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 07:09:31,053][59242] Avg episode reward: [(0, '35.860'), (1, '35.610')] +[2023-10-09 07:09:31,060][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000073408_75169792.pth... +[2023-10-09 07:09:31,096][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000071808_73531392.pth +[2023-10-09 07:09:31,315][60143] Updated weights for policy 0, policy_version 72582 (0.0009) +[2023-10-09 07:09:31,678][60143] Updated weights for policy 0, policy_version 72592 (0.0008) +[2023-10-09 07:09:32,048][60143] Updated weights for policy 0, policy_version 72602 (0.0010) +[2023-10-09 07:09:32,257][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000072608_74350592.pth... +[2023-10-09 07:09:32,293][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000071008_72712192.pth +[2023-10-09 07:09:33,641][60144] Updated weights for policy 1, policy_version 73412 (0.0007) +[2023-10-09 07:09:34,016][60144] Updated weights for policy 1, policy_version 73422 (0.0008) +[2023-10-09 07:09:34,381][60144] Updated weights for policy 1, policy_version 73432 (0.0009) +[2023-10-09 07:09:35,960][60143] Updated weights for policy 0, policy_version 72612 (0.0009) +[2023-10-09 07:09:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 149553152. Throughput: 0: 1704.1, 1: 1741.8. Samples: 37394222. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 07:09:36,053][59242] Avg episode reward: [(0, '34.400'), (1, '34.720')] +[2023-10-09 07:09:36,336][60143] Updated weights for policy 0, policy_version 72622 (0.0008) +[2023-10-09 07:09:36,707][60143] Updated weights for policy 0, policy_version 72632 (0.0008) +[2023-10-09 07:09:38,123][60144] Updated weights for policy 1, policy_version 73442 (0.0008) +[2023-10-09 07:09:38,498][60144] Updated weights for policy 1, policy_version 73452 (0.0009) +[2023-10-09 07:09:38,875][60144] Updated weights for policy 1, policy_version 73462 (0.0008) +[2023-10-09 07:09:39,243][60144] Updated weights for policy 1, policy_version 73472 (0.0008) +[2023-10-09 07:09:40,802][60143] Updated weights for policy 0, policy_version 72642 (0.0008) +[2023-10-09 07:09:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 149618688. Throughput: 0: 1719.6, 1: 1717.4. Samples: 37414392. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 07:09:41,053][59242] Avg episode reward: [(0, '33.490'), (1, '35.570')] +[2023-10-09 07:09:41,178][60143] Updated weights for policy 0, policy_version 72652 (0.0009) +[2023-10-09 07:09:41,553][60143] Updated weights for policy 0, policy_version 72662 (0.0007) +[2023-10-09 07:09:41,915][60143] Updated weights for policy 0, policy_version 72672 (0.0009) +[2023-10-09 07:09:43,198][60144] Updated weights for policy 1, policy_version 73482 (0.0009) +[2023-10-09 07:09:43,564][60144] Updated weights for policy 1, policy_version 73492 (0.0008) +[2023-10-09 07:09:43,931][60144] Updated weights for policy 1, policy_version 73502 (0.0009) +[2023-10-09 07:09:45,672][60143] Updated weights for policy 0, policy_version 72682 (0.0009) +[2023-10-09 07:09:46,028][60143] Updated weights for policy 0, policy_version 72692 (0.0009) +[2023-10-09 07:09:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 149684224. Throughput: 0: 1718.9, 1: 1726.2. Samples: 37435752. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 07:09:46,053][59242] Avg episode reward: [(0, '33.840'), (1, '34.850')] +[2023-10-09 07:09:46,396][60143] Updated weights for policy 0, policy_version 72702 (0.0011) +[2023-10-09 07:09:47,829][60144] Updated weights for policy 1, policy_version 73512 (0.0010) +[2023-10-09 07:09:48,195][60144] Updated weights for policy 1, policy_version 73522 (0.0010) +[2023-10-09 07:09:48,557][60144] Updated weights for policy 1, policy_version 73532 (0.0011) +[2023-10-09 07:09:50,377][60143] Updated weights for policy 0, policy_version 72712 (0.0008) +[2023-10-09 07:09:50,750][60143] Updated weights for policy 0, policy_version 72722 (0.0008) +[2023-10-09 07:09:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 149749760. Throughput: 0: 1722.2, 1: 1716.0. Samples: 37445436. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 07:09:51,053][59242] Avg episode reward: [(0, '36.120'), (1, '35.730')] +[2023-10-09 07:09:51,112][60143] Updated weights for policy 0, policy_version 72732 (0.0007) +[2023-10-09 07:09:52,559][60144] Updated weights for policy 1, policy_version 73542 (0.0008) +[2023-10-09 07:09:52,928][60144] Updated weights for policy 1, policy_version 73552 (0.0008) +[2023-10-09 07:09:53,296][60144] Updated weights for policy 1, policy_version 73562 (0.0007) +[2023-10-09 07:09:55,146][60143] Updated weights for policy 0, policy_version 72742 (0.0009) +[2023-10-09 07:09:55,519][60143] Updated weights for policy 0, policy_version 72752 (0.0009) +[2023-10-09 07:09:55,892][60143] Updated weights for policy 0, policy_version 72762 (0.0008) +[2023-10-09 07:09:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 149815296. Throughput: 0: 1731.1, 1: 1712.6. Samples: 37466952. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-09 07:09:56,053][59242] Avg episode reward: [(0, '37.240'), (1, '35.870')] +[2023-10-09 07:09:57,347][60144] Updated weights for policy 1, policy_version 73572 (0.0009) +[2023-10-09 07:09:57,711][60144] Updated weights for policy 1, policy_version 73582 (0.0011) +[2023-10-09 07:09:58,074][60144] Updated weights for policy 1, policy_version 73592 (0.0009) +[2023-10-09 07:09:59,805][60143] Updated weights for policy 0, policy_version 72772 (0.0008) +[2023-10-09 07:10:00,177][60143] Updated weights for policy 0, policy_version 72782 (0.0008) +[2023-10-09 07:10:00,555][60143] Updated weights for policy 0, policy_version 72792 (0.0009) +[2023-10-09 07:10:01,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 149913600. Throughput: 0: 1709.8, 1: 1733.4. Samples: 37487092. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 07:10:01,053][59242] Avg episode reward: [(0, '37.600'), (1, '35.850')] +[2023-10-09 07:10:02,092][60144] Updated weights for policy 1, policy_version 73602 (0.0010) +[2023-10-09 07:10:02,461][60144] Updated weights for policy 1, policy_version 73612 (0.0009) +[2023-10-09 07:10:02,828][60144] Updated weights for policy 1, policy_version 73622 (0.0008) +[2023-10-09 07:10:03,187][60144] Updated weights for policy 1, policy_version 73632 (0.0009) +[2023-10-09 07:10:04,453][60143] Updated weights for policy 0, policy_version 72802 (0.0008) +[2023-10-09 07:10:04,835][60143] Updated weights for policy 0, policy_version 72812 (0.0009) +[2023-10-09 07:10:05,198][60143] Updated weights for policy 0, policy_version 72822 (0.0010) +[2023-10-09 07:10:05,573][60143] Updated weights for policy 0, policy_version 72832 (0.0007) +[2023-10-09 07:10:06,052][59242] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 149979136. Throughput: 0: 1731.0, 1: 1700.8. Samples: 37497370. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 07:10:06,053][59242] Avg episode reward: [(0, '36.900'), (1, '36.940')] +[2023-10-09 07:10:06,938][60144] Updated weights for policy 1, policy_version 73642 (0.0010) +[2023-10-09 07:10:07,302][60144] Updated weights for policy 1, policy_version 73652 (0.0009) +[2023-10-09 07:10:07,675][60144] Updated weights for policy 1, policy_version 73662 (0.0007) +[2023-10-09 07:10:09,613][60143] Updated weights for policy 0, policy_version 72842 (0.0008) +[2023-10-09 07:10:09,967][60143] Updated weights for policy 0, policy_version 72852 (0.0010) +[2023-10-09 07:10:10,337][60143] Updated weights for policy 0, policy_version 72862 (0.0009) +[2023-10-09 07:10:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 150044672. Throughput: 0: 1718.7, 1: 1723.8. Samples: 37518216. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 07:10:11,052][59242] Avg episode reward: [(0, '35.030'), (1, '36.150')] +[2023-10-09 07:10:11,700][60144] Updated weights for policy 1, policy_version 73672 (0.0008) +[2023-10-09 07:10:12,073][60144] Updated weights for policy 1, policy_version 73682 (0.0010) +[2023-10-09 07:10:12,436][60144] Updated weights for policy 1, policy_version 73692 (0.0009) +[2023-10-09 07:10:14,061][60143] Updated weights for policy 0, policy_version 72872 (0.0009) +[2023-10-09 07:10:14,433][60143] Updated weights for policy 0, policy_version 72882 (0.0009) +[2023-10-09 07:10:14,806][60143] Updated weights for policy 0, policy_version 72892 (0.0007) +[2023-10-09 07:10:16,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 150110208. Throughput: 0: 1702.8, 1: 1741.1. Samples: 37538736. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 07:10:16,053][59242] Avg episode reward: [(0, '35.720'), (1, '36.480')] +[2023-10-09 07:10:16,563][60144] Updated weights for policy 1, policy_version 73702 (0.0009) +[2023-10-09 07:10:16,949][60144] Updated weights for policy 1, policy_version 73712 (0.0009) +[2023-10-09 07:10:17,313][60144] Updated weights for policy 1, policy_version 73722 (0.0007) +[2023-10-09 07:10:18,738][60143] Updated weights for policy 0, policy_version 72902 (0.0011) +[2023-10-09 07:10:19,111][60143] Updated weights for policy 0, policy_version 72912 (0.0010) +[2023-10-09 07:10:19,486][60143] Updated weights for policy 0, policy_version 72922 (0.0009) +[2023-10-09 07:10:21,021][60144] Updated weights for policy 1, policy_version 73732 (0.0010) +[2023-10-09 07:10:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 150175744. Throughput: 0: 1735.5, 1: 1709.0. Samples: 37549224. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 07:10:21,053][59242] Avg episode reward: [(0, '35.270'), (1, '36.720')] +[2023-10-09 07:10:21,395][60144] Updated weights for policy 1, policy_version 73742 (0.0007) +[2023-10-09 07:10:21,763][60144] Updated weights for policy 1, policy_version 73752 (0.0007) +[2023-10-09 07:10:23,382][60143] Updated weights for policy 0, policy_version 72932 (0.0007) +[2023-10-09 07:10:23,767][60143] Updated weights for policy 0, policy_version 72942 (0.0008) +[2023-10-09 07:10:24,146][60143] Updated weights for policy 0, policy_version 72952 (0.0008) +[2023-10-09 07:10:25,570][60144] Updated weights for policy 1, policy_version 73762 (0.0007) +[2023-10-09 07:10:25,932][60144] Updated weights for policy 1, policy_version 73772 (0.0010) +[2023-10-09 07:10:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 150241280. Throughput: 0: 1710.5, 1: 1737.4. Samples: 37569548. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 07:10:26,053][59242] Avg episode reward: [(0, '34.790'), (1, '37.300')] +[2023-10-09 07:10:26,309][60144] Updated weights for policy 1, policy_version 73782 (0.0008) +[2023-10-09 07:10:26,665][60144] Updated weights for policy 1, policy_version 73792 (0.0009) +[2023-10-09 07:10:28,140][60143] Updated weights for policy 0, policy_version 72962 (0.0008) +[2023-10-09 07:10:28,516][60143] Updated weights for policy 0, policy_version 72972 (0.0009) +[2023-10-09 07:10:28,874][60143] Updated weights for policy 0, policy_version 72982 (0.0009) +[2023-10-09 07:10:29,248][60143] Updated weights for policy 0, policy_version 72992 (0.0008) +[2023-10-09 07:10:30,619][60144] Updated weights for policy 1, policy_version 73802 (0.0010) +[2023-10-09 07:10:30,987][60144] Updated weights for policy 1, policy_version 73812 (0.0008) +[2023-10-09 07:10:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 150306816. Throughput: 0: 1714.6, 1: 1730.3. Samples: 37590772. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 07:10:31,053][59242] Avg episode reward: [(0, '35.970'), (1, '37.520')] +[2023-10-09 07:10:31,345][60144] Updated weights for policy 1, policy_version 73822 (0.0007) +[2023-10-09 07:10:33,163][60143] Updated weights for policy 0, policy_version 73002 (0.0008) +[2023-10-09 07:10:33,529][60143] Updated weights for policy 0, policy_version 73012 (0.0008) +[2023-10-09 07:10:33,901][60143] Updated weights for policy 0, policy_version 73022 (0.0010) +[2023-10-09 07:10:35,102][60144] Updated weights for policy 1, policy_version 73832 (0.0010) +[2023-10-09 07:10:35,464][60144] Updated weights for policy 1, policy_version 73842 (0.0010) +[2023-10-09 07:10:35,823][60144] Updated weights for policy 1, policy_version 73852 (0.0010) +[2023-10-09 07:10:36,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 150405120. Throughput: 0: 1723.1, 1: 1738.5. Samples: 37601208. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 07:10:36,053][59242] Avg episode reward: [(0, '34.230'), (1, '36.670')] +[2023-10-09 07:10:37,908][60143] Updated weights for policy 0, policy_version 73032 (0.0010) +[2023-10-09 07:10:38,278][60143] Updated weights for policy 0, policy_version 73042 (0.0008) +[2023-10-09 07:10:38,657][60143] Updated weights for policy 0, policy_version 73052 (0.0007) +[2023-10-09 07:10:39,923][60144] Updated weights for policy 1, policy_version 73862 (0.0007) +[2023-10-09 07:10:40,285][60144] Updated weights for policy 1, policy_version 73872 (0.0008) +[2023-10-09 07:10:40,662][60144] Updated weights for policy 1, policy_version 73882 (0.0008) +[2023-10-09 07:10:41,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 150470656. Throughput: 0: 1702.8, 1: 1740.0. Samples: 37621876. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 07:10:41,053][59242] Avg episode reward: [(0, '34.590'), (1, '37.690')] +[2023-10-09 07:10:42,802][60143] Updated weights for policy 0, policy_version 73062 (0.0010) +[2023-10-09 07:10:43,162][60143] Updated weights for policy 0, policy_version 73072 (0.0009) +[2023-10-09 07:10:43,535][60143] Updated weights for policy 0, policy_version 73082 (0.0009) +[2023-10-09 07:10:44,593][60144] Updated weights for policy 1, policy_version 73892 (0.0008) +[2023-10-09 07:10:44,961][60144] Updated weights for policy 1, policy_version 73902 (0.0008) +[2023-10-09 07:10:45,333][60144] Updated weights for policy 1, policy_version 73912 (0.0010) +[2023-10-09 07:10:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 150536192. Throughput: 0: 1722.7, 1: 1719.3. Samples: 37641982. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-09 07:10:46,053][59242] Avg episode reward: [(0, '34.190'), (1, '36.640')] +[2023-10-09 07:10:47,442][60143] Updated weights for policy 0, policy_version 73092 (0.0008) +[2023-10-09 07:10:47,804][60143] Updated weights for policy 0, policy_version 73102 (0.0008) +[2023-10-09 07:10:48,173][60143] Updated weights for policy 0, policy_version 73112 (0.0008) +[2023-10-09 07:10:49,365][60144] Updated weights for policy 1, policy_version 73922 (0.0010) +[2023-10-09 07:10:49,740][60144] Updated weights for policy 1, policy_version 73932 (0.0009) +[2023-10-09 07:10:50,110][60144] Updated weights for policy 1, policy_version 73942 (0.0009) +[2023-10-09 07:10:50,465][60144] Updated weights for policy 1, policy_version 73952 (0.0010) +[2023-10-09 07:10:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 150601728. Throughput: 0: 1701.5, 1: 1746.4. Samples: 37652528. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:10:51,053][59242] Avg episode reward: [(0, '34.380'), (1, '35.060')] +[2023-10-09 07:10:52,234][60143] Updated weights for policy 0, policy_version 73122 (0.0009) +[2023-10-09 07:10:52,604][60143] Updated weights for policy 0, policy_version 73132 (0.0011) +[2023-10-09 07:10:52,980][60143] Updated weights for policy 0, policy_version 73142 (0.0008) +[2023-10-09 07:10:53,352][60143] Updated weights for policy 0, policy_version 73152 (0.0008) +[2023-10-09 07:10:54,347][60144] Updated weights for policy 1, policy_version 73962 (0.0008) +[2023-10-09 07:10:54,709][60144] Updated weights for policy 1, policy_version 73972 (0.0007) +[2023-10-09 07:10:55,070][60144] Updated weights for policy 1, policy_version 73982 (0.0009) +[2023-10-09 07:10:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 150667264. Throughput: 0: 1709.5, 1: 1733.3. Samples: 37673140. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:10:56,052][59242] Avg episode reward: [(0, '33.840'), (1, '32.670')] +[2023-10-09 07:10:57,384][60143] Updated weights for policy 0, policy_version 73162 (0.0007) +[2023-10-09 07:10:57,757][60143] Updated weights for policy 0, policy_version 73172 (0.0008) +[2023-10-09 07:10:58,129][60143] Updated weights for policy 0, policy_version 73182 (0.0010) +[2023-10-09 07:10:58,765][60144] Updated weights for policy 1, policy_version 73992 (0.0010) +[2023-10-09 07:10:59,125][60144] Updated weights for policy 1, policy_version 74002 (0.0007) +[2023-10-09 07:10:59,489][60144] Updated weights for policy 1, policy_version 74012 (0.0009) +[2023-10-09 07:11:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 150732800. Throughput: 0: 1725.6, 1: 1722.9. Samples: 37693916. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:11:01,053][59242] Avg episode reward: [(0, '32.330'), (1, '34.720')] +[2023-10-09 07:11:02,067][60143] Updated weights for policy 0, policy_version 73192 (0.0009) +[2023-10-09 07:11:02,450][60143] Updated weights for policy 0, policy_version 73202 (0.0008) +[2023-10-09 07:11:02,812][60143] Updated weights for policy 0, policy_version 73212 (0.0009) +[2023-10-09 07:11:03,458][60144] Updated weights for policy 1, policy_version 74022 (0.0009) +[2023-10-09 07:11:03,845][60144] Updated weights for policy 1, policy_version 74032 (0.0009) +[2023-10-09 07:11:04,203][60144] Updated weights for policy 1, policy_version 74042 (0.0007) +[2023-10-09 07:11:06,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 150798336. Throughput: 0: 1695.6, 1: 1747.8. Samples: 37704176. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:11:06,053][59242] Avg episode reward: [(0, '33.420'), (1, '36.280')] +[2023-10-09 07:11:06,740][60143] Updated weights for policy 0, policy_version 73222 (0.0010) +[2023-10-09 07:11:07,117][60143] Updated weights for policy 0, policy_version 73232 (0.0010) +[2023-10-09 07:11:07,483][60143] Updated weights for policy 0, policy_version 73242 (0.0008) +[2023-10-09 07:11:07,988][60144] Updated weights for policy 1, policy_version 74052 (0.0007) +[2023-10-09 07:11:08,345][60144] Updated weights for policy 1, policy_version 74062 (0.0007) +[2023-10-09 07:11:08,717][60144] Updated weights for policy 1, policy_version 74072 (0.0007) +[2023-10-09 07:11:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 150863872. Throughput: 0: 1722.0, 1: 1724.8. Samples: 37724652. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:11:11,053][59242] Avg episode reward: [(0, '33.600'), (1, '36.920')] +[2023-10-09 07:11:11,615][60143] Updated weights for policy 0, policy_version 73252 (0.0009) +[2023-10-09 07:11:12,006][60143] Updated weights for policy 0, policy_version 73262 (0.0009) +[2023-10-09 07:11:12,375][60143] Updated weights for policy 0, policy_version 73272 (0.0008) +[2023-10-09 07:11:12,572][60144] Updated weights for policy 1, policy_version 74082 (0.0007) +[2023-10-09 07:11:12,946][60144] Updated weights for policy 1, policy_version 74092 (0.0008) +[2023-10-09 07:11:13,314][60144] Updated weights for policy 1, policy_version 74102 (0.0007) +[2023-10-09 07:11:13,685][60144] Updated weights for policy 1, policy_version 74112 (0.0007) +[2023-10-09 07:11:16,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 150929408. Throughput: 0: 1716.7, 1: 1730.4. Samples: 37745892. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:11:16,052][59242] Avg episode reward: [(0, '32.300'), (1, '36.810')] +[2023-10-09 07:11:16,137][60143] Updated weights for policy 0, policy_version 73282 (0.0008) +[2023-10-09 07:11:16,511][60143] Updated weights for policy 0, policy_version 73292 (0.0007) +[2023-10-09 07:11:16,880][60143] Updated weights for policy 0, policy_version 73302 (0.0008) +[2023-10-09 07:11:17,253][60143] Updated weights for policy 0, policy_version 73312 (0.0007) +[2023-10-09 07:11:17,713][60144] Updated weights for policy 1, policy_version 74122 (0.0010) +[2023-10-09 07:11:18,074][60144] Updated weights for policy 1, policy_version 74132 (0.0011) +[2023-10-09 07:11:18,447][60144] Updated weights for policy 1, policy_version 74142 (0.0010) +[2023-10-09 07:11:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 150994944. Throughput: 0: 1707.1, 1: 1715.5. Samples: 37755222. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:11:21,053][59242] Avg episode reward: [(0, '32.970'), (1, '36.120')] +[2023-10-09 07:11:21,298][60143] Updated weights for policy 0, policy_version 73322 (0.0008) +[2023-10-09 07:11:21,667][60143] Updated weights for policy 0, policy_version 73332 (0.0008) +[2023-10-09 07:11:22,034][60143] Updated weights for policy 0, policy_version 73342 (0.0008) +[2023-10-09 07:11:22,649][60144] Updated weights for policy 1, policy_version 74152 (0.0010) +[2023-10-09 07:11:23,016][60144] Updated weights for policy 1, policy_version 74162 (0.0010) +[2023-10-09 07:11:23,396][60144] Updated weights for policy 1, policy_version 74172 (0.0010) +[2023-10-09 07:11:25,855][60143] Updated weights for policy 0, policy_version 73352 (0.0007) +[2023-10-09 07:11:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 151060480. Throughput: 0: 1725.5, 1: 1711.6. Samples: 37776546. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:11:26,052][59242] Avg episode reward: [(0, '31.510'), (1, '36.010')] +[2023-10-09 07:11:26,226][60143] Updated weights for policy 0, policy_version 73362 (0.0007) +[2023-10-09 07:11:26,594][60143] Updated weights for policy 0, policy_version 73372 (0.0008) +[2023-10-09 07:11:27,257][60144] Updated weights for policy 1, policy_version 74182 (0.0008) +[2023-10-09 07:11:27,623][60144] Updated weights for policy 1, policy_version 74192 (0.0007) +[2023-10-09 07:11:27,979][60144] Updated weights for policy 1, policy_version 74202 (0.0011) +[2023-10-09 07:11:30,559][60143] Updated weights for policy 0, policy_version 73382 (0.0007) +[2023-10-09 07:11:30,932][60143] Updated weights for policy 0, policy_version 73392 (0.0008) +[2023-10-09 07:11:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 151126016. Throughput: 0: 1720.6, 1: 1738.0. Samples: 37797618. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:11:31,053][59242] Avg episode reward: [(0, '32.080'), (1, '36.070')] +[2023-10-09 07:11:31,061][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000074208_75988992.pth... +[2023-10-09 07:11:31,093][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000072608_74350592.pth +[2023-10-09 07:11:31,307][60143] Updated weights for policy 0, policy_version 73402 (0.0007) +[2023-10-09 07:11:31,524][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000073408_75169792.pth... +[2023-10-09 07:11:31,554][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000071808_73531392.pth +[2023-10-09 07:11:31,853][60144] Updated weights for policy 1, policy_version 74212 (0.0008) +[2023-10-09 07:11:32,228][60144] Updated weights for policy 1, policy_version 74222 (0.0009) +[2023-10-09 07:11:32,609][60144] Updated weights for policy 1, policy_version 74232 (0.0011) +[2023-10-09 07:11:35,166][60143] Updated weights for policy 0, policy_version 73412 (0.0007) +[2023-10-09 07:11:35,534][60143] Updated weights for policy 0, policy_version 73422 (0.0009) +[2023-10-09 07:11:35,906][60143] Updated weights for policy 0, policy_version 73432 (0.0010) +[2023-10-09 07:11:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 151191552. Throughput: 0: 1725.3, 1: 1713.4. Samples: 37807270. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:11:36,053][59242] Avg episode reward: [(0, '32.760'), (1, '35.420')] +[2023-10-09 07:11:36,488][60144] Updated weights for policy 1, policy_version 74242 (0.0008) +[2023-10-09 07:11:36,857][60144] Updated weights for policy 1, policy_version 74252 (0.0008) +[2023-10-09 07:11:37,225][60144] Updated weights for policy 1, policy_version 74262 (0.0007) +[2023-10-09 07:11:37,591][60144] Updated weights for policy 1, policy_version 74272 (0.0007) +[2023-10-09 07:11:39,914][60143] Updated weights for policy 0, policy_version 73442 (0.0008) +[2023-10-09 07:11:40,283][60143] Updated weights for policy 0, policy_version 73452 (0.0010) +[2023-10-09 07:11:40,653][60143] Updated weights for policy 0, policy_version 73462 (0.0009) +[2023-10-09 07:11:41,022][60143] Updated weights for policy 0, policy_version 73472 (0.0007) +[2023-10-09 07:11:41,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 151289856. Throughput: 0: 1724.8, 1: 1729.5. Samples: 37828584. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 07:11:41,053][59242] Avg episode reward: [(0, '30.700'), (1, '36.270')] +[2023-10-09 07:11:41,530][60144] Updated weights for policy 1, policy_version 74282 (0.0008) +[2023-10-09 07:11:41,893][60144] Updated weights for policy 1, policy_version 74292 (0.0008) +[2023-10-09 07:11:42,260][60144] Updated weights for policy 1, policy_version 74302 (0.0009) +[2023-10-09 07:11:44,950][60143] Updated weights for policy 0, policy_version 73482 (0.0007) +[2023-10-09 07:11:45,322][60143] Updated weights for policy 0, policy_version 73492 (0.0007) +[2023-10-09 07:11:45,687][60143] Updated weights for policy 0, policy_version 73502 (0.0009) +[2023-10-09 07:11:46,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 151355392. Throughput: 0: 1706.2, 1: 1741.9. Samples: 37849082. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 07:11:46,052][59242] Avg episode reward: [(0, '31.200'), (1, '36.260')] +[2023-10-09 07:11:46,225][60144] Updated weights for policy 1, policy_version 74312 (0.0008) +[2023-10-09 07:11:46,587][60144] Updated weights for policy 1, policy_version 74322 (0.0008) +[2023-10-09 07:11:46,967][60144] Updated weights for policy 1, policy_version 74332 (0.0008) +[2023-10-09 07:11:49,619][60143] Updated weights for policy 0, policy_version 73512 (0.0008) +[2023-10-09 07:11:49,986][60143] Updated weights for policy 0, policy_version 73522 (0.0008) +[2023-10-09 07:11:50,353][60143] Updated weights for policy 0, policy_version 73532 (0.0010) +[2023-10-09 07:11:50,955][60144] Updated weights for policy 1, policy_version 74342 (0.0008) +[2023-10-09 07:11:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 151420928. Throughput: 0: 1730.4, 1: 1719.0. Samples: 37859398. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 07:11:51,053][59242] Avg episode reward: [(0, '32.080'), (1, '36.610')] +[2023-10-09 07:11:51,353][60144] Updated weights for policy 1, policy_version 74352 (0.0008) +[2023-10-09 07:11:51,718][60144] Updated weights for policy 1, policy_version 74362 (0.0009) +[2023-10-09 07:11:54,475][60143] Updated weights for policy 0, policy_version 73542 (0.0009) +[2023-10-09 07:11:54,846][60143] Updated weights for policy 0, policy_version 73552 (0.0007) +[2023-10-09 07:11:55,219][60143] Updated weights for policy 0, policy_version 73562 (0.0007) +[2023-10-09 07:11:55,597][60144] Updated weights for policy 1, policy_version 74372 (0.0008) +[2023-10-09 07:11:55,965][60144] Updated weights for policy 1, policy_version 74382 (0.0010) +[2023-10-09 07:11:56,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 151486464. Throughput: 0: 1724.8, 1: 1740.0. Samples: 37880566. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 07:11:56,053][59242] Avg episode reward: [(0, '31.860'), (1, '37.050')] +[2023-10-09 07:11:56,335][60144] Updated weights for policy 1, policy_version 74392 (0.0008) +[2023-10-09 07:11:59,131][60143] Updated weights for policy 0, policy_version 73572 (0.0008) +[2023-10-09 07:11:59,525][60143] Updated weights for policy 0, policy_version 73582 (0.0010) +[2023-10-09 07:11:59,898][60143] Updated weights for policy 0, policy_version 73592 (0.0008) +[2023-10-09 07:12:00,095][60144] Updated weights for policy 1, policy_version 74402 (0.0007) +[2023-10-09 07:12:00,461][60144] Updated weights for policy 1, policy_version 74412 (0.0010) +[2023-10-09 07:12:00,823][60144] Updated weights for policy 1, policy_version 74422 (0.0008) +[2023-10-09 07:12:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 151552000. Throughput: 0: 1697.2, 1: 1731.4. Samples: 37900182. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 07:12:01,053][59242] Avg episode reward: [(0, '31.210'), (1, '35.640')] +[2023-10-09 07:12:01,190][60144] Updated weights for policy 1, policy_version 74432 (0.0009) +[2023-10-09 07:12:03,918][60143] Updated weights for policy 0, policy_version 73602 (0.0008) +[2023-10-09 07:12:04,286][60143] Updated weights for policy 0, policy_version 73612 (0.0011) +[2023-10-09 07:12:04,652][60143] Updated weights for policy 0, policy_version 73622 (0.0010) +[2023-10-09 07:12:05,021][60143] Updated weights for policy 0, policy_version 73632 (0.0008) +[2023-10-09 07:12:05,082][60144] Updated weights for policy 1, policy_version 74442 (0.0008) +[2023-10-09 07:12:05,439][60144] Updated weights for policy 1, policy_version 74452 (0.0009) +[2023-10-09 07:12:05,800][60144] Updated weights for policy 1, policy_version 74462 (0.0010) +[2023-10-09 07:12:06,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 151650304. Throughput: 0: 1723.1, 1: 1749.0. Samples: 37911468. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 07:12:06,053][59242] Avg episode reward: [(0, '32.920'), (1, '36.420')] +[2023-10-09 07:12:08,881][60143] Updated weights for policy 0, policy_version 73642 (0.0010) +[2023-10-09 07:12:09,248][60143] Updated weights for policy 0, policy_version 73652 (0.0010) +[2023-10-09 07:12:09,627][60143] Updated weights for policy 0, policy_version 73662 (0.0010) +[2023-10-09 07:12:09,861][60144] Updated weights for policy 1, policy_version 74472 (0.0010) +[2023-10-09 07:12:10,239][60144] Updated weights for policy 1, policy_version 74482 (0.0009) +[2023-10-09 07:12:10,611][60144] Updated weights for policy 1, policy_version 74492 (0.0007) +[2023-10-09 07:12:11,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 151715840. Throughput: 0: 1697.5, 1: 1758.2. Samples: 37932056. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 07:12:11,053][59242] Avg episode reward: [(0, '31.530'), (1, '36.310')] +[2023-10-09 07:12:13,644][60143] Updated weights for policy 0, policy_version 73672 (0.0007) +[2023-10-09 07:12:14,017][60143] Updated weights for policy 0, policy_version 73682 (0.0009) +[2023-10-09 07:12:14,380][60143] Updated weights for policy 0, policy_version 73692 (0.0008) +[2023-10-09 07:12:14,458][60144] Updated weights for policy 1, policy_version 74502 (0.0008) +[2023-10-09 07:12:14,820][60144] Updated weights for policy 1, policy_version 74512 (0.0010) +[2023-10-09 07:12:15,192][60144] Updated weights for policy 1, policy_version 74522 (0.0009) +[2023-10-09 07:12:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 151781376. Throughput: 0: 1699.3, 1: 1724.4. Samples: 37951686. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 07:12:16,053][59242] Avg episode reward: [(0, '32.090'), (1, '36.140')] +[2023-10-09 07:12:18,331][60143] Updated weights for policy 0, policy_version 73702 (0.0008) +[2023-10-09 07:12:18,697][60143] Updated weights for policy 0, policy_version 73712 (0.0009) +[2023-10-09 07:12:19,069][60143] Updated weights for policy 0, policy_version 73722 (0.0009) +[2023-10-09 07:12:19,117][60144] Updated weights for policy 1, policy_version 74532 (0.0009) +[2023-10-09 07:12:19,485][60144] Updated weights for policy 1, policy_version 74542 (0.0007) +[2023-10-09 07:12:19,853][60144] Updated weights for policy 1, policy_version 74552 (0.0007) +[2023-10-09 07:12:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 151846912. Throughput: 0: 1714.1, 1: 1756.6. Samples: 37963452. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 07:12:21,053][59242] Avg episode reward: [(0, '32.350'), (1, '35.450')] +[2023-10-09 07:12:23,122][60143] Updated weights for policy 0, policy_version 73732 (0.0007) +[2023-10-09 07:12:23,489][60143] Updated weights for policy 0, policy_version 73742 (0.0007) +[2023-10-09 07:12:23,814][60144] Updated weights for policy 1, policy_version 74562 (0.0009) +[2023-10-09 07:12:23,854][60143] Updated weights for policy 0, policy_version 73752 (0.0008) +[2023-10-09 07:12:24,184][60144] Updated weights for policy 1, policy_version 74572 (0.0009) +[2023-10-09 07:12:24,550][60144] Updated weights for policy 1, policy_version 74582 (0.0010) +[2023-10-09 07:12:24,921][60144] Updated weights for policy 1, policy_version 74592 (0.0011) +[2023-10-09 07:12:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 151912448. Throughput: 0: 1700.3, 1: 1735.7. Samples: 37983204. Policy #0 lag: (min: 9.0, avg: 21.7, max: 41.0) +[2023-10-09 07:12:26,053][59242] Avg episode reward: [(0, '33.350'), (1, '36.010')] +[2023-10-09 07:12:27,813][60143] Updated weights for policy 0, policy_version 73762 (0.0007) +[2023-10-09 07:12:28,183][60143] Updated weights for policy 0, policy_version 73772 (0.0009) +[2023-10-09 07:12:28,549][60143] Updated weights for policy 0, policy_version 73782 (0.0009) +[2023-10-09 07:12:28,783][60144] Updated weights for policy 1, policy_version 74602 (0.0008) +[2023-10-09 07:12:28,914][60143] Updated weights for policy 0, policy_version 73792 (0.0008) +[2023-10-09 07:12:29,151][60144] Updated weights for policy 1, policy_version 74612 (0.0011) +[2023-10-09 07:12:29,516][60144] Updated weights for policy 1, policy_version 74622 (0.0010) +[2023-10-09 07:12:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 151977984. Throughput: 0: 1719.7, 1: 1729.2. Samples: 38004282. Policy #0 lag: (min: 9.0, avg: 21.7, max: 41.0) +[2023-10-09 07:12:31,053][59242] Avg episode reward: [(0, '32.760'), (1, '36.280')] +[2023-10-09 07:12:32,733][60143] Updated weights for policy 0, policy_version 73802 (0.0009) +[2023-10-09 07:12:33,097][60143] Updated weights for policy 0, policy_version 73812 (0.0010) +[2023-10-09 07:12:33,383][60144] Updated weights for policy 1, policy_version 74632 (0.0010) +[2023-10-09 07:12:33,469][60143] Updated weights for policy 0, policy_version 73822 (0.0010) +[2023-10-09 07:12:33,744][60144] Updated weights for policy 1, policy_version 74642 (0.0007) +[2023-10-09 07:12:34,108][60144] Updated weights for policy 1, policy_version 74652 (0.0010) +[2023-10-09 07:12:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 152043520. Throughput: 0: 1698.9, 1: 1749.4. Samples: 38014574. Policy #0 lag: (min: 9.0, avg: 21.7, max: 41.0) +[2023-10-09 07:12:36,053][59242] Avg episode reward: [(0, '32.110'), (1, '35.160')] +[2023-10-09 07:12:37,596][60143] Updated weights for policy 0, policy_version 73832 (0.0009) +[2023-10-09 07:12:37,974][60143] Updated weights for policy 0, policy_version 73842 (0.0009) +[2023-10-09 07:12:38,063][60144] Updated weights for policy 1, policy_version 74662 (0.0009) +[2023-10-09 07:12:38,339][60143] Updated weights for policy 0, policy_version 73852 (0.0007) +[2023-10-09 07:12:38,417][60144] Updated weights for policy 1, policy_version 74672 (0.0008) +[2023-10-09 07:12:38,782][60144] Updated weights for policy 1, policy_version 74682 (0.0007) +[2023-10-09 07:12:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 152109056. Throughput: 0: 1700.5, 1: 1731.1. Samples: 38034988. Policy #0 lag: (min: 9.0, avg: 21.7, max: 41.0) +[2023-10-09 07:12:41,053][59242] Avg episode reward: [(0, '33.440'), (1, '34.940')] +[2023-10-09 07:12:42,452][60143] Updated weights for policy 0, policy_version 73862 (0.0009) +[2023-10-09 07:12:42,749][60144] Updated weights for policy 1, policy_version 74692 (0.0007) +[2023-10-09 07:12:42,822][60143] Updated weights for policy 0, policy_version 73872 (0.0009) +[2023-10-09 07:12:43,161][60144] Updated weights for policy 1, policy_version 74702 (0.0007) +[2023-10-09 07:12:43,193][60143] Updated weights for policy 0, policy_version 73882 (0.0008) +[2023-10-09 07:12:43,525][60144] Updated weights for policy 1, policy_version 74712 (0.0007) +[2023-10-09 07:12:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 152174592. Throughput: 0: 1719.7, 1: 1736.3. Samples: 38055704. Policy #0 lag: (min: 9.0, avg: 21.7, max: 41.0) +[2023-10-09 07:12:46,053][59242] Avg episode reward: [(0, '30.320'), (1, '33.530')] +[2023-10-09 07:12:47,345][60144] Updated weights for policy 1, policy_version 74722 (0.0009) +[2023-10-09 07:12:47,477][60143] Updated weights for policy 0, policy_version 73892 (0.0007) +[2023-10-09 07:12:47,698][60144] Updated weights for policy 1, policy_version 74732 (0.0010) +[2023-10-09 07:12:47,874][60143] Updated weights for policy 0, policy_version 73902 (0.0009) +[2023-10-09 07:12:48,060][60144] Updated weights for policy 1, policy_version 74742 (0.0007) +[2023-10-09 07:12:48,234][60143] Updated weights for policy 0, policy_version 73912 (0.0008) +[2023-10-09 07:12:48,421][60144] Updated weights for policy 1, policy_version 74752 (0.0008) +[2023-10-09 07:12:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 152240128. Throughput: 0: 1688.9, 1: 1722.6. Samples: 38064986. Policy #0 lag: (min: 9.0, avg: 21.7, max: 41.0) +[2023-10-09 07:12:51,052][59242] Avg episode reward: [(0, '31.220'), (1, '32.630')] +[2023-10-09 07:12:52,072][60143] Updated weights for policy 0, policy_version 73922 (0.0010) +[2023-10-09 07:12:52,323][60144] Updated weights for policy 1, policy_version 74762 (0.0008) +[2023-10-09 07:12:52,444][60143] Updated weights for policy 0, policy_version 73932 (0.0007) +[2023-10-09 07:12:52,690][60144] Updated weights for policy 1, policy_version 74772 (0.0009) +[2023-10-09 07:12:52,812][60143] Updated weights for policy 0, policy_version 73942 (0.0008) +[2023-10-09 07:12:53,060][60144] Updated weights for policy 1, policy_version 74782 (0.0009) +[2023-10-09 07:12:53,177][60143] Updated weights for policy 0, policy_version 73952 (0.0010) +[2023-10-09 07:12:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 152305664. Throughput: 0: 1705.6, 1: 1718.0. Samples: 38086114. Policy #0 lag: (min: 9.0, avg: 21.7, max: 41.0) +[2023-10-09 07:12:56,053][59242] Avg episode reward: [(0, '32.550'), (1, '33.390')] +[2023-10-09 07:12:57,169][60144] Updated weights for policy 1, policy_version 74792 (0.0008) +[2023-10-09 07:12:57,215][60143] Updated weights for policy 0, policy_version 73962 (0.0007) +[2023-10-09 07:12:57,544][60144] Updated weights for policy 1, policy_version 74802 (0.0009) +[2023-10-09 07:12:57,581][60143] Updated weights for policy 0, policy_version 73972 (0.0009) +[2023-10-09 07:12:57,904][60144] Updated weights for policy 1, policy_version 74812 (0.0007) +[2023-10-09 07:12:57,950][60143] Updated weights for policy 0, policy_version 73982 (0.0009) +[2023-10-09 07:13:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 152371200. Throughput: 0: 1709.3, 1: 1746.7. Samples: 38107204. Policy #0 lag: (min: 9.0, avg: 21.7, max: 41.0) +[2023-10-09 07:13:01,053][59242] Avg episode reward: [(0, '31.170'), (1, '34.600')] +[2023-10-09 07:13:01,860][60144] Updated weights for policy 1, policy_version 74822 (0.0008) +[2023-10-09 07:13:01,863][60143] Updated weights for policy 0, policy_version 73992 (0.0007) +[2023-10-09 07:13:02,224][60144] Updated weights for policy 1, policy_version 74832 (0.0007) +[2023-10-09 07:13:02,234][60143] Updated weights for policy 0, policy_version 74002 (0.0007) +[2023-10-09 07:13:02,596][60144] Updated weights for policy 1, policy_version 74842 (0.0009) +[2023-10-09 07:13:02,600][60143] Updated weights for policy 0, policy_version 74012 (0.0007) +[2023-10-09 07:13:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 152436736. Throughput: 0: 1693.7, 1: 1709.4. Samples: 38116592. Policy #0 lag: (min: 9.0, avg: 21.7, max: 41.0) +[2023-10-09 07:13:06,053][59242] Avg episode reward: [(0, '32.170'), (1, '35.250')] +[2023-10-09 07:13:06,524][60144] Updated weights for policy 1, policy_version 74852 (0.0009) +[2023-10-09 07:13:06,579][60143] Updated weights for policy 0, policy_version 74022 (0.0007) +[2023-10-09 07:13:06,889][60144] Updated weights for policy 1, policy_version 74862 (0.0009) +[2023-10-09 07:13:06,941][60143] Updated weights for policy 0, policy_version 74032 (0.0008) +[2023-10-09 07:13:07,256][60144] Updated weights for policy 1, policy_version 74872 (0.0008) +[2023-10-09 07:13:07,314][60143] Updated weights for policy 0, policy_version 74042 (0.0008) +[2023-10-09 07:13:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 152502272. Throughput: 0: 1706.6, 1: 1731.9. Samples: 38137938. Policy #0 lag: (min: 9.0, avg: 21.7, max: 41.0) +[2023-10-09 07:13:11,052][59242] Avg episode reward: [(0, '31.620'), (1, '33.990')] +[2023-10-09 07:13:11,258][60144] Updated weights for policy 1, policy_version 74882 (0.0009) +[2023-10-09 07:13:11,355][60143] Updated weights for policy 0, policy_version 74052 (0.0008) +[2023-10-09 07:13:11,627][60144] Updated weights for policy 1, policy_version 74892 (0.0007) +[2023-10-09 07:13:11,722][60143] Updated weights for policy 0, policy_version 74062 (0.0009) +[2023-10-09 07:13:11,995][60144] Updated weights for policy 1, policy_version 74902 (0.0008) +[2023-10-09 07:13:12,093][60143] Updated weights for policy 0, policy_version 74072 (0.0007) +[2023-10-09 07:13:12,351][60144] Updated weights for policy 1, policy_version 74912 (0.0007) +[2023-10-09 07:13:16,034][60143] Updated weights for policy 0, policy_version 74082 (0.0007) +[2023-10-09 07:13:16,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 152567808. Throughput: 0: 1704.8, 1: 1739.9. Samples: 38159294. Policy #0 lag: (min: 22.0, avg: 24.9, max: 54.0) +[2023-10-09 07:13:16,052][59242] Avg episode reward: [(0, '31.320'), (1, '34.480')] +[2023-10-09 07:13:16,190][60144] Updated weights for policy 1, policy_version 74922 (0.0007) +[2023-10-09 07:13:16,401][60143] Updated weights for policy 0, policy_version 74092 (0.0008) +[2023-10-09 07:13:16,554][60144] Updated weights for policy 1, policy_version 74932 (0.0007) +[2023-10-09 07:13:16,766][60143] Updated weights for policy 0, policy_version 74102 (0.0010) +[2023-10-09 07:13:16,917][60144] Updated weights for policy 1, policy_version 74942 (0.0007) +[2023-10-09 07:13:17,140][60143] Updated weights for policy 0, policy_version 74112 (0.0009) +[2023-10-09 07:13:20,896][60144] Updated weights for policy 1, policy_version 74952 (0.0008) +[2023-10-09 07:13:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 152633344. Throughput: 0: 1698.9, 1: 1719.9. Samples: 38168420. Policy #0 lag: (min: 22.0, avg: 24.9, max: 54.0) +[2023-10-09 07:13:21,053][59242] Avg episode reward: [(0, '32.320'), (1, '34.490')] +[2023-10-09 07:13:21,131][60143] Updated weights for policy 0, policy_version 74122 (0.0010) +[2023-10-09 07:13:21,265][60144] Updated weights for policy 1, policy_version 74962 (0.0009) +[2023-10-09 07:13:21,506][60143] Updated weights for policy 0, policy_version 74132 (0.0007) +[2023-10-09 07:13:21,632][60144] Updated weights for policy 1, policy_version 74972 (0.0010) +[2023-10-09 07:13:21,865][60143] Updated weights for policy 0, policy_version 74142 (0.0008) +[2023-10-09 07:13:25,337][60144] Updated weights for policy 1, policy_version 74982 (0.0007) +[2023-10-09 07:13:25,697][60144] Updated weights for policy 1, policy_version 74992 (0.0008) +[2023-10-09 07:13:25,942][60143] Updated weights for policy 0, policy_version 74152 (0.0009) +[2023-10-09 07:13:26,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 152698880. Throughput: 0: 1698.5, 1: 1737.6. Samples: 38189614. Policy #0 lag: (min: 22.0, avg: 24.9, max: 54.0) +[2023-10-09 07:13:26,053][59242] Avg episode reward: [(0, '31.550'), (1, '33.270')] +[2023-10-09 07:13:26,065][60144] Updated weights for policy 1, policy_version 75002 (0.0007) +[2023-10-09 07:13:26,306][60143] Updated weights for policy 0, policy_version 74162 (0.0008) +[2023-10-09 07:13:26,683][60143] Updated weights for policy 0, policy_version 74172 (0.0010) +[2023-10-09 07:13:30,090][60144] Updated weights for policy 1, policy_version 75012 (0.0007) +[2023-10-09 07:13:30,485][60144] Updated weights for policy 1, policy_version 75022 (0.0009) +[2023-10-09 07:13:30,628][60143] Updated weights for policy 0, policy_version 74182 (0.0009) +[2023-10-09 07:13:30,845][60144] Updated weights for policy 1, policy_version 75032 (0.0007) +[2023-10-09 07:13:30,993][60143] Updated weights for policy 0, policy_version 74192 (0.0008) +[2023-10-09 07:13:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 152764416. Throughput: 0: 1700.6, 1: 1728.5. Samples: 38210014. Policy #0 lag: (min: 22.0, avg: 24.9, max: 54.0) +[2023-10-09 07:13:31,053][59242] Avg episode reward: [(0, '31.750'), (1, '33.650')] +[2023-10-09 07:13:31,134][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000075040_76840960.pth... +[2023-10-09 07:13:31,163][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000073408_75169792.pth +[2023-10-09 07:13:31,369][60143] Updated weights for policy 0, policy_version 74202 (0.0008) +[2023-10-09 07:13:31,584][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000074208_75988992.pth... +[2023-10-09 07:13:31,620][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000072608_74350592.pth +[2023-10-09 07:13:34,713][60144] Updated weights for policy 1, policy_version 75042 (0.0010) +[2023-10-09 07:13:35,079][60144] Updated weights for policy 1, policy_version 75052 (0.0009) +[2023-10-09 07:13:35,418][60143] Updated weights for policy 0, policy_version 74212 (0.0010) +[2023-10-09 07:13:35,443][60144] Updated weights for policy 1, policy_version 75062 (0.0009) +[2023-10-09 07:13:35,806][60143] Updated weights for policy 0, policy_version 74222 (0.0009) +[2023-10-09 07:13:35,810][60144] Updated weights for policy 1, policy_version 75072 (0.0007) +[2023-10-09 07:13:36,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 152862720. Throughput: 0: 1703.7, 1: 1743.7. Samples: 38220122. Policy #0 lag: (min: 22.0, avg: 24.9, max: 54.0) +[2023-10-09 07:13:36,053][59242] Avg episode reward: [(0, '32.560'), (1, '34.480')] +[2023-10-09 07:13:36,184][60143] Updated weights for policy 0, policy_version 74232 (0.0008) +[2023-10-09 07:13:39,698][60144] Updated weights for policy 1, policy_version 75082 (0.0009) +[2023-10-09 07:13:40,058][60144] Updated weights for policy 1, policy_version 75092 (0.0007) +[2023-10-09 07:13:40,204][60143] Updated weights for policy 0, policy_version 74242 (0.0007) +[2023-10-09 07:13:40,418][60144] Updated weights for policy 1, policy_version 75102 (0.0010) +[2023-10-09 07:13:40,571][60143] Updated weights for policy 0, policy_version 74252 (0.0009) +[2023-10-09 07:13:40,941][60143] Updated weights for policy 0, policy_version 74262 (0.0007) +[2023-10-09 07:13:41,052][59242] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 152928256. Throughput: 0: 1708.0, 1: 1740.6. Samples: 38241300. Policy #0 lag: (min: 22.0, avg: 24.9, max: 54.0) +[2023-10-09 07:13:41,052][59242] Avg episode reward: [(0, '33.000'), (1, '32.760')] +[2023-10-09 07:13:41,304][60143] Updated weights for policy 0, policy_version 74272 (0.0007) +[2023-10-09 07:13:44,165][60144] Updated weights for policy 1, policy_version 75112 (0.0008) +[2023-10-09 07:13:44,536][60144] Updated weights for policy 1, policy_version 75122 (0.0008) +[2023-10-09 07:13:44,900][60144] Updated weights for policy 1, policy_version 75132 (0.0007) +[2023-10-09 07:13:45,246][60143] Updated weights for policy 0, policy_version 74282 (0.0009) +[2023-10-09 07:13:45,622][60143] Updated weights for policy 0, policy_version 74292 (0.0011) +[2023-10-09 07:13:45,984][60143] Updated weights for policy 0, policy_version 74302 (0.0008) +[2023-10-09 07:13:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 152993792. Throughput: 0: 1694.1, 1: 1723.9. Samples: 38261012. Policy #0 lag: (min: 22.0, avg: 24.9, max: 54.0) +[2023-10-09 07:13:46,053][59242] Avg episode reward: [(0, '31.570'), (1, '32.800')] +[2023-10-09 07:13:48,955][60144] Updated weights for policy 1, policy_version 75142 (0.0009) +[2023-10-09 07:13:49,325][60144] Updated weights for policy 1, policy_version 75152 (0.0009) +[2023-10-09 07:13:49,697][60144] Updated weights for policy 1, policy_version 75162 (0.0007) +[2023-10-09 07:13:50,058][60143] Updated weights for policy 0, policy_version 74312 (0.0010) +[2023-10-09 07:13:50,426][60143] Updated weights for policy 0, policy_version 74322 (0.0009) +[2023-10-09 07:13:50,790][60143] Updated weights for policy 0, policy_version 74332 (0.0008) +[2023-10-09 07:13:51,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 153092096. Throughput: 0: 1701.4, 1: 1759.0. Samples: 38272310. Policy #0 lag: (min: 22.0, avg: 24.9, max: 54.0) +[2023-10-09 07:13:51,052][59242] Avg episode reward: [(0, '30.750'), (1, '33.160')] +[2023-10-09 07:13:53,454][60144] Updated weights for policy 1, policy_version 75172 (0.0008) +[2023-10-09 07:13:53,821][60144] Updated weights for policy 1, policy_version 75182 (0.0010) +[2023-10-09 07:13:54,186][60144] Updated weights for policy 1, policy_version 75192 (0.0011) +[2023-10-09 07:13:54,873][60143] Updated weights for policy 0, policy_version 74342 (0.0008) +[2023-10-09 07:13:55,238][60143] Updated weights for policy 0, policy_version 74352 (0.0007) +[2023-10-09 07:13:55,612][60143] Updated weights for policy 0, policy_version 74362 (0.0010) +[2023-10-09 07:13:56,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 153157632. Throughput: 0: 1705.8, 1: 1727.4. Samples: 38292432. Policy #0 lag: (min: 22.0, avg: 24.9, max: 54.0) +[2023-10-09 07:13:56,052][59242] Avg episode reward: [(0, '31.080'), (1, '34.130')] +[2023-10-09 07:13:58,241][60144] Updated weights for policy 1, policy_version 75202 (0.0010) +[2023-10-09 07:13:58,610][60144] Updated weights for policy 1, policy_version 75212 (0.0007) +[2023-10-09 07:13:58,980][60144] Updated weights for policy 1, policy_version 75222 (0.0010) +[2023-10-09 07:13:59,338][60144] Updated weights for policy 1, policy_version 75232 (0.0008) +[2023-10-09 07:13:59,449][60143] Updated weights for policy 0, policy_version 74372 (0.0007) +[2023-10-09 07:13:59,817][60143] Updated weights for policy 0, policy_version 74382 (0.0007) +[2023-10-09 07:14:00,177][60143] Updated weights for policy 0, policy_version 74392 (0.0008) +[2023-10-09 07:14:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 153223168. Throughput: 0: 1682.0, 1: 1723.0. Samples: 38312516. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) +[2023-10-09 07:14:01,053][59242] Avg episode reward: [(0, '30.060'), (1, '34.560')] +[2023-10-09 07:14:03,310][60144] Updated weights for policy 1, policy_version 75242 (0.0009) +[2023-10-09 07:14:03,672][60144] Updated weights for policy 1, policy_version 75252 (0.0008) +[2023-10-09 07:14:04,037][60144] Updated weights for policy 1, policy_version 75262 (0.0011) +[2023-10-09 07:14:04,233][60143] Updated weights for policy 0, policy_version 74402 (0.0008) +[2023-10-09 07:14:04,601][60143] Updated weights for policy 0, policy_version 74412 (0.0010) +[2023-10-09 07:14:04,970][60143] Updated weights for policy 0, policy_version 74422 (0.0010) +[2023-10-09 07:14:05,338][60143] Updated weights for policy 0, policy_version 74432 (0.0009) +[2023-10-09 07:14:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 153288704. Throughput: 0: 1712.8, 1: 1737.6. Samples: 38323688. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) +[2023-10-09 07:14:06,052][59242] Avg episode reward: [(0, '31.630'), (1, '34.180')] +[2023-10-09 07:14:07,923][60144] Updated weights for policy 1, policy_version 75272 (0.0010) +[2023-10-09 07:14:08,292][60144] Updated weights for policy 1, policy_version 75282 (0.0008) +[2023-10-09 07:14:08,662][60144] Updated weights for policy 1, policy_version 75292 (0.0007) +[2023-10-09 07:14:09,505][60143] Updated weights for policy 0, policy_version 74442 (0.0009) +[2023-10-09 07:14:09,879][60143] Updated weights for policy 0, policy_version 74452 (0.0008) +[2023-10-09 07:14:10,253][60143] Updated weights for policy 0, policy_version 74462 (0.0009) +[2023-10-09 07:14:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 153354240. Throughput: 0: 1704.8, 1: 1729.1. Samples: 38344140. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) +[2023-10-09 07:14:11,053][59242] Avg episode reward: [(0, '30.860'), (1, '34.590')] +[2023-10-09 07:14:12,544][60144] Updated weights for policy 1, policy_version 75302 (0.0007) +[2023-10-09 07:14:12,907][60144] Updated weights for policy 1, policy_version 75312 (0.0007) +[2023-10-09 07:14:13,279][60144] Updated weights for policy 1, policy_version 75322 (0.0007) +[2023-10-09 07:14:14,112][60143] Updated weights for policy 0, policy_version 74472 (0.0011) +[2023-10-09 07:14:14,490][60143] Updated weights for policy 0, policy_version 74482 (0.0011) +[2023-10-09 07:14:14,856][60143] Updated weights for policy 0, policy_version 74492 (0.0010) +[2023-10-09 07:14:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 153419776. Throughput: 0: 1689.1, 1: 1744.7. Samples: 38364534. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) +[2023-10-09 07:14:16,053][59242] Avg episode reward: [(0, '31.570'), (1, '36.080')] +[2023-10-09 07:14:17,139][60144] Updated weights for policy 1, policy_version 75332 (0.0008) +[2023-10-09 07:14:17,535][60144] Updated weights for policy 1, policy_version 75342 (0.0008) +[2023-10-09 07:14:17,893][60144] Updated weights for policy 1, policy_version 75352 (0.0009) +[2023-10-09 07:14:18,830][60143] Updated weights for policy 0, policy_version 74502 (0.0010) +[2023-10-09 07:14:19,198][60143] Updated weights for policy 0, policy_version 74512 (0.0008) +[2023-10-09 07:14:19,568][60143] Updated weights for policy 0, policy_version 74522 (0.0009) +[2023-10-09 07:14:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 153485312. Throughput: 0: 1723.9, 1: 1725.6. Samples: 38375350. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) +[2023-10-09 07:14:21,053][59242] Avg episode reward: [(0, '30.600'), (1, '35.520')] +[2023-10-09 07:14:21,833][60144] Updated weights for policy 1, policy_version 75362 (0.0008) +[2023-10-09 07:14:22,204][60144] Updated weights for policy 1, policy_version 75372 (0.0008) +[2023-10-09 07:14:22,568][60144] Updated weights for policy 1, policy_version 75382 (0.0009) +[2023-10-09 07:14:22,940][60144] Updated weights for policy 1, policy_version 75392 (0.0007) +[2023-10-09 07:14:23,608][60143] Updated weights for policy 0, policy_version 74532 (0.0007) +[2023-10-09 07:14:23,995][60143] Updated weights for policy 0, policy_version 74542 (0.0008) +[2023-10-09 07:14:24,375][60143] Updated weights for policy 0, policy_version 74552 (0.0009) +[2023-10-09 07:14:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 153550848. Throughput: 0: 1694.6, 1: 1730.0. Samples: 38395408. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) +[2023-10-09 07:14:26,053][59242] Avg episode reward: [(0, '30.820'), (1, '35.430')] +[2023-10-09 07:14:26,871][60144] Updated weights for policy 1, policy_version 75402 (0.0010) +[2023-10-09 07:14:27,237][60144] Updated weights for policy 1, policy_version 75412 (0.0010) +[2023-10-09 07:14:27,611][60144] Updated weights for policy 1, policy_version 75422 (0.0008) +[2023-10-09 07:14:28,136][60143] Updated weights for policy 0, policy_version 74562 (0.0009) +[2023-10-09 07:14:28,505][60143] Updated weights for policy 0, policy_version 74572 (0.0007) +[2023-10-09 07:14:28,867][60143] Updated weights for policy 0, policy_version 74582 (0.0008) +[2023-10-09 07:14:29,235][60143] Updated weights for policy 0, policy_version 74592 (0.0010) +[2023-10-09 07:14:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 153616384. Throughput: 0: 1708.1, 1: 1747.3. Samples: 38416508. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) +[2023-10-09 07:14:31,053][59242] Avg episode reward: [(0, '32.260'), (1, '36.130')] +[2023-10-09 07:14:31,674][60144] Updated weights for policy 1, policy_version 75432 (0.0008) +[2023-10-09 07:14:32,044][60144] Updated weights for policy 1, policy_version 75442 (0.0007) +[2023-10-09 07:14:32,402][60144] Updated weights for policy 1, policy_version 75452 (0.0010) +[2023-10-09 07:14:33,203][60143] Updated weights for policy 0, policy_version 74602 (0.0007) +[2023-10-09 07:14:33,571][60143] Updated weights for policy 0, policy_version 74612 (0.0009) +[2023-10-09 07:14:33,934][60143] Updated weights for policy 0, policy_version 74622 (0.0008) +[2023-10-09 07:14:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 153681920. Throughput: 0: 1711.1, 1: 1712.6. Samples: 38426376. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) +[2023-10-09 07:14:36,053][59242] Avg episode reward: [(0, '32.770'), (1, '35.140')] +[2023-10-09 07:14:36,429][60144] Updated weights for policy 1, policy_version 75462 (0.0010) +[2023-10-09 07:14:36,805][60144] Updated weights for policy 1, policy_version 75472 (0.0011) +[2023-10-09 07:14:37,173][60144] Updated weights for policy 1, policy_version 75482 (0.0009) +[2023-10-09 07:14:37,927][60143] Updated weights for policy 0, policy_version 74632 (0.0007) +[2023-10-09 07:14:38,297][60143] Updated weights for policy 0, policy_version 74642 (0.0009) +[2023-10-09 07:14:38,672][60143] Updated weights for policy 0, policy_version 74652 (0.0009) +[2023-10-09 07:14:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 153747456. Throughput: 0: 1696.4, 1: 1735.9. Samples: 38446886. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) +[2023-10-09 07:14:41,053][59242] Avg episode reward: [(0, '32.070'), (1, '34.620')] +[2023-10-09 07:14:41,205][60144] Updated weights for policy 1, policy_version 75492 (0.0007) +[2023-10-09 07:14:41,570][60144] Updated weights for policy 1, policy_version 75502 (0.0008) +[2023-10-09 07:14:41,931][60144] Updated weights for policy 1, policy_version 75512 (0.0010) +[2023-10-09 07:14:42,530][60143] Updated weights for policy 0, policy_version 74662 (0.0009) +[2023-10-09 07:14:42,891][60143] Updated weights for policy 0, policy_version 74672 (0.0009) +[2023-10-09 07:14:43,257][60143] Updated weights for policy 0, policy_version 74682 (0.0007) +[2023-10-09 07:14:45,617][60144] Updated weights for policy 1, policy_version 75522 (0.0009) +[2023-10-09 07:14:45,981][60144] Updated weights for policy 1, policy_version 75532 (0.0009) +[2023-10-09 07:14:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 153812992. Throughput: 0: 1717.5, 1: 1741.2. Samples: 38468156. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) +[2023-10-09 07:14:46,053][59242] Avg episode reward: [(0, '33.540'), (1, '35.580')] +[2023-10-09 07:14:46,343][60144] Updated weights for policy 1, policy_version 75542 (0.0008) +[2023-10-09 07:14:46,714][60144] Updated weights for policy 1, policy_version 75552 (0.0007) +[2023-10-09 07:14:47,333][60143] Updated weights for policy 0, policy_version 74692 (0.0008) +[2023-10-09 07:14:47,700][60143] Updated weights for policy 0, policy_version 74702 (0.0011) +[2023-10-09 07:14:48,072][60143] Updated weights for policy 0, policy_version 74712 (0.0011) +[2023-10-09 07:14:50,718][60144] Updated weights for policy 1, policy_version 75562 (0.0010) +[2023-10-09 07:14:51,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13773.7). Total num frames: 153878528. Throughput: 0: 1691.3, 1: 1724.8. Samples: 38477414. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:14:51,053][59242] Avg episode reward: [(0, '31.520'), (1, '35.390')] +[2023-10-09 07:14:51,083][60144] Updated weights for policy 1, policy_version 75572 (0.0007) +[2023-10-09 07:14:51,448][60144] Updated weights for policy 1, policy_version 75582 (0.0007) +[2023-10-09 07:14:52,168][60143] Updated weights for policy 0, policy_version 74722 (0.0009) +[2023-10-09 07:14:52,540][60143] Updated weights for policy 0, policy_version 74732 (0.0008) +[2023-10-09 07:14:52,909][60143] Updated weights for policy 0, policy_version 74742 (0.0007) +[2023-10-09 07:14:53,275][60143] Updated weights for policy 0, policy_version 74752 (0.0007) +[2023-10-09 07:14:55,368][60144] Updated weights for policy 1, policy_version 75592 (0.0007) +[2023-10-09 07:14:55,732][60144] Updated weights for policy 1, policy_version 75602 (0.0008) +[2023-10-09 07:14:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 153944064. Throughput: 0: 1699.2, 1: 1729.5. Samples: 38498428. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:14:56,053][59242] Avg episode reward: [(0, '31.390'), (1, '35.000')] +[2023-10-09 07:14:56,102][60144] Updated weights for policy 1, policy_version 75612 (0.0007) +[2023-10-09 07:14:57,257][60143] Updated weights for policy 0, policy_version 74762 (0.0008) +[2023-10-09 07:14:57,630][60143] Updated weights for policy 0, policy_version 74772 (0.0009) +[2023-10-09 07:14:57,999][60143] Updated weights for policy 0, policy_version 74782 (0.0009) +[2023-10-09 07:15:00,310][60144] Updated weights for policy 1, policy_version 75622 (0.0008) +[2023-10-09 07:15:00,678][60144] Updated weights for policy 1, policy_version 75632 (0.0009) +[2023-10-09 07:15:01,038][60144] Updated weights for policy 1, policy_version 75642 (0.0008) +[2023-10-09 07:15:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 154009600. Throughput: 0: 1724.6, 1: 1714.2. Samples: 38519280. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:15:01,053][59242] Avg episode reward: [(0, '31.600'), (1, '34.470')] +[2023-10-09 07:15:02,051][60143] Updated weights for policy 0, policy_version 74792 (0.0009) +[2023-10-09 07:15:02,433][60143] Updated weights for policy 0, policy_version 74802 (0.0007) +[2023-10-09 07:15:02,804][60143] Updated weights for policy 0, policy_version 74812 (0.0010) +[2023-10-09 07:15:05,098][60144] Updated weights for policy 1, policy_version 75652 (0.0008) +[2023-10-09 07:15:05,497][60144] Updated weights for policy 1, policy_version 75662 (0.0008) +[2023-10-09 07:15:05,879][60144] Updated weights for policy 1, policy_version 75672 (0.0008) +[2023-10-09 07:15:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13662.6). Total num frames: 154075136. Throughput: 0: 1685.9, 1: 1730.7. Samples: 38529098. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:15:06,053][59242] Avg episode reward: [(0, '31.390'), (1, '34.190')] +[2023-10-09 07:15:06,776][60143] Updated weights for policy 0, policy_version 74822 (0.0009) +[2023-10-09 07:15:07,153][60143] Updated weights for policy 0, policy_version 74832 (0.0011) +[2023-10-09 07:15:07,518][60143] Updated weights for policy 0, policy_version 74842 (0.0011) +[2023-10-09 07:15:09,724][60144] Updated weights for policy 1, policy_version 75682 (0.0009) +[2023-10-09 07:15:10,094][60144] Updated weights for policy 1, policy_version 75692 (0.0007) +[2023-10-09 07:15:10,471][60144] Updated weights for policy 1, policy_version 75702 (0.0008) +[2023-10-09 07:15:10,834][60144] Updated weights for policy 1, policy_version 75712 (0.0010) +[2023-10-09 07:15:11,052][59242] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 154173440. Throughput: 0: 1714.8, 1: 1717.1. Samples: 38549840. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:15:11,052][59242] Avg episode reward: [(0, '31.830'), (1, '35.250')] +[2023-10-09 07:15:11,708][60143] Updated weights for policy 0, policy_version 74852 (0.0008) +[2023-10-09 07:15:12,096][60143] Updated weights for policy 0, policy_version 74862 (0.0009) +[2023-10-09 07:15:12,468][60143] Updated weights for policy 0, policy_version 74872 (0.0008) +[2023-10-09 07:15:14,782][60144] Updated weights for policy 1, policy_version 75722 (0.0008) +[2023-10-09 07:15:15,150][60144] Updated weights for policy 1, policy_version 75732 (0.0008) +[2023-10-09 07:15:15,519][60144] Updated weights for policy 1, policy_version 75742 (0.0011) +[2023-10-09 07:15:16,052][59242] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 154238976. Throughput: 0: 1711.8, 1: 1691.2. Samples: 38569644. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:15:16,052][59242] Avg episode reward: [(0, '31.290'), (1, '36.150')] +[2023-10-09 07:15:16,353][60143] Updated weights for policy 0, policy_version 74882 (0.0007) +[2023-10-09 07:15:16,722][60143] Updated weights for policy 0, policy_version 74892 (0.0007) +[2023-10-09 07:15:17,097][60143] Updated weights for policy 0, policy_version 74902 (0.0010) +[2023-10-09 07:15:17,463][60143] Updated weights for policy 0, policy_version 74912 (0.0009) +[2023-10-09 07:15:19,373][60144] Updated weights for policy 1, policy_version 75752 (0.0010) +[2023-10-09 07:15:19,735][60144] Updated weights for policy 1, policy_version 75762 (0.0008) +[2023-10-09 07:15:20,102][60144] Updated weights for policy 1, policy_version 75772 (0.0008) +[2023-10-09 07:15:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 154304512. Throughput: 0: 1699.0, 1: 1726.0. Samples: 38580502. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:15:21,053][59242] Avg episode reward: [(0, '32.480'), (1, '37.620')] +[2023-10-09 07:15:21,236][60143] Updated weights for policy 0, policy_version 74922 (0.0008) +[2023-10-09 07:15:21,605][60143] Updated weights for policy 0, policy_version 74932 (0.0008) +[2023-10-09 07:15:21,974][60143] Updated weights for policy 0, policy_version 74942 (0.0010) +[2023-10-09 07:15:23,983][60144] Updated weights for policy 1, policy_version 75782 (0.0009) +[2023-10-09 07:15:24,353][60144] Updated weights for policy 1, policy_version 75792 (0.0007) +[2023-10-09 07:15:24,731][60144] Updated weights for policy 1, policy_version 75802 (0.0010) +[2023-10-09 07:15:25,949][60143] Updated weights for policy 0, policy_version 74952 (0.0007) +[2023-10-09 07:15:26,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 154370048. Throughput: 0: 1714.3, 1: 1710.5. Samples: 38601002. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:15:26,053][59242] Avg episode reward: [(0, '33.110'), (1, '36.160')] +[2023-10-09 07:15:26,320][60143] Updated weights for policy 0, policy_version 74962 (0.0008) +[2023-10-09 07:15:26,688][60143] Updated weights for policy 0, policy_version 74972 (0.0009) +[2023-10-09 07:15:28,744][60144] Updated weights for policy 1, policy_version 75812 (0.0008) +[2023-10-09 07:15:29,106][60144] Updated weights for policy 1, policy_version 75822 (0.0010) +[2023-10-09 07:15:29,467][60144] Updated weights for policy 1, policy_version 75832 (0.0010) +[2023-10-09 07:15:30,618][60143] Updated weights for policy 0, policy_version 74982 (0.0009) +[2023-10-09 07:15:30,983][60143] Updated weights for policy 0, policy_version 74992 (0.0009) +[2023-10-09 07:15:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 154435584. Throughput: 0: 1717.1, 1: 1694.6. Samples: 38621680. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:15:31,054][59242] Avg episode reward: [(0, '33.020'), (1, '36.260')] +[2023-10-09 07:15:31,061][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000075840_77660160.pth... +[2023-10-09 07:15:31,096][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000074208_75988992.pth +[2023-10-09 07:15:31,354][60143] Updated weights for policy 0, policy_version 75002 (0.0009) +[2023-10-09 07:15:31,572][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000075008_76808192.pth... +[2023-10-09 07:15:31,606][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000073408_75169792.pth +[2023-10-09 07:15:33,454][60144] Updated weights for policy 1, policy_version 75842 (0.0010) +[2023-10-09 07:15:33,830][60144] Updated weights for policy 1, policy_version 75852 (0.0009) +[2023-10-09 07:15:34,198][60144] Updated weights for policy 1, policy_version 75862 (0.0009) +[2023-10-09 07:15:34,562][60144] Updated weights for policy 1, policy_version 75872 (0.0008) +[2023-10-09 07:15:35,587][60143] Updated weights for policy 0, policy_version 75012 (0.0009) +[2023-10-09 07:15:35,951][60143] Updated weights for policy 0, policy_version 75022 (0.0008) +[2023-10-09 07:15:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 154501120. Throughput: 0: 1715.9, 1: 1725.3. Samples: 38632268. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:15:36,053][59242] Avg episode reward: [(0, '34.270'), (1, '35.560')] +[2023-10-09 07:15:36,318][60143] Updated weights for policy 0, policy_version 75032 (0.0008) +[2023-10-09 07:15:38,543][60144] Updated weights for policy 1, policy_version 75882 (0.0010) +[2023-10-09 07:15:38,913][60144] Updated weights for policy 1, policy_version 75892 (0.0007) +[2023-10-09 07:15:39,277][60144] Updated weights for policy 1, policy_version 75902 (0.0007) +[2023-10-09 07:15:40,308][60143] Updated weights for policy 0, policy_version 75042 (0.0008) +[2023-10-09 07:15:40,677][60143] Updated weights for policy 0, policy_version 75052 (0.0011) +[2023-10-09 07:15:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 154566656. Throughput: 0: 1720.7, 1: 1702.6. Samples: 38652476. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-09 07:15:41,053][59242] Avg episode reward: [(0, '35.270'), (1, '35.930')] +[2023-10-09 07:15:41,053][60143] Updated weights for policy 0, policy_version 75062 (0.0009) +[2023-10-09 07:15:41,426][60143] Updated weights for policy 0, policy_version 75072 (0.0007) +[2023-10-09 07:15:43,212][60144] Updated weights for policy 1, policy_version 75912 (0.0007) +[2023-10-09 07:15:43,574][60144] Updated weights for policy 1, policy_version 75922 (0.0010) +[2023-10-09 07:15:43,940][60144] Updated weights for policy 1, policy_version 75932 (0.0010) +[2023-10-09 07:15:45,327][60143] Updated weights for policy 0, policy_version 75082 (0.0009) +[2023-10-09 07:15:45,695][60143] Updated weights for policy 0, policy_version 75092 (0.0009) +[2023-10-09 07:15:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 154632192. Throughput: 0: 1698.9, 1: 1717.5. Samples: 38673020. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-09 07:15:46,053][59242] Avg episode reward: [(0, '35.280'), (1, '35.400')] +[2023-10-09 07:15:46,060][60143] Updated weights for policy 0, policy_version 75102 (0.0010) +[2023-10-09 07:15:47,916][60144] Updated weights for policy 1, policy_version 75942 (0.0007) +[2023-10-09 07:15:48,282][60144] Updated weights for policy 1, policy_version 75952 (0.0008) +[2023-10-09 07:15:48,642][60144] Updated weights for policy 1, policy_version 75962 (0.0008) +[2023-10-09 07:15:50,138][60143] Updated weights for policy 0, policy_version 75112 (0.0008) +[2023-10-09 07:15:50,511][60143] Updated weights for policy 0, policy_version 75122 (0.0007) +[2023-10-09 07:15:50,888][60143] Updated weights for policy 0, policy_version 75132 (0.0009) +[2023-10-09 07:15:51,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 154730496. Throughput: 0: 1714.3, 1: 1711.4. Samples: 38683256. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-09 07:15:51,053][59242] Avg episode reward: [(0, '33.800'), (1, '34.430')] +[2023-10-09 07:15:52,576][60144] Updated weights for policy 1, policy_version 75972 (0.0009) +[2023-10-09 07:15:52,969][60144] Updated weights for policy 1, policy_version 75982 (0.0011) +[2023-10-09 07:15:53,341][60144] Updated weights for policy 1, policy_version 75992 (0.0011) +[2023-10-09 07:15:54,730][60143] Updated weights for policy 0, policy_version 75142 (0.0009) +[2023-10-09 07:15:55,094][60143] Updated weights for policy 0, policy_version 75152 (0.0008) +[2023-10-09 07:15:55,456][60143] Updated weights for policy 0, policy_version 75162 (0.0010) +[2023-10-09 07:15:56,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 154796032. Throughput: 0: 1717.7, 1: 1707.5. Samples: 38703976. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-09 07:15:56,053][59242] Avg episode reward: [(0, '36.050'), (1, '33.930')] +[2023-10-09 07:15:57,204][60144] Updated weights for policy 1, policy_version 76002 (0.0009) +[2023-10-09 07:15:57,575][60144] Updated weights for policy 1, policy_version 76012 (0.0009) +[2023-10-09 07:15:57,931][60144] Updated weights for policy 1, policy_version 76022 (0.0008) +[2023-10-09 07:15:58,299][60144] Updated weights for policy 1, policy_version 76032 (0.0009) +[2023-10-09 07:15:59,616][60143] Updated weights for policy 0, policy_version 75172 (0.0007) +[2023-10-09 07:15:59,985][60143] Updated weights for policy 0, policy_version 75182 (0.0007) +[2023-10-09 07:16:00,355][60143] Updated weights for policy 0, policy_version 75192 (0.0008) +[2023-10-09 07:16:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 154861568. Throughput: 0: 1694.4, 1: 1740.5. Samples: 38724216. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-09 07:16:01,053][59242] Avg episode reward: [(0, '36.720'), (1, '34.290')] +[2023-10-09 07:16:02,255][60144] Updated weights for policy 1, policy_version 76042 (0.0010) +[2023-10-09 07:16:02,633][60144] Updated weights for policy 1, policy_version 76052 (0.0010) +[2023-10-09 07:16:02,998][60144] Updated weights for policy 1, policy_version 76062 (0.0009) +[2023-10-09 07:16:04,184][60143] Updated weights for policy 0, policy_version 75202 (0.0008) +[2023-10-09 07:16:04,539][60143] Updated weights for policy 0, policy_version 75212 (0.0007) +[2023-10-09 07:16:04,913][60143] Updated weights for policy 0, policy_version 75222 (0.0009) +[2023-10-09 07:16:05,274][60143] Updated weights for policy 0, policy_version 75232 (0.0011) +[2023-10-09 07:16:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 154927104. Throughput: 0: 1716.4, 1: 1709.5. Samples: 38734666. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-09 07:16:06,053][59242] Avg episode reward: [(0, '34.720'), (1, '33.110')] +[2023-10-09 07:16:06,873][60144] Updated weights for policy 1, policy_version 76072 (0.0008) +[2023-10-09 07:16:07,223][60144] Updated weights for policy 1, policy_version 76082 (0.0007) +[2023-10-09 07:16:07,593][60144] Updated weights for policy 1, policy_version 76092 (0.0007) +[2023-10-09 07:16:09,177][60143] Updated weights for policy 0, policy_version 75242 (0.0008) +[2023-10-09 07:16:09,539][60143] Updated weights for policy 0, policy_version 75252 (0.0008) +[2023-10-09 07:16:09,914][60143] Updated weights for policy 0, policy_version 75262 (0.0008) +[2023-10-09 07:16:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 154992640. Throughput: 0: 1699.6, 1: 1730.9. Samples: 38755376. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-09 07:16:11,053][59242] Avg episode reward: [(0, '34.150'), (1, '32.920')] +[2023-10-09 07:16:11,713][60144] Updated weights for policy 1, policy_version 76102 (0.0009) +[2023-10-09 07:16:12,075][60144] Updated weights for policy 1, policy_version 76112 (0.0010) +[2023-10-09 07:16:12,452][60144] Updated weights for policy 1, policy_version 76122 (0.0008) +[2023-10-09 07:16:13,883][60143] Updated weights for policy 0, policy_version 75272 (0.0010) +[2023-10-09 07:16:14,249][60143] Updated weights for policy 0, policy_version 75282 (0.0008) +[2023-10-09 07:16:14,621][60143] Updated weights for policy 0, policy_version 75292 (0.0007) +[2023-10-09 07:16:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 155058176. Throughput: 0: 1687.6, 1: 1743.1. Samples: 38776058. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-09 07:16:16,053][59242] Avg episode reward: [(0, '34.790'), (1, '32.360')] +[2023-10-09 07:16:16,369][60144] Updated weights for policy 1, policy_version 76132 (0.0008) +[2023-10-09 07:16:16,743][60144] Updated weights for policy 1, policy_version 76142 (0.0007) +[2023-10-09 07:16:17,105][60144] Updated weights for policy 1, policy_version 76152 (0.0008) +[2023-10-09 07:16:18,597][60143] Updated weights for policy 0, policy_version 75302 (0.0008) +[2023-10-09 07:16:18,967][60143] Updated weights for policy 0, policy_version 75312 (0.0009) +[2023-10-09 07:16:19,338][60143] Updated weights for policy 0, policy_version 75322 (0.0009) +[2023-10-09 07:16:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 155123712. Throughput: 0: 1712.1, 1: 1712.6. Samples: 38786378. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-09 07:16:21,053][59242] Avg episode reward: [(0, '34.790'), (1, '32.570')] +[2023-10-09 07:16:21,201][60144] Updated weights for policy 1, policy_version 76162 (0.0007) +[2023-10-09 07:16:21,575][60144] Updated weights for policy 1, policy_version 76172 (0.0007) +[2023-10-09 07:16:21,939][60144] Updated weights for policy 1, policy_version 76182 (0.0008) +[2023-10-09 07:16:22,305][60144] Updated weights for policy 1, policy_version 76192 (0.0009) +[2023-10-09 07:16:23,529][60143] Updated weights for policy 0, policy_version 75332 (0.0010) +[2023-10-09 07:16:23,904][60143] Updated weights for policy 0, policy_version 75342 (0.0008) +[2023-10-09 07:16:24,279][60143] Updated weights for policy 0, policy_version 75352 (0.0008) +[2023-10-09 07:16:26,050][60144] Updated weights for policy 1, policy_version 76202 (0.0007) +[2023-10-09 07:16:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 155189248. Throughput: 0: 1686.7, 1: 1736.1. Samples: 38806502. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:16:26,053][59242] Avg episode reward: [(0, '34.650'), (1, '33.200')] +[2023-10-09 07:16:26,407][60144] Updated weights for policy 1, policy_version 76212 (0.0008) +[2023-10-09 07:16:26,774][60144] Updated weights for policy 1, policy_version 76222 (0.0007) +[2023-10-09 07:16:28,444][60143] Updated weights for policy 0, policy_version 75362 (0.0009) +[2023-10-09 07:16:28,817][60143] Updated weights for policy 0, policy_version 75372 (0.0007) +[2023-10-09 07:16:29,184][60143] Updated weights for policy 0, policy_version 75382 (0.0007) +[2023-10-09 07:16:29,553][60143] Updated weights for policy 0, policy_version 75392 (0.0007) +[2023-10-09 07:16:30,593][60144] Updated weights for policy 1, policy_version 76232 (0.0009) +[2023-10-09 07:16:30,961][60144] Updated weights for policy 1, policy_version 76242 (0.0010) +[2023-10-09 07:16:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 155254784. Throughput: 0: 1699.3, 1: 1733.8. Samples: 38827512. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:16:31,053][59242] Avg episode reward: [(0, '36.490'), (1, '34.380')] +[2023-10-09 07:16:31,334][60144] Updated weights for policy 1, policy_version 76252 (0.0008) +[2023-10-09 07:16:33,414][60143] Updated weights for policy 0, policy_version 75402 (0.0011) +[2023-10-09 07:16:33,779][60143] Updated weights for policy 0, policy_version 75412 (0.0010) +[2023-10-09 07:16:34,141][60143] Updated weights for policy 0, policy_version 75422 (0.0009) +[2023-10-09 07:16:35,271][60144] Updated weights for policy 1, policy_version 76262 (0.0007) +[2023-10-09 07:16:35,631][60144] Updated weights for policy 1, policy_version 76272 (0.0008) +[2023-10-09 07:16:35,995][60144] Updated weights for policy 1, policy_version 76282 (0.0009) +[2023-10-09 07:16:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 155320320. Throughput: 0: 1704.8, 1: 1733.1. Samples: 38837960. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:16:36,052][59242] Avg episode reward: [(0, '36.050'), (1, '33.290')] +[2023-10-09 07:16:38,115][60143] Updated weights for policy 0, policy_version 75432 (0.0010) +[2023-10-09 07:16:38,497][60143] Updated weights for policy 0, policy_version 75442 (0.0010) +[2023-10-09 07:16:38,868][60143] Updated weights for policy 0, policy_version 75452 (0.0010) +[2023-10-09 07:16:40,040][60144] Updated weights for policy 1, policy_version 76292 (0.0008) +[2023-10-09 07:16:40,451][60144] Updated weights for policy 1, policy_version 76302 (0.0011) +[2023-10-09 07:16:40,821][60144] Updated weights for policy 1, policy_version 76312 (0.0010) +[2023-10-09 07:16:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 155385856. Throughput: 0: 1679.2, 1: 1752.6. Samples: 38858408. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:16:41,053][59242] Avg episode reward: [(0, '35.550'), (1, '33.840')] +[2023-10-09 07:16:42,728][60143] Updated weights for policy 0, policy_version 75462 (0.0008) +[2023-10-09 07:16:43,098][60143] Updated weights for policy 0, policy_version 75472 (0.0008) +[2023-10-09 07:16:43,469][60143] Updated weights for policy 0, policy_version 75482 (0.0010) +[2023-10-09 07:16:44,696][60144] Updated weights for policy 1, policy_version 76322 (0.0010) +[2023-10-09 07:16:45,063][60144] Updated weights for policy 1, policy_version 76332 (0.0008) +[2023-10-09 07:16:45,426][60144] Updated weights for policy 1, policy_version 76342 (0.0008) +[2023-10-09 07:16:45,795][60144] Updated weights for policy 1, policy_version 76352 (0.0008) +[2023-10-09 07:16:46,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 155484160. Throughput: 0: 1708.9, 1: 1725.9. Samples: 38878782. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:16:46,053][59242] Avg episode reward: [(0, '34.550'), (1, '32.230')] +[2023-10-09 07:16:47,456][60143] Updated weights for policy 0, policy_version 75492 (0.0010) +[2023-10-09 07:16:47,847][60143] Updated weights for policy 0, policy_version 75502 (0.0011) +[2023-10-09 07:16:48,220][60143] Updated weights for policy 0, policy_version 75512 (0.0010) +[2023-10-09 07:16:49,683][60144] Updated weights for policy 1, policy_version 76362 (0.0008) +[2023-10-09 07:16:50,048][60144] Updated weights for policy 1, policy_version 76372 (0.0007) +[2023-10-09 07:16:50,414][60144] Updated weights for policy 1, policy_version 76382 (0.0007) +[2023-10-09 07:16:51,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 155549696. Throughput: 0: 1686.8, 1: 1748.0. Samples: 38889232. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:16:51,053][59242] Avg episode reward: [(0, '35.940'), (1, '32.310')] +[2023-10-09 07:16:52,328][60143] Updated weights for policy 0, policy_version 75522 (0.0010) +[2023-10-09 07:16:52,693][60143] Updated weights for policy 0, policy_version 75532 (0.0008) +[2023-10-09 07:16:53,055][60143] Updated weights for policy 0, policy_version 75542 (0.0008) +[2023-10-09 07:16:53,435][60143] Updated weights for policy 0, policy_version 75552 (0.0010) +[2023-10-09 07:16:54,207][60144] Updated weights for policy 1, policy_version 76392 (0.0011) +[2023-10-09 07:16:54,581][60144] Updated weights for policy 1, policy_version 76402 (0.0010) +[2023-10-09 07:16:54,953][60144] Updated weights for policy 1, policy_version 76412 (0.0007) +[2023-10-09 07:16:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 155615232. Throughput: 0: 1698.7, 1: 1728.7. Samples: 38909610. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:16:56,053][59242] Avg episode reward: [(0, '35.670'), (1, '32.130')] +[2023-10-09 07:16:57,400][60143] Updated weights for policy 0, policy_version 75562 (0.0007) +[2023-10-09 07:16:57,774][60143] Updated weights for policy 0, policy_version 75572 (0.0008) +[2023-10-09 07:16:58,142][60143] Updated weights for policy 0, policy_version 75582 (0.0007) +[2023-10-09 07:16:58,770][60144] Updated weights for policy 1, policy_version 76422 (0.0009) +[2023-10-09 07:16:59,134][60144] Updated weights for policy 1, policy_version 76432 (0.0008) +[2023-10-09 07:16:59,503][60144] Updated weights for policy 1, policy_version 76442 (0.0009) +[2023-10-09 07:17:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 155680768. Throughput: 0: 1714.0, 1: 1718.0. Samples: 38930502. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:17:01,053][59242] Avg episode reward: [(0, '35.160'), (1, '33.700')] +[2023-10-09 07:17:02,227][60143] Updated weights for policy 0, policy_version 75592 (0.0008) +[2023-10-09 07:17:02,593][60143] Updated weights for policy 0, policy_version 75602 (0.0010) +[2023-10-09 07:17:02,967][60143] Updated weights for policy 0, policy_version 75612 (0.0008) +[2023-10-09 07:17:03,316][60144] Updated weights for policy 1, policy_version 76452 (0.0009) +[2023-10-09 07:17:03,684][60144] Updated weights for policy 1, policy_version 76462 (0.0009) +[2023-10-09 07:17:04,058][60144] Updated weights for policy 1, policy_version 76472 (0.0009) +[2023-10-09 07:17:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 155746304. Throughput: 0: 1683.0, 1: 1746.0. Samples: 38940686. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:17:06,053][59242] Avg episode reward: [(0, '35.960'), (1, '33.540')] +[2023-10-09 07:17:07,190][60143] Updated weights for policy 0, policy_version 75622 (0.0008) +[2023-10-09 07:17:07,561][60143] Updated weights for policy 0, policy_version 75632 (0.0009) +[2023-10-09 07:17:07,946][60143] Updated weights for policy 0, policy_version 75642 (0.0009) +[2023-10-09 07:17:08,048][60144] Updated weights for policy 1, policy_version 76482 (0.0008) +[2023-10-09 07:17:08,407][60144] Updated weights for policy 1, policy_version 76492 (0.0009) +[2023-10-09 07:17:08,778][60144] Updated weights for policy 1, policy_version 76502 (0.0007) +[2023-10-09 07:17:09,137][60144] Updated weights for policy 1, policy_version 76512 (0.0011) +[2023-10-09 07:17:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 155811840. Throughput: 0: 1704.6, 1: 1725.9. Samples: 38960874. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:17:11,052][59242] Avg episode reward: [(0, '35.560'), (1, '33.180')] +[2023-10-09 07:17:11,797][60143] Updated weights for policy 0, policy_version 75652 (0.0009) +[2023-10-09 07:17:12,163][60143] Updated weights for policy 0, policy_version 75662 (0.0008) +[2023-10-09 07:17:12,529][60143] Updated weights for policy 0, policy_version 75672 (0.0007) +[2023-10-09 07:17:13,096][60144] Updated weights for policy 1, policy_version 76522 (0.0007) +[2023-10-09 07:17:13,469][60144] Updated weights for policy 1, policy_version 76532 (0.0009) +[2023-10-09 07:17:13,832][60144] Updated weights for policy 1, policy_version 76542 (0.0010) +[2023-10-09 07:17:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 155877376. Throughput: 0: 1714.9, 1: 1727.5. Samples: 38982420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:17:16,053][59242] Avg episode reward: [(0, '33.820'), (1, '31.920')] +[2023-10-09 07:17:16,495][60143] Updated weights for policy 0, policy_version 75682 (0.0009) +[2023-10-09 07:17:16,871][60143] Updated weights for policy 0, policy_version 75692 (0.0010) +[2023-10-09 07:17:17,246][60143] Updated weights for policy 0, policy_version 75702 (0.0009) +[2023-10-09 07:17:17,604][60143] Updated weights for policy 0, policy_version 75712 (0.0008) +[2023-10-09 07:17:17,810][60144] Updated weights for policy 1, policy_version 76552 (0.0008) +[2023-10-09 07:17:18,171][60144] Updated weights for policy 1, policy_version 76562 (0.0008) +[2023-10-09 07:17:18,548][60144] Updated weights for policy 1, policy_version 76572 (0.0009) +[2023-10-09 07:17:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 155942912. Throughput: 0: 1697.6, 1: 1724.4. Samples: 38991952. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:17:21,052][59242] Avg episode reward: [(0, '34.170'), (1, '31.830')] +[2023-10-09 07:17:21,658][60143] Updated weights for policy 0, policy_version 75722 (0.0009) +[2023-10-09 07:17:22,033][60143] Updated weights for policy 0, policy_version 75732 (0.0008) +[2023-10-09 07:17:22,311][60144] Updated weights for policy 1, policy_version 76582 (0.0008) +[2023-10-09 07:17:22,399][60143] Updated weights for policy 0, policy_version 75742 (0.0007) +[2023-10-09 07:17:22,679][60144] Updated weights for policy 1, policy_version 76592 (0.0009) +[2023-10-09 07:17:23,052][60144] Updated weights for policy 1, policy_version 76602 (0.0007) +[2023-10-09 07:17:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 156008448. Throughput: 0: 1718.2, 1: 1719.0. Samples: 39013084. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:17:26,053][59242] Avg episode reward: [(0, '33.860'), (1, '31.210')] +[2023-10-09 07:17:26,210][60143] Updated weights for policy 0, policy_version 75752 (0.0008) +[2023-10-09 07:17:26,577][60143] Updated weights for policy 0, policy_version 75762 (0.0009) +[2023-10-09 07:17:26,953][60143] Updated weights for policy 0, policy_version 75772 (0.0010) +[2023-10-09 07:17:27,168][60144] Updated weights for policy 1, policy_version 76612 (0.0009) +[2023-10-09 07:17:27,534][60144] Updated weights for policy 1, policy_version 76622 (0.0008) +[2023-10-09 07:17:27,900][60144] Updated weights for policy 1, policy_version 76632 (0.0008) +[2023-10-09 07:17:30,878][60143] Updated weights for policy 0, policy_version 75782 (0.0009) +[2023-10-09 07:17:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 156073984. Throughput: 0: 1715.7, 1: 1738.5. Samples: 39034222. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:17:31,053][59242] Avg episode reward: [(0, '34.660'), (1, '30.000')] +[2023-10-09 07:17:31,063][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000076640_78479360.pth... +[2023-10-09 07:17:31,096][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000075040_76840960.pth +[2023-10-09 07:17:31,248][60143] Updated weights for policy 0, policy_version 75792 (0.0007) +[2023-10-09 07:17:31,620][60143] Updated weights for policy 0, policy_version 75802 (0.0009) +[2023-10-09 07:17:31,832][60144] Updated weights for policy 1, policy_version 76642 (0.0007) +[2023-10-09 07:17:31,837][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000075808_77627392.pth... +[2023-10-09 07:17:31,865][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000074208_75988992.pth +[2023-10-09 07:17:32,201][60144] Updated weights for policy 1, policy_version 76652 (0.0007) +[2023-10-09 07:17:32,551][60144] Updated weights for policy 1, policy_version 76662 (0.0008) +[2023-10-09 07:17:32,916][60144] Updated weights for policy 1, policy_version 76672 (0.0007) +[2023-10-09 07:17:35,671][60143] Updated weights for policy 0, policy_version 75812 (0.0010) +[2023-10-09 07:17:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 156139520. Throughput: 0: 1715.0, 1: 1716.8. Samples: 39043660. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:17:36,053][60143] Updated weights for policy 0, policy_version 75822 (0.0010) +[2023-10-09 07:17:36,053][59242] Avg episode reward: [(0, '34.140'), (1, '31.750')] +[2023-10-09 07:17:36,424][60143] Updated weights for policy 0, policy_version 75832 (0.0008) +[2023-10-09 07:17:36,740][60144] Updated weights for policy 1, policy_version 76682 (0.0009) +[2023-10-09 07:17:37,107][60144] Updated weights for policy 1, policy_version 76692 (0.0008) +[2023-10-09 07:17:37,466][60144] Updated weights for policy 1, policy_version 76702 (0.0008) +[2023-10-09 07:17:40,314][60143] Updated weights for policy 0, policy_version 75842 (0.0007) +[2023-10-09 07:17:40,676][60143] Updated weights for policy 0, policy_version 75852 (0.0009) +[2023-10-09 07:17:41,044][60143] Updated weights for policy 0, policy_version 75862 (0.0009) +[2023-10-09 07:17:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 156205056. Throughput: 0: 1713.6, 1: 1732.1. Samples: 39064668. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:17:41,052][59242] Avg episode reward: [(0, '34.540'), (1, '32.590')] +[2023-10-09 07:17:41,409][60143] Updated weights for policy 0, policy_version 75872 (0.0009) +[2023-10-09 07:17:41,522][60144] Updated weights for policy 1, policy_version 76712 (0.0009) +[2023-10-09 07:17:41,896][60144] Updated weights for policy 1, policy_version 76722 (0.0008) +[2023-10-09 07:17:42,269][60144] Updated weights for policy 1, policy_version 76732 (0.0008) +[2023-10-09 07:17:45,451][60143] Updated weights for policy 0, policy_version 75882 (0.0010) +[2023-10-09 07:17:45,818][60143] Updated weights for policy 0, policy_version 75892 (0.0010) +[2023-10-09 07:17:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 156270592. Throughput: 0: 1699.2, 1: 1743.4. Samples: 39085420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:17:46,053][59242] Avg episode reward: [(0, '34.250'), (1, '34.370')] +[2023-10-09 07:17:46,164][60144] Updated weights for policy 1, policy_version 76742 (0.0009) +[2023-10-09 07:17:46,194][60143] Updated weights for policy 0, policy_version 75902 (0.0009) +[2023-10-09 07:17:46,536][60144] Updated weights for policy 1, policy_version 76752 (0.0010) +[2023-10-09 07:17:46,907][60144] Updated weights for policy 1, policy_version 76762 (0.0010) +[2023-10-09 07:17:50,285][60143] Updated weights for policy 0, policy_version 75912 (0.0008) +[2023-10-09 07:17:50,642][60143] Updated weights for policy 0, policy_version 75922 (0.0008) +[2023-10-09 07:17:50,811][60144] Updated weights for policy 1, policy_version 76772 (0.0008) +[2023-10-09 07:17:51,011][60143] Updated weights for policy 0, policy_version 75932 (0.0008) +[2023-10-09 07:17:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 156336128. Throughput: 0: 1716.8, 1: 1715.8. Samples: 39095154. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:17:51,052][59242] Avg episode reward: [(0, '33.670'), (1, '34.290')] +[2023-10-09 07:17:51,180][60144] Updated weights for policy 1, policy_version 76782 (0.0009) +[2023-10-09 07:17:51,538][60144] Updated weights for policy 1, policy_version 76792 (0.0008) +[2023-10-09 07:17:55,036][60143] Updated weights for policy 0, policy_version 75942 (0.0010) +[2023-10-09 07:17:55,396][60143] Updated weights for policy 0, policy_version 75952 (0.0009) +[2023-10-09 07:17:55,405][60144] Updated weights for policy 1, policy_version 76802 (0.0007) +[2023-10-09 07:17:55,763][60143] Updated weights for policy 0, policy_version 75962 (0.0007) +[2023-10-09 07:17:55,773][60144] Updated weights for policy 1, policy_version 76812 (0.0008) +[2023-10-09 07:17:56,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 156434432. Throughput: 0: 1716.3, 1: 1745.0. Samples: 39116632. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:17:56,053][59242] Avg episode reward: [(0, '34.650'), (1, '33.250')] +[2023-10-09 07:17:56,138][60144] Updated weights for policy 1, policy_version 76822 (0.0007) +[2023-10-09 07:17:56,505][60144] Updated weights for policy 1, policy_version 76832 (0.0009) +[2023-10-09 07:17:59,763][60143] Updated weights for policy 0, policy_version 75972 (0.0008) +[2023-10-09 07:18:00,133][60143] Updated weights for policy 0, policy_version 75982 (0.0007) +[2023-10-09 07:18:00,424][60144] Updated weights for policy 1, policy_version 76842 (0.0008) +[2023-10-09 07:18:00,494][60143] Updated weights for policy 0, policy_version 75992 (0.0008) +[2023-10-09 07:18:00,786][60144] Updated weights for policy 1, policy_version 76852 (0.0008) +[2023-10-09 07:18:01,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 156499968. Throughput: 0: 1694.8, 1: 1727.8. Samples: 39136434. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:18:01,053][59242] Avg episode reward: [(0, '34.790'), (1, '33.110')] +[2023-10-09 07:18:01,151][60144] Updated weights for policy 1, policy_version 76862 (0.0007) +[2023-10-09 07:18:04,202][60143] Updated weights for policy 0, policy_version 76002 (0.0009) +[2023-10-09 07:18:04,579][60143] Updated weights for policy 0, policy_version 76012 (0.0009) +[2023-10-09 07:18:04,945][60143] Updated weights for policy 0, policy_version 76022 (0.0008) +[2023-10-09 07:18:05,096][60144] Updated weights for policy 1, policy_version 76872 (0.0008) +[2023-10-09 07:18:05,307][60143] Updated weights for policy 0, policy_version 76032 (0.0008) +[2023-10-09 07:18:05,464][60144] Updated weights for policy 1, policy_version 76882 (0.0008) +[2023-10-09 07:18:05,838][60144] Updated weights for policy 1, policy_version 76892 (0.0009) +[2023-10-09 07:18:06,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 156598272. Throughput: 0: 1720.3, 1: 1735.0. Samples: 39147438. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:18:06,053][59242] Avg episode reward: [(0, '32.840'), (1, '32.730')] +[2023-10-09 07:18:09,305][60143] Updated weights for policy 0, policy_version 76042 (0.0008) +[2023-10-09 07:18:09,676][60143] Updated weights for policy 0, policy_version 76052 (0.0011) +[2023-10-09 07:18:09,834][60144] Updated weights for policy 1, policy_version 76902 (0.0009) +[2023-10-09 07:18:10,042][60143] Updated weights for policy 0, policy_version 76062 (0.0008) +[2023-10-09 07:18:10,192][60144] Updated weights for policy 1, policy_version 76912 (0.0010) +[2023-10-09 07:18:10,553][60144] Updated weights for policy 1, policy_version 76922 (0.0009) +[2023-10-09 07:18:11,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 156663808. Throughput: 0: 1706.9, 1: 1738.9. Samples: 39168140. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:18:11,052][59242] Avg episode reward: [(0, '33.000'), (1, '33.030')] +[2023-10-09 07:18:14,031][60143] Updated weights for policy 0, policy_version 76072 (0.0009) +[2023-10-09 07:18:14,393][60143] Updated weights for policy 0, policy_version 76082 (0.0008) +[2023-10-09 07:18:14,537][60144] Updated weights for policy 1, policy_version 76932 (0.0007) +[2023-10-09 07:18:14,765][60143] Updated weights for policy 0, policy_version 76092 (0.0008) +[2023-10-09 07:18:14,906][60144] Updated weights for policy 1, policy_version 76942 (0.0008) +[2023-10-09 07:18:15,270][60144] Updated weights for policy 1, policy_version 76952 (0.0009) +[2023-10-09 07:18:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 156729344. Throughput: 0: 1686.7, 1: 1711.1. Samples: 39187122. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:18:16,053][59242] Avg episode reward: [(0, '32.970'), (1, '34.490')] +[2023-10-09 07:18:18,814][60143] Updated weights for policy 0, policy_version 76102 (0.0007) +[2023-10-09 07:18:19,186][60143] Updated weights for policy 0, policy_version 76112 (0.0009) +[2023-10-09 07:18:19,373][60144] Updated weights for policy 1, policy_version 76962 (0.0008) +[2023-10-09 07:18:19,548][60143] Updated weights for policy 0, policy_version 76122 (0.0007) +[2023-10-09 07:18:19,739][60144] Updated weights for policy 1, policy_version 76972 (0.0007) +[2023-10-09 07:18:20,106][60144] Updated weights for policy 1, policy_version 76982 (0.0009) +[2023-10-09 07:18:20,465][60144] Updated weights for policy 1, policy_version 76992 (0.0010) +[2023-10-09 07:18:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 156794880. Throughput: 0: 1715.8, 1: 1739.6. Samples: 39199156. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:18:21,053][59242] Avg episode reward: [(0, '35.380'), (1, '33.070')] +[2023-10-09 07:18:23,540][60143] Updated weights for policy 0, policy_version 76132 (0.0008) +[2023-10-09 07:18:23,920][60143] Updated weights for policy 0, policy_version 76142 (0.0008) +[2023-10-09 07:18:24,288][60143] Updated weights for policy 0, policy_version 76152 (0.0009) +[2023-10-09 07:18:24,510][60144] Updated weights for policy 1, policy_version 77002 (0.0008) +[2023-10-09 07:18:24,876][60144] Updated weights for policy 1, policy_version 77012 (0.0010) +[2023-10-09 07:18:25,245][60144] Updated weights for policy 1, policy_version 77022 (0.0011) +[2023-10-09 07:18:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 156860416. Throughput: 0: 1695.6, 1: 1728.7. Samples: 39218758. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:18:26,053][59242] Avg episode reward: [(0, '34.730'), (1, '32.280')] +[2023-10-09 07:18:28,153][60143] Updated weights for policy 0, policy_version 76162 (0.0008) +[2023-10-09 07:18:28,526][60143] Updated weights for policy 0, policy_version 76172 (0.0009) +[2023-10-09 07:18:28,894][60143] Updated weights for policy 0, policy_version 76182 (0.0009) +[2023-10-09 07:18:29,150][60144] Updated weights for policy 1, policy_version 77032 (0.0009) +[2023-10-09 07:18:29,259][60143] Updated weights for policy 0, policy_version 76192 (0.0007) +[2023-10-09 07:18:29,509][60144] Updated weights for policy 1, policy_version 77042 (0.0010) +[2023-10-09 07:18:29,871][60144] Updated weights for policy 1, policy_version 77052 (0.0011) +[2023-10-09 07:18:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 156925952. Throughput: 0: 1707.6, 1: 1709.7. Samples: 39239198. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:18:31,053][59242] Avg episode reward: [(0, '34.550'), (1, '31.220')] +[2023-10-09 07:18:33,370][60143] Updated weights for policy 0, policy_version 76202 (0.0007) +[2023-10-09 07:18:33,708][60144] Updated weights for policy 1, policy_version 77062 (0.0009) +[2023-10-09 07:18:33,739][60143] Updated weights for policy 0, policy_version 76212 (0.0007) +[2023-10-09 07:18:34,079][60144] Updated weights for policy 1, policy_version 77072 (0.0007) +[2023-10-09 07:18:34,111][60143] Updated weights for policy 0, policy_version 76222 (0.0008) +[2023-10-09 07:18:34,442][60144] Updated weights for policy 1, policy_version 77082 (0.0007) +[2023-10-09 07:18:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 156991488. Throughput: 0: 1708.3, 1: 1740.7. Samples: 39250364. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:18:36,053][59242] Avg episode reward: [(0, '34.560'), (1, '31.720')] +[2023-10-09 07:18:38,059][60143] Updated weights for policy 0, policy_version 76232 (0.0010) +[2023-10-09 07:18:38,427][60143] Updated weights for policy 0, policy_version 76242 (0.0009) +[2023-10-09 07:18:38,474][60144] Updated weights for policy 1, policy_version 77092 (0.0008) +[2023-10-09 07:18:38,791][60143] Updated weights for policy 0, policy_version 76252 (0.0009) +[2023-10-09 07:18:38,833][60144] Updated weights for policy 1, policy_version 77102 (0.0007) +[2023-10-09 07:18:39,206][60144] Updated weights for policy 1, policy_version 77112 (0.0008) +[2023-10-09 07:18:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 157057024. Throughput: 0: 1693.8, 1: 1698.8. Samples: 39269300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:18:41,053][59242] Avg episode reward: [(0, '35.330'), (1, '32.900')] +[2023-10-09 07:18:42,933][60143] Updated weights for policy 0, policy_version 76262 (0.0007) +[2023-10-09 07:18:43,171][60144] Updated weights for policy 1, policy_version 77122 (0.0008) +[2023-10-09 07:18:43,307][60143] Updated weights for policy 0, policy_version 76272 (0.0009) +[2023-10-09 07:18:43,527][60144] Updated weights for policy 1, policy_version 77132 (0.0007) +[2023-10-09 07:18:43,682][60143] Updated weights for policy 0, policy_version 76282 (0.0007) +[2023-10-09 07:18:43,894][60144] Updated weights for policy 1, policy_version 77142 (0.0008) +[2023-10-09 07:18:44,265][60144] Updated weights for policy 1, policy_version 77152 (0.0009) +[2023-10-09 07:18:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 157122560. Throughput: 0: 1710.5, 1: 1714.3. Samples: 39290552. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:18:46,053][59242] Avg episode reward: [(0, '35.290'), (1, '32.810')] +[2023-10-09 07:18:47,572][60143] Updated weights for policy 0, policy_version 76292 (0.0009) +[2023-10-09 07:18:47,945][60143] Updated weights for policy 0, policy_version 76302 (0.0008) +[2023-10-09 07:18:48,288][60144] Updated weights for policy 1, policy_version 77162 (0.0007) +[2023-10-09 07:18:48,316][60143] Updated weights for policy 0, policy_version 76312 (0.0007) +[2023-10-09 07:18:48,652][60144] Updated weights for policy 1, policy_version 77172 (0.0008) +[2023-10-09 07:18:49,018][60144] Updated weights for policy 1, policy_version 77182 (0.0008) +[2023-10-09 07:18:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 157188096. Throughput: 0: 1689.3, 1: 1713.6. Samples: 39300566. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:18:51,053][59242] Avg episode reward: [(0, '34.170'), (1, '32.770')] +[2023-10-09 07:18:52,373][60143] Updated weights for policy 0, policy_version 76322 (0.0009) +[2023-10-09 07:18:52,746][60143] Updated weights for policy 0, policy_version 76332 (0.0009) +[2023-10-09 07:18:53,067][60144] Updated weights for policy 1, policy_version 77192 (0.0009) +[2023-10-09 07:18:53,121][60143] Updated weights for policy 0, policy_version 76342 (0.0008) +[2023-10-09 07:18:53,435][60144] Updated weights for policy 1, policy_version 77202 (0.0008) +[2023-10-09 07:18:53,484][60143] Updated weights for policy 0, policy_version 76352 (0.0008) +[2023-10-09 07:18:53,791][60144] Updated weights for policy 1, policy_version 77212 (0.0007) +[2023-10-09 07:18:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 157253632. Throughput: 0: 1701.2, 1: 1696.1. Samples: 39321020. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:18:56,053][59242] Avg episode reward: [(0, '34.330'), (1, '34.200')] +[2023-10-09 07:18:57,492][60143] Updated weights for policy 0, policy_version 76362 (0.0008) +[2023-10-09 07:18:57,791][60144] Updated weights for policy 1, policy_version 77222 (0.0008) +[2023-10-09 07:18:57,864][60143] Updated weights for policy 0, policy_version 76372 (0.0007) +[2023-10-09 07:18:58,153][60144] Updated weights for policy 1, policy_version 77232 (0.0007) +[2023-10-09 07:18:58,226][60143] Updated weights for policy 0, policy_version 76382 (0.0009) +[2023-10-09 07:18:58,516][60144] Updated weights for policy 1, policy_version 77242 (0.0009) +[2023-10-09 07:19:01,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 157319168. Throughput: 0: 1724.1, 1: 1724.5. Samples: 39342310. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:19:01,052][59242] Avg episode reward: [(0, '35.770'), (1, '34.730')] +[2023-10-09 07:19:02,352][60143] Updated weights for policy 0, policy_version 76392 (0.0008) +[2023-10-09 07:19:02,542][60144] Updated weights for policy 1, policy_version 77252 (0.0008) +[2023-10-09 07:19:02,732][60143] Updated weights for policy 0, policy_version 76402 (0.0008) +[2023-10-09 07:19:02,944][60144] Updated weights for policy 1, policy_version 77262 (0.0007) +[2023-10-09 07:19:03,108][60143] Updated weights for policy 0, policy_version 76412 (0.0009) +[2023-10-09 07:19:03,318][60144] Updated weights for policy 1, policy_version 77272 (0.0009) +[2023-10-09 07:19:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 157384704. Throughput: 0: 1691.7, 1: 1692.4. Samples: 39351440. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:19:06,053][59242] Avg episode reward: [(0, '34.230'), (1, '34.800')] +[2023-10-09 07:19:06,916][60143] Updated weights for policy 0, policy_version 76422 (0.0008) +[2023-10-09 07:19:07,116][60144] Updated weights for policy 1, policy_version 77282 (0.0008) +[2023-10-09 07:19:07,281][60143] Updated weights for policy 0, policy_version 76432 (0.0007) +[2023-10-09 07:19:07,478][60144] Updated weights for policy 1, policy_version 77292 (0.0007) +[2023-10-09 07:19:07,652][60143] Updated weights for policy 0, policy_version 76442 (0.0008) +[2023-10-09 07:19:07,857][60144] Updated weights for policy 1, policy_version 77302 (0.0007) +[2023-10-09 07:19:08,216][60144] Updated weights for policy 1, policy_version 77312 (0.0011) +[2023-10-09 07:19:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 157450240. Throughput: 0: 1715.2, 1: 1702.3. Samples: 39372544. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:19:11,053][59242] Avg episode reward: [(0, '33.950'), (1, '33.910')] +[2023-10-09 07:19:11,705][60143] Updated weights for policy 0, policy_version 76452 (0.0009) +[2023-10-09 07:19:12,098][60143] Updated weights for policy 0, policy_version 76462 (0.0011) +[2023-10-09 07:19:12,212][60144] Updated weights for policy 1, policy_version 77322 (0.0007) +[2023-10-09 07:19:12,464][60143] Updated weights for policy 0, policy_version 76472 (0.0007) +[2023-10-09 07:19:12,576][60144] Updated weights for policy 1, policy_version 77332 (0.0007) +[2023-10-09 07:19:12,947][60144] Updated weights for policy 1, policy_version 77342 (0.0007) +[2023-10-09 07:19:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 157515776. Throughput: 0: 1710.6, 1: 1720.9. Samples: 39393618. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:19:16,053][59242] Avg episode reward: [(0, '35.400'), (1, '33.170')] +[2023-10-09 07:19:16,256][60143] Updated weights for policy 0, policy_version 76482 (0.0008) +[2023-10-09 07:19:16,630][60143] Updated weights for policy 0, policy_version 76492 (0.0009) +[2023-10-09 07:19:16,884][60144] Updated weights for policy 1, policy_version 77352 (0.0007) +[2023-10-09 07:19:17,005][60143] Updated weights for policy 0, policy_version 76502 (0.0009) +[2023-10-09 07:19:17,247][60144] Updated weights for policy 1, policy_version 77362 (0.0007) +[2023-10-09 07:19:17,379][60143] Updated weights for policy 0, policy_version 76512 (0.0008) +[2023-10-09 07:19:17,609][60144] Updated weights for policy 1, policy_version 77372 (0.0009) +[2023-10-09 07:19:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 157581312. Throughput: 0: 1698.9, 1: 1689.8. Samples: 39402852. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:19:21,053][59242] Avg episode reward: [(0, '35.660'), (1, '31.540')] +[2023-10-09 07:19:21,436][60143] Updated weights for policy 0, policy_version 76522 (0.0007) +[2023-10-09 07:19:21,491][60144] Updated weights for policy 1, policy_version 77382 (0.0008) +[2023-10-09 07:19:21,796][60143] Updated weights for policy 0, policy_version 76532 (0.0007) +[2023-10-09 07:19:21,850][60144] Updated weights for policy 1, policy_version 77392 (0.0007) +[2023-10-09 07:19:22,172][60143] Updated weights for policy 0, policy_version 76542 (0.0008) +[2023-10-09 07:19:22,217][60144] Updated weights for policy 1, policy_version 77402 (0.0007) +[2023-10-09 07:19:26,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 157646848. Throughput: 0: 1710.2, 1: 1725.1. Samples: 39423888. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:19:26,053][59242] Avg episode reward: [(0, '35.350'), (1, '31.540')] +[2023-10-09 07:19:26,244][60143] Updated weights for policy 0, policy_version 76552 (0.0009) +[2023-10-09 07:19:26,277][60144] Updated weights for policy 1, policy_version 77412 (0.0007) +[2023-10-09 07:19:26,614][60143] Updated weights for policy 0, policy_version 76562 (0.0007) +[2023-10-09 07:19:26,643][60144] Updated weights for policy 1, policy_version 77422 (0.0009) +[2023-10-09 07:19:26,976][60143] Updated weights for policy 0, policy_version 76572 (0.0007) +[2023-10-09 07:19:27,000][60144] Updated weights for policy 1, policy_version 77432 (0.0007) +[2023-10-09 07:19:31,000][60144] Updated weights for policy 1, policy_version 77442 (0.0007) +[2023-10-09 07:19:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 157712384. Throughput: 0: 1708.5, 1: 1721.2. Samples: 39444890. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:19:31,053][59242] Avg episode reward: [(0, '35.220'), (1, '32.880')] +[2023-10-09 07:19:31,134][60143] Updated weights for policy 0, policy_version 76582 (0.0008) +[2023-10-09 07:19:31,360][60144] Updated weights for policy 1, policy_version 77452 (0.0008) +[2023-10-09 07:19:31,510][60143] Updated weights for policy 0, policy_version 76592 (0.0008) +[2023-10-09 07:19:31,726][60144] Updated weights for policy 1, policy_version 77462 (0.0007) +[2023-10-09 07:19:31,872][60143] Updated weights for policy 0, policy_version 76602 (0.0010) +[2023-10-09 07:19:32,086][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000076608_78446592.pth... +[2023-10-09 07:19:32,089][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000077472_79331328.pth... +[2023-10-09 07:19:32,095][60144] Updated weights for policy 1, policy_version 77472 (0.0007) +[2023-10-09 07:19:32,126][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000075008_76808192.pth +[2023-10-09 07:19:32,128][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000075840_77660160.pth +[2023-10-09 07:19:35,991][60143] Updated weights for policy 0, policy_version 76612 (0.0007) +[2023-10-09 07:19:36,024][60144] Updated weights for policy 1, policy_version 77482 (0.0008) +[2023-10-09 07:19:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 157777920. Throughput: 0: 1700.6, 1: 1711.2. Samples: 39454096. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:19:36,053][59242] Avg episode reward: [(0, '34.660'), (1, '33.600')] +[2023-10-09 07:19:36,362][60143] Updated weights for policy 0, policy_version 76622 (0.0007) +[2023-10-09 07:19:36,389][60144] Updated weights for policy 1, policy_version 77492 (0.0008) +[2023-10-09 07:19:36,727][60143] Updated weights for policy 0, policy_version 76632 (0.0008) +[2023-10-09 07:19:36,755][60144] Updated weights for policy 1, policy_version 77502 (0.0007) +[2023-10-09 07:19:40,724][60143] Updated weights for policy 0, policy_version 76642 (0.0010) +[2023-10-09 07:19:40,800][60144] Updated weights for policy 1, policy_version 77512 (0.0007) +[2023-10-09 07:19:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 157843456. Throughput: 0: 1704.4, 1: 1722.7. Samples: 39475238. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:19:41,053][59242] Avg episode reward: [(0, '35.590'), (1, '34.330')] +[2023-10-09 07:19:41,101][60143] Updated weights for policy 0, policy_version 76652 (0.0007) +[2023-10-09 07:19:41,162][60144] Updated weights for policy 1, policy_version 77522 (0.0010) +[2023-10-09 07:19:41,465][60143] Updated weights for policy 0, policy_version 76662 (0.0008) +[2023-10-09 07:19:41,517][60144] Updated weights for policy 1, policy_version 77532 (0.0009) +[2023-10-09 07:19:41,823][60143] Updated weights for policy 0, policy_version 76672 (0.0009) +[2023-10-09 07:19:45,619][60144] Updated weights for policy 1, policy_version 77542 (0.0008) +[2023-10-09 07:19:45,663][60143] Updated weights for policy 0, policy_version 76682 (0.0009) +[2023-10-09 07:19:45,978][60144] Updated weights for policy 1, policy_version 77552 (0.0008) +[2023-10-09 07:19:46,030][60143] Updated weights for policy 0, policy_version 76692 (0.0007) +[2023-10-09 07:19:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 157908992. Throughput: 0: 1697.4, 1: 1722.7. Samples: 39496216. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:19:46,053][59242] Avg episode reward: [(0, '33.700'), (1, '34.780')] +[2023-10-09 07:19:46,353][60144] Updated weights for policy 1, policy_version 77562 (0.0008) +[2023-10-09 07:19:46,395][60143] Updated weights for policy 0, policy_version 76702 (0.0008) +[2023-10-09 07:19:50,221][60143] Updated weights for policy 0, policy_version 76712 (0.0007) +[2023-10-09 07:19:50,417][60144] Updated weights for policy 1, policy_version 77572 (0.0009) +[2023-10-09 07:19:50,591][60143] Updated weights for policy 0, policy_version 76722 (0.0007) +[2023-10-09 07:19:50,805][60144] Updated weights for policy 1, policy_version 77582 (0.0008) +[2023-10-09 07:19:50,952][60143] Updated weights for policy 0, policy_version 76732 (0.0008) +[2023-10-09 07:19:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 157974528. Throughput: 0: 1710.0, 1: 1725.6. Samples: 39506044. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:19:51,052][59242] Avg episode reward: [(0, '32.300'), (1, '34.060')] +[2023-10-09 07:19:51,169][60144] Updated weights for policy 1, policy_version 77592 (0.0008) +[2023-10-09 07:19:55,020][60144] Updated weights for policy 1, policy_version 77602 (0.0011) +[2023-10-09 07:19:55,036][60143] Updated weights for policy 0, policy_version 76742 (0.0009) +[2023-10-09 07:19:55,384][60144] Updated weights for policy 1, policy_version 77612 (0.0008) +[2023-10-09 07:19:55,408][60143] Updated weights for policy 0, policy_version 76752 (0.0007) +[2023-10-09 07:19:55,747][60144] Updated weights for policy 1, policy_version 77622 (0.0007) +[2023-10-09 07:19:55,780][60143] Updated weights for policy 0, policy_version 76762 (0.0007) +[2023-10-09 07:19:56,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 158072832. Throughput: 0: 1712.7, 1: 1726.8. Samples: 39527318. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:19:56,053][59242] Avg episode reward: [(0, '31.930'), (1, '34.770')] +[2023-10-09 07:19:56,124][60144] Updated weights for policy 1, policy_version 77632 (0.0008) +[2023-10-09 07:19:59,777][60143] Updated weights for policy 0, policy_version 76772 (0.0007) +[2023-10-09 07:19:59,972][60144] Updated weights for policy 1, policy_version 77642 (0.0007) +[2023-10-09 07:20:00,159][60143] Updated weights for policy 0, policy_version 76782 (0.0008) +[2023-10-09 07:20:00,334][60144] Updated weights for policy 1, policy_version 77652 (0.0007) +[2023-10-09 07:20:00,529][60143] Updated weights for policy 0, policy_version 76792 (0.0009) +[2023-10-09 07:20:00,694][60144] Updated weights for policy 1, policy_version 77662 (0.0007) +[2023-10-09 07:20:01,052][59242] Fps is (10 sec: 19660.6, 60 sec: 14199.4, 300 sec: 13884.8). Total num frames: 158171136. Throughput: 0: 1692.0, 1: 1706.4. Samples: 39546546. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:20:01,053][59242] Avg episode reward: [(0, '32.530'), (1, '35.350')] +[2023-10-09 07:20:04,515][60144] Updated weights for policy 1, policy_version 77672 (0.0007) +[2023-10-09 07:20:04,543][60143] Updated weights for policy 0, policy_version 76802 (0.0007) +[2023-10-09 07:20:04,884][60144] Updated weights for policy 1, policy_version 77682 (0.0008) +[2023-10-09 07:20:04,905][60143] Updated weights for policy 0, policy_version 76812 (0.0007) +[2023-10-09 07:20:05,252][60144] Updated weights for policy 1, policy_version 77692 (0.0007) +[2023-10-09 07:20:05,275][60143] Updated weights for policy 0, policy_version 76822 (0.0008) +[2023-10-09 07:20:05,647][60143] Updated weights for policy 0, policy_version 76832 (0.0009) +[2023-10-09 07:20:06,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 158236672. Throughput: 0: 1712.0, 1: 1733.9. Samples: 39557918. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:20:06,053][59242] Avg episode reward: [(0, '32.060'), (1, '34.800')] +[2023-10-09 07:20:09,363][60144] Updated weights for policy 1, policy_version 77702 (0.0008) +[2023-10-09 07:20:09,587][60143] Updated weights for policy 0, policy_version 76842 (0.0009) +[2023-10-09 07:20:09,735][60144] Updated weights for policy 1, policy_version 77712 (0.0008) +[2023-10-09 07:20:09,956][60143] Updated weights for policy 0, policy_version 76852 (0.0008) +[2023-10-09 07:20:10,096][60144] Updated weights for policy 1, policy_version 77722 (0.0007) +[2023-10-09 07:20:10,327][60143] Updated weights for policy 0, policy_version 76862 (0.0009) +[2023-10-09 07:20:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 158302208. Throughput: 0: 1714.1, 1: 1719.7. Samples: 39578408. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:20:11,053][59242] Avg episode reward: [(0, '32.310'), (1, '34.180')] +[2023-10-09 07:20:14,024][60144] Updated weights for policy 1, policy_version 77732 (0.0007) +[2023-10-09 07:20:14,211][60143] Updated weights for policy 0, policy_version 76872 (0.0007) +[2023-10-09 07:20:14,396][60144] Updated weights for policy 1, policy_version 77742 (0.0010) +[2023-10-09 07:20:14,575][60143] Updated weights for policy 0, policy_version 76882 (0.0007) +[2023-10-09 07:20:14,755][60144] Updated weights for policy 1, policy_version 77752 (0.0009) +[2023-10-09 07:20:14,941][60143] Updated weights for policy 0, policy_version 76892 (0.0008) +[2023-10-09 07:20:16,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 158367744. Throughput: 0: 1696.2, 1: 1703.6. Samples: 39597878. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:20:16,053][59242] Avg episode reward: [(0, '32.890'), (1, '33.820')] +[2023-10-09 07:20:18,787][60144] Updated weights for policy 1, policy_version 77762 (0.0009) +[2023-10-09 07:20:18,911][60143] Updated weights for policy 0, policy_version 76902 (0.0008) +[2023-10-09 07:20:19,151][60144] Updated weights for policy 1, policy_version 77772 (0.0008) +[2023-10-09 07:20:19,275][60143] Updated weights for policy 0, policy_version 76912 (0.0007) +[2023-10-09 07:20:19,518][60144] Updated weights for policy 1, policy_version 77782 (0.0009) +[2023-10-09 07:20:19,652][60143] Updated weights for policy 0, policy_version 76922 (0.0008) +[2023-10-09 07:20:19,877][60144] Updated weights for policy 1, policy_version 77792 (0.0007) +[2023-10-09 07:20:21,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 158433280. Throughput: 0: 1728.3, 1: 1730.6. Samples: 39609744. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:20:21,053][59242] Avg episode reward: [(0, '30.940'), (1, '33.560')] +[2023-10-09 07:20:23,655][60143] Updated weights for policy 0, policy_version 76932 (0.0008) +[2023-10-09 07:20:23,876][60144] Updated weights for policy 1, policy_version 77802 (0.0007) +[2023-10-09 07:20:24,018][60143] Updated weights for policy 0, policy_version 76942 (0.0008) +[2023-10-09 07:20:24,236][60144] Updated weights for policy 1, policy_version 77812 (0.0007) +[2023-10-09 07:20:24,389][60143] Updated weights for policy 0, policy_version 76952 (0.0008) +[2023-10-09 07:20:24,604][60144] Updated weights for policy 1, policy_version 77822 (0.0007) +[2023-10-09 07:20:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 158498816. Throughput: 0: 1702.2, 1: 1706.8. Samples: 39628644. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:20:26,053][59242] Avg episode reward: [(0, '30.750'), (1, '35.080')] +[2023-10-09 07:20:28,226][60143] Updated weights for policy 0, policy_version 76962 (0.0008) +[2023-10-09 07:20:28,606][60143] Updated weights for policy 0, policy_version 76972 (0.0010) +[2023-10-09 07:20:28,644][60144] Updated weights for policy 1, policy_version 77832 (0.0007) +[2023-10-09 07:20:28,966][60143] Updated weights for policy 0, policy_version 76982 (0.0008) +[2023-10-09 07:20:29,001][60144] Updated weights for policy 1, policy_version 77842 (0.0008) +[2023-10-09 07:20:29,332][60143] Updated weights for policy 0, policy_version 76992 (0.0008) +[2023-10-09 07:20:29,359][60144] Updated weights for policy 1, policy_version 77852 (0.0008) +[2023-10-09 07:20:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 158564352. Throughput: 0: 1704.7, 1: 1704.4. Samples: 39649628. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:20:31,053][59242] Avg episode reward: [(0, '31.430'), (1, '35.590')] +[2023-10-09 07:20:33,193][60144] Updated weights for policy 1, policy_version 77862 (0.0009) +[2023-10-09 07:20:33,333][60143] Updated weights for policy 0, policy_version 77002 (0.0009) +[2023-10-09 07:20:33,556][60144] Updated weights for policy 1, policy_version 77872 (0.0009) +[2023-10-09 07:20:33,711][60143] Updated weights for policy 0, policy_version 77012 (0.0007) +[2023-10-09 07:20:33,927][60144] Updated weights for policy 1, policy_version 77882 (0.0007) +[2023-10-09 07:20:34,076][60143] Updated weights for policy 0, policy_version 77022 (0.0008) +[2023-10-09 07:20:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 158629888. Throughput: 0: 1709.6, 1: 1719.7. Samples: 39660362. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:20:36,053][59242] Avg episode reward: [(0, '31.850'), (1, '35.360')] +[2023-10-09 07:20:37,821][60144] Updated weights for policy 1, policy_version 77892 (0.0008) +[2023-10-09 07:20:38,089][60143] Updated weights for policy 0, policy_version 77032 (0.0009) +[2023-10-09 07:20:38,175][60144] Updated weights for policy 1, policy_version 77902 (0.0008) +[2023-10-09 07:20:38,465][60143] Updated weights for policy 0, policy_version 77042 (0.0008) +[2023-10-09 07:20:38,538][60144] Updated weights for policy 1, policy_version 77912 (0.0009) +[2023-10-09 07:20:38,847][60143] Updated weights for policy 0, policy_version 77052 (0.0009) +[2023-10-09 07:20:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 158695424. Throughput: 0: 1688.2, 1: 1706.2. Samples: 39680068. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:20:41,052][59242] Avg episode reward: [(0, '31.100'), (1, '35.320')] +[2023-10-09 07:20:42,687][60144] Updated weights for policy 1, policy_version 77922 (0.0008) +[2023-10-09 07:20:42,866][60143] Updated weights for policy 0, policy_version 77062 (0.0010) +[2023-10-09 07:20:43,100][60144] Updated weights for policy 1, policy_version 77932 (0.0007) +[2023-10-09 07:20:43,242][60143] Updated weights for policy 0, policy_version 77072 (0.0009) +[2023-10-09 07:20:43,475][60144] Updated weights for policy 1, policy_version 77942 (0.0010) +[2023-10-09 07:20:43,613][60143] Updated weights for policy 0, policy_version 77082 (0.0007) +[2023-10-09 07:20:43,833][60144] Updated weights for policy 1, policy_version 77952 (0.0009) +[2023-10-09 07:20:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 158760960. Throughput: 0: 1715.6, 1: 1719.6. Samples: 39701130. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:20:46,053][59242] Avg episode reward: [(0, '31.420'), (1, '34.410')] +[2023-10-09 07:20:47,640][60143] Updated weights for policy 0, policy_version 77092 (0.0009) +[2023-10-09 07:20:47,871][60144] Updated weights for policy 1, policy_version 77962 (0.0008) +[2023-10-09 07:20:48,029][60143] Updated weights for policy 0, policy_version 77102 (0.0009) +[2023-10-09 07:20:48,235][60144] Updated weights for policy 1, policy_version 77972 (0.0008) +[2023-10-09 07:20:48,395][60143] Updated weights for policy 0, policy_version 77112 (0.0009) +[2023-10-09 07:20:48,595][60144] Updated weights for policy 1, policy_version 77982 (0.0009) +[2023-10-09 07:20:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 158826496. Throughput: 0: 1697.6, 1: 1695.2. Samples: 39710592. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:20:51,052][59242] Avg episode reward: [(0, '31.690'), (1, '34.410')] +[2023-10-09 07:20:52,452][60143] Updated weights for policy 0, policy_version 77122 (0.0008) +[2023-10-09 07:20:52,624][60144] Updated weights for policy 1, policy_version 77992 (0.0009) +[2023-10-09 07:20:52,817][60143] Updated weights for policy 0, policy_version 77132 (0.0009) +[2023-10-09 07:20:52,990][60144] Updated weights for policy 1, policy_version 78002 (0.0007) +[2023-10-09 07:20:53,188][60143] Updated weights for policy 0, policy_version 77142 (0.0010) +[2023-10-09 07:20:53,354][60144] Updated weights for policy 1, policy_version 78012 (0.0007) +[2023-10-09 07:20:53,555][60143] Updated weights for policy 0, policy_version 77152 (0.0008) +[2023-10-09 07:20:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 158892032. Throughput: 0: 1695.7, 1: 1703.5. Samples: 39731372. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:20:56,053][59242] Avg episode reward: [(0, '32.040'), (1, '36.440')] +[2023-10-09 07:20:57,241][60144] Updated weights for policy 1, policy_version 78022 (0.0007) +[2023-10-09 07:20:57,602][60144] Updated weights for policy 1, policy_version 78032 (0.0007) +[2023-10-09 07:20:57,732][60143] Updated weights for policy 0, policy_version 77162 (0.0009) +[2023-10-09 07:20:57,968][60144] Updated weights for policy 1, policy_version 78042 (0.0008) +[2023-10-09 07:20:58,106][60143] Updated weights for policy 0, policy_version 77172 (0.0008) +[2023-10-09 07:20:58,483][60143] Updated weights for policy 0, policy_version 77182 (0.0010) +[2023-10-09 07:21:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 158957568. Throughput: 0: 1711.2, 1: 1728.3. Samples: 39752656. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:21:01,053][59242] Avg episode reward: [(0, '31.180'), (1, '35.530')] +[2023-10-09 07:21:01,771][60144] Updated weights for policy 1, policy_version 78052 (0.0007) +[2023-10-09 07:21:02,147][60144] Updated weights for policy 1, policy_version 78062 (0.0009) +[2023-10-09 07:21:02,520][60144] Updated weights for policy 1, policy_version 78072 (0.0007) +[2023-10-09 07:21:02,550][60143] Updated weights for policy 0, policy_version 77192 (0.0009) +[2023-10-09 07:21:02,922][60143] Updated weights for policy 0, policy_version 77202 (0.0008) +[2023-10-09 07:21:03,284][60143] Updated weights for policy 0, policy_version 77212 (0.0007) +[2023-10-09 07:21:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 159023104. Throughput: 0: 1680.6, 1: 1701.4. Samples: 39761932. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:21:06,053][59242] Avg episode reward: [(0, '31.800'), (1, '35.080')] +[2023-10-09 07:21:06,348][60144] Updated weights for policy 1, policy_version 78082 (0.0007) +[2023-10-09 07:21:06,710][60144] Updated weights for policy 1, policy_version 78092 (0.0009) +[2023-10-09 07:21:07,090][60144] Updated weights for policy 1, policy_version 78102 (0.0009) +[2023-10-09 07:21:07,210][60143] Updated weights for policy 0, policy_version 77222 (0.0009) +[2023-10-09 07:21:07,462][60144] Updated weights for policy 1, policy_version 78112 (0.0007) +[2023-10-09 07:21:07,578][60143] Updated weights for policy 0, policy_version 77232 (0.0010) +[2023-10-09 07:21:07,957][60143] Updated weights for policy 0, policy_version 77242 (0.0011) +[2023-10-09 07:21:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 159088640. Throughput: 0: 1703.7, 1: 1729.4. Samples: 39783132. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:21:11,053][59242] Avg episode reward: [(0, '32.870'), (1, '34.950')] +[2023-10-09 07:21:11,561][60144] Updated weights for policy 1, policy_version 78122 (0.0008) +[2023-10-09 07:21:11,848][60143] Updated weights for policy 0, policy_version 77252 (0.0011) +[2023-10-09 07:21:11,930][60144] Updated weights for policy 1, policy_version 78132 (0.0007) +[2023-10-09 07:21:12,222][60143] Updated weights for policy 0, policy_version 77262 (0.0009) +[2023-10-09 07:21:12,294][60144] Updated weights for policy 1, policy_version 78142 (0.0010) +[2023-10-09 07:21:12,592][60143] Updated weights for policy 0, policy_version 77272 (0.0008) +[2023-10-09 07:21:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 159154176. Throughput: 0: 1703.7, 1: 1733.0. Samples: 39804280. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:21:16,053][59242] Avg episode reward: [(0, '34.230'), (1, '33.530')] +[2023-10-09 07:21:16,486][60144] Updated weights for policy 1, policy_version 78152 (0.0009) +[2023-10-09 07:21:16,667][60143] Updated weights for policy 0, policy_version 77282 (0.0008) +[2023-10-09 07:21:16,852][60144] Updated weights for policy 1, policy_version 78162 (0.0008) +[2023-10-09 07:21:17,041][60143] Updated weights for policy 0, policy_version 77292 (0.0009) +[2023-10-09 07:21:17,214][60144] Updated weights for policy 1, policy_version 78172 (0.0007) +[2023-10-09 07:21:17,421][60143] Updated weights for policy 0, policy_version 77302 (0.0007) +[2023-10-09 07:21:17,792][60143] Updated weights for policy 0, policy_version 77312 (0.0008) +[2023-10-09 07:21:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 159219712. Throughput: 0: 1685.4, 1: 1715.9. Samples: 39813420. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-09 07:21:21,053][59242] Avg episode reward: [(0, '34.170'), (1, '34.440')] +[2023-10-09 07:21:21,090][60144] Updated weights for policy 1, policy_version 78182 (0.0007) +[2023-10-09 07:21:21,449][60144] Updated weights for policy 1, policy_version 78192 (0.0007) +[2023-10-09 07:21:21,819][60144] Updated weights for policy 1, policy_version 78202 (0.0008) +[2023-10-09 07:21:21,839][60143] Updated weights for policy 0, policy_version 77322 (0.0008) +[2023-10-09 07:21:22,214][60143] Updated weights for policy 0, policy_version 77332 (0.0008) +[2023-10-09 07:21:22,585][60143] Updated weights for policy 0, policy_version 77342 (0.0007) +[2023-10-09 07:21:25,617][60144] Updated weights for policy 1, policy_version 78212 (0.0007) +[2023-10-09 07:21:25,986][60144] Updated weights for policy 1, policy_version 78222 (0.0007) +[2023-10-09 07:21:26,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 159285248. Throughput: 0: 1700.7, 1: 1732.2. Samples: 39834548. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-09 07:21:26,052][59242] Avg episode reward: [(0, '34.070'), (1, '34.760')] +[2023-10-09 07:21:26,344][60144] Updated weights for policy 1, policy_version 78232 (0.0009) +[2023-10-09 07:21:26,588][60143] Updated weights for policy 0, policy_version 77352 (0.0008) +[2023-10-09 07:21:26,957][60143] Updated weights for policy 0, policy_version 77362 (0.0007) +[2023-10-09 07:21:27,333][60143] Updated weights for policy 0, policy_version 77372 (0.0008) +[2023-10-09 07:21:30,138][60144] Updated weights for policy 1, policy_version 78242 (0.0008) +[2023-10-09 07:21:30,547][60144] Updated weights for policy 1, policy_version 78252 (0.0007) +[2023-10-09 07:21:30,915][60144] Updated weights for policy 1, policy_version 78262 (0.0009) +[2023-10-09 07:21:31,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 159350784. Throughput: 0: 1700.7, 1: 1729.1. Samples: 39855472. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-09 07:21:31,053][59242] Avg episode reward: [(0, '33.250'), (1, '33.990')] +[2023-10-09 07:21:31,277][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000078272_80150528.pth... +[2023-10-09 07:21:31,282][60144] Updated weights for policy 1, policy_version 78272 (0.0009) +[2023-10-09 07:21:31,306][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000076640_78479360.pth +[2023-10-09 07:21:31,325][60143] Updated weights for policy 0, policy_version 77382 (0.0008) +[2023-10-09 07:21:31,685][60143] Updated weights for policy 0, policy_version 77392 (0.0008) +[2023-10-09 07:21:32,049][60143] Updated weights for policy 0, policy_version 77402 (0.0009) +[2023-10-09 07:21:32,273][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000077408_79265792.pth... +[2023-10-09 07:21:32,302][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000075808_77627392.pth +[2023-10-09 07:21:35,319][60144] Updated weights for policy 1, policy_version 78282 (0.0008) +[2023-10-09 07:21:35,689][60144] Updated weights for policy 1, policy_version 78292 (0.0009) +[2023-10-09 07:21:36,049][60144] Updated weights for policy 1, policy_version 78302 (0.0009) +[2023-10-09 07:21:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 159416320. Throughput: 0: 1696.4, 1: 1740.0. Samples: 39865230. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-09 07:21:36,052][59242] Avg episode reward: [(0, '35.100'), (1, '33.450')] +[2023-10-09 07:21:36,191][60143] Updated weights for policy 0, policy_version 77412 (0.0009) +[2023-10-09 07:21:36,586][60143] Updated weights for policy 0, policy_version 77422 (0.0011) +[2023-10-09 07:21:36,954][60143] Updated weights for policy 0, policy_version 77432 (0.0011) +[2023-10-09 07:21:39,888][60144] Updated weights for policy 1, policy_version 78312 (0.0009) +[2023-10-09 07:21:40,251][60144] Updated weights for policy 1, policy_version 78322 (0.0010) +[2023-10-09 07:21:40,617][60144] Updated weights for policy 1, policy_version 78332 (0.0010) +[2023-10-09 07:21:40,821][60143] Updated weights for policy 0, policy_version 77442 (0.0007) +[2023-10-09 07:21:41,052][59242] Fps is (10 sec: 16384.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 159514624. Throughput: 0: 1699.1, 1: 1744.9. Samples: 39886350. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-09 07:21:41,053][59242] Avg episode reward: [(0, '34.720'), (1, '33.940')] +[2023-10-09 07:21:41,187][60143] Updated weights for policy 0, policy_version 77452 (0.0008) +[2023-10-09 07:21:41,554][60143] Updated weights for policy 0, policy_version 77462 (0.0007) +[2023-10-09 07:21:41,925][60143] Updated weights for policy 0, policy_version 77472 (0.0010) +[2023-10-09 07:21:44,548][60144] Updated weights for policy 1, policy_version 78342 (0.0008) +[2023-10-09 07:21:44,921][60144] Updated weights for policy 1, policy_version 78352 (0.0009) +[2023-10-09 07:21:45,284][60144] Updated weights for policy 1, policy_version 78362 (0.0008) +[2023-10-09 07:21:45,836][60143] Updated weights for policy 0, policy_version 77482 (0.0009) +[2023-10-09 07:21:46,052][59242] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 159580160. Throughput: 0: 1700.0, 1: 1710.8. Samples: 39906142. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-09 07:21:46,053][59242] Avg episode reward: [(0, '34.800'), (1, '33.790')] +[2023-10-09 07:21:46,200][60143] Updated weights for policy 0, policy_version 77492 (0.0007) +[2023-10-09 07:21:46,568][60143] Updated weights for policy 0, policy_version 77502 (0.0007) +[2023-10-09 07:21:49,226][60144] Updated weights for policy 1, policy_version 78372 (0.0008) +[2023-10-09 07:21:49,594][60144] Updated weights for policy 1, policy_version 78382 (0.0007) +[2023-10-09 07:21:49,958][60144] Updated weights for policy 1, policy_version 78392 (0.0009) +[2023-10-09 07:21:50,658][60143] Updated weights for policy 0, policy_version 77512 (0.0009) +[2023-10-09 07:21:51,021][60143] Updated weights for policy 0, policy_version 77522 (0.0010) +[2023-10-09 07:21:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 159645696. Throughput: 0: 1700.8, 1: 1737.9. Samples: 39916672. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-09 07:21:51,053][59242] Avg episode reward: [(0, '33.950'), (1, '34.600')] +[2023-10-09 07:21:51,390][60143] Updated weights for policy 0, policy_version 77532 (0.0010) +[2023-10-09 07:21:54,036][60144] Updated weights for policy 1, policy_version 78402 (0.0009) +[2023-10-09 07:21:54,407][60144] Updated weights for policy 1, policy_version 78412 (0.0010) +[2023-10-09 07:21:54,786][60144] Updated weights for policy 1, policy_version 78422 (0.0011) +[2023-10-09 07:21:55,144][60144] Updated weights for policy 1, policy_version 78432 (0.0010) +[2023-10-09 07:21:55,284][60143] Updated weights for policy 0, policy_version 77542 (0.0008) +[2023-10-09 07:21:55,655][60143] Updated weights for policy 0, policy_version 77552 (0.0008) +[2023-10-09 07:21:56,015][60143] Updated weights for policy 0, policy_version 77562 (0.0011) +[2023-10-09 07:21:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 159711232. Throughput: 0: 1702.5, 1: 1717.2. Samples: 39937018. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-09 07:21:56,053][59242] Avg episode reward: [(0, '33.730'), (1, '35.050')] +[2023-10-09 07:21:59,236][60144] Updated weights for policy 1, policy_version 78442 (0.0009) +[2023-10-09 07:21:59,610][60144] Updated weights for policy 1, policy_version 78452 (0.0008) +[2023-10-09 07:21:59,980][60144] Updated weights for policy 1, policy_version 78462 (0.0008) +[2023-10-09 07:22:00,048][60143] Updated weights for policy 0, policy_version 77572 (0.0008) +[2023-10-09 07:22:00,422][60143] Updated weights for policy 0, policy_version 77582 (0.0008) +[2023-10-09 07:22:00,795][60143] Updated weights for policy 0, policy_version 77592 (0.0009) +[2023-10-09 07:22:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 159776768. Throughput: 0: 1688.1, 1: 1700.3. Samples: 39956758. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-09 07:22:01,053][59242] Avg episode reward: [(0, '33.900'), (1, '33.500')] +[2023-10-09 07:22:03,674][60144] Updated weights for policy 1, policy_version 78472 (0.0007) +[2023-10-09 07:22:04,031][60144] Updated weights for policy 1, policy_version 78482 (0.0010) +[2023-10-09 07:22:04,397][60144] Updated weights for policy 1, policy_version 78492 (0.0009) +[2023-10-09 07:22:04,787][60143] Updated weights for policy 0, policy_version 77602 (0.0009) +[2023-10-09 07:22:05,145][60143] Updated weights for policy 0, policy_version 77612 (0.0011) +[2023-10-09 07:22:05,508][60143] Updated weights for policy 0, policy_version 77622 (0.0008) +[2023-10-09 07:22:05,879][60143] Updated weights for policy 0, policy_version 77632 (0.0008) +[2023-10-09 07:22:06,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 159875072. Throughput: 0: 1707.5, 1: 1727.3. Samples: 39967984. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-09 07:22:06,053][59242] Avg episode reward: [(0, '32.770'), (1, '33.570')] +[2023-10-09 07:22:08,383][60144] Updated weights for policy 1, policy_version 78502 (0.0008) +[2023-10-09 07:22:08,754][60144] Updated weights for policy 1, policy_version 78512 (0.0008) +[2023-10-09 07:22:09,118][60144] Updated weights for policy 1, policy_version 78522 (0.0008) +[2023-10-09 07:22:09,795][60143] Updated weights for policy 0, policy_version 77642 (0.0011) +[2023-10-09 07:22:10,160][60143] Updated weights for policy 0, policy_version 77652 (0.0010) +[2023-10-09 07:22:10,527][60143] Updated weights for policy 0, policy_version 77662 (0.0008) +[2023-10-09 07:22:11,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 159940608. Throughput: 0: 1715.5, 1: 1700.7. Samples: 39988282. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:22:11,053][59242] Avg episode reward: [(0, '33.110'), (1, '33.780')] +[2023-10-09 07:22:13,121][60144] Updated weights for policy 1, policy_version 78532 (0.0007) +[2023-10-09 07:22:13,489][60144] Updated weights for policy 1, policy_version 78542 (0.0008) +[2023-10-09 07:22:13,864][60144] Updated weights for policy 1, policy_version 78552 (0.0008) +[2023-10-09 07:22:14,540][60143] Updated weights for policy 0, policy_version 77672 (0.0009) +[2023-10-09 07:22:14,906][60143] Updated weights for policy 0, policy_version 77682 (0.0011) +[2023-10-09 07:22:15,277][60143] Updated weights for policy 0, policy_version 77692 (0.0009) +[2023-10-09 07:22:16,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 160006144. Throughput: 0: 1684.5, 1: 1706.9. Samples: 40008086. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:22:16,053][59242] Avg episode reward: [(0, '32.470'), (1, '33.650')] +[2023-10-09 07:22:17,974][60144] Updated weights for policy 1, policy_version 78562 (0.0007) +[2023-10-09 07:22:18,393][60144] Updated weights for policy 1, policy_version 78572 (0.0009) +[2023-10-09 07:22:18,757][60144] Updated weights for policy 1, policy_version 78582 (0.0007) +[2023-10-09 07:22:19,123][60144] Updated weights for policy 1, policy_version 78592 (0.0008) +[2023-10-09 07:22:19,215][60143] Updated weights for policy 0, policy_version 77702 (0.0010) +[2023-10-09 07:22:19,582][60143] Updated weights for policy 0, policy_version 77712 (0.0010) +[2023-10-09 07:22:19,958][60143] Updated weights for policy 0, policy_version 77722 (0.0009) +[2023-10-09 07:22:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 160071680. Throughput: 0: 1717.3, 1: 1705.0. Samples: 40019234. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:22:21,053][59242] Avg episode reward: [(0, '33.390'), (1, '34.340')] +[2023-10-09 07:22:23,138][60144] Updated weights for policy 1, policy_version 78602 (0.0007) +[2023-10-09 07:22:23,501][60144] Updated weights for policy 1, policy_version 78612 (0.0010) +[2023-10-09 07:22:23,867][60144] Updated weights for policy 1, policy_version 78622 (0.0009) +[2023-10-09 07:22:23,941][60143] Updated weights for policy 0, policy_version 77732 (0.0009) +[2023-10-09 07:22:24,329][60143] Updated weights for policy 0, policy_version 77742 (0.0010) +[2023-10-09 07:22:24,693][60143] Updated weights for policy 0, policy_version 77752 (0.0010) +[2023-10-09 07:22:26,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 160137216. Throughput: 0: 1702.0, 1: 1689.9. Samples: 40038984. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:22:26,053][59242] Avg episode reward: [(0, '34.270'), (1, '34.690')] +[2023-10-09 07:22:27,780][60144] Updated weights for policy 1, policy_version 78632 (0.0008) +[2023-10-09 07:22:28,147][60144] Updated weights for policy 1, policy_version 78642 (0.0007) +[2023-10-09 07:22:28,519][60144] Updated weights for policy 1, policy_version 78652 (0.0009) +[2023-10-09 07:22:28,543][60143] Updated weights for policy 0, policy_version 77762 (0.0008) +[2023-10-09 07:22:28,911][60143] Updated weights for policy 0, policy_version 77772 (0.0007) +[2023-10-09 07:22:29,291][60143] Updated weights for policy 0, policy_version 77782 (0.0008) +[2023-10-09 07:22:29,658][60143] Updated weights for policy 0, policy_version 77792 (0.0007) +[2023-10-09 07:22:31,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 160202752. Throughput: 0: 1697.7, 1: 1718.2. Samples: 40059860. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:22:31,053][59242] Avg episode reward: [(0, '34.460'), (1, '33.090')] +[2023-10-09 07:22:32,501][60144] Updated weights for policy 1, policy_version 78662 (0.0010) +[2023-10-09 07:22:32,865][60144] Updated weights for policy 1, policy_version 78672 (0.0011) +[2023-10-09 07:22:33,234][60144] Updated weights for policy 1, policy_version 78682 (0.0010) +[2023-10-09 07:22:33,638][60143] Updated weights for policy 0, policy_version 77802 (0.0008) +[2023-10-09 07:22:34,004][60143] Updated weights for policy 0, policy_version 77812 (0.0010) +[2023-10-09 07:22:34,386][60143] Updated weights for policy 0, policy_version 77822 (0.0009) +[2023-10-09 07:22:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 160268288. Throughput: 0: 1724.5, 1: 1686.8. Samples: 40070182. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:22:36,053][59242] Avg episode reward: [(0, '34.020'), (1, '34.310')] +[2023-10-09 07:22:37,193][60144] Updated weights for policy 1, policy_version 78692 (0.0009) +[2023-10-09 07:22:37,560][60144] Updated weights for policy 1, policy_version 78702 (0.0009) +[2023-10-09 07:22:37,932][60144] Updated weights for policy 1, policy_version 78712 (0.0007) +[2023-10-09 07:22:38,518][60143] Updated weights for policy 0, policy_version 77832 (0.0009) +[2023-10-09 07:22:38,881][60143] Updated weights for policy 0, policy_version 77842 (0.0010) +[2023-10-09 07:22:39,245][60143] Updated weights for policy 0, policy_version 77852 (0.0009) +[2023-10-09 07:22:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 160333824. Throughput: 0: 1697.8, 1: 1713.9. Samples: 40090544. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:22:41,053][59242] Avg episode reward: [(0, '33.130'), (1, '34.410')] +[2023-10-09 07:22:41,935][60144] Updated weights for policy 1, policy_version 78722 (0.0008) +[2023-10-09 07:22:42,296][60144] Updated weights for policy 1, policy_version 78732 (0.0008) +[2023-10-09 07:22:42,653][60144] Updated weights for policy 1, policy_version 78742 (0.0009) +[2023-10-09 07:22:43,015][60144] Updated weights for policy 1, policy_version 78752 (0.0010) +[2023-10-09 07:22:43,306][60143] Updated weights for policy 0, policy_version 77862 (0.0010) +[2023-10-09 07:22:43,677][60143] Updated weights for policy 0, policy_version 77872 (0.0011) +[2023-10-09 07:22:44,045][60143] Updated weights for policy 0, policy_version 77882 (0.0008) +[2023-10-09 07:22:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 160399360. Throughput: 0: 1715.2, 1: 1733.9. Samples: 40111966. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:22:46,053][59242] Avg episode reward: [(0, '33.510'), (1, '34.080')] +[2023-10-09 07:22:46,845][60144] Updated weights for policy 1, policy_version 78762 (0.0010) +[2023-10-09 07:22:47,212][60144] Updated weights for policy 1, policy_version 78772 (0.0010) +[2023-10-09 07:22:47,573][60144] Updated weights for policy 1, policy_version 78782 (0.0009) +[2023-10-09 07:22:47,931][60143] Updated weights for policy 0, policy_version 77892 (0.0009) +[2023-10-09 07:22:48,289][60143] Updated weights for policy 0, policy_version 77902 (0.0009) +[2023-10-09 07:22:48,664][60143] Updated weights for policy 0, policy_version 77912 (0.0007) +[2023-10-09 07:22:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 160464896. Throughput: 0: 1711.2, 1: 1705.0. Samples: 40121712. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:22:51,052][59242] Avg episode reward: [(0, '34.940'), (1, '34.230')] +[2023-10-09 07:22:51,671][60144] Updated weights for policy 1, policy_version 78792 (0.0011) +[2023-10-09 07:22:52,034][60144] Updated weights for policy 1, policy_version 78802 (0.0010) +[2023-10-09 07:22:52,399][60144] Updated weights for policy 1, policy_version 78812 (0.0007) +[2023-10-09 07:22:52,581][60143] Updated weights for policy 0, policy_version 77922 (0.0008) +[2023-10-09 07:22:52,958][60143] Updated weights for policy 0, policy_version 77932 (0.0008) +[2023-10-09 07:22:53,319][60143] Updated weights for policy 0, policy_version 77942 (0.0008) +[2023-10-09 07:22:53,690][60143] Updated weights for policy 0, policy_version 77952 (0.0008) +[2023-10-09 07:22:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 160530432. Throughput: 0: 1697.1, 1: 1731.4. Samples: 40142564. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:22:56,052][59242] Avg episode reward: [(0, '34.330'), (1, '33.830')] +[2023-10-09 07:22:56,373][60144] Updated weights for policy 1, policy_version 78822 (0.0007) +[2023-10-09 07:22:56,731][60144] Updated weights for policy 1, policy_version 78832 (0.0008) +[2023-10-09 07:22:57,098][60144] Updated weights for policy 1, policy_version 78842 (0.0007) +[2023-10-09 07:22:57,564][60143] Updated weights for policy 0, policy_version 77962 (0.0007) +[2023-10-09 07:22:57,946][60143] Updated weights for policy 0, policy_version 77972 (0.0008) +[2023-10-09 07:22:58,310][60143] Updated weights for policy 0, policy_version 77982 (0.0009) +[2023-10-09 07:23:00,878][60144] Updated weights for policy 1, policy_version 78852 (0.0008) +[2023-10-09 07:23:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 160595968. Throughput: 0: 1722.8, 1: 1735.9. Samples: 40163730. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:23:01,053][59242] Avg episode reward: [(0, '34.130'), (1, '34.960')] +[2023-10-09 07:23:01,250][60144] Updated weights for policy 1, policy_version 78862 (0.0008) +[2023-10-09 07:23:01,611][60144] Updated weights for policy 1, policy_version 78872 (0.0008) +[2023-10-09 07:23:02,575][60143] Updated weights for policy 0, policy_version 77992 (0.0009) +[2023-10-09 07:23:02,942][60143] Updated weights for policy 0, policy_version 78002 (0.0009) +[2023-10-09 07:23:03,326][60143] Updated weights for policy 0, policy_version 78012 (0.0008) +[2023-10-09 07:23:05,577][60144] Updated weights for policy 1, policy_version 78882 (0.0010) +[2023-10-09 07:23:05,983][60144] Updated weights for policy 1, policy_version 78892 (0.0009) +[2023-10-09 07:23:06,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 160661504. Throughput: 0: 1690.6, 1: 1728.1. Samples: 40173074. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:23:06,053][59242] Avg episode reward: [(0, '35.010'), (1, '34.000')] +[2023-10-09 07:23:06,349][60144] Updated weights for policy 1, policy_version 78902 (0.0008) +[2023-10-09 07:23:06,720][60144] Updated weights for policy 1, policy_version 78912 (0.0008) +[2023-10-09 07:23:07,375][60143] Updated weights for policy 0, policy_version 78022 (0.0011) +[2023-10-09 07:23:07,750][60143] Updated weights for policy 0, policy_version 78032 (0.0009) +[2023-10-09 07:23:08,117][60143] Updated weights for policy 0, policy_version 78042 (0.0009) +[2023-10-09 07:23:10,834][60144] Updated weights for policy 1, policy_version 78922 (0.0008) +[2023-10-09 07:23:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 160727040. Throughput: 0: 1708.7, 1: 1735.1. Samples: 40193954. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:23:11,052][59242] Avg episode reward: [(0, '36.680'), (1, '34.140')] +[2023-10-09 07:23:11,205][60144] Updated weights for policy 1, policy_version 78932 (0.0008) +[2023-10-09 07:23:11,572][60144] Updated weights for policy 1, policy_version 78942 (0.0007) +[2023-10-09 07:23:12,141][60143] Updated weights for policy 0, policy_version 78052 (0.0010) +[2023-10-09 07:23:12,525][60143] Updated weights for policy 0, policy_version 78062 (0.0008) +[2023-10-09 07:23:12,904][60143] Updated weights for policy 0, policy_version 78072 (0.0008) +[2023-10-09 07:23:15,419][60144] Updated weights for policy 1, policy_version 78952 (0.0008) +[2023-10-09 07:23:15,793][60144] Updated weights for policy 1, policy_version 78962 (0.0009) +[2023-10-09 07:23:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 160792576. Throughput: 0: 1710.2, 1: 1725.1. Samples: 40214448. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:23:16,053][59242] Avg episode reward: [(0, '35.510'), (1, '33.870')] +[2023-10-09 07:23:16,156][60144] Updated weights for policy 1, policy_version 78972 (0.0008) +[2023-10-09 07:23:16,870][60143] Updated weights for policy 0, policy_version 78082 (0.0009) +[2023-10-09 07:23:17,235][60143] Updated weights for policy 0, policy_version 78092 (0.0007) +[2023-10-09 07:23:17,606][60143] Updated weights for policy 0, policy_version 78102 (0.0008) +[2023-10-09 07:23:17,967][60143] Updated weights for policy 0, policy_version 78112 (0.0008) +[2023-10-09 07:23:20,016][60144] Updated weights for policy 1, policy_version 78982 (0.0009) +[2023-10-09 07:23:20,381][60144] Updated weights for policy 1, policy_version 78992 (0.0009) +[2023-10-09 07:23:20,751][60144] Updated weights for policy 1, policy_version 79002 (0.0009) +[2023-10-09 07:23:21,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 160890880. Throughput: 0: 1684.0, 1: 1740.4. Samples: 40224278. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:23:21,052][59242] Avg episode reward: [(0, '36.380'), (1, '35.430')] +[2023-10-09 07:23:22,116][60143] Updated weights for policy 0, policy_version 78122 (0.0007) +[2023-10-09 07:23:22,486][60143] Updated weights for policy 0, policy_version 78132 (0.0008) +[2023-10-09 07:23:22,858][60143] Updated weights for policy 0, policy_version 78142 (0.0007) +[2023-10-09 07:23:24,722][60144] Updated weights for policy 1, policy_version 79012 (0.0009) +[2023-10-09 07:23:25,084][60144] Updated weights for policy 1, policy_version 79022 (0.0008) +[2023-10-09 07:23:25,457][60144] Updated weights for policy 1, policy_version 79032 (0.0008) +[2023-10-09 07:23:26,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 160956416. Throughput: 0: 1707.6, 1: 1730.2. Samples: 40245244. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:23:26,053][59242] Avg episode reward: [(0, '34.950'), (1, '36.380')] +[2023-10-09 07:23:26,853][60143] Updated weights for policy 0, policy_version 78152 (0.0009) +[2023-10-09 07:23:27,219][60143] Updated weights for policy 0, policy_version 78162 (0.0009) +[2023-10-09 07:23:27,586][60143] Updated weights for policy 0, policy_version 78172 (0.0010) +[2023-10-09 07:23:29,331][60144] Updated weights for policy 1, policy_version 79042 (0.0009) +[2023-10-09 07:23:29,699][60144] Updated weights for policy 1, policy_version 79052 (0.0010) +[2023-10-09 07:23:30,058][60144] Updated weights for policy 1, policy_version 79062 (0.0010) +[2023-10-09 07:23:30,423][60144] Updated weights for policy 1, policy_version 79072 (0.0009) +[2023-10-09 07:23:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 161021952. Throughput: 0: 1704.0, 1: 1700.8. Samples: 40265184. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:23:31,053][59242] Avg episode reward: [(0, '34.760'), (1, '35.850')] +[2023-10-09 07:23:31,062][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000079072_80969728.pth... +[2023-10-09 07:23:31,062][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000078176_80052224.pth... +[2023-10-09 07:23:31,101][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000077472_79331328.pth +[2023-10-09 07:23:31,104][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000076608_78446592.pth +[2023-10-09 07:23:31,579][60143] Updated weights for policy 0, policy_version 78182 (0.0011) +[2023-10-09 07:23:31,937][60143] Updated weights for policy 0, policy_version 78192 (0.0010) +[2023-10-09 07:23:32,311][60143] Updated weights for policy 0, policy_version 78202 (0.0007) +[2023-10-09 07:23:34,493][60144] Updated weights for policy 1, policy_version 79082 (0.0010) +[2023-10-09 07:23:34,864][60144] Updated weights for policy 1, policy_version 79092 (0.0009) +[2023-10-09 07:23:35,236][60144] Updated weights for policy 1, policy_version 79102 (0.0007) +[2023-10-09 07:23:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 161087488. Throughput: 0: 1692.4, 1: 1735.6. Samples: 40275974. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:23:36,052][59242] Avg episode reward: [(0, '36.110'), (1, '35.550')] +[2023-10-09 07:23:36,234][60143] Updated weights for policy 0, policy_version 78212 (0.0009) +[2023-10-09 07:23:36,596][60143] Updated weights for policy 0, policy_version 78222 (0.0010) +[2023-10-09 07:23:36,975][60143] Updated weights for policy 0, policy_version 78232 (0.0009) +[2023-10-09 07:23:39,173][60144] Updated weights for policy 1, policy_version 79112 (0.0007) +[2023-10-09 07:23:39,547][60144] Updated weights for policy 1, policy_version 79122 (0.0007) +[2023-10-09 07:23:39,911][60144] Updated weights for policy 1, policy_version 79132 (0.0009) +[2023-10-09 07:23:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 161153024. Throughput: 0: 1704.1, 1: 1717.1. Samples: 40296518. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:23:41,052][59242] Avg episode reward: [(0, '39.720'), (1, '35.160')] +[2023-10-09 07:23:41,100][60143] Updated weights for policy 0, policy_version 78242 (0.0007) +[2023-10-09 07:23:41,459][60143] Updated weights for policy 0, policy_version 78252 (0.0010) +[2023-10-09 07:23:41,827][60143] Updated weights for policy 0, policy_version 78262 (0.0010) +[2023-10-09 07:23:42,196][59934] Saving new best policy, reward=39.720! +[2023-10-09 07:23:42,199][60143] Updated weights for policy 0, policy_version 78272 (0.0011) +[2023-10-09 07:23:43,903][60144] Updated weights for policy 1, policy_version 79142 (0.0008) +[2023-10-09 07:23:44,270][60144] Updated weights for policy 1, policy_version 79152 (0.0008) +[2023-10-09 07:23:44,629][60144] Updated weights for policy 1, policy_version 79162 (0.0008) +[2023-10-09 07:23:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 161218560. Throughput: 0: 1706.0, 1: 1705.9. Samples: 40317264. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:23:46,053][59242] Avg episode reward: [(0, '40.310'), (1, '36.120')] +[2023-10-09 07:23:46,261][60143] Updated weights for policy 0, policy_version 78282 (0.0008) +[2023-10-09 07:23:46,631][60143] Updated weights for policy 0, policy_version 78292 (0.0007) +[2023-10-09 07:23:47,006][60143] Updated weights for policy 0, policy_version 78302 (0.0007) +[2023-10-09 07:23:47,071][59934] Saving new best policy, reward=40.310! +[2023-10-09 07:23:48,517][60144] Updated weights for policy 1, policy_version 79172 (0.0007) +[2023-10-09 07:23:48,882][60144] Updated weights for policy 1, policy_version 79182 (0.0008) +[2023-10-09 07:23:49,250][60144] Updated weights for policy 1, policy_version 79192 (0.0008) +[2023-10-09 07:23:50,953][60143] Updated weights for policy 0, policy_version 78312 (0.0008) +[2023-10-09 07:23:51,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 161284096. Throughput: 0: 1704.0, 1: 1726.7. Samples: 40327456. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:23:51,053][59242] Avg episode reward: [(0, '39.670'), (1, '35.500')] +[2023-10-09 07:23:51,308][60143] Updated weights for policy 0, policy_version 78322 (0.0008) +[2023-10-09 07:23:51,680][60143] Updated weights for policy 0, policy_version 78332 (0.0007) +[2023-10-09 07:23:53,171][60144] Updated weights for policy 1, policy_version 79202 (0.0010) +[2023-10-09 07:23:53,586][60144] Updated weights for policy 1, policy_version 79212 (0.0008) +[2023-10-09 07:23:53,950][60144] Updated weights for policy 1, policy_version 79222 (0.0010) +[2023-10-09 07:23:54,316][60144] Updated weights for policy 1, policy_version 79232 (0.0010) +[2023-10-09 07:23:55,584][60143] Updated weights for policy 0, policy_version 78342 (0.0008) +[2023-10-09 07:23:55,945][60143] Updated weights for policy 0, policy_version 78352 (0.0008) +[2023-10-09 07:23:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 161349632. Throughput: 0: 1710.3, 1: 1714.4. Samples: 40348062. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:23:56,053][59242] Avg episode reward: [(0, '38.870'), (1, '33.630')] +[2023-10-09 07:23:56,326][60143] Updated weights for policy 0, policy_version 78362 (0.0009) +[2023-10-09 07:23:58,183][60144] Updated weights for policy 1, policy_version 79242 (0.0009) +[2023-10-09 07:23:58,564][60144] Updated weights for policy 1, policy_version 79252 (0.0009) +[2023-10-09 07:23:58,935][60144] Updated weights for policy 1, policy_version 79262 (0.0008) +[2023-10-09 07:24:00,403][60143] Updated weights for policy 0, policy_version 78372 (0.0010) +[2023-10-09 07:24:00,784][60143] Updated weights for policy 0, policy_version 78382 (0.0008) +[2023-10-09 07:24:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 161415168. Throughput: 0: 1708.8, 1: 1729.2. Samples: 40369158. Policy #0 lag: (min: 6.0, avg: 17.3, max: 38.0) +[2023-10-09 07:24:01,053][59242] Avg episode reward: [(0, '39.380'), (1, '34.240')] +[2023-10-09 07:24:01,151][60143] Updated weights for policy 0, policy_version 78392 (0.0009) +[2023-10-09 07:24:02,747][60144] Updated weights for policy 1, policy_version 79272 (0.0008) +[2023-10-09 07:24:03,126][60144] Updated weights for policy 1, policy_version 79282 (0.0007) +[2023-10-09 07:24:03,499][60144] Updated weights for policy 1, policy_version 79292 (0.0008) +[2023-10-09 07:24:04,913][60143] Updated weights for policy 0, policy_version 78402 (0.0008) +[2023-10-09 07:24:05,277][60143] Updated weights for policy 0, policy_version 78412 (0.0009) +[2023-10-09 07:24:05,645][60143] Updated weights for policy 0, policy_version 78422 (0.0010) +[2023-10-09 07:24:06,009][60143] Updated weights for policy 0, policy_version 78432 (0.0011) +[2023-10-09 07:24:06,052][59242] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 161513472. Throughput: 0: 1718.5, 1: 1720.9. Samples: 40379054. Policy #0 lag: (min: 6.0, avg: 17.3, max: 38.0) +[2023-10-09 07:24:06,052][59242] Avg episode reward: [(0, '39.710'), (1, '34.860')] +[2023-10-09 07:24:07,486][60144] Updated weights for policy 1, policy_version 79302 (0.0009) +[2023-10-09 07:24:07,843][60144] Updated weights for policy 1, policy_version 79312 (0.0007) +[2023-10-09 07:24:08,213][60144] Updated weights for policy 1, policy_version 79322 (0.0009) +[2023-10-09 07:24:09,974][60143] Updated weights for policy 0, policy_version 78442 (0.0008) +[2023-10-09 07:24:10,334][60143] Updated weights for policy 0, policy_version 78452 (0.0010) +[2023-10-09 07:24:10,706][60143] Updated weights for policy 0, policy_version 78462 (0.0011) +[2023-10-09 07:24:11,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 161579008. Throughput: 0: 1723.3, 1: 1720.0. Samples: 40400190. Policy #0 lag: (min: 6.0, avg: 17.3, max: 38.0) +[2023-10-09 07:24:11,052][59242] Avg episode reward: [(0, '37.810'), (1, '34.810')] +[2023-10-09 07:24:12,175][60144] Updated weights for policy 1, policy_version 79332 (0.0008) +[2023-10-09 07:24:12,553][60144] Updated weights for policy 1, policy_version 79342 (0.0010) +[2023-10-09 07:24:12,914][60144] Updated weights for policy 1, policy_version 79352 (0.0009) +[2023-10-09 07:24:14,716][60143] Updated weights for policy 0, policy_version 78472 (0.0009) +[2023-10-09 07:24:15,080][60143] Updated weights for policy 0, policy_version 78482 (0.0009) +[2023-10-09 07:24:15,455][60143] Updated weights for policy 0, policy_version 78492 (0.0010) +[2023-10-09 07:24:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 161644544. Throughput: 0: 1697.0, 1: 1747.3. Samples: 40420178. Policy #0 lag: (min: 6.0, avg: 17.3, max: 38.0) +[2023-10-09 07:24:16,052][59242] Avg episode reward: [(0, '37.860'), (1, '34.230')] +[2023-10-09 07:24:16,697][60144] Updated weights for policy 1, policy_version 79362 (0.0009) +[2023-10-09 07:24:17,065][60144] Updated weights for policy 1, policy_version 79372 (0.0010) +[2023-10-09 07:24:17,430][60144] Updated weights for policy 1, policy_version 79382 (0.0009) +[2023-10-09 07:24:17,797][60144] Updated weights for policy 1, policy_version 79392 (0.0009) +[2023-10-09 07:24:19,511][60143] Updated weights for policy 0, policy_version 78502 (0.0007) +[2023-10-09 07:24:19,878][60143] Updated weights for policy 0, policy_version 78512 (0.0010) +[2023-10-09 07:24:20,248][60143] Updated weights for policy 0, policy_version 78522 (0.0010) +[2023-10-09 07:24:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 161710080. Throughput: 0: 1723.1, 1: 1711.1. Samples: 40430512. Policy #0 lag: (min: 6.0, avg: 17.3, max: 38.0) +[2023-10-09 07:24:21,052][59242] Avg episode reward: [(0, '37.540'), (1, '33.990')] +[2023-10-09 07:24:21,805][60144] Updated weights for policy 1, policy_version 79402 (0.0009) +[2023-10-09 07:24:22,176][60144] Updated weights for policy 1, policy_version 79412 (0.0009) +[2023-10-09 07:24:22,549][60144] Updated weights for policy 1, policy_version 79422 (0.0009) +[2023-10-09 07:24:24,221][60143] Updated weights for policy 0, policy_version 78532 (0.0009) +[2023-10-09 07:24:24,577][60143] Updated weights for policy 0, policy_version 78542 (0.0008) +[2023-10-09 07:24:24,947][60143] Updated weights for policy 0, policy_version 78552 (0.0008) +[2023-10-09 07:24:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 161775616. Throughput: 0: 1711.6, 1: 1733.1. Samples: 40451528. Policy #0 lag: (min: 6.0, avg: 17.3, max: 38.0) +[2023-10-09 07:24:26,053][59242] Avg episode reward: [(0, '39.200'), (1, '33.820')] +[2023-10-09 07:24:26,451][60144] Updated weights for policy 1, policy_version 79432 (0.0007) +[2023-10-09 07:24:26,825][60144] Updated weights for policy 1, policy_version 79442 (0.0008) +[2023-10-09 07:24:27,188][60144] Updated weights for policy 1, policy_version 79452 (0.0008) +[2023-10-09 07:24:28,884][60143] Updated weights for policy 0, policy_version 78562 (0.0008) +[2023-10-09 07:24:29,264][60143] Updated weights for policy 0, policy_version 78572 (0.0009) +[2023-10-09 07:24:29,628][60143] Updated weights for policy 0, policy_version 78582 (0.0008) +[2023-10-09 07:24:29,997][60143] Updated weights for policy 0, policy_version 78592 (0.0008) +[2023-10-09 07:24:30,996][60144] Updated weights for policy 1, policy_version 79462 (0.0007) +[2023-10-09 07:24:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 161841152. Throughput: 0: 1690.9, 1: 1748.0. Samples: 40472014. Policy #0 lag: (min: 6.0, avg: 17.3, max: 38.0) +[2023-10-09 07:24:31,053][59242] Avg episode reward: [(0, '38.730'), (1, '33.550')] +[2023-10-09 07:24:31,362][60144] Updated weights for policy 1, policy_version 79472 (0.0007) +[2023-10-09 07:24:31,722][60144] Updated weights for policy 1, policy_version 79482 (0.0008) +[2023-10-09 07:24:33,995][60143] Updated weights for policy 0, policy_version 78602 (0.0007) +[2023-10-09 07:24:34,356][60143] Updated weights for policy 0, policy_version 78612 (0.0007) +[2023-10-09 07:24:34,724][60143] Updated weights for policy 0, policy_version 78622 (0.0007) +[2023-10-09 07:24:35,656][60144] Updated weights for policy 1, policy_version 79492 (0.0009) +[2023-10-09 07:24:36,026][60144] Updated weights for policy 1, policy_version 79502 (0.0007) +[2023-10-09 07:24:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 161906688. Throughput: 0: 1721.6, 1: 1724.6. Samples: 40482534. Policy #0 lag: (min: 6.0, avg: 17.3, max: 38.0) +[2023-10-09 07:24:36,053][59242] Avg episode reward: [(0, '36.490'), (1, '33.500')] +[2023-10-09 07:24:36,388][60144] Updated weights for policy 1, policy_version 79512 (0.0009) +[2023-10-09 07:24:38,683][60143] Updated weights for policy 0, policy_version 78632 (0.0008) +[2023-10-09 07:24:39,043][60143] Updated weights for policy 0, policy_version 78642 (0.0007) +[2023-10-09 07:24:39,414][60143] Updated weights for policy 0, policy_version 78652 (0.0007) +[2023-10-09 07:24:40,295][60144] Updated weights for policy 1, policy_version 79522 (0.0010) +[2023-10-09 07:24:40,709][60144] Updated weights for policy 1, policy_version 79532 (0.0010) +[2023-10-09 07:24:41,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 161972224. Throughput: 0: 1692.4, 1: 1747.8. Samples: 40502872. Policy #0 lag: (min: 6.0, avg: 17.3, max: 38.0) +[2023-10-09 07:24:41,053][59242] Avg episode reward: [(0, '38.380'), (1, '33.140')] +[2023-10-09 07:24:41,067][60144] Updated weights for policy 1, policy_version 79542 (0.0009) +[2023-10-09 07:24:41,429][60144] Updated weights for policy 1, policy_version 79552 (0.0010) +[2023-10-09 07:24:43,394][60143] Updated weights for policy 0, policy_version 78662 (0.0008) +[2023-10-09 07:24:43,764][60143] Updated weights for policy 0, policy_version 78672 (0.0008) +[2023-10-09 07:24:44,120][60143] Updated weights for policy 0, policy_version 78682 (0.0007) +[2023-10-09 07:24:45,265][60144] Updated weights for policy 1, policy_version 79562 (0.0007) +[2023-10-09 07:24:45,628][60144] Updated weights for policy 1, policy_version 79572 (0.0009) +[2023-10-09 07:24:46,003][60144] Updated weights for policy 1, policy_version 79582 (0.0007) +[2023-10-09 07:24:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 162037760. Throughput: 0: 1693.1, 1: 1727.5. Samples: 40523084. Policy #0 lag: (min: 6.0, avg: 17.3, max: 38.0) +[2023-10-09 07:24:46,053][59242] Avg episode reward: [(0, '39.040'), (1, '33.880')] +[2023-10-09 07:24:48,263][60143] Updated weights for policy 0, policy_version 78692 (0.0008) +[2023-10-09 07:24:48,660][60143] Updated weights for policy 0, policy_version 78702 (0.0012) +[2023-10-09 07:24:49,021][60143] Updated weights for policy 0, policy_version 78712 (0.0012) +[2023-10-09 07:24:49,980][60144] Updated weights for policy 1, policy_version 79592 (0.0008) +[2023-10-09 07:24:50,348][60144] Updated weights for policy 1, policy_version 79602 (0.0010) +[2023-10-09 07:24:50,712][60144] Updated weights for policy 1, policy_version 79612 (0.0010) +[2023-10-09 07:24:51,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 162136064. Throughput: 0: 1703.6, 1: 1737.5. Samples: 40533902. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) +[2023-10-09 07:24:51,053][59242] Avg episode reward: [(0, '36.300'), (1, '33.380')] +[2023-10-09 07:24:53,081][60143] Updated weights for policy 0, policy_version 78722 (0.0010) +[2023-10-09 07:24:53,453][60143] Updated weights for policy 0, policy_version 78732 (0.0011) +[2023-10-09 07:24:53,821][60143] Updated weights for policy 0, policy_version 78742 (0.0010) +[2023-10-09 07:24:54,191][60143] Updated weights for policy 0, policy_version 78752 (0.0010) +[2023-10-09 07:24:54,758][60144] Updated weights for policy 1, policy_version 79622 (0.0010) +[2023-10-09 07:24:55,132][60144] Updated weights for policy 1, policy_version 79632 (0.0011) +[2023-10-09 07:24:55,494][60144] Updated weights for policy 1, policy_version 79642 (0.0010) +[2023-10-09 07:24:56,052][59242] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 162201600. Throughput: 0: 1680.8, 1: 1739.3. Samples: 40554094. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) +[2023-10-09 07:24:56,053][59242] Avg episode reward: [(0, '35.560'), (1, '31.430')] +[2023-10-09 07:24:58,054][60143] Updated weights for policy 0, policy_version 78762 (0.0009) +[2023-10-09 07:24:58,416][60143] Updated weights for policy 0, policy_version 78772 (0.0010) +[2023-10-09 07:24:58,784][60143] Updated weights for policy 0, policy_version 78782 (0.0009) +[2023-10-09 07:24:59,519][60144] Updated weights for policy 1, policy_version 79652 (0.0008) +[2023-10-09 07:24:59,892][60144] Updated weights for policy 1, policy_version 79662 (0.0007) +[2023-10-09 07:25:00,254][60144] Updated weights for policy 1, policy_version 79672 (0.0007) +[2023-10-09 07:25:01,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 162267136. Throughput: 0: 1711.6, 1: 1712.2. Samples: 40574252. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) +[2023-10-09 07:25:01,053][59242] Avg episode reward: [(0, '37.170'), (1, '30.740')] +[2023-10-09 07:25:02,866][60143] Updated weights for policy 0, policy_version 78792 (0.0009) +[2023-10-09 07:25:03,233][60143] Updated weights for policy 0, policy_version 78802 (0.0009) +[2023-10-09 07:25:03,619][60143] Updated weights for policy 0, policy_version 78812 (0.0009) +[2023-10-09 07:25:04,041][60144] Updated weights for policy 1, policy_version 79682 (0.0008) +[2023-10-09 07:25:04,417][60144] Updated weights for policy 1, policy_version 79692 (0.0010) +[2023-10-09 07:25:04,778][60144] Updated weights for policy 1, policy_version 79702 (0.0007) +[2023-10-09 07:25:05,138][60144] Updated weights for policy 1, policy_version 79712 (0.0009) +[2023-10-09 07:25:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 162332672. Throughput: 0: 1690.8, 1: 1748.8. Samples: 40585294. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) +[2023-10-09 07:25:06,052][59242] Avg episode reward: [(0, '35.270'), (1, '31.300')] +[2023-10-09 07:25:07,653][60143] Updated weights for policy 0, policy_version 78822 (0.0012) +[2023-10-09 07:25:08,026][60143] Updated weights for policy 0, policy_version 78832 (0.0010) +[2023-10-09 07:25:08,398][60143] Updated weights for policy 0, policy_version 78842 (0.0010) +[2023-10-09 07:25:09,138][60144] Updated weights for policy 1, policy_version 79722 (0.0007) +[2023-10-09 07:25:09,497][60144] Updated weights for policy 1, policy_version 79732 (0.0008) +[2023-10-09 07:25:09,864][60144] Updated weights for policy 1, policy_version 79742 (0.0007) +[2023-10-09 07:25:11,052][59242] Fps is (10 sec: 13106.7, 60 sec: 13653.2, 300 sec: 13662.6). Total num frames: 162398208. Throughput: 0: 1691.4, 1: 1725.0. Samples: 40605268. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) +[2023-10-09 07:25:11,053][59242] Avg episode reward: [(0, '35.390'), (1, '32.880')] +[2023-10-09 07:25:12,346][60143] Updated weights for policy 0, policy_version 78852 (0.0007) +[2023-10-09 07:25:12,709][60143] Updated weights for policy 0, policy_version 78862 (0.0008) +[2023-10-09 07:25:13,082][60143] Updated weights for policy 0, policy_version 78872 (0.0009) +[2023-10-09 07:25:13,921][60144] Updated weights for policy 1, policy_version 79752 (0.0009) +[2023-10-09 07:25:14,291][60144] Updated weights for policy 1, policy_version 79762 (0.0007) +[2023-10-09 07:25:14,668][60144] Updated weights for policy 1, policy_version 79772 (0.0007) +[2023-10-09 07:25:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 162463744. Throughput: 0: 1708.8, 1: 1707.7. Samples: 40625756. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) +[2023-10-09 07:25:16,053][59242] Avg episode reward: [(0, '34.040'), (1, '32.200')] +[2023-10-09 07:25:17,085][60143] Updated weights for policy 0, policy_version 78882 (0.0007) +[2023-10-09 07:25:17,462][60143] Updated weights for policy 0, policy_version 78892 (0.0007) +[2023-10-09 07:25:17,829][60143] Updated weights for policy 0, policy_version 78902 (0.0009) +[2023-10-09 07:25:18,202][60143] Updated weights for policy 0, policy_version 78912 (0.0009) +[2023-10-09 07:25:18,636][60144] Updated weights for policy 1, policy_version 79782 (0.0009) +[2023-10-09 07:25:19,005][60144] Updated weights for policy 1, policy_version 79792 (0.0010) +[2023-10-09 07:25:19,367][60144] Updated weights for policy 1, policy_version 79802 (0.0011) +[2023-10-09 07:25:21,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 162529280. Throughput: 0: 1679.9, 1: 1733.8. Samples: 40636152. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) +[2023-10-09 07:25:21,053][59242] Avg episode reward: [(0, '34.490'), (1, '33.150')] +[2023-10-09 07:25:22,272][60143] Updated weights for policy 0, policy_version 78922 (0.0007) +[2023-10-09 07:25:22,645][60143] Updated weights for policy 0, policy_version 78932 (0.0007) +[2023-10-09 07:25:23,019][60143] Updated weights for policy 0, policy_version 78942 (0.0008) +[2023-10-09 07:25:23,431][60144] Updated weights for policy 1, policy_version 79812 (0.0011) +[2023-10-09 07:25:23,797][60144] Updated weights for policy 1, policy_version 79822 (0.0010) +[2023-10-09 07:25:24,173][60144] Updated weights for policy 1, policy_version 79832 (0.0010) +[2023-10-09 07:25:26,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 162594816. Throughput: 0: 1704.1, 1: 1700.9. Samples: 40656098. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) +[2023-10-09 07:25:26,053][59242] Avg episode reward: [(0, '35.220'), (1, '33.720')] +[2023-10-09 07:25:26,827][60143] Updated weights for policy 0, policy_version 78952 (0.0008) +[2023-10-09 07:25:27,181][60143] Updated weights for policy 0, policy_version 78962 (0.0007) +[2023-10-09 07:25:27,541][60143] Updated weights for policy 0, policy_version 78972 (0.0007) +[2023-10-09 07:25:28,216][60144] Updated weights for policy 1, policy_version 79842 (0.0008) +[2023-10-09 07:25:28,617][60144] Updated weights for policy 1, policy_version 79852 (0.0009) +[2023-10-09 07:25:28,980][60144] Updated weights for policy 1, policy_version 79862 (0.0011) +[2023-10-09 07:25:29,349][60144] Updated weights for policy 1, policy_version 79872 (0.0011) +[2023-10-09 07:25:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 162660352. Throughput: 0: 1711.9, 1: 1718.5. Samples: 40677450. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) +[2023-10-09 07:25:31,053][59242] Avg episode reward: [(0, '35.920'), (1, '35.290')] +[2023-10-09 07:25:31,063][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000078976_80871424.pth... +[2023-10-09 07:25:31,063][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000079872_81788928.pth... +[2023-10-09 07:25:31,098][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000078272_80150528.pth +[2023-10-09 07:25:31,102][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000077408_79265792.pth +[2023-10-09 07:25:31,583][60143] Updated weights for policy 0, policy_version 78982 (0.0008) +[2023-10-09 07:25:31,952][60143] Updated weights for policy 0, policy_version 78992 (0.0010) +[2023-10-09 07:25:32,328][60143] Updated weights for policy 0, policy_version 79002 (0.0012) +[2023-10-09 07:25:33,171][60144] Updated weights for policy 1, policy_version 79882 (0.0009) +[2023-10-09 07:25:33,538][60144] Updated weights for policy 1, policy_version 79892 (0.0008) +[2023-10-09 07:25:33,907][60144] Updated weights for policy 1, policy_version 79902 (0.0009) +[2023-10-09 07:25:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 162725888. Throughput: 0: 1690.8, 1: 1715.5. Samples: 40687182. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) +[2023-10-09 07:25:36,053][59242] Avg episode reward: [(0, '37.540'), (1, '34.220')] +[2023-10-09 07:25:36,452][60143] Updated weights for policy 0, policy_version 79012 (0.0007) +[2023-10-09 07:25:36,839][60143] Updated weights for policy 0, policy_version 79022 (0.0011) +[2023-10-09 07:25:37,205][60143] Updated weights for policy 0, policy_version 79032 (0.0008) +[2023-10-09 07:25:37,618][60144] Updated weights for policy 1, policy_version 79912 (0.0008) +[2023-10-09 07:25:37,987][60144] Updated weights for policy 1, policy_version 79922 (0.0007) +[2023-10-09 07:25:38,351][60144] Updated weights for policy 1, policy_version 79932 (0.0010) +[2023-10-09 07:25:41,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 162791424. Throughput: 0: 1709.5, 1: 1707.1. Samples: 40707842. Policy #0 lag: (min: 19.0, avg: 26.5, max: 51.0) +[2023-10-09 07:25:41,053][59242] Avg episode reward: [(0, '37.190'), (1, '32.870')] +[2023-10-09 07:25:41,147][60143] Updated weights for policy 0, policy_version 79042 (0.0007) +[2023-10-09 07:25:41,510][60143] Updated weights for policy 0, policy_version 79052 (0.0007) +[2023-10-09 07:25:41,881][60143] Updated weights for policy 0, policy_version 79062 (0.0009) +[2023-10-09 07:25:42,250][60143] Updated weights for policy 0, policy_version 79072 (0.0008) +[2023-10-09 07:25:42,431][60144] Updated weights for policy 1, policy_version 79942 (0.0009) +[2023-10-09 07:25:42,792][60144] Updated weights for policy 1, policy_version 79952 (0.0009) +[2023-10-09 07:25:43,169][60144] Updated weights for policy 1, policy_version 79962 (0.0010) +[2023-10-09 07:25:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 162856960. Throughput: 0: 1703.3, 1: 1734.7. Samples: 40728962. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-09 07:25:46,053][59242] Avg episode reward: [(0, '36.980'), (1, '33.740')] +[2023-10-09 07:25:46,402][60143] Updated weights for policy 0, policy_version 79082 (0.0008) +[2023-10-09 07:25:46,784][60143] Updated weights for policy 0, policy_version 79092 (0.0007) +[2023-10-09 07:25:47,149][60143] Updated weights for policy 0, policy_version 79102 (0.0009) +[2023-10-09 07:25:47,160][60144] Updated weights for policy 1, policy_version 79972 (0.0009) +[2023-10-09 07:25:47,529][60144] Updated weights for policy 1, policy_version 79982 (0.0010) +[2023-10-09 07:25:47,896][60144] Updated weights for policy 1, policy_version 79992 (0.0008) +[2023-10-09 07:25:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 162922496. Throughput: 0: 1697.3, 1: 1702.6. Samples: 40738288. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-09 07:25:51,053][59242] Avg episode reward: [(0, '36.950'), (1, '33.880')] +[2023-10-09 07:25:51,190][60143] Updated weights for policy 0, policy_version 79112 (0.0009) +[2023-10-09 07:25:51,571][60143] Updated weights for policy 0, policy_version 79122 (0.0009) +[2023-10-09 07:25:51,775][60144] Updated weights for policy 1, policy_version 80002 (0.0008) +[2023-10-09 07:25:51,941][60143] Updated weights for policy 0, policy_version 79132 (0.0008) +[2023-10-09 07:25:52,146][60144] Updated weights for policy 1, policy_version 80012 (0.0007) +[2023-10-09 07:25:52,521][60144] Updated weights for policy 1, policy_version 80022 (0.0008) +[2023-10-09 07:25:52,888][60144] Updated weights for policy 1, policy_version 80032 (0.0009) +[2023-10-09 07:25:55,871][60143] Updated weights for policy 0, policy_version 79142 (0.0009) +[2023-10-09 07:25:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 162988032. Throughput: 0: 1704.9, 1: 1721.1. Samples: 40759438. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-09 07:25:56,053][59242] Avg episode reward: [(0, '36.920'), (1, '33.680')] +[2023-10-09 07:25:56,247][60143] Updated weights for policy 0, policy_version 79152 (0.0010) +[2023-10-09 07:25:56,614][60143] Updated weights for policy 0, policy_version 79162 (0.0007) +[2023-10-09 07:25:56,926][60144] Updated weights for policy 1, policy_version 80042 (0.0009) +[2023-10-09 07:25:57,299][60144] Updated weights for policy 1, policy_version 80052 (0.0009) +[2023-10-09 07:25:57,666][60144] Updated weights for policy 1, policy_version 80062 (0.0008) +[2023-10-09 07:26:00,624][60143] Updated weights for policy 0, policy_version 79172 (0.0009) +[2023-10-09 07:26:00,997][60143] Updated weights for policy 0, policy_version 79182 (0.0008) +[2023-10-09 07:26:01,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 163053568. Throughput: 0: 1704.3, 1: 1734.7. Samples: 40780508. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-09 07:26:01,053][59242] Avg episode reward: [(0, '34.890'), (1, '32.120')] +[2023-10-09 07:26:01,366][60143] Updated weights for policy 0, policy_version 79192 (0.0008) +[2023-10-09 07:26:01,474][60144] Updated weights for policy 1, policy_version 80072 (0.0008) +[2023-10-09 07:26:01,833][60144] Updated weights for policy 1, policy_version 80082 (0.0009) +[2023-10-09 07:26:02,212][60144] Updated weights for policy 1, policy_version 80092 (0.0008) +[2023-10-09 07:26:05,246][60143] Updated weights for policy 0, policy_version 79202 (0.0007) +[2023-10-09 07:26:05,612][60143] Updated weights for policy 0, policy_version 79212 (0.0007) +[2023-10-09 07:26:05,979][60143] Updated weights for policy 0, policy_version 79222 (0.0007) +[2023-10-09 07:26:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 163119104. Throughput: 0: 1707.1, 1: 1712.4. Samples: 40790028. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-09 07:26:06,053][59242] Avg episode reward: [(0, '33.980'), (1, '32.040')] +[2023-10-09 07:26:06,173][60144] Updated weights for policy 1, policy_version 80102 (0.0009) +[2023-10-09 07:26:06,345][60143] Updated weights for policy 0, policy_version 79232 (0.0008) +[2023-10-09 07:26:06,537][60144] Updated weights for policy 1, policy_version 80112 (0.0009) +[2023-10-09 07:26:06,898][60144] Updated weights for policy 1, policy_version 80122 (0.0011) +[2023-10-09 07:26:10,484][60143] Updated weights for policy 0, policy_version 79242 (0.0010) +[2023-10-09 07:26:10,845][60143] Updated weights for policy 0, policy_version 79252 (0.0008) +[2023-10-09 07:26:10,884][60144] Updated weights for policy 1, policy_version 80132 (0.0008) +[2023-10-09 07:26:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 163184640. Throughput: 0: 1705.4, 1: 1739.9. Samples: 40811138. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-09 07:26:11,052][59242] Avg episode reward: [(0, '31.830'), (1, '32.150')] +[2023-10-09 07:26:11,212][60143] Updated weights for policy 0, policy_version 79262 (0.0008) +[2023-10-09 07:26:11,246][60144] Updated weights for policy 1, policy_version 80142 (0.0008) +[2023-10-09 07:26:11,610][60144] Updated weights for policy 1, policy_version 80152 (0.0009) +[2023-10-09 07:26:15,177][60143] Updated weights for policy 0, policy_version 79272 (0.0007) +[2023-10-09 07:26:15,522][60144] Updated weights for policy 1, policy_version 80162 (0.0008) +[2023-10-09 07:26:15,545][60143] Updated weights for policy 0, policy_version 79282 (0.0009) +[2023-10-09 07:26:15,920][60143] Updated weights for policy 0, policy_version 79292 (0.0008) +[2023-10-09 07:26:15,951][60144] Updated weights for policy 1, policy_version 80172 (0.0009) +[2023-10-09 07:26:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 163250176. Throughput: 0: 1692.1, 1: 1732.6. Samples: 40831562. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-09 07:26:16,053][59242] Avg episode reward: [(0, '32.440'), (1, '31.190')] +[2023-10-09 07:26:16,306][60144] Updated weights for policy 1, policy_version 80182 (0.0010) +[2023-10-09 07:26:16,670][60144] Updated weights for policy 1, policy_version 80192 (0.0009) +[2023-10-09 07:26:20,136][60143] Updated weights for policy 0, policy_version 79302 (0.0007) +[2023-10-09 07:26:20,508][60143] Updated weights for policy 0, policy_version 79312 (0.0008) +[2023-10-09 07:26:20,553][60144] Updated weights for policy 1, policy_version 80202 (0.0008) +[2023-10-09 07:26:20,870][60143] Updated weights for policy 0, policy_version 79322 (0.0008) +[2023-10-09 07:26:20,908][60144] Updated weights for policy 1, policy_version 80212 (0.0007) +[2023-10-09 07:26:21,052][59242] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 163315712. Throughput: 0: 1709.4, 1: 1724.9. Samples: 40841726. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-09 07:26:21,054][59242] Avg episode reward: [(0, '32.180'), (1, '31.480')] +[2023-10-09 07:26:21,280][60144] Updated weights for policy 1, policy_version 80222 (0.0008) +[2023-10-09 07:26:24,988][60143] Updated weights for policy 0, policy_version 79332 (0.0008) +[2023-10-09 07:26:25,372][60143] Updated weights for policy 0, policy_version 79342 (0.0009) +[2023-10-09 07:26:25,402][60144] Updated weights for policy 1, policy_version 80232 (0.0008) +[2023-10-09 07:26:25,743][60143] Updated weights for policy 0, policy_version 79352 (0.0008) +[2023-10-09 07:26:25,762][60144] Updated weights for policy 1, policy_version 80242 (0.0008) +[2023-10-09 07:26:26,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 163414016. Throughput: 0: 1710.5, 1: 1734.1. Samples: 40862846. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-09 07:26:26,053][59242] Avg episode reward: [(0, '32.230'), (1, '32.120')] +[2023-10-09 07:26:26,138][60144] Updated weights for policy 1, policy_version 80252 (0.0008) +[2023-10-09 07:26:29,977][60143] Updated weights for policy 0, policy_version 79362 (0.0009) +[2023-10-09 07:26:30,344][60143] Updated weights for policy 0, policy_version 79372 (0.0010) +[2023-10-09 07:26:30,639][60144] Updated weights for policy 1, policy_version 80262 (0.0009) +[2023-10-09 07:26:30,708][60143] Updated weights for policy 0, policy_version 79382 (0.0008) +[2023-10-09 07:26:30,999][60144] Updated weights for policy 1, policy_version 80272 (0.0008) +[2023-10-09 07:26:31,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 163446784. Throughput: 0: 1677.8, 1: 1707.7. Samples: 40881308. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-09 07:26:31,053][59242] Avg episode reward: [(0, '32.810'), (1, '31.180')] +[2023-10-09 07:26:31,080][60143] Updated weights for policy 0, policy_version 79392 (0.0008) +[2023-10-09 07:26:31,362][60144] Updated weights for policy 1, policy_version 80282 (0.0009) +[2023-10-09 07:26:35,130][60143] Updated weights for policy 0, policy_version 79402 (0.0009) +[2023-10-09 07:26:35,495][60143] Updated weights for policy 0, policy_version 79412 (0.0010) +[2023-10-09 07:26:35,646][60144] Updated weights for policy 1, policy_version 80292 (0.0008) +[2023-10-09 07:26:35,867][60143] Updated weights for policy 0, policy_version 79422 (0.0009) +[2023-10-09 07:26:36,012][60144] Updated weights for policy 1, policy_version 80302 (0.0010) +[2023-10-09 07:26:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 163545088. Throughput: 0: 1687.8, 1: 1705.3. Samples: 40890976. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-09 07:26:36,053][59242] Avg episode reward: [(0, '32.950'), (1, '32.080')] +[2023-10-09 07:26:36,375][60144] Updated weights for policy 1, policy_version 80312 (0.0009) +[2023-10-09 07:26:40,165][60143] Updated weights for policy 0, policy_version 79432 (0.0011) +[2023-10-09 07:26:40,533][60143] Updated weights for policy 0, policy_version 79442 (0.0011) +[2023-10-09 07:26:40,561][60144] Updated weights for policy 1, policy_version 80322 (0.0008) +[2023-10-09 07:26:40,906][60143] Updated weights for policy 0, policy_version 79452 (0.0009) +[2023-10-09 07:26:40,933][60144] Updated weights for policy 1, policy_version 80332 (0.0009) +[2023-10-09 07:26:41,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 163610624. Throughput: 0: 1671.8, 1: 1686.6. Samples: 40910566. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-09 07:26:41,053][59242] Avg episode reward: [(0, '32.670'), (1, '31.120')] +[2023-10-09 07:26:41,297][60144] Updated weights for policy 1, policy_version 80342 (0.0008) +[2023-10-09 07:26:41,664][60144] Updated weights for policy 1, policy_version 80352 (0.0008) +[2023-10-09 07:26:45,351][60143] Updated weights for policy 0, policy_version 79462 (0.0010) +[2023-10-09 07:26:45,719][60143] Updated weights for policy 0, policy_version 79472 (0.0010) +[2023-10-09 07:26:46,052][59242] Fps is (10 sec: 9830.4, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 163643392. Throughput: 0: 1643.0, 1: 1660.3. Samples: 40929156. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-09 07:26:46,053][59242] Avg episode reward: [(0, '34.810'), (1, '30.960')] +[2023-10-09 07:26:46,093][60143] Updated weights for policy 0, policy_version 79482 (0.0008) +[2023-10-09 07:26:46,183][60144] Updated weights for policy 1, policy_version 80362 (0.0007) +[2023-10-09 07:26:46,550][60144] Updated weights for policy 1, policy_version 80372 (0.0008) +[2023-10-09 07:26:46,915][60144] Updated weights for policy 1, policy_version 80382 (0.0008) +[2023-10-09 07:26:50,255][60143] Updated weights for policy 0, policy_version 79492 (0.0008) +[2023-10-09 07:26:50,618][60143] Updated weights for policy 0, policy_version 79502 (0.0011) +[2023-10-09 07:26:50,991][60143] Updated weights for policy 0, policy_version 79512 (0.0010) +[2023-10-09 07:26:51,052][59242] Fps is (10 sec: 9830.4, 60 sec: 13107.3, 300 sec: 13551.5). Total num frames: 163708928. Throughput: 0: 1642.8, 1: 1649.4. Samples: 40938178. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-09 07:26:51,052][59242] Avg episode reward: [(0, '35.410'), (1, '31.550')] +[2023-10-09 07:26:51,233][60144] Updated weights for policy 1, policy_version 80392 (0.0009) +[2023-10-09 07:26:51,602][60144] Updated weights for policy 1, policy_version 80402 (0.0008) +[2023-10-09 07:26:51,959][60144] Updated weights for policy 1, policy_version 80412 (0.0010) +[2023-10-09 07:26:55,309][60143] Updated weights for policy 0, policy_version 79522 (0.0010) +[2023-10-09 07:26:55,671][60143] Updated weights for policy 0, policy_version 79532 (0.0010) +[2023-10-09 07:26:56,041][60143] Updated weights for policy 0, policy_version 79542 (0.0008) +[2023-10-09 07:26:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 163774464. Throughput: 0: 1631.7, 1: 1637.2. Samples: 40958242. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-09 07:26:56,052][59242] Avg episode reward: [(0, '33.620'), (1, '31.990')] +[2023-10-09 07:26:56,069][60144] Updated weights for policy 1, policy_version 80422 (0.0008) +[2023-10-09 07:26:56,408][60143] Updated weights for policy 0, policy_version 79552 (0.0008) +[2023-10-09 07:26:56,439][60144] Updated weights for policy 1, policy_version 80432 (0.0008) +[2023-10-09 07:26:56,805][60144] Updated weights for policy 1, policy_version 80442 (0.0008) +[2023-10-09 07:27:00,615][60143] Updated weights for policy 0, policy_version 79562 (0.0010) +[2023-10-09 07:27:00,986][60144] Updated weights for policy 1, policy_version 80452 (0.0007) +[2023-10-09 07:27:00,991][60143] Updated weights for policy 0, policy_version 79572 (0.0009) +[2023-10-09 07:27:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 163840000. Throughput: 0: 1623.4, 1: 1629.6. Samples: 40977946. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-09 07:27:01,053][59242] Avg episode reward: [(0, '33.500'), (1, '32.560')] +[2023-10-09 07:27:01,350][60143] Updated weights for policy 0, policy_version 79582 (0.0008) +[2023-10-09 07:27:01,379][60144] Updated weights for policy 1, policy_version 80462 (0.0010) +[2023-10-09 07:27:01,750][60144] Updated weights for policy 1, policy_version 80472 (0.0008) +[2023-10-09 07:27:05,420][60143] Updated weights for policy 0, policy_version 79592 (0.0008) +[2023-10-09 07:27:05,724][60144] Updated weights for policy 1, policy_version 80482 (0.0007) +[2023-10-09 07:27:05,779][60143] Updated weights for policy 0, policy_version 79602 (0.0009) +[2023-10-09 07:27:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 163905536. Throughput: 0: 1610.1, 1: 1620.9. Samples: 40987118. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-09 07:27:06,052][59242] Avg episode reward: [(0, '32.450'), (1, '31.270')] +[2023-10-09 07:27:06,091][60144] Updated weights for policy 1, policy_version 80492 (0.0007) +[2023-10-09 07:27:06,142][60143] Updated weights for policy 0, policy_version 79612 (0.0007) +[2023-10-09 07:27:06,458][60144] Updated weights for policy 1, policy_version 80502 (0.0009) +[2023-10-09 07:27:06,810][60144] Updated weights for policy 1, policy_version 80512 (0.0007) +[2023-10-09 07:27:10,108][60143] Updated weights for policy 0, policy_version 79622 (0.0008) +[2023-10-09 07:27:10,488][60143] Updated weights for policy 0, policy_version 79632 (0.0010) +[2023-10-09 07:27:10,856][60143] Updated weights for policy 0, policy_version 79642 (0.0009) +[2023-10-09 07:27:10,953][60144] Updated weights for policy 1, policy_version 80522 (0.0009) +[2023-10-09 07:27:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 163971072. Throughput: 0: 1613.4, 1: 1622.8. Samples: 41008476. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-09 07:27:11,053][59242] Avg episode reward: [(0, '30.720'), (1, '30.670')] +[2023-10-09 07:27:11,332][60144] Updated weights for policy 1, policy_version 80532 (0.0010) +[2023-10-09 07:27:11,691][60144] Updated weights for policy 1, policy_version 80542 (0.0009) +[2023-10-09 07:27:14,890][60143] Updated weights for policy 0, policy_version 79652 (0.0009) +[2023-10-09 07:27:15,254][60143] Updated weights for policy 0, policy_version 79662 (0.0007) +[2023-10-09 07:27:15,434][60144] Updated weights for policy 1, policy_version 80552 (0.0008) +[2023-10-09 07:27:15,629][60143] Updated weights for policy 0, policy_version 79672 (0.0008) +[2023-10-09 07:27:15,806][60144] Updated weights for policy 1, policy_version 80562 (0.0008) +[2023-10-09 07:27:16,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 164069376. Throughput: 0: 1629.5, 1: 1642.4. Samples: 41028544. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) +[2023-10-09 07:27:16,053][59242] Avg episode reward: [(0, '30.810'), (1, '32.120')] +[2023-10-09 07:27:16,173][60144] Updated weights for policy 1, policy_version 80572 (0.0008) +[2023-10-09 07:27:19,613][60143] Updated weights for policy 0, policy_version 79682 (0.0007) +[2023-10-09 07:27:19,989][60143] Updated weights for policy 0, policy_version 79692 (0.0009) +[2023-10-09 07:27:20,024][60144] Updated weights for policy 1, policy_version 80582 (0.0007) +[2023-10-09 07:27:20,362][60143] Updated weights for policy 0, policy_version 79702 (0.0008) +[2023-10-09 07:27:20,398][60144] Updated weights for policy 1, policy_version 80592 (0.0007) +[2023-10-09 07:27:20,731][60143] Updated weights for policy 0, policy_version 79712 (0.0009) +[2023-10-09 07:27:20,759][60144] Updated weights for policy 1, policy_version 80602 (0.0009) +[2023-10-09 07:27:21,052][59242] Fps is (10 sec: 19660.5, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 164167680. Throughput: 0: 1638.7, 1: 1655.0. Samples: 41039190. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:27:21,053][59242] Avg episode reward: [(0, '31.300'), (1, '32.910')] +[2023-10-09 07:27:24,710][60144] Updated weights for policy 1, policy_version 80612 (0.0009) +[2023-10-09 07:27:24,754][60143] Updated weights for policy 0, policy_version 79722 (0.0008) +[2023-10-09 07:27:25,086][60144] Updated weights for policy 1, policy_version 80622 (0.0008) +[2023-10-09 07:27:25,125][60143] Updated weights for policy 0, policy_version 79732 (0.0009) +[2023-10-09 07:27:25,450][60144] Updated weights for policy 1, policy_version 80632 (0.0010) +[2023-10-09 07:27:25,492][60143] Updated weights for policy 0, policy_version 79742 (0.0009) +[2023-10-09 07:27:26,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 164233216. Throughput: 0: 1648.0, 1: 1675.3. Samples: 41060114. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:27:26,052][59242] Avg episode reward: [(0, '31.290'), (1, '33.680')] +[2023-10-09 07:27:29,456][60144] Updated weights for policy 1, policy_version 80642 (0.0009) +[2023-10-09 07:27:29,641][60143] Updated weights for policy 0, policy_version 79752 (0.0007) +[2023-10-09 07:27:29,816][60144] Updated weights for policy 1, policy_version 80652 (0.0008) +[2023-10-09 07:27:30,012][60143] Updated weights for policy 0, policy_version 79762 (0.0008) +[2023-10-09 07:27:30,178][60144] Updated weights for policy 1, policy_version 80662 (0.0008) +[2023-10-09 07:27:30,380][60143] Updated weights for policy 0, policy_version 79772 (0.0008) +[2023-10-09 07:27:30,541][60144] Updated weights for policy 1, policy_version 80672 (0.0010) +[2023-10-09 07:27:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 164298752. Throughput: 0: 1650.6, 1: 1671.6. Samples: 41078656. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:27:31,053][59242] Avg episode reward: [(0, '33.200'), (1, '33.450')] +[2023-10-09 07:27:31,065][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000080672_82608128.pth... +[2023-10-09 07:27:31,065][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000079776_81690624.pth... +[2023-10-09 07:27:31,095][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000079072_80969728.pth +[2023-10-09 07:27:31,095][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000078176_80052224.pth +[2023-10-09 07:27:31,099][60003] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p1/milestones/checkpoint_000080672_82608128.pth +[2023-10-09 07:27:31,099][59934] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p0/milestones/checkpoint_000079776_81690624.pth +[2023-10-09 07:27:34,457][60143] Updated weights for policy 0, policy_version 79782 (0.0007) +[2023-10-09 07:27:34,528][60144] Updated weights for policy 1, policy_version 80682 (0.0007) +[2023-10-09 07:27:34,821][60143] Updated weights for policy 0, policy_version 79792 (0.0007) +[2023-10-09 07:27:34,897][60144] Updated weights for policy 1, policy_version 80692 (0.0008) +[2023-10-09 07:27:35,183][60143] Updated weights for policy 0, policy_version 79802 (0.0007) +[2023-10-09 07:27:35,259][60144] Updated weights for policy 1, policy_version 80702 (0.0009) +[2023-10-09 07:27:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 164364288. Throughput: 0: 1676.0, 1: 1704.5. Samples: 41090302. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:27:36,053][59242] Avg episode reward: [(0, '32.380'), (1, '33.200')] +[2023-10-09 07:27:39,305][60144] Updated weights for policy 1, policy_version 80712 (0.0009) +[2023-10-09 07:27:39,334][60143] Updated weights for policy 0, policy_version 79812 (0.0007) +[2023-10-09 07:27:39,668][60144] Updated weights for policy 1, policy_version 80722 (0.0007) +[2023-10-09 07:27:39,702][60143] Updated weights for policy 0, policy_version 79822 (0.0007) +[2023-10-09 07:27:40,027][60144] Updated weights for policy 1, policy_version 80732 (0.0007) +[2023-10-09 07:27:40,060][60143] Updated weights for policy 0, policy_version 79832 (0.0008) +[2023-10-09 07:27:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 164429824. Throughput: 0: 1679.8, 1: 1704.0. Samples: 41110514. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:27:41,053][59242] Avg episode reward: [(0, '33.370'), (1, '34.040')] +[2023-10-09 07:27:43,962][60143] Updated weights for policy 0, policy_version 79842 (0.0008) +[2023-10-09 07:27:44,079][60144] Updated weights for policy 1, policy_version 80742 (0.0008) +[2023-10-09 07:27:44,320][60143] Updated weights for policy 0, policy_version 79852 (0.0009) +[2023-10-09 07:27:44,452][60144] Updated weights for policy 1, policy_version 80752 (0.0009) +[2023-10-09 07:27:44,689][60143] Updated weights for policy 0, policy_version 79862 (0.0007) +[2023-10-09 07:27:44,817][60144] Updated weights for policy 1, policy_version 80762 (0.0009) +[2023-10-09 07:27:45,062][60143] Updated weights for policy 0, policy_version 79872 (0.0008) +[2023-10-09 07:27:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 164495360. Throughput: 0: 1677.5, 1: 1700.0. Samples: 41129934. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:27:46,053][59242] Avg episode reward: [(0, '33.890'), (1, '34.440')] +[2023-10-09 07:27:48,915][60144] Updated weights for policy 1, policy_version 80772 (0.0008) +[2023-10-09 07:27:49,068][60143] Updated weights for policy 0, policy_version 79882 (0.0009) +[2023-10-09 07:27:49,311][60144] Updated weights for policy 1, policy_version 80782 (0.0010) +[2023-10-09 07:27:49,429][60143] Updated weights for policy 0, policy_version 79892 (0.0008) +[2023-10-09 07:27:49,678][60144] Updated weights for policy 1, policy_version 80792 (0.0008) +[2023-10-09 07:27:49,796][60143] Updated weights for policy 0, policy_version 79902 (0.0007) +[2023-10-09 07:27:51,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 164560896. Throughput: 0: 1705.5, 1: 1731.7. Samples: 41141794. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:27:51,053][59242] Avg episode reward: [(0, '35.330'), (1, '34.640')] +[2023-10-09 07:27:53,481][60144] Updated weights for policy 1, policy_version 80802 (0.0008) +[2023-10-09 07:27:53,800][60143] Updated weights for policy 0, policy_version 79912 (0.0008) +[2023-10-09 07:27:53,854][60144] Updated weights for policy 1, policy_version 80812 (0.0008) +[2023-10-09 07:27:54,168][60143] Updated weights for policy 0, policy_version 79922 (0.0007) +[2023-10-09 07:27:54,218][60144] Updated weights for policy 1, policy_version 80822 (0.0010) +[2023-10-09 07:27:54,536][60143] Updated weights for policy 0, policy_version 79932 (0.0008) +[2023-10-09 07:27:54,579][60144] Updated weights for policy 1, policy_version 80832 (0.0009) +[2023-10-09 07:27:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 164626432. Throughput: 0: 1679.2, 1: 1703.1. Samples: 41160682. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:27:56,053][59242] Avg episode reward: [(0, '35.160'), (1, '34.190')] +[2023-10-09 07:27:58,435][60144] Updated weights for policy 1, policy_version 80842 (0.0007) +[2023-10-09 07:27:58,675][60143] Updated weights for policy 0, policy_version 79942 (0.0009) +[2023-10-09 07:27:58,798][60144] Updated weights for policy 1, policy_version 80852 (0.0007) +[2023-10-09 07:27:59,052][60143] Updated weights for policy 0, policy_version 79952 (0.0008) +[2023-10-09 07:27:59,171][60144] Updated weights for policy 1, policy_version 80862 (0.0008) +[2023-10-09 07:27:59,425][60143] Updated weights for policy 0, policy_version 79962 (0.0008) +[2023-10-09 07:28:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 164691968. Throughput: 0: 1688.3, 1: 1707.7. Samples: 41181362. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:28:01,053][59242] Avg episode reward: [(0, '35.090'), (1, '36.190')] +[2023-10-09 07:28:03,021][60144] Updated weights for policy 1, policy_version 80872 (0.0008) +[2023-10-09 07:28:03,267][60143] Updated weights for policy 0, policy_version 79972 (0.0009) +[2023-10-09 07:28:03,395][60144] Updated weights for policy 1, policy_version 80882 (0.0007) +[2023-10-09 07:28:03,637][60143] Updated weights for policy 0, policy_version 79982 (0.0008) +[2023-10-09 07:28:03,762][60144] Updated weights for policy 1, policy_version 80892 (0.0008) +[2023-10-09 07:28:04,008][60143] Updated weights for policy 0, policy_version 79992 (0.0007) +[2023-10-09 07:28:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 164757504. Throughput: 0: 1691.9, 1: 1707.2. Samples: 41192148. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:28:06,053][59242] Avg episode reward: [(0, '34.280'), (1, '35.780')] +[2023-10-09 07:28:07,787][60144] Updated weights for policy 1, policy_version 80902 (0.0008) +[2023-10-09 07:28:08,045][60143] Updated weights for policy 0, policy_version 80002 (0.0009) +[2023-10-09 07:28:08,142][60144] Updated weights for policy 1, policy_version 80912 (0.0007) +[2023-10-09 07:28:08,411][60143] Updated weights for policy 0, policy_version 80012 (0.0008) +[2023-10-09 07:28:08,513][60144] Updated weights for policy 1, policy_version 80922 (0.0008) +[2023-10-09 07:28:08,779][60143] Updated weights for policy 0, policy_version 80022 (0.0008) +[2023-10-09 07:28:09,150][60143] Updated weights for policy 0, policy_version 80032 (0.0009) +[2023-10-09 07:28:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 164823040. Throughput: 0: 1681.0, 1: 1693.6. Samples: 41211968. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:28:11,052][59242] Avg episode reward: [(0, '34.100'), (1, '34.390')] +[2023-10-09 07:28:12,375][60144] Updated weights for policy 1, policy_version 80932 (0.0007) +[2023-10-09 07:28:12,747][60144] Updated weights for policy 1, policy_version 80942 (0.0008) +[2023-10-09 07:28:13,110][60144] Updated weights for policy 1, policy_version 80952 (0.0009) +[2023-10-09 07:28:13,225][60143] Updated weights for policy 0, policy_version 80042 (0.0007) +[2023-10-09 07:28:13,594][60143] Updated weights for policy 0, policy_version 80052 (0.0008) +[2023-10-09 07:28:13,966][60143] Updated weights for policy 0, policy_version 80062 (0.0009) +[2023-10-09 07:28:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 164888576. Throughput: 0: 1710.9, 1: 1721.7. Samples: 41233126. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 07:28:16,052][59242] Avg episode reward: [(0, '31.690'), (1, '33.590')] +[2023-10-09 07:28:17,111][60144] Updated weights for policy 1, policy_version 80962 (0.0007) +[2023-10-09 07:28:17,479][60144] Updated weights for policy 1, policy_version 80972 (0.0007) +[2023-10-09 07:28:17,851][60144] Updated weights for policy 1, policy_version 80982 (0.0007) +[2023-10-09 07:28:17,996][60143] Updated weights for policy 0, policy_version 80072 (0.0007) +[2023-10-09 07:28:18,211][60144] Updated weights for policy 1, policy_version 80992 (0.0007) +[2023-10-09 07:28:18,362][60143] Updated weights for policy 0, policy_version 80082 (0.0007) +[2023-10-09 07:28:18,741][60143] Updated weights for policy 0, policy_version 80092 (0.0008) +[2023-10-09 07:28:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 164954112. Throughput: 0: 1695.2, 1: 1697.5. Samples: 41242974. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 07:28:21,053][59242] Avg episode reward: [(0, '32.150'), (1, '33.230')] +[2023-10-09 07:28:22,213][60144] Updated weights for policy 1, policy_version 81002 (0.0007) +[2023-10-09 07:28:22,575][60144] Updated weights for policy 1, policy_version 81012 (0.0009) +[2023-10-09 07:28:22,694][60143] Updated weights for policy 0, policy_version 80102 (0.0008) +[2023-10-09 07:28:22,942][60144] Updated weights for policy 1, policy_version 81022 (0.0008) +[2023-10-09 07:28:23,053][60143] Updated weights for policy 0, policy_version 80112 (0.0010) +[2023-10-09 07:28:23,432][60143] Updated weights for policy 0, policy_version 80122 (0.0009) +[2023-10-09 07:28:26,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13551.5). Total num frames: 165019648. Throughput: 0: 1691.5, 1: 1713.8. Samples: 41263754. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 07:28:26,053][59242] Avg episode reward: [(0, '32.340'), (1, '32.710')] +[2023-10-09 07:28:26,810][60144] Updated weights for policy 1, policy_version 81032 (0.0007) +[2023-10-09 07:28:27,175][60144] Updated weights for policy 1, policy_version 81042 (0.0007) +[2023-10-09 07:28:27,509][60143] Updated weights for policy 0, policy_version 80132 (0.0008) +[2023-10-09 07:28:27,551][60144] Updated weights for policy 1, policy_version 81052 (0.0008) +[2023-10-09 07:28:27,884][60143] Updated weights for policy 0, policy_version 80142 (0.0008) +[2023-10-09 07:28:28,255][60143] Updated weights for policy 0, policy_version 80152 (0.0009) +[2023-10-09 07:28:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13551.5). Total num frames: 165085184. Throughput: 0: 1708.0, 1: 1731.1. Samples: 41284696. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 07:28:31,053][59242] Avg episode reward: [(0, '32.660'), (1, '31.560')] +[2023-10-09 07:28:31,567][60144] Updated weights for policy 1, policy_version 81062 (0.0008) +[2023-10-09 07:28:31,933][60144] Updated weights for policy 1, policy_version 81072 (0.0009) +[2023-10-09 07:28:32,303][60143] Updated weights for policy 0, policy_version 80162 (0.0008) +[2023-10-09 07:28:32,304][60144] Updated weights for policy 1, policy_version 81082 (0.0008) +[2023-10-09 07:28:32,672][60143] Updated weights for policy 0, policy_version 80172 (0.0007) +[2023-10-09 07:28:33,040][60143] Updated weights for policy 0, policy_version 80182 (0.0007) +[2023-10-09 07:28:33,408][60143] Updated weights for policy 0, policy_version 80192 (0.0009) +[2023-10-09 07:28:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 165150720. Throughput: 0: 1677.0, 1: 1704.5. Samples: 41293962. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 07:28:36,053][59242] Avg episode reward: [(0, '31.210'), (1, '33.020')] +[2023-10-09 07:28:36,336][60144] Updated weights for policy 1, policy_version 81092 (0.0009) +[2023-10-09 07:28:36,720][60144] Updated weights for policy 1, policy_version 81102 (0.0009) +[2023-10-09 07:28:37,087][60144] Updated weights for policy 1, policy_version 81112 (0.0007) +[2023-10-09 07:28:37,489][60143] Updated weights for policy 0, policy_version 80202 (0.0009) +[2023-10-09 07:28:37,857][60143] Updated weights for policy 0, policy_version 80212 (0.0008) +[2023-10-09 07:28:38,231][60143] Updated weights for policy 0, policy_version 80222 (0.0010) +[2023-10-09 07:28:40,861][60144] Updated weights for policy 1, policy_version 81122 (0.0007) +[2023-10-09 07:28:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 165216256. Throughput: 0: 1698.0, 1: 1736.2. Samples: 41315222. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 07:28:41,053][59242] Avg episode reward: [(0, '32.880'), (1, '31.740')] +[2023-10-09 07:28:41,227][60144] Updated weights for policy 1, policy_version 81132 (0.0008) +[2023-10-09 07:28:41,591][60144] Updated weights for policy 1, policy_version 81142 (0.0007) +[2023-10-09 07:28:41,962][60144] Updated weights for policy 1, policy_version 81152 (0.0009) +[2023-10-09 07:28:42,154][60143] Updated weights for policy 0, policy_version 80232 (0.0009) +[2023-10-09 07:28:42,518][60143] Updated weights for policy 0, policy_version 80242 (0.0009) +[2023-10-09 07:28:42,886][60143] Updated weights for policy 0, policy_version 80252 (0.0008) +[2023-10-09 07:28:45,846][60144] Updated weights for policy 1, policy_version 81162 (0.0009) +[2023-10-09 07:28:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 165281792. Throughput: 0: 1708.7, 1: 1735.2. Samples: 41336338. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 07:28:46,052][59242] Avg episode reward: [(0, '33.770'), (1, '31.870')] +[2023-10-09 07:28:46,216][60144] Updated weights for policy 1, policy_version 81172 (0.0008) +[2023-10-09 07:28:46,586][60144] Updated weights for policy 1, policy_version 81182 (0.0009) +[2023-10-09 07:28:47,003][60143] Updated weights for policy 0, policy_version 80262 (0.0008) +[2023-10-09 07:28:47,394][60143] Updated weights for policy 0, policy_version 80272 (0.0008) +[2023-10-09 07:28:47,776][60143] Updated weights for policy 0, policy_version 80282 (0.0011) +[2023-10-09 07:28:50,491][60144] Updated weights for policy 1, policy_version 81192 (0.0010) +[2023-10-09 07:28:50,862][60144] Updated weights for policy 1, policy_version 81202 (0.0010) +[2023-10-09 07:28:51,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 165347328. Throughput: 0: 1681.6, 1: 1726.5. Samples: 41345512. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 07:28:51,053][59242] Avg episode reward: [(0, '35.880'), (1, '32.300')] +[2023-10-09 07:28:51,224][60144] Updated weights for policy 1, policy_version 81212 (0.0008) +[2023-10-09 07:28:51,904][60143] Updated weights for policy 0, policy_version 80292 (0.0011) +[2023-10-09 07:28:52,277][60143] Updated weights for policy 0, policy_version 80302 (0.0009) +[2023-10-09 07:28:52,634][60143] Updated weights for policy 0, policy_version 80312 (0.0008) +[2023-10-09 07:28:55,191][60144] Updated weights for policy 1, policy_version 81222 (0.0009) +[2023-10-09 07:28:55,547][60144] Updated weights for policy 1, policy_version 81232 (0.0010) +[2023-10-09 07:28:55,917][60144] Updated weights for policy 1, policy_version 81242 (0.0009) +[2023-10-09 07:28:56,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 165412864. Throughput: 0: 1703.0, 1: 1743.1. Samples: 41367040. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 07:28:56,053][59242] Avg episode reward: [(0, '35.050'), (1, '31.490')] +[2023-10-09 07:28:56,523][60143] Updated weights for policy 0, policy_version 80322 (0.0008) +[2023-10-09 07:28:56,891][60143] Updated weights for policy 0, policy_version 80332 (0.0008) +[2023-10-09 07:28:57,266][60143] Updated weights for policy 0, policy_version 80342 (0.0007) +[2023-10-09 07:28:57,636][60143] Updated weights for policy 0, policy_version 80352 (0.0010) +[2023-10-09 07:29:00,000][60144] Updated weights for policy 1, policy_version 81252 (0.0009) +[2023-10-09 07:29:00,366][60144] Updated weights for policy 1, policy_version 81262 (0.0009) +[2023-10-09 07:29:00,735][60144] Updated weights for policy 1, policy_version 81272 (0.0010) +[2023-10-09 07:29:01,052][59242] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 165511168. Throughput: 0: 1699.1, 1: 1727.1. Samples: 41387308. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 07:29:01,053][59242] Avg episode reward: [(0, '34.410'), (1, '31.750')] +[2023-10-09 07:29:01,687][60143] Updated weights for policy 0, policy_version 80362 (0.0008) +[2023-10-09 07:29:02,061][60143] Updated weights for policy 0, policy_version 80372 (0.0008) +[2023-10-09 07:29:02,428][60143] Updated weights for policy 0, policy_version 80382 (0.0007) +[2023-10-09 07:29:04,602][60144] Updated weights for policy 1, policy_version 81282 (0.0008) +[2023-10-09 07:29:04,971][60144] Updated weights for policy 1, policy_version 81292 (0.0009) +[2023-10-09 07:29:05,332][60144] Updated weights for policy 1, policy_version 81302 (0.0010) +[2023-10-09 07:29:05,692][60144] Updated weights for policy 1, policy_version 81312 (0.0009) +[2023-10-09 07:29:06,052][59242] Fps is (10 sec: 16384.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 165576704. Throughput: 0: 1686.6, 1: 1745.7. Samples: 41397424. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-09 07:29:06,053][59242] Avg episode reward: [(0, '33.650'), (1, '31.900')] +[2023-10-09 07:29:06,420][60143] Updated weights for policy 0, policy_version 80392 (0.0010) +[2023-10-09 07:29:06,788][60143] Updated weights for policy 0, policy_version 80402 (0.0007) +[2023-10-09 07:29:07,157][60143] Updated weights for policy 0, policy_version 80412 (0.0008) +[2023-10-09 07:29:09,867][60144] Updated weights for policy 1, policy_version 81322 (0.0007) +[2023-10-09 07:29:10,222][60144] Updated weights for policy 1, policy_version 81332 (0.0008) +[2023-10-09 07:29:10,588][60144] Updated weights for policy 1, policy_version 81342 (0.0007) +[2023-10-09 07:29:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 165642240. Throughput: 0: 1696.0, 1: 1741.8. Samples: 41418454. Policy #0 lag: (min: 0.0, avg: 27.7, max: 32.0) +[2023-10-09 07:29:11,053][59242] Avg episode reward: [(0, '35.270'), (1, '31.750')] +[2023-10-09 07:29:11,115][60143] Updated weights for policy 0, policy_version 80422 (0.0009) +[2023-10-09 07:29:11,478][60143] Updated weights for policy 0, policy_version 80432 (0.0009) +[2023-10-09 07:29:11,845][60143] Updated weights for policy 0, policy_version 80442 (0.0011) +[2023-10-09 07:29:14,406][60144] Updated weights for policy 1, policy_version 81352 (0.0007) +[2023-10-09 07:29:14,780][60144] Updated weights for policy 1, policy_version 81362 (0.0007) +[2023-10-09 07:29:15,151][60144] Updated weights for policy 1, policy_version 81372 (0.0008) +[2023-10-09 07:29:15,913][60143] Updated weights for policy 0, policy_version 80452 (0.0009) +[2023-10-09 07:29:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 165707776. Throughput: 0: 1702.4, 1: 1717.8. Samples: 41438608. Policy #0 lag: (min: 0.0, avg: 27.7, max: 32.0) +[2023-10-09 07:29:16,052][59242] Avg episode reward: [(0, '34.720'), (1, '31.510')] +[2023-10-09 07:29:16,277][60143] Updated weights for policy 0, policy_version 80462 (0.0007) +[2023-10-09 07:29:16,643][60143] Updated weights for policy 0, policy_version 80472 (0.0007) +[2023-10-09 07:29:18,930][60144] Updated weights for policy 1, policy_version 81382 (0.0008) +[2023-10-09 07:29:19,298][60144] Updated weights for policy 1, policy_version 81392 (0.0007) +[2023-10-09 07:29:19,657][60144] Updated weights for policy 1, policy_version 81402 (0.0010) +[2023-10-09 07:29:20,522][60143] Updated weights for policy 0, policy_version 80482 (0.0010) +[2023-10-09 07:29:20,888][60143] Updated weights for policy 0, policy_version 80492 (0.0009) +[2023-10-09 07:29:21,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 165773312. Throughput: 0: 1702.4, 1: 1749.9. Samples: 41449314. Policy #0 lag: (min: 0.0, avg: 27.7, max: 32.0) +[2023-10-09 07:29:21,052][59242] Avg episode reward: [(0, '33.950'), (1, '32.240')] +[2023-10-09 07:29:21,261][60143] Updated weights for policy 0, policy_version 80502 (0.0008) +[2023-10-09 07:29:21,623][60143] Updated weights for policy 0, policy_version 80512 (0.0008) +[2023-10-09 07:29:23,492][60144] Updated weights for policy 1, policy_version 81412 (0.0007) +[2023-10-09 07:29:23,891][60144] Updated weights for policy 1, policy_version 81422 (0.0008) +[2023-10-09 07:29:24,255][60144] Updated weights for policy 1, policy_version 81432 (0.0009) +[2023-10-09 07:29:25,622][60143] Updated weights for policy 0, policy_version 80522 (0.0008) +[2023-10-09 07:29:25,990][60143] Updated weights for policy 0, policy_version 80532 (0.0007) +[2023-10-09 07:29:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 165838848. Throughput: 0: 1707.0, 1: 1716.8. Samples: 41469292. Policy #0 lag: (min: 0.0, avg: 27.7, max: 32.0) +[2023-10-09 07:29:26,052][59242] Avg episode reward: [(0, '34.210'), (1, '30.970')] +[2023-10-09 07:29:26,363][60143] Updated weights for policy 0, policy_version 80542 (0.0008) +[2023-10-09 07:29:28,265][60144] Updated weights for policy 1, policy_version 81442 (0.0007) +[2023-10-09 07:29:28,627][60144] Updated weights for policy 1, policy_version 81452 (0.0007) +[2023-10-09 07:29:28,989][60144] Updated weights for policy 1, policy_version 81462 (0.0007) +[2023-10-09 07:29:29,355][60144] Updated weights for policy 1, policy_version 81472 (0.0009) +[2023-10-09 07:29:30,272][60143] Updated weights for policy 0, policy_version 80552 (0.0009) +[2023-10-09 07:29:30,643][60143] Updated weights for policy 0, policy_version 80562 (0.0008) +[2023-10-09 07:29:30,999][60143] Updated weights for policy 0, policy_version 80572 (0.0010) +[2023-10-09 07:29:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 165904384. Throughput: 0: 1695.2, 1: 1716.4. Samples: 41489862. Policy #0 lag: (min: 0.0, avg: 27.7, max: 32.0) +[2023-10-09 07:29:31,053][59242] Avg episode reward: [(0, '34.110'), (1, '30.390')] +[2023-10-09 07:29:31,064][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000081472_83427328.pth... +[2023-10-09 07:29:31,101][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000079872_81788928.pth +[2023-10-09 07:29:31,145][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000080576_82509824.pth... +[2023-10-09 07:29:31,175][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000078976_80871424.pth +[2023-10-09 07:29:33,266][60144] Updated weights for policy 1, policy_version 81482 (0.0010) +[2023-10-09 07:29:33,635][60144] Updated weights for policy 1, policy_version 81492 (0.0009) +[2023-10-09 07:29:34,001][60144] Updated weights for policy 1, policy_version 81502 (0.0009) +[2023-10-09 07:29:35,070][60143] Updated weights for policy 0, policy_version 80582 (0.0008) +[2023-10-09 07:29:35,455][60143] Updated weights for policy 0, policy_version 80592 (0.0008) +[2023-10-09 07:29:35,836][60143] Updated weights for policy 0, policy_version 80602 (0.0009) +[2023-10-09 07:29:36,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 166002688. Throughput: 0: 1716.1, 1: 1725.3. Samples: 41500374. Policy #0 lag: (min: 0.0, avg: 27.7, max: 32.0) +[2023-10-09 07:29:36,052][59242] Avg episode reward: [(0, '35.370'), (1, '29.690')] +[2023-10-09 07:29:38,032][60144] Updated weights for policy 1, policy_version 81512 (0.0008) +[2023-10-09 07:29:38,399][60144] Updated weights for policy 1, policy_version 81522 (0.0007) +[2023-10-09 07:29:38,776][60144] Updated weights for policy 1, policy_version 81532 (0.0007) +[2023-10-09 07:29:40,071][60143] Updated weights for policy 0, policy_version 80612 (0.0008) +[2023-10-09 07:29:40,441][60143] Updated weights for policy 0, policy_version 80622 (0.0009) +[2023-10-09 07:29:40,812][60143] Updated weights for policy 0, policy_version 80632 (0.0011) +[2023-10-09 07:29:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 166035456. Throughput: 0: 1704.1, 1: 1712.2. Samples: 41520774. Policy #0 lag: (min: 0.0, avg: 27.7, max: 32.0) +[2023-10-09 07:29:41,053][59242] Avg episode reward: [(0, '36.010'), (1, '32.750')] +[2023-10-09 07:29:42,767][60144] Updated weights for policy 1, policy_version 81542 (0.0007) +[2023-10-09 07:29:43,128][60144] Updated weights for policy 1, policy_version 81552 (0.0008) +[2023-10-09 07:29:43,493][60144] Updated weights for policy 1, policy_version 81562 (0.0011) +[2023-10-09 07:29:44,786][60143] Updated weights for policy 0, policy_version 80642 (0.0008) +[2023-10-09 07:29:45,156][60143] Updated weights for policy 0, policy_version 80652 (0.0007) +[2023-10-09 07:29:45,528][60143] Updated weights for policy 0, policy_version 80662 (0.0010) +[2023-10-09 07:29:45,900][60143] Updated weights for policy 0, policy_version 80672 (0.0008) +[2023-10-09 07:29:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 166133760. Throughput: 0: 1690.4, 1: 1730.3. Samples: 41541236. Policy #0 lag: (min: 0.0, avg: 27.7, max: 32.0) +[2023-10-09 07:29:46,053][59242] Avg episode reward: [(0, '35.390'), (1, '32.710')] +[2023-10-09 07:29:47,314][60144] Updated weights for policy 1, policy_version 81572 (0.0009) +[2023-10-09 07:29:47,682][60144] Updated weights for policy 1, policy_version 81582 (0.0008) +[2023-10-09 07:29:48,046][60144] Updated weights for policy 1, policy_version 81592 (0.0009) +[2023-10-09 07:29:49,928][60143] Updated weights for policy 0, policy_version 80682 (0.0008) +[2023-10-09 07:29:50,300][60143] Updated weights for policy 0, policy_version 80692 (0.0008) +[2023-10-09 07:29:50,673][60143] Updated weights for policy 0, policy_version 80702 (0.0009) +[2023-10-09 07:29:51,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 166199296. Throughput: 0: 1711.6, 1: 1710.9. Samples: 41551438. Policy #0 lag: (min: 0.0, avg: 27.7, max: 32.0) +[2023-10-09 07:29:51,053][59242] Avg episode reward: [(0, '35.010'), (1, '33.020')] +[2023-10-09 07:29:52,101][60144] Updated weights for policy 1, policy_version 81602 (0.0011) +[2023-10-09 07:29:52,466][60144] Updated weights for policy 1, policy_version 81612 (0.0008) +[2023-10-09 07:29:52,846][60144] Updated weights for policy 1, policy_version 81622 (0.0008) +[2023-10-09 07:29:53,208][60144] Updated weights for policy 1, policy_version 81632 (0.0010) +[2023-10-09 07:29:54,661][60143] Updated weights for policy 0, policy_version 80712 (0.0009) +[2023-10-09 07:29:55,037][60143] Updated weights for policy 0, policy_version 80722 (0.0008) +[2023-10-09 07:29:55,407][60143] Updated weights for policy 0, policy_version 80732 (0.0008) +[2023-10-09 07:29:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 166264832. Throughput: 0: 1713.1, 1: 1710.6. Samples: 41572520. Policy #0 lag: (min: 0.0, avg: 27.7, max: 32.0) +[2023-10-09 07:29:56,053][59242] Avg episode reward: [(0, '36.440'), (1, '32.370')] +[2023-10-09 07:29:57,226][60144] Updated weights for policy 1, policy_version 81642 (0.0011) +[2023-10-09 07:29:57,595][60144] Updated weights for policy 1, policy_version 81652 (0.0009) +[2023-10-09 07:29:57,968][60144] Updated weights for policy 1, policy_version 81662 (0.0011) +[2023-10-09 07:29:59,302][60143] Updated weights for policy 0, policy_version 80742 (0.0008) +[2023-10-09 07:29:59,682][60143] Updated weights for policy 0, policy_version 80752 (0.0008) +[2023-10-09 07:30:00,060][60143] Updated weights for policy 0, policy_version 80762 (0.0007) +[2023-10-09 07:30:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 166330368. Throughput: 0: 1688.3, 1: 1739.6. Samples: 41592862. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:30:01,053][59242] Avg episode reward: [(0, '35.610'), (1, '34.320')] +[2023-10-09 07:30:01,870][60144] Updated weights for policy 1, policy_version 81672 (0.0010) +[2023-10-09 07:30:02,237][60144] Updated weights for policy 1, policy_version 81682 (0.0007) +[2023-10-09 07:30:02,611][60144] Updated weights for policy 1, policy_version 81692 (0.0007) +[2023-10-09 07:30:03,825][60143] Updated weights for policy 0, policy_version 80772 (0.0008) +[2023-10-09 07:30:04,192][60143] Updated weights for policy 0, policy_version 80782 (0.0009) +[2023-10-09 07:30:04,556][60143] Updated weights for policy 0, policy_version 80792 (0.0009) +[2023-10-09 07:30:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 166395904. Throughput: 0: 1722.8, 1: 1705.5. Samples: 41603586. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:30:06,052][59242] Avg episode reward: [(0, '36.790'), (1, '32.270')] +[2023-10-09 07:30:06,583][60144] Updated weights for policy 1, policy_version 81702 (0.0008) +[2023-10-09 07:30:06,953][60144] Updated weights for policy 1, policy_version 81712 (0.0008) +[2023-10-09 07:30:07,313][60144] Updated weights for policy 1, policy_version 81722 (0.0008) +[2023-10-09 07:30:08,412][60143] Updated weights for policy 0, policy_version 80802 (0.0008) +[2023-10-09 07:30:08,779][60143] Updated weights for policy 0, policy_version 80812 (0.0008) +[2023-10-09 07:30:09,148][60143] Updated weights for policy 0, policy_version 80822 (0.0009) +[2023-10-09 07:30:09,521][60143] Updated weights for policy 0, policy_version 80832 (0.0007) +[2023-10-09 07:30:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 166461440. Throughput: 0: 1693.6, 1: 1735.3. Samples: 41623594. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:30:11,053][59242] Avg episode reward: [(0, '36.420'), (1, '33.350')] +[2023-10-09 07:30:11,251][60144] Updated weights for policy 1, policy_version 81732 (0.0009) +[2023-10-09 07:30:11,645][60144] Updated weights for policy 1, policy_version 81742 (0.0010) +[2023-10-09 07:30:12,013][60144] Updated weights for policy 1, policy_version 81752 (0.0008) +[2023-10-09 07:30:13,724][60143] Updated weights for policy 0, policy_version 80842 (0.0009) +[2023-10-09 07:30:14,091][60143] Updated weights for policy 0, policy_version 80852 (0.0010) +[2023-10-09 07:30:14,460][60143] Updated weights for policy 0, policy_version 80862 (0.0011) +[2023-10-09 07:30:16,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 166526976. Throughput: 0: 1702.2, 1: 1733.2. Samples: 41644454. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:30:16,053][59242] Avg episode reward: [(0, '34.510'), (1, '31.880')] +[2023-10-09 07:30:16,058][60144] Updated weights for policy 1, policy_version 81762 (0.0007) +[2023-10-09 07:30:16,426][60144] Updated weights for policy 1, policy_version 81772 (0.0007) +[2023-10-09 07:30:16,785][60144] Updated weights for policy 1, policy_version 81782 (0.0008) +[2023-10-09 07:30:17,148][60144] Updated weights for policy 1, policy_version 81792 (0.0008) +[2023-10-09 07:30:18,489][60143] Updated weights for policy 0, policy_version 80872 (0.0010) +[2023-10-09 07:30:18,866][60143] Updated weights for policy 0, policy_version 80882 (0.0010) +[2023-10-09 07:30:19,221][60143] Updated weights for policy 0, policy_version 80892 (0.0007) +[2023-10-09 07:30:21,017][60144] Updated weights for policy 1, policy_version 81802 (0.0009) +[2023-10-09 07:30:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 166592512. Throughput: 0: 1707.3, 1: 1721.5. Samples: 41654674. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:30:21,053][59242] Avg episode reward: [(0, '34.620'), (1, '32.330')] +[2023-10-09 07:30:21,374][60144] Updated weights for policy 1, policy_version 81812 (0.0009) +[2023-10-09 07:30:21,745][60144] Updated weights for policy 1, policy_version 81822 (0.0007) +[2023-10-09 07:30:23,285][60143] Updated weights for policy 0, policy_version 80902 (0.0007) +[2023-10-09 07:30:23,679][60143] Updated weights for policy 0, policy_version 80912 (0.0008) +[2023-10-09 07:30:24,040][60143] Updated weights for policy 0, policy_version 80922 (0.0007) +[2023-10-09 07:30:25,546][60144] Updated weights for policy 1, policy_version 81832 (0.0008) +[2023-10-09 07:30:25,904][60144] Updated weights for policy 1, policy_version 81842 (0.0009) +[2023-10-09 07:30:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 166658048. Throughput: 0: 1693.3, 1: 1742.1. Samples: 41675366. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:30:26,053][59242] Avg episode reward: [(0, '33.320'), (1, '32.670')] +[2023-10-09 07:30:26,283][60144] Updated weights for policy 1, policy_version 81852 (0.0010) +[2023-10-09 07:30:28,113][60143] Updated weights for policy 0, policy_version 80932 (0.0009) +[2023-10-09 07:30:28,488][60143] Updated weights for policy 0, policy_version 80942 (0.0009) +[2023-10-09 07:30:28,850][60143] Updated weights for policy 0, policy_version 80952 (0.0009) +[2023-10-09 07:30:30,283][60144] Updated weights for policy 1, policy_version 81862 (0.0008) +[2023-10-09 07:30:30,648][60144] Updated weights for policy 1, policy_version 81872 (0.0009) +[2023-10-09 07:30:31,011][60144] Updated weights for policy 1, policy_version 81882 (0.0008) +[2023-10-09 07:30:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 166723584. Throughput: 0: 1712.3, 1: 1726.1. Samples: 41695966. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:30:31,053][59242] Avg episode reward: [(0, '34.300'), (1, '30.910')] +[2023-10-09 07:30:32,795][60143] Updated weights for policy 0, policy_version 80962 (0.0007) +[2023-10-09 07:30:33,160][60143] Updated weights for policy 0, policy_version 80972 (0.0008) +[2023-10-09 07:30:33,524][60143] Updated weights for policy 0, policy_version 80982 (0.0008) +[2023-10-09 07:30:33,889][60143] Updated weights for policy 0, policy_version 80992 (0.0008) +[2023-10-09 07:30:34,971][60144] Updated weights for policy 1, policy_version 81892 (0.0010) +[2023-10-09 07:30:35,331][60144] Updated weights for policy 1, policy_version 81902 (0.0009) +[2023-10-09 07:30:35,694][60144] Updated weights for policy 1, policy_version 81912 (0.0010) +[2023-10-09 07:30:36,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 166821888. Throughput: 0: 1702.2, 1: 1739.2. Samples: 41706304. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:30:36,053][59242] Avg episode reward: [(0, '35.560'), (1, '31.420')] +[2023-10-09 07:30:37,810][60143] Updated weights for policy 0, policy_version 81002 (0.0007) +[2023-10-09 07:30:38,170][60143] Updated weights for policy 0, policy_version 81012 (0.0010) +[2023-10-09 07:30:38,547][60143] Updated weights for policy 0, policy_version 81022 (0.0010) +[2023-10-09 07:30:39,660][60144] Updated weights for policy 1, policy_version 81922 (0.0011) +[2023-10-09 07:30:40,034][60144] Updated weights for policy 1, policy_version 81932 (0.0009) +[2023-10-09 07:30:40,405][60144] Updated weights for policy 1, policy_version 81942 (0.0009) +[2023-10-09 07:30:40,762][60144] Updated weights for policy 1, policy_version 81952 (0.0009) +[2023-10-09 07:30:41,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 166887424. Throughput: 0: 1694.4, 1: 1741.9. Samples: 41727154. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:30:41,053][59242] Avg episode reward: [(0, '33.810'), (1, '31.400')] +[2023-10-09 07:30:42,572][60143] Updated weights for policy 0, policy_version 81032 (0.0010) +[2023-10-09 07:30:42,943][60143] Updated weights for policy 0, policy_version 81042 (0.0008) +[2023-10-09 07:30:43,310][60143] Updated weights for policy 0, policy_version 81052 (0.0012) +[2023-10-09 07:30:44,603][60144] Updated weights for policy 1, policy_version 81962 (0.0007) +[2023-10-09 07:30:44,973][60144] Updated weights for policy 1, policy_version 81972 (0.0008) +[2023-10-09 07:30:45,341][60144] Updated weights for policy 1, policy_version 81982 (0.0009) +[2023-10-09 07:30:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 166952960. Throughput: 0: 1719.6, 1: 1707.2. Samples: 41747064. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:30:46,053][59242] Avg episode reward: [(0, '33.400'), (1, '32.470')] +[2023-10-09 07:30:47,054][60143] Updated weights for policy 0, policy_version 81062 (0.0009) +[2023-10-09 07:30:47,419][60143] Updated weights for policy 0, policy_version 81072 (0.0007) +[2023-10-09 07:30:47,783][60143] Updated weights for policy 0, policy_version 81082 (0.0008) +[2023-10-09 07:30:49,128][60144] Updated weights for policy 1, policy_version 81992 (0.0009) +[2023-10-09 07:30:49,497][60144] Updated weights for policy 1, policy_version 82002 (0.0008) +[2023-10-09 07:30:49,871][60144] Updated weights for policy 1, policy_version 82012 (0.0008) +[2023-10-09 07:30:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 167018496. Throughput: 0: 1684.6, 1: 1746.3. Samples: 41757976. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:30:51,053][59242] Avg episode reward: [(0, '33.640'), (1, '32.470')] +[2023-10-09 07:30:51,808][60143] Updated weights for policy 0, policy_version 81092 (0.0008) +[2023-10-09 07:30:52,185][60143] Updated weights for policy 0, policy_version 81102 (0.0011) +[2023-10-09 07:30:52,562][60143] Updated weights for policy 0, policy_version 81112 (0.0010) +[2023-10-09 07:30:53,755][60144] Updated weights for policy 1, policy_version 82022 (0.0008) +[2023-10-09 07:30:54,122][60144] Updated weights for policy 1, policy_version 82032 (0.0009) +[2023-10-09 07:30:54,494][60144] Updated weights for policy 1, policy_version 82042 (0.0009) +[2023-10-09 07:30:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 167084032. Throughput: 0: 1711.0, 1: 1723.5. Samples: 41778146. Policy #0 lag: (min: 27.0, avg: 27.1, max: 35.0) +[2023-10-09 07:30:56,053][59242] Avg episode reward: [(0, '35.270'), (1, '31.730')] +[2023-10-09 07:30:56,469][60143] Updated weights for policy 0, policy_version 81122 (0.0011) +[2023-10-09 07:30:56,850][60143] Updated weights for policy 0, policy_version 81132 (0.0009) +[2023-10-09 07:30:57,223][60143] Updated weights for policy 0, policy_version 81142 (0.0010) +[2023-10-09 07:30:57,592][60143] Updated weights for policy 0, policy_version 81152 (0.0009) +[2023-10-09 07:30:58,366][60144] Updated weights for policy 1, policy_version 82052 (0.0009) +[2023-10-09 07:30:58,753][60144] Updated weights for policy 1, policy_version 82062 (0.0011) +[2023-10-09 07:30:59,111][60144] Updated weights for policy 1, policy_version 82072 (0.0010) +[2023-10-09 07:31:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 167149568. Throughput: 0: 1716.4, 1: 1725.9. Samples: 41799358. Policy #0 lag: (min: 27.0, avg: 27.1, max: 35.0) +[2023-10-09 07:31:01,053][59242] Avg episode reward: [(0, '34.540'), (1, '30.800')] +[2023-10-09 07:31:01,625][60143] Updated weights for policy 0, policy_version 81162 (0.0008) +[2023-10-09 07:31:01,995][60143] Updated weights for policy 0, policy_version 81172 (0.0009) +[2023-10-09 07:31:02,364][60143] Updated weights for policy 0, policy_version 81182 (0.0009) +[2023-10-09 07:31:03,079][60144] Updated weights for policy 1, policy_version 82082 (0.0007) +[2023-10-09 07:31:03,443][60144] Updated weights for policy 1, policy_version 82092 (0.0009) +[2023-10-09 07:31:03,805][60144] Updated weights for policy 1, policy_version 82102 (0.0009) +[2023-10-09 07:31:04,174][60144] Updated weights for policy 1, policy_version 82112 (0.0010) +[2023-10-09 07:31:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 167215104. Throughput: 0: 1694.8, 1: 1746.2. Samples: 41809522. Policy #0 lag: (min: 27.0, avg: 27.1, max: 35.0) +[2023-10-09 07:31:06,053][59242] Avg episode reward: [(0, '32.980'), (1, '30.680')] +[2023-10-09 07:31:06,349][60143] Updated weights for policy 0, policy_version 81192 (0.0008) +[2023-10-09 07:31:06,726][60143] Updated weights for policy 0, policy_version 81202 (0.0007) +[2023-10-09 07:31:07,099][60143] Updated weights for policy 0, policy_version 81212 (0.0009) +[2023-10-09 07:31:08,081][60144] Updated weights for policy 1, policy_version 82122 (0.0008) +[2023-10-09 07:31:08,455][60144] Updated weights for policy 1, policy_version 82132 (0.0009) +[2023-10-09 07:31:08,820][60144] Updated weights for policy 1, policy_version 82142 (0.0008) +[2023-10-09 07:31:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 167280640. Throughput: 0: 1720.0, 1: 1720.3. Samples: 41830180. Policy #0 lag: (min: 27.0, avg: 27.1, max: 35.0) +[2023-10-09 07:31:11,053][59242] Avg episode reward: [(0, '32.620'), (1, '30.720')] +[2023-10-09 07:31:11,209][60143] Updated weights for policy 0, policy_version 81222 (0.0008) +[2023-10-09 07:31:11,588][60143] Updated weights for policy 0, policy_version 81232 (0.0008) +[2023-10-09 07:31:11,955][60143] Updated weights for policy 0, policy_version 81242 (0.0011) +[2023-10-09 07:31:12,659][60144] Updated weights for policy 1, policy_version 82152 (0.0009) +[2023-10-09 07:31:13,033][60144] Updated weights for policy 1, policy_version 82162 (0.0011) +[2023-10-09 07:31:13,403][60144] Updated weights for policy 1, policy_version 82172 (0.0008) +[2023-10-09 07:31:15,975][60143] Updated weights for policy 0, policy_version 81252 (0.0010) +[2023-10-09 07:31:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 167346176. Throughput: 0: 1714.1, 1: 1737.6. Samples: 41851292. Policy #0 lag: (min: 27.0, avg: 27.1, max: 35.0) +[2023-10-09 07:31:16,053][59242] Avg episode reward: [(0, '32.850'), (1, '30.440')] +[2023-10-09 07:31:16,342][60143] Updated weights for policy 0, policy_version 81262 (0.0009) +[2023-10-09 07:31:16,712][60143] Updated weights for policy 0, policy_version 81272 (0.0008) +[2023-10-09 07:31:17,447][60144] Updated weights for policy 1, policy_version 82182 (0.0008) +[2023-10-09 07:31:17,809][60144] Updated weights for policy 1, policy_version 82192 (0.0007) +[2023-10-09 07:31:18,177][60144] Updated weights for policy 1, policy_version 82202 (0.0007) +[2023-10-09 07:31:20,690][60143] Updated weights for policy 0, policy_version 81282 (0.0009) +[2023-10-09 07:31:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 167411712. Throughput: 0: 1705.6, 1: 1724.7. Samples: 41860670. Policy #0 lag: (min: 27.0, avg: 27.1, max: 35.0) +[2023-10-09 07:31:21,053][59242] Avg episode reward: [(0, '33.770'), (1, '30.670')] +[2023-10-09 07:31:21,056][60143] Updated weights for policy 0, policy_version 81292 (0.0009) +[2023-10-09 07:31:21,432][60143] Updated weights for policy 0, policy_version 81302 (0.0010) +[2023-10-09 07:31:21,792][60143] Updated weights for policy 0, policy_version 81312 (0.0010) +[2023-10-09 07:31:22,043][60144] Updated weights for policy 1, policy_version 82212 (0.0009) +[2023-10-09 07:31:22,407][60144] Updated weights for policy 1, policy_version 82222 (0.0007) +[2023-10-09 07:31:22,786][60144] Updated weights for policy 1, policy_version 82232 (0.0007) +[2023-10-09 07:31:25,829][60143] Updated weights for policy 0, policy_version 81322 (0.0009) +[2023-10-09 07:31:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 167477248. Throughput: 0: 1710.6, 1: 1730.5. Samples: 41882002. Policy #0 lag: (min: 27.0, avg: 27.1, max: 35.0) +[2023-10-09 07:31:26,053][59242] Avg episode reward: [(0, '33.420'), (1, '30.750')] +[2023-10-09 07:31:26,200][60143] Updated weights for policy 0, policy_version 81332 (0.0010) +[2023-10-09 07:31:26,518][60144] Updated weights for policy 1, policy_version 82242 (0.0008) +[2023-10-09 07:31:26,574][60143] Updated weights for policy 0, policy_version 81342 (0.0007) +[2023-10-09 07:31:26,884][60144] Updated weights for policy 1, policy_version 82252 (0.0007) +[2023-10-09 07:31:27,249][60144] Updated weights for policy 1, policy_version 82262 (0.0007) +[2023-10-09 07:31:27,605][60144] Updated weights for policy 1, policy_version 82272 (0.0008) +[2023-10-09 07:31:30,547][60143] Updated weights for policy 0, policy_version 81352 (0.0007) +[2023-10-09 07:31:30,921][60143] Updated weights for policy 0, policy_version 81362 (0.0007) +[2023-10-09 07:31:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 167542784. Throughput: 0: 1707.2, 1: 1767.3. Samples: 41903414. Policy #0 lag: (min: 27.0, avg: 27.1, max: 35.0) +[2023-10-09 07:31:31,053][59242] Avg episode reward: [(0, '34.010'), (1, '31.100')] +[2023-10-09 07:31:31,282][60143] Updated weights for policy 0, policy_version 81372 (0.0007) +[2023-10-09 07:31:31,428][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000081376_83329024.pth... +[2023-10-09 07:31:31,453][60144] Updated weights for policy 1, policy_version 82282 (0.0007) +[2023-10-09 07:31:31,467][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000079776_81690624.pth +[2023-10-09 07:31:31,821][60144] Updated weights for policy 1, policy_version 82292 (0.0007) +[2023-10-09 07:31:32,186][60144] Updated weights for policy 1, policy_version 82302 (0.0008) +[2023-10-09 07:31:32,256][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000082304_84279296.pth... +[2023-10-09 07:31:32,295][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000080672_82608128.pth +[2023-10-09 07:31:35,257][60143] Updated weights for policy 0, policy_version 81382 (0.0008) +[2023-10-09 07:31:35,632][60143] Updated weights for policy 0, policy_version 81392 (0.0007) +[2023-10-09 07:31:36,015][60143] Updated weights for policy 0, policy_version 81402 (0.0008) +[2023-10-09 07:31:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 167608320. Throughput: 0: 1714.5, 1: 1731.5. Samples: 41913046. Policy #0 lag: (min: 27.0, avg: 27.1, max: 35.0) +[2023-10-09 07:31:36,053][59242] Avg episode reward: [(0, '33.190'), (1, '31.180')] +[2023-10-09 07:31:36,117][60144] Updated weights for policy 1, policy_version 82312 (0.0009) +[2023-10-09 07:31:36,484][60144] Updated weights for policy 1, policy_version 82322 (0.0008) +[2023-10-09 07:31:36,849][60144] Updated weights for policy 1, policy_version 82332 (0.0010) +[2023-10-09 07:31:39,857][60143] Updated weights for policy 0, policy_version 81412 (0.0008) +[2023-10-09 07:31:40,229][60143] Updated weights for policy 0, policy_version 81422 (0.0008) +[2023-10-09 07:31:40,595][60143] Updated weights for policy 0, policy_version 81432 (0.0009) +[2023-10-09 07:31:40,852][60144] Updated weights for policy 1, policy_version 82342 (0.0010) +[2023-10-09 07:31:41,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 167706624. Throughput: 0: 1721.7, 1: 1752.7. Samples: 41934496. Policy #0 lag: (min: 27.0, avg: 27.1, max: 35.0) +[2023-10-09 07:31:41,053][59242] Avg episode reward: [(0, '35.360'), (1, '30.890')] +[2023-10-09 07:31:41,223][60144] Updated weights for policy 1, policy_version 82352 (0.0007) +[2023-10-09 07:31:41,592][60144] Updated weights for policy 1, policy_version 82362 (0.0009) +[2023-10-09 07:31:44,492][60143] Updated weights for policy 0, policy_version 81442 (0.0010) +[2023-10-09 07:31:44,853][60143] Updated weights for policy 0, policy_version 81452 (0.0008) +[2023-10-09 07:31:45,223][60143] Updated weights for policy 0, policy_version 81462 (0.0012) +[2023-10-09 07:31:45,579][60143] Updated weights for policy 0, policy_version 81472 (0.0008) +[2023-10-09 07:31:45,633][60144] Updated weights for policy 1, policy_version 82372 (0.0008) +[2023-10-09 07:31:46,039][60144] Updated weights for policy 1, policy_version 82382 (0.0007) +[2023-10-09 07:31:46,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 167772160. Throughput: 0: 1695.2, 1: 1756.3. Samples: 41954672. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:31:46,053][59242] Avg episode reward: [(0, '34.720'), (1, '29.270')] +[2023-10-09 07:31:46,409][60144] Updated weights for policy 1, policy_version 82392 (0.0007) +[2023-10-09 07:31:49,567][60143] Updated weights for policy 0, policy_version 81482 (0.0009) +[2023-10-09 07:31:49,940][60143] Updated weights for policy 0, policy_version 81492 (0.0008) +[2023-10-09 07:31:50,267][60144] Updated weights for policy 1, policy_version 82402 (0.0007) +[2023-10-09 07:31:50,311][60143] Updated weights for policy 0, policy_version 81502 (0.0008) +[2023-10-09 07:31:50,640][60144] Updated weights for policy 1, policy_version 82412 (0.0007) +[2023-10-09 07:31:51,007][60144] Updated weights for policy 1, policy_version 82422 (0.0009) +[2023-10-09 07:31:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 167837696. Throughput: 0: 1720.7, 1: 1736.3. Samples: 41965086. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:31:51,053][59242] Avg episode reward: [(0, '35.240'), (1, '29.050')] +[2023-10-09 07:31:51,375][60144] Updated weights for policy 1, policy_version 82432 (0.0009) +[2023-10-09 07:31:54,370][60143] Updated weights for policy 0, policy_version 81512 (0.0009) +[2023-10-09 07:31:54,736][60143] Updated weights for policy 0, policy_version 81522 (0.0009) +[2023-10-09 07:31:55,094][60143] Updated weights for policy 0, policy_version 81532 (0.0008) +[2023-10-09 07:31:55,210][60144] Updated weights for policy 1, policy_version 82442 (0.0007) +[2023-10-09 07:31:55,575][60144] Updated weights for policy 1, policy_version 82452 (0.0008) +[2023-10-09 07:31:55,943][60144] Updated weights for policy 1, policy_version 82462 (0.0008) +[2023-10-09 07:31:56,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 167936000. Throughput: 0: 1705.3, 1: 1745.3. Samples: 41985458. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:31:56,053][59242] Avg episode reward: [(0, '34.840'), (1, '30.930')] +[2023-10-09 07:31:59,256][60143] Updated weights for policy 0, policy_version 81542 (0.0007) +[2023-10-09 07:31:59,637][60143] Updated weights for policy 0, policy_version 81552 (0.0008) +[2023-10-09 07:31:59,887][60144] Updated weights for policy 1, policy_version 82472 (0.0008) +[2023-10-09 07:32:00,005][60143] Updated weights for policy 0, policy_version 81562 (0.0008) +[2023-10-09 07:32:00,252][60144] Updated weights for policy 1, policy_version 82482 (0.0008) +[2023-10-09 07:32:00,626][60144] Updated weights for policy 1, policy_version 82492 (0.0009) +[2023-10-09 07:32:01,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 168001536. Throughput: 0: 1689.9, 1: 1718.8. Samples: 42004680. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:32:01,053][59242] Avg episode reward: [(0, '36.380'), (1, '30.860')] +[2023-10-09 07:32:04,029][60143] Updated weights for policy 0, policy_version 81572 (0.0009) +[2023-10-09 07:32:04,404][60143] Updated weights for policy 0, policy_version 81582 (0.0009) +[2023-10-09 07:32:04,657][60144] Updated weights for policy 1, policy_version 82502 (0.0009) +[2023-10-09 07:32:04,781][60143] Updated weights for policy 0, policy_version 81592 (0.0009) +[2023-10-09 07:32:05,028][60144] Updated weights for policy 1, policy_version 82512 (0.0007) +[2023-10-09 07:32:05,394][60144] Updated weights for policy 1, policy_version 82522 (0.0007) +[2023-10-09 07:32:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 168067072. Throughput: 0: 1718.7, 1: 1742.4. Samples: 42016420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:32:06,053][59242] Avg episode reward: [(0, '36.040'), (1, '29.390')] +[2023-10-09 07:32:08,658][60143] Updated weights for policy 0, policy_version 81602 (0.0007) +[2023-10-09 07:32:09,030][60143] Updated weights for policy 0, policy_version 81612 (0.0010) +[2023-10-09 07:32:09,395][60143] Updated weights for policy 0, policy_version 81622 (0.0010) +[2023-10-09 07:32:09,481][60144] Updated weights for policy 1, policy_version 82532 (0.0008) +[2023-10-09 07:32:09,767][60143] Updated weights for policy 0, policy_version 81632 (0.0009) +[2023-10-09 07:32:09,850][60144] Updated weights for policy 1, policy_version 82542 (0.0009) +[2023-10-09 07:32:10,213][60144] Updated weights for policy 1, policy_version 82552 (0.0007) +[2023-10-09 07:32:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 168132608. Throughput: 0: 1700.6, 1: 1727.6. Samples: 42036274. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:32:11,053][59242] Avg episode reward: [(0, '35.220'), (1, '29.800')] +[2023-10-09 07:32:13,829][60143] Updated weights for policy 0, policy_version 81642 (0.0008) +[2023-10-09 07:32:14,139][60144] Updated weights for policy 1, policy_version 82562 (0.0010) +[2023-10-09 07:32:14,190][60143] Updated weights for policy 0, policy_version 81652 (0.0008) +[2023-10-09 07:32:14,506][60144] Updated weights for policy 1, policy_version 82572 (0.0008) +[2023-10-09 07:32:14,556][60143] Updated weights for policy 0, policy_version 81662 (0.0008) +[2023-10-09 07:32:14,873][60144] Updated weights for policy 1, policy_version 82582 (0.0010) +[2023-10-09 07:32:15,245][60144] Updated weights for policy 1, policy_version 82592 (0.0009) +[2023-10-09 07:32:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 168198144. Throughput: 0: 1699.7, 1: 1694.8. Samples: 42056168. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:32:16,053][59242] Avg episode reward: [(0, '35.370'), (1, '30.130')] +[2023-10-09 07:32:18,499][60143] Updated weights for policy 0, policy_version 81672 (0.0010) +[2023-10-09 07:32:18,875][60143] Updated weights for policy 0, policy_version 81682 (0.0010) +[2023-10-09 07:32:19,237][60143] Updated weights for policy 0, policy_version 81692 (0.0007) +[2023-10-09 07:32:19,302][60144] Updated weights for policy 1, policy_version 82602 (0.0008) +[2023-10-09 07:32:19,672][60144] Updated weights for policy 1, policy_version 82612 (0.0008) +[2023-10-09 07:32:20,037][60144] Updated weights for policy 1, policy_version 82622 (0.0008) +[2023-10-09 07:32:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 168263680. Throughput: 0: 1714.1, 1: 1723.6. Samples: 42067740. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:32:21,053][59242] Avg episode reward: [(0, '34.380'), (1, '31.000')] +[2023-10-09 07:32:23,286][60143] Updated weights for policy 0, policy_version 81702 (0.0010) +[2023-10-09 07:32:23,647][60143] Updated weights for policy 0, policy_version 81712 (0.0007) +[2023-10-09 07:32:24,017][60143] Updated weights for policy 0, policy_version 81722 (0.0008) +[2023-10-09 07:32:24,131][60144] Updated weights for policy 1, policy_version 82632 (0.0008) +[2023-10-09 07:32:24,502][60144] Updated weights for policy 1, policy_version 82642 (0.0007) +[2023-10-09 07:32:24,863][60144] Updated weights for policy 1, policy_version 82652 (0.0007) +[2023-10-09 07:32:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 168329216. Throughput: 0: 1681.9, 1: 1709.0. Samples: 42087088. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:32:26,053][59242] Avg episode reward: [(0, '35.760'), (1, '33.170')] +[2023-10-09 07:32:28,094][60143] Updated weights for policy 0, policy_version 81732 (0.0009) +[2023-10-09 07:32:28,458][60143] Updated weights for policy 0, policy_version 81742 (0.0011) +[2023-10-09 07:32:28,742][60144] Updated weights for policy 1, policy_version 82662 (0.0007) +[2023-10-09 07:32:28,824][60143] Updated weights for policy 0, policy_version 81752 (0.0009) +[2023-10-09 07:32:29,109][60144] Updated weights for policy 1, policy_version 82672 (0.0008) +[2023-10-09 07:32:29,476][60144] Updated weights for policy 1, policy_version 82682 (0.0009) +[2023-10-09 07:32:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 168394752. Throughput: 0: 1708.0, 1: 1700.0. Samples: 42108036. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:32:31,053][59242] Avg episode reward: [(0, '35.720'), (1, '32.680')] +[2023-10-09 07:32:32,652][60143] Updated weights for policy 0, policy_version 81762 (0.0008) +[2023-10-09 07:32:33,025][60143] Updated weights for policy 0, policy_version 81772 (0.0009) +[2023-10-09 07:32:33,405][60143] Updated weights for policy 0, policy_version 81782 (0.0008) +[2023-10-09 07:32:33,542][60144] Updated weights for policy 1, policy_version 82692 (0.0009) +[2023-10-09 07:32:33,770][60143] Updated weights for policy 0, policy_version 81792 (0.0009) +[2023-10-09 07:32:33,935][60144] Updated weights for policy 1, policy_version 82702 (0.0009) +[2023-10-09 07:32:34,310][60144] Updated weights for policy 1, policy_version 82712 (0.0010) +[2023-10-09 07:32:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 168460288. Throughput: 0: 1693.9, 1: 1723.3. Samples: 42118860. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:32:36,053][59242] Avg episode reward: [(0, '33.980'), (1, '33.200')] +[2023-10-09 07:32:37,624][60143] Updated weights for policy 0, policy_version 81802 (0.0008) +[2023-10-09 07:32:37,988][60143] Updated weights for policy 0, policy_version 81812 (0.0009) +[2023-10-09 07:32:38,140][60144] Updated weights for policy 1, policy_version 82722 (0.0009) +[2023-10-09 07:32:38,359][60143] Updated weights for policy 0, policy_version 81822 (0.0008) +[2023-10-09 07:32:38,511][60144] Updated weights for policy 1, policy_version 82732 (0.0007) +[2023-10-09 07:32:38,881][60144] Updated weights for policy 1, policy_version 82742 (0.0008) +[2023-10-09 07:32:39,249][60144] Updated weights for policy 1, policy_version 82752 (0.0010) +[2023-10-09 07:32:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 168525824. Throughput: 0: 1701.6, 1: 1700.7. Samples: 42138560. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:32:41,052][59242] Avg episode reward: [(0, '32.450'), (1, '33.220')] +[2023-10-09 07:32:42,412][60143] Updated weights for policy 0, policy_version 81832 (0.0008) +[2023-10-09 07:32:42,782][60143] Updated weights for policy 0, policy_version 81842 (0.0007) +[2023-10-09 07:32:42,932][60144] Updated weights for policy 1, policy_version 82762 (0.0007) +[2023-10-09 07:32:43,153][60143] Updated weights for policy 0, policy_version 81852 (0.0009) +[2023-10-09 07:32:43,300][60144] Updated weights for policy 1, policy_version 82772 (0.0009) +[2023-10-09 07:32:43,668][60144] Updated weights for policy 1, policy_version 82782 (0.0009) +[2023-10-09 07:32:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 168591360. Throughput: 0: 1719.9, 1: 1734.1. Samples: 42160108. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:32:46,053][59242] Avg episode reward: [(0, '31.830'), (1, '34.090')] +[2023-10-09 07:32:47,126][60143] Updated weights for policy 0, policy_version 81862 (0.0009) +[2023-10-09 07:32:47,506][60143] Updated weights for policy 0, policy_version 81872 (0.0007) +[2023-10-09 07:32:47,560][60144] Updated weights for policy 1, policy_version 82792 (0.0008) +[2023-10-09 07:32:47,865][60143] Updated weights for policy 0, policy_version 81882 (0.0009) +[2023-10-09 07:32:47,930][60144] Updated weights for policy 1, policy_version 82802 (0.0008) +[2023-10-09 07:32:48,297][60144] Updated weights for policy 1, policy_version 82812 (0.0010) +[2023-10-09 07:32:51,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 168656896. Throughput: 0: 1689.8, 1: 1711.1. Samples: 42169460. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:32:51,053][59242] Avg episode reward: [(0, '32.240'), (1, '33.420')] +[2023-10-09 07:32:51,818][60143] Updated weights for policy 0, policy_version 81892 (0.0008) +[2023-10-09 07:32:52,182][60143] Updated weights for policy 0, policy_version 81902 (0.0007) +[2023-10-09 07:32:52,264][60144] Updated weights for policy 1, policy_version 82822 (0.0007) +[2023-10-09 07:32:52,547][60143] Updated weights for policy 0, policy_version 81912 (0.0007) +[2023-10-09 07:32:52,629][60144] Updated weights for policy 1, policy_version 82832 (0.0007) +[2023-10-09 07:32:52,990][60144] Updated weights for policy 1, policy_version 82842 (0.0007) +[2023-10-09 07:32:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 168722432. Throughput: 0: 1710.3, 1: 1720.4. Samples: 42190654. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:32:56,053][59242] Avg episode reward: [(0, '34.000'), (1, '32.650')] +[2023-10-09 07:32:56,643][60143] Updated weights for policy 0, policy_version 81922 (0.0009) +[2023-10-09 07:32:56,941][60144] Updated weights for policy 1, policy_version 82852 (0.0008) +[2023-10-09 07:32:57,019][60143] Updated weights for policy 0, policy_version 81932 (0.0007) +[2023-10-09 07:32:57,309][60144] Updated weights for policy 1, policy_version 82862 (0.0008) +[2023-10-09 07:32:57,380][60143] Updated weights for policy 0, policy_version 81942 (0.0008) +[2023-10-09 07:32:57,668][60144] Updated weights for policy 1, policy_version 82872 (0.0009) +[2023-10-09 07:32:57,754][60143] Updated weights for policy 0, policy_version 81952 (0.0007) +[2023-10-09 07:33:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 168787968. Throughput: 0: 1712.4, 1: 1750.3. Samples: 42211992. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:33:01,053][59242] Avg episode reward: [(0, '32.860'), (1, '32.270')] +[2023-10-09 07:33:01,598][60144] Updated weights for policy 1, policy_version 82882 (0.0009) +[2023-10-09 07:33:01,805][60143] Updated weights for policy 0, policy_version 81962 (0.0008) +[2023-10-09 07:33:01,960][60144] Updated weights for policy 1, policy_version 82892 (0.0009) +[2023-10-09 07:33:02,177][60143] Updated weights for policy 0, policy_version 81972 (0.0010) +[2023-10-09 07:33:02,327][60144] Updated weights for policy 1, policy_version 82902 (0.0008) +[2023-10-09 07:33:02,549][60143] Updated weights for policy 0, policy_version 81982 (0.0008) +[2023-10-09 07:33:02,683][60144] Updated weights for policy 1, policy_version 82912 (0.0008) +[2023-10-09 07:33:06,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 168853504. Throughput: 0: 1689.8, 1: 1721.4. Samples: 42221244. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:33:06,053][59242] Avg episode reward: [(0, '32.080'), (1, '32.210')] +[2023-10-09 07:33:06,606][60143] Updated weights for policy 0, policy_version 81992 (0.0008) +[2023-10-09 07:33:06,607][60144] Updated weights for policy 1, policy_version 82922 (0.0010) +[2023-10-09 07:33:06,967][60143] Updated weights for policy 0, policy_version 82002 (0.0009) +[2023-10-09 07:33:06,971][60144] Updated weights for policy 1, policy_version 82932 (0.0008) +[2023-10-09 07:33:07,328][60144] Updated weights for policy 1, policy_version 82942 (0.0007) +[2023-10-09 07:33:07,330][60143] Updated weights for policy 0, policy_version 82012 (0.0008) +[2023-10-09 07:33:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 168919040. Throughput: 0: 1712.0, 1: 1740.5. Samples: 42242452. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:33:11,053][59242] Avg episode reward: [(0, '31.200'), (1, '33.710')] +[2023-10-09 07:33:11,223][60144] Updated weights for policy 1, policy_version 82952 (0.0009) +[2023-10-09 07:33:11,439][60143] Updated weights for policy 0, policy_version 82022 (0.0009) +[2023-10-09 07:33:11,589][60144] Updated weights for policy 1, policy_version 82962 (0.0009) +[2023-10-09 07:33:11,808][60143] Updated weights for policy 0, policy_version 82032 (0.0009) +[2023-10-09 07:33:11,943][60144] Updated weights for policy 1, policy_version 82972 (0.0008) +[2023-10-09 07:33:12,189][60143] Updated weights for policy 0, policy_version 82042 (0.0009) +[2023-10-09 07:33:16,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 168984576. Throughput: 0: 1709.5, 1: 1741.9. Samples: 42263350. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:33:16,053][59242] Avg episode reward: [(0, '31.120'), (1, '33.820')] +[2023-10-09 07:33:16,080][60144] Updated weights for policy 1, policy_version 82982 (0.0008) +[2023-10-09 07:33:16,139][60143] Updated weights for policy 0, policy_version 82052 (0.0010) +[2023-10-09 07:33:16,454][60144] Updated weights for policy 1, policy_version 82992 (0.0007) +[2023-10-09 07:33:16,513][60143] Updated weights for policy 0, policy_version 82062 (0.0009) +[2023-10-09 07:33:16,816][60144] Updated weights for policy 1, policy_version 83002 (0.0007) +[2023-10-09 07:33:16,881][60143] Updated weights for policy 0, policy_version 82072 (0.0009) +[2023-10-09 07:33:20,669][60144] Updated weights for policy 1, policy_version 83012 (0.0007) +[2023-10-09 07:33:20,871][60143] Updated weights for policy 0, policy_version 82082 (0.0007) +[2023-10-09 07:33:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 169050112. Throughput: 0: 1695.8, 1: 1721.3. Samples: 42272630. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:33:21,053][59242] Avg episode reward: [(0, '30.170'), (1, '35.040')] +[2023-10-09 07:33:21,074][60144] Updated weights for policy 1, policy_version 83022 (0.0008) +[2023-10-09 07:33:21,245][60143] Updated weights for policy 0, policy_version 82092 (0.0007) +[2023-10-09 07:33:21,449][60144] Updated weights for policy 1, policy_version 83032 (0.0008) +[2023-10-09 07:33:21,624][60143] Updated weights for policy 0, policy_version 82102 (0.0008) +[2023-10-09 07:33:21,991][60143] Updated weights for policy 0, policy_version 82112 (0.0009) +[2023-10-09 07:33:25,350][60144] Updated weights for policy 1, policy_version 83042 (0.0008) +[2023-10-09 07:33:25,719][60144] Updated weights for policy 1, policy_version 83052 (0.0009) +[2023-10-09 07:33:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 169115648. Throughput: 0: 1705.4, 1: 1743.9. Samples: 42293776. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:33:26,053][59242] Avg episode reward: [(0, '29.910'), (1, '34.570')] +[2023-10-09 07:33:26,065][60143] Updated weights for policy 0, policy_version 82122 (0.0008) +[2023-10-09 07:33:26,088][60144] Updated weights for policy 1, policy_version 83062 (0.0009) +[2023-10-09 07:33:26,429][60143] Updated weights for policy 0, policy_version 82132 (0.0007) +[2023-10-09 07:33:26,450][60144] Updated weights for policy 1, policy_version 83072 (0.0007) +[2023-10-09 07:33:26,813][60143] Updated weights for policy 0, policy_version 82142 (0.0007) +[2023-10-09 07:33:30,419][60144] Updated weights for policy 1, policy_version 83082 (0.0007) +[2023-10-09 07:33:30,753][60143] Updated weights for policy 0, policy_version 82152 (0.0008) +[2023-10-09 07:33:30,783][60144] Updated weights for policy 1, policy_version 83092 (0.0008) +[2023-10-09 07:33:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 169181184. Throughput: 0: 1707.4, 1: 1724.3. Samples: 42314532. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:33:31,053][59242] Avg episode reward: [(0, '32.100'), (1, '33.760')] +[2023-10-09 07:33:31,122][60143] Updated weights for policy 0, policy_version 82162 (0.0008) +[2023-10-09 07:33:31,147][60144] Updated weights for policy 1, policy_version 83102 (0.0007) +[2023-10-09 07:33:31,220][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000083104_85098496.pth... +[2023-10-09 07:33:31,248][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000081472_83427328.pth +[2023-10-09 07:33:31,492][60143] Updated weights for policy 0, policy_version 82172 (0.0011) +[2023-10-09 07:33:31,641][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000082176_84148224.pth... +[2023-10-09 07:33:31,670][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000080576_82509824.pth +[2023-10-09 07:33:35,016][60144] Updated weights for policy 1, policy_version 83112 (0.0009) +[2023-10-09 07:33:35,360][60143] Updated weights for policy 0, policy_version 82182 (0.0008) +[2023-10-09 07:33:35,393][60144] Updated weights for policy 1, policy_version 83122 (0.0007) +[2023-10-09 07:33:35,742][60143] Updated weights for policy 0, policy_version 82192 (0.0008) +[2023-10-09 07:33:35,751][60144] Updated weights for policy 1, policy_version 83132 (0.0008) +[2023-10-09 07:33:36,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 169279488. Throughput: 0: 1712.3, 1: 1735.6. Samples: 42324612. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-09 07:33:36,053][59242] Avg episode reward: [(0, '32.760'), (1, '33.790')] +[2023-10-09 07:33:36,102][60143] Updated weights for policy 0, policy_version 82202 (0.0009) +[2023-10-09 07:33:39,570][60144] Updated weights for policy 1, policy_version 83142 (0.0007) +[2023-10-09 07:33:39,934][60144] Updated weights for policy 1, policy_version 83152 (0.0008) +[2023-10-09 07:33:40,188][60143] Updated weights for policy 0, policy_version 82212 (0.0009) +[2023-10-09 07:33:40,299][60144] Updated weights for policy 1, policy_version 83162 (0.0007) +[2023-10-09 07:33:40,551][60143] Updated weights for policy 0, policy_version 82222 (0.0008) +[2023-10-09 07:33:40,915][60143] Updated weights for policy 0, policy_version 82232 (0.0007) +[2023-10-09 07:33:41,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 169345024. Throughput: 0: 1709.6, 1: 1733.4. Samples: 42345590. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-09 07:33:41,052][59242] Avg episode reward: [(0, '31.880'), (1, '35.060')] +[2023-10-09 07:33:44,184][60144] Updated weights for policy 1, policy_version 83172 (0.0007) +[2023-10-09 07:33:44,551][60144] Updated weights for policy 1, policy_version 83182 (0.0007) +[2023-10-09 07:33:44,676][60143] Updated weights for policy 0, policy_version 82242 (0.0008) +[2023-10-09 07:33:44,919][60144] Updated weights for policy 1, policy_version 83192 (0.0007) +[2023-10-09 07:33:45,047][60143] Updated weights for policy 0, policy_version 82252 (0.0010) +[2023-10-09 07:33:45,421][60143] Updated weights for policy 0, policy_version 82262 (0.0009) +[2023-10-09 07:33:45,786][60143] Updated weights for policy 0, policy_version 82272 (0.0012) +[2023-10-09 07:33:46,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 169443328. Throughput: 0: 1692.6, 1: 1708.8. Samples: 42365052. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-09 07:33:46,053][59242] Avg episode reward: [(0, '33.470'), (1, '34.610')] +[2023-10-09 07:33:48,926][60144] Updated weights for policy 1, policy_version 83202 (0.0007) +[2023-10-09 07:33:49,287][60144] Updated weights for policy 1, policy_version 83212 (0.0007) +[2023-10-09 07:33:49,643][60144] Updated weights for policy 1, policy_version 83222 (0.0007) +[2023-10-09 07:33:49,897][60143] Updated weights for policy 0, policy_version 82282 (0.0007) +[2023-10-09 07:33:50,011][60144] Updated weights for policy 1, policy_version 83232 (0.0007) +[2023-10-09 07:33:50,259][60143] Updated weights for policy 0, policy_version 82292 (0.0010) +[2023-10-09 07:33:50,627][60143] Updated weights for policy 0, policy_version 82302 (0.0008) +[2023-10-09 07:33:51,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 169508864. Throughput: 0: 1710.7, 1: 1738.4. Samples: 42376450. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-09 07:33:51,052][59242] Avg episode reward: [(0, '33.890'), (1, '32.810')] +[2023-10-09 07:33:53,904][60144] Updated weights for policy 1, policy_version 83242 (0.0009) +[2023-10-09 07:33:54,257][60144] Updated weights for policy 1, policy_version 83252 (0.0009) +[2023-10-09 07:33:54,620][60144] Updated weights for policy 1, policy_version 83262 (0.0007) +[2023-10-09 07:33:54,624][60143] Updated weights for policy 0, policy_version 82312 (0.0008) +[2023-10-09 07:33:54,991][60143] Updated weights for policy 0, policy_version 82322 (0.0010) +[2023-10-09 07:33:55,371][60143] Updated weights for policy 0, policy_version 82332 (0.0009) +[2023-10-09 07:33:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 169574400. Throughput: 0: 1717.4, 1: 1706.7. Samples: 42396536. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-09 07:33:56,053][59242] Avg episode reward: [(0, '32.970'), (1, '31.810')] +[2023-10-09 07:33:58,523][60144] Updated weights for policy 1, policy_version 83272 (0.0007) +[2023-10-09 07:33:58,887][60144] Updated weights for policy 1, policy_version 83282 (0.0008) +[2023-10-09 07:33:59,260][60144] Updated weights for policy 1, policy_version 83292 (0.0010) +[2023-10-09 07:33:59,426][60143] Updated weights for policy 0, policy_version 82342 (0.0008) +[2023-10-09 07:33:59,795][60143] Updated weights for policy 0, policy_version 82352 (0.0010) +[2023-10-09 07:34:00,160][60143] Updated weights for policy 0, policy_version 82362 (0.0010) +[2023-10-09 07:34:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 169639936. Throughput: 0: 1691.7, 1: 1719.7. Samples: 42416866. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-09 07:34:01,053][59242] Avg episode reward: [(0, '32.380'), (1, '30.910')] +[2023-10-09 07:34:03,202][60144] Updated weights for policy 1, policy_version 83302 (0.0008) +[2023-10-09 07:34:03,574][60144] Updated weights for policy 1, policy_version 83312 (0.0007) +[2023-10-09 07:34:03,942][60144] Updated weights for policy 1, policy_version 83322 (0.0008) +[2023-10-09 07:34:04,135][60143] Updated weights for policy 0, policy_version 82372 (0.0011) +[2023-10-09 07:34:04,509][60143] Updated weights for policy 0, policy_version 82382 (0.0008) +[2023-10-09 07:34:04,880][60143] Updated weights for policy 0, policy_version 82392 (0.0007) +[2023-10-09 07:34:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 169705472. Throughput: 0: 1726.2, 1: 1731.5. Samples: 42428226. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-09 07:34:06,053][59242] Avg episode reward: [(0, '31.990'), (1, '30.600')] +[2023-10-09 07:34:07,701][60144] Updated weights for policy 1, policy_version 83332 (0.0008) +[2023-10-09 07:34:08,072][60144] Updated weights for policy 1, policy_version 83342 (0.0008) +[2023-10-09 07:34:08,430][60144] Updated weights for policy 1, policy_version 83352 (0.0009) +[2023-10-09 07:34:09,016][60143] Updated weights for policy 0, policy_version 82402 (0.0008) +[2023-10-09 07:34:09,395][60143] Updated weights for policy 0, policy_version 82412 (0.0008) +[2023-10-09 07:34:09,775][60143] Updated weights for policy 0, policy_version 82422 (0.0009) +[2023-10-09 07:34:10,138][60143] Updated weights for policy 0, policy_version 82432 (0.0009) +[2023-10-09 07:34:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 169771008. Throughput: 0: 1704.8, 1: 1725.8. Samples: 42448152. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-09 07:34:11,053][59242] Avg episode reward: [(0, '33.150'), (1, '30.140')] +[2023-10-09 07:34:12,317][60144] Updated weights for policy 1, policy_version 83362 (0.0007) +[2023-10-09 07:34:12,742][60144] Updated weights for policy 1, policy_version 83372 (0.0009) +[2023-10-09 07:34:13,119][60144] Updated weights for policy 1, policy_version 83382 (0.0008) +[2023-10-09 07:34:13,488][60144] Updated weights for policy 1, policy_version 83392 (0.0007) +[2023-10-09 07:34:14,029][60143] Updated weights for policy 0, policy_version 82442 (0.0007) +[2023-10-09 07:34:14,396][60143] Updated weights for policy 0, policy_version 82452 (0.0009) +[2023-10-09 07:34:14,773][60143] Updated weights for policy 0, policy_version 82462 (0.0007) +[2023-10-09 07:34:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 169836544. Throughput: 0: 1687.6, 1: 1736.3. Samples: 42468610. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-09 07:34:16,053][59242] Avg episode reward: [(0, '34.430'), (1, '30.870')] +[2023-10-09 07:34:17,478][60144] Updated weights for policy 1, policy_version 83402 (0.0008) +[2023-10-09 07:34:17,842][60144] Updated weights for policy 1, policy_version 83412 (0.0008) +[2023-10-09 07:34:18,213][60144] Updated weights for policy 1, policy_version 83422 (0.0010) +[2023-10-09 07:34:18,644][60143] Updated weights for policy 0, policy_version 82472 (0.0008) +[2023-10-09 07:34:19,008][60143] Updated weights for policy 0, policy_version 82482 (0.0010) +[2023-10-09 07:34:19,378][60143] Updated weights for policy 0, policy_version 82492 (0.0008) +[2023-10-09 07:34:21,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 169902080. Throughput: 0: 1709.5, 1: 1723.3. Samples: 42479088. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-09 07:34:21,053][59242] Avg episode reward: [(0, '35.940'), (1, '31.000')] +[2023-10-09 07:34:22,113][60144] Updated weights for policy 1, policy_version 83432 (0.0008) +[2023-10-09 07:34:22,471][60144] Updated weights for policy 1, policy_version 83442 (0.0008) +[2023-10-09 07:34:22,839][60144] Updated weights for policy 1, policy_version 83452 (0.0008) +[2023-10-09 07:34:23,484][60143] Updated weights for policy 0, policy_version 82502 (0.0008) +[2023-10-09 07:34:23,872][60143] Updated weights for policy 0, policy_version 82512 (0.0007) +[2023-10-09 07:34:24,253][60143] Updated weights for policy 0, policy_version 82522 (0.0007) +[2023-10-09 07:34:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 169967616. Throughput: 0: 1680.4, 1: 1732.0. Samples: 42499150. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:34:26,053][59242] Avg episode reward: [(0, '35.030'), (1, '32.790')] +[2023-10-09 07:34:26,879][60144] Updated weights for policy 1, policy_version 83462 (0.0008) +[2023-10-09 07:34:27,237][60144] Updated weights for policy 1, policy_version 83472 (0.0007) +[2023-10-09 07:34:27,610][60144] Updated weights for policy 1, policy_version 83482 (0.0008) +[2023-10-09 07:34:28,247][60143] Updated weights for policy 0, policy_version 82532 (0.0009) +[2023-10-09 07:34:28,610][60143] Updated weights for policy 0, policy_version 82542 (0.0007) +[2023-10-09 07:34:28,982][60143] Updated weights for policy 0, policy_version 82552 (0.0008) +[2023-10-09 07:34:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 170033152. Throughput: 0: 1697.8, 1: 1753.0. Samples: 42520338. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:34:31,053][59242] Avg episode reward: [(0, '32.950'), (1, '33.150')] +[2023-10-09 07:34:31,611][60144] Updated weights for policy 1, policy_version 83492 (0.0007) +[2023-10-09 07:34:31,977][60144] Updated weights for policy 1, policy_version 83502 (0.0009) +[2023-10-09 07:34:32,333][60144] Updated weights for policy 1, policy_version 83512 (0.0009) +[2023-10-09 07:34:32,876][60143] Updated weights for policy 0, policy_version 82562 (0.0009) +[2023-10-09 07:34:33,240][60143] Updated weights for policy 0, policy_version 82572 (0.0007) +[2023-10-09 07:34:33,608][60143] Updated weights for policy 0, policy_version 82582 (0.0008) +[2023-10-09 07:34:33,980][60143] Updated weights for policy 0, policy_version 82592 (0.0008) +[2023-10-09 07:34:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 170098688. Throughput: 0: 1698.9, 1: 1721.5. Samples: 42530368. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:34:36,052][59242] Avg episode reward: [(0, '32.400'), (1, '33.780')] +[2023-10-09 07:34:36,307][60144] Updated weights for policy 1, policy_version 83522 (0.0010) +[2023-10-09 07:34:36,676][60144] Updated weights for policy 1, policy_version 83532 (0.0011) +[2023-10-09 07:34:37,043][60144] Updated weights for policy 1, policy_version 83542 (0.0007) +[2023-10-09 07:34:37,408][60144] Updated weights for policy 1, policy_version 83552 (0.0007) +[2023-10-09 07:34:37,968][60143] Updated weights for policy 0, policy_version 82602 (0.0009) +[2023-10-09 07:34:38,327][60143] Updated weights for policy 0, policy_version 82612 (0.0010) +[2023-10-09 07:34:38,688][60143] Updated weights for policy 0, policy_version 82622 (0.0011) +[2023-10-09 07:34:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 170164224. Throughput: 0: 1683.6, 1: 1749.1. Samples: 42551008. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:34:41,053][59242] Avg episode reward: [(0, '32.840'), (1, '31.530')] +[2023-10-09 07:34:41,239][60144] Updated weights for policy 1, policy_version 83562 (0.0007) +[2023-10-09 07:34:41,600][60144] Updated weights for policy 1, policy_version 83572 (0.0007) +[2023-10-09 07:34:41,961][60144] Updated weights for policy 1, policy_version 83582 (0.0010) +[2023-10-09 07:34:42,566][60143] Updated weights for policy 0, policy_version 82632 (0.0010) +[2023-10-09 07:34:42,929][60143] Updated weights for policy 0, policy_version 82642 (0.0010) +[2023-10-09 07:34:43,296][60143] Updated weights for policy 0, policy_version 82652 (0.0008) +[2023-10-09 07:34:45,985][60144] Updated weights for policy 1, policy_version 83592 (0.0008) +[2023-10-09 07:34:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 170229760. Throughput: 0: 1712.1, 1: 1741.7. Samples: 42572288. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:34:46,053][59242] Avg episode reward: [(0, '33.550'), (1, '33.590')] +[2023-10-09 07:34:46,364][60144] Updated weights for policy 1, policy_version 83602 (0.0007) +[2023-10-09 07:34:46,733][60144] Updated weights for policy 1, policy_version 83612 (0.0007) +[2023-10-09 07:34:47,268][60143] Updated weights for policy 0, policy_version 82662 (0.0010) +[2023-10-09 07:34:47,640][60143] Updated weights for policy 0, policy_version 82672 (0.0011) +[2023-10-09 07:34:48,001][60143] Updated weights for policy 0, policy_version 82682 (0.0010) +[2023-10-09 07:34:50,706][60144] Updated weights for policy 1, policy_version 83622 (0.0007) +[2023-10-09 07:34:51,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 170295296. Throughput: 0: 1680.8, 1: 1729.9. Samples: 42581710. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:34:51,053][59242] Avg episode reward: [(0, '33.790'), (1, '32.260')] +[2023-10-09 07:34:51,076][60144] Updated weights for policy 1, policy_version 83632 (0.0008) +[2023-10-09 07:34:51,439][60144] Updated weights for policy 1, policy_version 83642 (0.0008) +[2023-10-09 07:34:51,945][60143] Updated weights for policy 0, policy_version 82692 (0.0010) +[2023-10-09 07:34:52,315][60143] Updated weights for policy 0, policy_version 82702 (0.0008) +[2023-10-09 07:34:52,687][60143] Updated weights for policy 0, policy_version 82712 (0.0007) +[2023-10-09 07:34:55,238][60144] Updated weights for policy 1, policy_version 83652 (0.0007) +[2023-10-09 07:34:55,602][60144] Updated weights for policy 1, policy_version 83662 (0.0010) +[2023-10-09 07:34:55,962][60144] Updated weights for policy 1, policy_version 83672 (0.0008) +[2023-10-09 07:34:56,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 170360832. Throughput: 0: 1698.4, 1: 1743.1. Samples: 42603016. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:34:56,053][59242] Avg episode reward: [(0, '33.810'), (1, '32.560')] +[2023-10-09 07:34:56,784][60143] Updated weights for policy 0, policy_version 82722 (0.0008) +[2023-10-09 07:34:57,154][60143] Updated weights for policy 0, policy_version 82732 (0.0009) +[2023-10-09 07:34:57,528][60143] Updated weights for policy 0, policy_version 82742 (0.0008) +[2023-10-09 07:34:57,902][60143] Updated weights for policy 0, policy_version 82752 (0.0009) +[2023-10-09 07:34:59,838][60144] Updated weights for policy 1, policy_version 83682 (0.0007) +[2023-10-09 07:35:00,254][60144] Updated weights for policy 1, policy_version 83692 (0.0009) +[2023-10-09 07:35:00,628][60144] Updated weights for policy 1, policy_version 83702 (0.0009) +[2023-10-09 07:35:00,985][60144] Updated weights for policy 1, policy_version 83712 (0.0008) +[2023-10-09 07:35:01,052][59242] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 170459136. Throughput: 0: 1713.6, 1: 1728.1. Samples: 42623484. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:35:01,053][59242] Avg episode reward: [(0, '31.820'), (1, '32.340')] +[2023-10-09 07:35:02,008][60143] Updated weights for policy 0, policy_version 82762 (0.0009) +[2023-10-09 07:35:02,383][60143] Updated weights for policy 0, policy_version 82772 (0.0009) +[2023-10-09 07:35:02,768][60143] Updated weights for policy 0, policy_version 82782 (0.0009) +[2023-10-09 07:35:04,986][60144] Updated weights for policy 1, policy_version 83722 (0.0007) +[2023-10-09 07:35:05,358][60144] Updated weights for policy 1, policy_version 83732 (0.0010) +[2023-10-09 07:35:05,716][60144] Updated weights for policy 1, policy_version 83742 (0.0009) +[2023-10-09 07:35:06,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 170524672. Throughput: 0: 1687.9, 1: 1749.5. Samples: 42633772. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:35:06,052][59242] Avg episode reward: [(0, '33.990'), (1, '31.980')] +[2023-10-09 07:35:06,931][60143] Updated weights for policy 0, policy_version 82792 (0.0008) +[2023-10-09 07:35:07,303][60143] Updated weights for policy 0, policy_version 82802 (0.0007) +[2023-10-09 07:35:07,674][60143] Updated weights for policy 0, policy_version 82812 (0.0009) +[2023-10-09 07:35:09,648][60144] Updated weights for policy 1, policy_version 83752 (0.0008) +[2023-10-09 07:35:10,014][60144] Updated weights for policy 1, policy_version 83762 (0.0007) +[2023-10-09 07:35:10,383][60144] Updated weights for policy 1, policy_version 83772 (0.0008) +[2023-10-09 07:35:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 170590208. Throughput: 0: 1720.0, 1: 1737.0. Samples: 42654714. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:35:11,053][59242] Avg episode reward: [(0, '31.940'), (1, '32.150')] +[2023-10-09 07:35:11,724][60143] Updated weights for policy 0, policy_version 82822 (0.0007) +[2023-10-09 07:35:12,099][60143] Updated weights for policy 0, policy_version 82832 (0.0008) +[2023-10-09 07:35:12,473][60143] Updated weights for policy 0, policy_version 82842 (0.0007) +[2023-10-09 07:35:14,276][60144] Updated weights for policy 1, policy_version 83782 (0.0007) +[2023-10-09 07:35:14,644][60144] Updated weights for policy 1, policy_version 83792 (0.0008) +[2023-10-09 07:35:15,012][60144] Updated weights for policy 1, policy_version 83802 (0.0009) +[2023-10-09 07:35:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 170655744. Throughput: 0: 1722.5, 1: 1714.8. Samples: 42675014. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-09 07:35:16,053][59242] Avg episode reward: [(0, '30.830'), (1, '31.400')] +[2023-10-09 07:35:16,294][60143] Updated weights for policy 0, policy_version 82852 (0.0008) +[2023-10-09 07:35:16,674][60143] Updated weights for policy 0, policy_version 82862 (0.0009) +[2023-10-09 07:35:17,036][60143] Updated weights for policy 0, policy_version 82872 (0.0009) +[2023-10-09 07:35:18,850][60144] Updated weights for policy 1, policy_version 83812 (0.0010) +[2023-10-09 07:35:19,221][60144] Updated weights for policy 1, policy_version 83822 (0.0008) +[2023-10-09 07:35:19,583][60144] Updated weights for policy 1, policy_version 83832 (0.0009) +[2023-10-09 07:35:21,030][60143] Updated weights for policy 0, policy_version 82882 (0.0008) +[2023-10-09 07:35:21,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 170721280. Throughput: 0: 1705.3, 1: 1752.5. Samples: 42685970. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-09 07:35:21,052][59242] Avg episode reward: [(0, '31.790'), (1, '31.500')] +[2023-10-09 07:35:21,398][60143] Updated weights for policy 0, policy_version 82892 (0.0009) +[2023-10-09 07:35:21,782][60143] Updated weights for policy 0, policy_version 82902 (0.0011) +[2023-10-09 07:35:22,144][60143] Updated weights for policy 0, policy_version 82912 (0.0011) +[2023-10-09 07:35:23,431][60144] Updated weights for policy 1, policy_version 83842 (0.0009) +[2023-10-09 07:35:23,792][60144] Updated weights for policy 1, policy_version 83852 (0.0008) +[2023-10-09 07:35:24,161][60144] Updated weights for policy 1, policy_version 83862 (0.0007) +[2023-10-09 07:35:24,525][60144] Updated weights for policy 1, policy_version 83872 (0.0007) +[2023-10-09 07:35:26,030][60143] Updated weights for policy 0, policy_version 82922 (0.0008) +[2023-10-09 07:35:26,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 170786816. Throughput: 0: 1717.6, 1: 1723.7. Samples: 42705868. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-09 07:35:26,053][59242] Avg episode reward: [(0, '31.730'), (1, '33.530')] +[2023-10-09 07:35:26,402][60143] Updated weights for policy 0, policy_version 82932 (0.0008) +[2023-10-09 07:35:26,762][60143] Updated weights for policy 0, policy_version 82942 (0.0010) +[2023-10-09 07:35:28,528][60144] Updated weights for policy 1, policy_version 83882 (0.0007) +[2023-10-09 07:35:28,890][60144] Updated weights for policy 1, policy_version 83892 (0.0007) +[2023-10-09 07:35:29,258][60144] Updated weights for policy 1, policy_version 83902 (0.0008) +[2023-10-09 07:35:30,735][60143] Updated weights for policy 0, policy_version 82952 (0.0009) +[2023-10-09 07:35:31,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 170852352. Throughput: 0: 1709.9, 1: 1726.1. Samples: 42726908. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-09 07:35:31,053][59242] Avg episode reward: [(0, '31.510'), (1, '33.810')] +[2023-10-09 07:35:31,059][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000083904_85917696.pth... +[2023-10-09 07:35:31,094][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000082304_84279296.pth +[2023-10-09 07:35:31,102][60143] Updated weights for policy 0, policy_version 82962 (0.0007) +[2023-10-09 07:35:31,472][60143] Updated weights for policy 0, policy_version 82972 (0.0008) +[2023-10-09 07:35:31,617][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000082976_84967424.pth... +[2023-10-09 07:35:31,657][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000081376_83329024.pth +[2023-10-09 07:35:33,188][60144] Updated weights for policy 1, policy_version 83912 (0.0009) +[2023-10-09 07:35:33,553][60144] Updated weights for policy 1, policy_version 83922 (0.0007) +[2023-10-09 07:35:33,928][60144] Updated weights for policy 1, policy_version 83932 (0.0008) +[2023-10-09 07:35:35,495][60143] Updated weights for policy 0, policy_version 82982 (0.0008) +[2023-10-09 07:35:35,863][60143] Updated weights for policy 0, policy_version 82992 (0.0011) +[2023-10-09 07:35:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 170917888. Throughput: 0: 1712.2, 1: 1740.1. Samples: 42737066. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-09 07:35:36,053][59242] Avg episode reward: [(0, '31.050'), (1, '34.260')] +[2023-10-09 07:35:36,227][60143] Updated weights for policy 0, policy_version 83002 (0.0008) +[2023-10-09 07:35:37,839][60144] Updated weights for policy 1, policy_version 83942 (0.0008) +[2023-10-09 07:35:38,204][60144] Updated weights for policy 1, policy_version 83952 (0.0008) +[2023-10-09 07:35:38,566][60144] Updated weights for policy 1, policy_version 83962 (0.0008) +[2023-10-09 07:35:40,302][60143] Updated weights for policy 0, policy_version 83012 (0.0009) +[2023-10-09 07:35:40,665][60143] Updated weights for policy 0, policy_version 83022 (0.0008) +[2023-10-09 07:35:41,033][60143] Updated weights for policy 0, policy_version 83032 (0.0010) +[2023-10-09 07:35:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 170983424. Throughput: 0: 1714.0, 1: 1723.7. Samples: 42757712. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-09 07:35:41,053][59242] Avg episode reward: [(0, '30.580'), (1, '33.400')] +[2023-10-09 07:35:42,447][60144] Updated weights for policy 1, policy_version 83972 (0.0008) +[2023-10-09 07:35:42,821][60144] Updated weights for policy 1, policy_version 83982 (0.0010) +[2023-10-09 07:35:43,174][60144] Updated weights for policy 1, policy_version 83992 (0.0009) +[2023-10-09 07:35:44,885][60143] Updated weights for policy 0, policy_version 83042 (0.0007) +[2023-10-09 07:35:45,254][60143] Updated weights for policy 0, policy_version 83052 (0.0007) +[2023-10-09 07:35:45,621][60143] Updated weights for policy 0, policy_version 83062 (0.0009) +[2023-10-09 07:35:45,988][60143] Updated weights for policy 0, policy_version 83072 (0.0008) +[2023-10-09 07:35:46,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 171081728. Throughput: 0: 1701.5, 1: 1743.4. Samples: 42778504. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-09 07:35:46,053][59242] Avg episode reward: [(0, '30.440'), (1, '33.950')] +[2023-10-09 07:35:47,210][60144] Updated weights for policy 1, policy_version 84002 (0.0007) +[2023-10-09 07:35:47,632][60144] Updated weights for policy 1, policy_version 84012 (0.0008) +[2023-10-09 07:35:47,990][60144] Updated weights for policy 1, policy_version 84022 (0.0010) +[2023-10-09 07:35:48,362][60144] Updated weights for policy 1, policy_version 84032 (0.0009) +[2023-10-09 07:35:49,993][60143] Updated weights for policy 0, policy_version 83082 (0.0007) +[2023-10-09 07:35:50,365][60143] Updated weights for policy 0, policy_version 83092 (0.0008) +[2023-10-09 07:35:50,730][60143] Updated weights for policy 0, policy_version 83102 (0.0009) +[2023-10-09 07:35:51,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 171147264. Throughput: 0: 1716.2, 1: 1718.6. Samples: 42788338. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-09 07:35:51,052][59242] Avg episode reward: [(0, '31.850'), (1, '33.330')] +[2023-10-09 07:35:52,253][60144] Updated weights for policy 1, policy_version 84042 (0.0007) +[2023-10-09 07:35:52,628][60144] Updated weights for policy 1, policy_version 84052 (0.0008) +[2023-10-09 07:35:52,994][60144] Updated weights for policy 1, policy_version 84062 (0.0007) +[2023-10-09 07:35:54,816][60143] Updated weights for policy 0, policy_version 83112 (0.0008) +[2023-10-09 07:35:55,186][60143] Updated weights for policy 0, policy_version 83122 (0.0009) +[2023-10-09 07:35:55,549][60143] Updated weights for policy 0, policy_version 83132 (0.0010) +[2023-10-09 07:35:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 171212800. Throughput: 0: 1717.2, 1: 1722.5. Samples: 42809500. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-09 07:35:56,053][59242] Avg episode reward: [(0, '32.520'), (1, '32.420')] +[2023-10-09 07:35:56,831][60144] Updated weights for policy 1, policy_version 84072 (0.0009) +[2023-10-09 07:35:57,193][60144] Updated weights for policy 1, policy_version 84082 (0.0010) +[2023-10-09 07:35:57,556][60144] Updated weights for policy 1, policy_version 84092 (0.0008) +[2023-10-09 07:35:59,541][60143] Updated weights for policy 0, policy_version 83142 (0.0008) +[2023-10-09 07:35:59,902][60143] Updated weights for policy 0, policy_version 83152 (0.0009) +[2023-10-09 07:36:00,269][60143] Updated weights for policy 0, policy_version 83162 (0.0008) +[2023-10-09 07:36:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 171278336. Throughput: 0: 1687.2, 1: 1752.0. Samples: 42829774. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-09 07:36:01,053][59242] Avg episode reward: [(0, '33.970'), (1, '31.580')] +[2023-10-09 07:36:01,350][60144] Updated weights for policy 1, policy_version 84102 (0.0009) +[2023-10-09 07:36:01,717][60144] Updated weights for policy 1, policy_version 84112 (0.0009) +[2023-10-09 07:36:02,086][60144] Updated weights for policy 1, policy_version 84122 (0.0008) +[2023-10-09 07:36:04,355][60143] Updated weights for policy 0, policy_version 83172 (0.0008) +[2023-10-09 07:36:04,720][60143] Updated weights for policy 0, policy_version 83182 (0.0007) +[2023-10-09 07:36:05,083][60143] Updated weights for policy 0, policy_version 83192 (0.0007) +[2023-10-09 07:36:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 171343872. Throughput: 0: 1715.7, 1: 1716.0. Samples: 42840396. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-09 07:36:06,052][59242] Avg episode reward: [(0, '33.330'), (1, '30.630')] +[2023-10-09 07:36:06,149][60144] Updated weights for policy 1, policy_version 84132 (0.0008) +[2023-10-09 07:36:06,520][60144] Updated weights for policy 1, policy_version 84142 (0.0007) +[2023-10-09 07:36:06,877][60144] Updated weights for policy 1, policy_version 84152 (0.0008) +[2023-10-09 07:36:09,153][60143] Updated weights for policy 0, policy_version 83202 (0.0007) +[2023-10-09 07:36:09,515][60143] Updated weights for policy 0, policy_version 83212 (0.0007) +[2023-10-09 07:36:09,886][60143] Updated weights for policy 0, policy_version 83222 (0.0008) +[2023-10-09 07:36:10,256][60143] Updated weights for policy 0, policy_version 83232 (0.0008) +[2023-10-09 07:36:10,772][60144] Updated weights for policy 1, policy_version 84162 (0.0008) +[2023-10-09 07:36:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 171409408. Throughput: 0: 1697.5, 1: 1744.8. Samples: 42860772. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:36:11,052][59242] Avg episode reward: [(0, '33.980'), (1, '30.050')] +[2023-10-09 07:36:11,144][60144] Updated weights for policy 1, policy_version 84172 (0.0009) +[2023-10-09 07:36:11,517][60144] Updated weights for policy 1, policy_version 84182 (0.0010) +[2023-10-09 07:36:11,889][60144] Updated weights for policy 1, policy_version 84192 (0.0007) +[2023-10-09 07:36:14,303][60143] Updated weights for policy 0, policy_version 83242 (0.0009) +[2023-10-09 07:36:14,671][60143] Updated weights for policy 0, policy_version 83252 (0.0009) +[2023-10-09 07:36:15,041][60143] Updated weights for policy 0, policy_version 83262 (0.0009) +[2023-10-09 07:36:15,793][60144] Updated weights for policy 1, policy_version 84202 (0.0008) +[2023-10-09 07:36:16,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 171474944. Throughput: 0: 1684.0, 1: 1741.1. Samples: 42881040. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:36:16,053][59242] Avg episode reward: [(0, '33.240'), (1, '31.390')] +[2023-10-09 07:36:16,167][60144] Updated weights for policy 1, policy_version 84212 (0.0009) +[2023-10-09 07:36:16,535][60144] Updated weights for policy 1, policy_version 84222 (0.0008) +[2023-10-09 07:36:18,940][60143] Updated weights for policy 0, policy_version 83272 (0.0009) +[2023-10-09 07:36:19,313][60143] Updated weights for policy 0, policy_version 83282 (0.0010) +[2023-10-09 07:36:19,688][60143] Updated weights for policy 0, policy_version 83292 (0.0011) +[2023-10-09 07:36:20,533][60144] Updated weights for policy 1, policy_version 84232 (0.0008) +[2023-10-09 07:36:20,894][60144] Updated weights for policy 1, policy_version 84242 (0.0009) +[2023-10-09 07:36:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 171540480. Throughput: 0: 1711.1, 1: 1726.6. Samples: 42891760. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:36:21,053][59242] Avg episode reward: [(0, '33.220'), (1, '31.280')] +[2023-10-09 07:36:21,262][60144] Updated weights for policy 1, policy_version 84252 (0.0011) +[2023-10-09 07:36:23,783][60143] Updated weights for policy 0, policy_version 83302 (0.0009) +[2023-10-09 07:36:24,149][60143] Updated weights for policy 0, policy_version 83312 (0.0007) +[2023-10-09 07:36:24,519][60143] Updated weights for policy 0, policy_version 83322 (0.0009) +[2023-10-09 07:36:25,299][60144] Updated weights for policy 1, policy_version 84262 (0.0009) +[2023-10-09 07:36:25,668][60144] Updated weights for policy 1, policy_version 84272 (0.0008) +[2023-10-09 07:36:26,025][60144] Updated weights for policy 1, policy_version 84282 (0.0010) +[2023-10-09 07:36:26,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 171606016. Throughput: 0: 1687.6, 1: 1742.0. Samples: 42912046. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:36:26,053][59242] Avg episode reward: [(0, '34.390'), (1, '30.360')] +[2023-10-09 07:36:28,611][60143] Updated weights for policy 0, policy_version 83332 (0.0010) +[2023-10-09 07:36:28,974][60143] Updated weights for policy 0, policy_version 83342 (0.0008) +[2023-10-09 07:36:29,343][60143] Updated weights for policy 0, policy_version 83352 (0.0008) +[2023-10-09 07:36:29,991][60144] Updated weights for policy 1, policy_version 84292 (0.0008) +[2023-10-09 07:36:30,348][60144] Updated weights for policy 1, policy_version 84302 (0.0010) +[2023-10-09 07:36:30,712][60144] Updated weights for policy 1, policy_version 84312 (0.0011) +[2023-10-09 07:36:31,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 171704320. Throughput: 0: 1690.0, 1: 1719.1. Samples: 42931910. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:36:31,053][59242] Avg episode reward: [(0, '34.800'), (1, '31.170')] +[2023-10-09 07:36:33,422][60143] Updated weights for policy 0, policy_version 83362 (0.0007) +[2023-10-09 07:36:33,790][60143] Updated weights for policy 0, policy_version 83372 (0.0009) +[2023-10-09 07:36:34,154][60143] Updated weights for policy 0, policy_version 83382 (0.0007) +[2023-10-09 07:36:34,519][60143] Updated weights for policy 0, policy_version 83392 (0.0008) +[2023-10-09 07:36:34,714][60144] Updated weights for policy 1, policy_version 84322 (0.0010) +[2023-10-09 07:36:35,110][60144] Updated weights for policy 1, policy_version 84332 (0.0009) +[2023-10-09 07:36:35,473][60144] Updated weights for policy 1, policy_version 84342 (0.0008) +[2023-10-09 07:36:35,832][60144] Updated weights for policy 1, policy_version 84352 (0.0008) +[2023-10-09 07:36:36,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 171769856. Throughput: 0: 1700.9, 1: 1739.9. Samples: 42943174. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:36:36,053][59242] Avg episode reward: [(0, '33.040'), (1, '30.870')] +[2023-10-09 07:36:38,551][60143] Updated weights for policy 0, policy_version 83402 (0.0008) +[2023-10-09 07:36:38,911][60143] Updated weights for policy 0, policy_version 83412 (0.0009) +[2023-10-09 07:36:39,274][60143] Updated weights for policy 0, policy_version 83422 (0.0007) +[2023-10-09 07:36:39,784][60144] Updated weights for policy 1, policy_version 84362 (0.0010) +[2023-10-09 07:36:40,155][60144] Updated weights for policy 1, policy_version 84372 (0.0008) +[2023-10-09 07:36:40,527][60144] Updated weights for policy 1, policy_version 84382 (0.0008) +[2023-10-09 07:36:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 171835392. Throughput: 0: 1672.6, 1: 1733.6. Samples: 42962778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:36:41,052][59242] Avg episode reward: [(0, '32.980'), (1, '30.730')] +[2023-10-09 07:36:43,222][60143] Updated weights for policy 0, policy_version 83432 (0.0008) +[2023-10-09 07:36:43,592][60143] Updated weights for policy 0, policy_version 83442 (0.0008) +[2023-10-09 07:36:43,970][60143] Updated weights for policy 0, policy_version 83452 (0.0009) +[2023-10-09 07:36:44,410][60144] Updated weights for policy 1, policy_version 84392 (0.0007) +[2023-10-09 07:36:44,767][60144] Updated weights for policy 1, policy_version 84402 (0.0008) +[2023-10-09 07:36:45,131][60144] Updated weights for policy 1, policy_version 84412 (0.0011) +[2023-10-09 07:36:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 171900928. Throughput: 0: 1705.2, 1: 1699.3. Samples: 42982976. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:36:46,053][59242] Avg episode reward: [(0, '32.780'), (1, '30.640')] +[2023-10-09 07:36:47,913][60143] Updated weights for policy 0, policy_version 83462 (0.0008) +[2023-10-09 07:36:48,291][60143] Updated weights for policy 0, policy_version 83472 (0.0007) +[2023-10-09 07:36:48,660][60143] Updated weights for policy 0, policy_version 83482 (0.0007) +[2023-10-09 07:36:49,097][60144] Updated weights for policy 1, policy_version 84422 (0.0009) +[2023-10-09 07:36:49,467][60144] Updated weights for policy 1, policy_version 84432 (0.0008) +[2023-10-09 07:36:49,832][60144] Updated weights for policy 1, policy_version 84442 (0.0008) +[2023-10-09 07:36:51,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 171966464. Throughput: 0: 1687.9, 1: 1729.4. Samples: 42994176. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:36:51,053][59242] Avg episode reward: [(0, '33.390'), (1, '32.410')] +[2023-10-09 07:36:52,534][60143] Updated weights for policy 0, policy_version 83492 (0.0009) +[2023-10-09 07:36:52,896][60143] Updated weights for policy 0, policy_version 83502 (0.0010) +[2023-10-09 07:36:53,267][60143] Updated weights for policy 0, policy_version 83512 (0.0007) +[2023-10-09 07:36:53,735][60144] Updated weights for policy 1, policy_version 84452 (0.0007) +[2023-10-09 07:36:54,105][60144] Updated weights for policy 1, policy_version 84462 (0.0008) +[2023-10-09 07:36:54,471][60144] Updated weights for policy 1, policy_version 84472 (0.0007) +[2023-10-09 07:36:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 172032000. Throughput: 0: 1698.9, 1: 1705.4. Samples: 43013964. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:36:56,053][59242] Avg episode reward: [(0, '33.340'), (1, '33.240')] +[2023-10-09 07:36:57,060][60143] Updated weights for policy 0, policy_version 83522 (0.0007) +[2023-10-09 07:36:57,435][60143] Updated weights for policy 0, policy_version 83532 (0.0008) +[2023-10-09 07:36:57,810][60143] Updated weights for policy 0, policy_version 83542 (0.0008) +[2023-10-09 07:36:58,174][60143] Updated weights for policy 0, policy_version 83552 (0.0009) +[2023-10-09 07:36:58,408][60144] Updated weights for policy 1, policy_version 84482 (0.0010) +[2023-10-09 07:36:58,776][60144] Updated weights for policy 1, policy_version 84492 (0.0010) +[2023-10-09 07:36:59,140][60144] Updated weights for policy 1, policy_version 84502 (0.0011) +[2023-10-09 07:36:59,510][60144] Updated weights for policy 1, policy_version 84512 (0.0011) +[2023-10-09 07:37:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 172097536. Throughput: 0: 1715.1, 1: 1702.6. Samples: 43034838. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:37:01,053][59242] Avg episode reward: [(0, '33.090'), (1, '32.660')] +[2023-10-09 07:37:02,202][60143] Updated weights for policy 0, policy_version 83562 (0.0008) +[2023-10-09 07:37:02,575][60143] Updated weights for policy 0, policy_version 83572 (0.0009) +[2023-10-09 07:37:02,944][60143] Updated weights for policy 0, policy_version 83582 (0.0007) +[2023-10-09 07:37:03,589][60144] Updated weights for policy 1, policy_version 84522 (0.0007) +[2023-10-09 07:37:03,951][60144] Updated weights for policy 1, policy_version 84532 (0.0007) +[2023-10-09 07:37:04,321][60144] Updated weights for policy 1, policy_version 84542 (0.0008) +[2023-10-09 07:37:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 172163072. Throughput: 0: 1685.8, 1: 1720.0. Samples: 43045022. Policy #0 lag: (min: 27.0, avg: 27.1, max: 34.0) +[2023-10-09 07:37:06,053][59242] Avg episode reward: [(0, '33.390'), (1, '33.260')] +[2023-10-09 07:37:06,974][60143] Updated weights for policy 0, policy_version 83592 (0.0007) +[2023-10-09 07:37:07,341][60143] Updated weights for policy 0, policy_version 83602 (0.0009) +[2023-10-09 07:37:07,712][60143] Updated weights for policy 0, policy_version 83612 (0.0007) +[2023-10-09 07:37:08,145][60144] Updated weights for policy 1, policy_version 84552 (0.0008) +[2023-10-09 07:37:08,502][60144] Updated weights for policy 1, policy_version 84562 (0.0007) +[2023-10-09 07:37:08,877][60144] Updated weights for policy 1, policy_version 84572 (0.0009) +[2023-10-09 07:37:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 172228608. Throughput: 0: 1709.6, 1: 1699.3. Samples: 43065450. Policy #0 lag: (min: 27.0, avg: 27.1, max: 34.0) +[2023-10-09 07:37:11,053][59242] Avg episode reward: [(0, '34.730'), (1, '32.610')] +[2023-10-09 07:37:11,643][60143] Updated weights for policy 0, policy_version 83622 (0.0007) +[2023-10-09 07:37:12,017][60143] Updated weights for policy 0, policy_version 83632 (0.0007) +[2023-10-09 07:37:12,391][60143] Updated weights for policy 0, policy_version 83642 (0.0010) +[2023-10-09 07:37:12,827][60144] Updated weights for policy 1, policy_version 84582 (0.0008) +[2023-10-09 07:37:13,195][60144] Updated weights for policy 1, policy_version 84592 (0.0009) +[2023-10-09 07:37:13,560][60144] Updated weights for policy 1, policy_version 84602 (0.0009) +[2023-10-09 07:37:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 172294144. Throughput: 0: 1722.5, 1: 1720.1. Samples: 43086826. Policy #0 lag: (min: 27.0, avg: 27.1, max: 34.0) +[2023-10-09 07:37:16,053][59242] Avg episode reward: [(0, '35.300'), (1, '32.420')] +[2023-10-09 07:37:16,500][60143] Updated weights for policy 0, policy_version 83652 (0.0008) +[2023-10-09 07:37:16,862][60143] Updated weights for policy 0, policy_version 83662 (0.0008) +[2023-10-09 07:37:17,233][60143] Updated weights for policy 0, policy_version 83672 (0.0007) +[2023-10-09 07:37:17,425][60144] Updated weights for policy 1, policy_version 84612 (0.0008) +[2023-10-09 07:37:17,784][60144] Updated weights for policy 1, policy_version 84622 (0.0007) +[2023-10-09 07:37:18,148][60144] Updated weights for policy 1, policy_version 84632 (0.0008) +[2023-10-09 07:37:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 172359680. Throughput: 0: 1696.7, 1: 1705.2. Samples: 43096262. Policy #0 lag: (min: 27.0, avg: 27.1, max: 34.0) +[2023-10-09 07:37:21,053][59242] Avg episode reward: [(0, '33.720'), (1, '33.450')] +[2023-10-09 07:37:21,130][60143] Updated weights for policy 0, policy_version 83682 (0.0008) +[2023-10-09 07:37:21,497][60143] Updated weights for policy 0, policy_version 83692 (0.0008) +[2023-10-09 07:37:21,862][60143] Updated weights for policy 0, policy_version 83702 (0.0010) +[2023-10-09 07:37:22,087][60144] Updated weights for policy 1, policy_version 84642 (0.0007) +[2023-10-09 07:37:22,222][60143] Updated weights for policy 0, policy_version 83712 (0.0008) +[2023-10-09 07:37:22,457][60144] Updated weights for policy 1, policy_version 84652 (0.0007) +[2023-10-09 07:37:22,818][60144] Updated weights for policy 1, policy_version 84662 (0.0007) +[2023-10-09 07:37:23,190][60144] Updated weights for policy 1, policy_version 84672 (0.0008) +[2023-10-09 07:37:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 172425216. Throughput: 0: 1731.5, 1: 1712.8. Samples: 43117770. Policy #0 lag: (min: 27.0, avg: 27.1, max: 34.0) +[2023-10-09 07:37:26,052][59242] Avg episode reward: [(0, '33.040'), (1, '33.110')] +[2023-10-09 07:37:26,190][60143] Updated weights for policy 0, policy_version 83722 (0.0007) +[2023-10-09 07:37:26,559][60143] Updated weights for policy 0, policy_version 83732 (0.0009) +[2023-10-09 07:37:26,932][60143] Updated weights for policy 0, policy_version 83742 (0.0007) +[2023-10-09 07:37:27,131][60144] Updated weights for policy 1, policy_version 84682 (0.0007) +[2023-10-09 07:37:27,492][60144] Updated weights for policy 1, policy_version 84692 (0.0008) +[2023-10-09 07:37:27,855][60144] Updated weights for policy 1, policy_version 84702 (0.0010) +[2023-10-09 07:37:30,746][60143] Updated weights for policy 0, policy_version 83752 (0.0008) +[2023-10-09 07:37:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13662.6). Total num frames: 172490752. Throughput: 0: 1727.0, 1: 1744.2. Samples: 43139180. Policy #0 lag: (min: 27.0, avg: 27.1, max: 34.0) +[2023-10-09 07:37:31,053][59242] Avg episode reward: [(0, '33.200'), (1, '33.980')] +[2023-10-09 07:37:31,064][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000084704_86736896.pth... +[2023-10-09 07:37:31,103][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000083104_85098496.pth +[2023-10-09 07:37:31,128][60143] Updated weights for policy 0, policy_version 83762 (0.0010) +[2023-10-09 07:37:31,489][60143] Updated weights for policy 0, policy_version 83772 (0.0008) +[2023-10-09 07:37:31,633][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000083776_85786624.pth... +[2023-10-09 07:37:31,669][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000082176_84148224.pth +[2023-10-09 07:37:31,878][60144] Updated weights for policy 1, policy_version 84712 (0.0008) +[2023-10-09 07:37:32,246][60144] Updated weights for policy 1, policy_version 84722 (0.0007) +[2023-10-09 07:37:32,608][60144] Updated weights for policy 1, policy_version 84732 (0.0008) +[2023-10-09 07:37:35,537][60143] Updated weights for policy 0, policy_version 83782 (0.0010) +[2023-10-09 07:37:35,914][60143] Updated weights for policy 0, policy_version 83792 (0.0008) +[2023-10-09 07:37:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 172556288. Throughput: 0: 1722.7, 1: 1714.5. Samples: 43148850. Policy #0 lag: (min: 27.0, avg: 27.1, max: 34.0) +[2023-10-09 07:37:36,053][59242] Avg episode reward: [(0, '33.230'), (1, '34.540')] +[2023-10-09 07:37:36,289][60143] Updated weights for policy 0, policy_version 83802 (0.0008) +[2023-10-09 07:37:36,369][60144] Updated weights for policy 1, policy_version 84742 (0.0008) +[2023-10-09 07:37:36,733][60144] Updated weights for policy 1, policy_version 84752 (0.0010) +[2023-10-09 07:37:37,111][60144] Updated weights for policy 1, policy_version 84762 (0.0009) +[2023-10-09 07:37:40,269][60143] Updated weights for policy 0, policy_version 83812 (0.0007) +[2023-10-09 07:37:40,643][60143] Updated weights for policy 0, policy_version 83822 (0.0009) +[2023-10-09 07:37:41,018][60143] Updated weights for policy 0, policy_version 83832 (0.0008) +[2023-10-09 07:37:41,052][59242] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 172621824. Throughput: 0: 1730.8, 1: 1744.1. Samples: 43170334. Policy #0 lag: (min: 27.0, avg: 27.1, max: 34.0) +[2023-10-09 07:37:41,052][59242] Avg episode reward: [(0, '34.430'), (1, '33.680')] +[2023-10-09 07:37:41,192][60144] Updated weights for policy 1, policy_version 84772 (0.0008) +[2023-10-09 07:37:41,558][60144] Updated weights for policy 1, policy_version 84782 (0.0007) +[2023-10-09 07:37:41,926][60144] Updated weights for policy 1, policy_version 84792 (0.0007) +[2023-10-09 07:37:45,060][60143] Updated weights for policy 0, policy_version 83842 (0.0008) +[2023-10-09 07:37:45,432][60143] Updated weights for policy 0, policy_version 83852 (0.0008) +[2023-10-09 07:37:45,801][60143] Updated weights for policy 0, policy_version 83862 (0.0009) +[2023-10-09 07:37:45,804][60144] Updated weights for policy 1, policy_version 84802 (0.0007) +[2023-10-09 07:37:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 172687360. Throughput: 0: 1719.4, 1: 1754.4. Samples: 43191158. Policy #0 lag: (min: 27.0, avg: 27.1, max: 34.0) +[2023-10-09 07:37:46,053][59242] Avg episode reward: [(0, '34.140'), (1, '34.950')] +[2023-10-09 07:37:46,173][60143] Updated weights for policy 0, policy_version 83872 (0.0008) +[2023-10-09 07:37:46,176][60144] Updated weights for policy 1, policy_version 84812 (0.0008) +[2023-10-09 07:37:46,549][60144] Updated weights for policy 1, policy_version 84822 (0.0007) +[2023-10-09 07:37:46,904][60144] Updated weights for policy 1, policy_version 84832 (0.0009) +[2023-10-09 07:37:50,198][60143] Updated weights for policy 0, policy_version 83882 (0.0007) +[2023-10-09 07:37:50,560][60143] Updated weights for policy 0, policy_version 83892 (0.0007) +[2023-10-09 07:37:50,720][60144] Updated weights for policy 1, policy_version 84842 (0.0008) +[2023-10-09 07:37:50,934][60143] Updated weights for policy 0, policy_version 83902 (0.0007) +[2023-10-09 07:37:51,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 172785664. Throughput: 0: 1731.7, 1: 1735.2. Samples: 43201034. Policy #0 lag: (min: 27.0, avg: 27.1, max: 34.0) +[2023-10-09 07:37:51,052][59242] Avg episode reward: [(0, '34.680'), (1, '35.930')] +[2023-10-09 07:37:51,082][60144] Updated weights for policy 1, policy_version 84852 (0.0009) +[2023-10-09 07:37:51,445][60144] Updated weights for policy 1, policy_version 84862 (0.0009) +[2023-10-09 07:37:54,685][60143] Updated weights for policy 0, policy_version 83912 (0.0008) +[2023-10-09 07:37:55,060][60143] Updated weights for policy 0, policy_version 83922 (0.0007) +[2023-10-09 07:37:55,316][60144] Updated weights for policy 1, policy_version 84872 (0.0009) +[2023-10-09 07:37:55,431][60143] Updated weights for policy 0, policy_version 83932 (0.0007) +[2023-10-09 07:37:55,678][60144] Updated weights for policy 1, policy_version 84882 (0.0008) +[2023-10-09 07:37:56,049][60144] Updated weights for policy 1, policy_version 84892 (0.0007) +[2023-10-09 07:37:56,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 172851200. Throughput: 0: 1726.6, 1: 1757.4. Samples: 43222230. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:37:56,053][59242] Avg episode reward: [(0, '34.620'), (1, '35.690')] +[2023-10-09 07:37:59,444][60143] Updated weights for policy 0, policy_version 83942 (0.0008) +[2023-10-09 07:37:59,816][60143] Updated weights for policy 0, policy_version 83952 (0.0007) +[2023-10-09 07:37:59,944][60144] Updated weights for policy 1, policy_version 84902 (0.0008) +[2023-10-09 07:38:00,185][60143] Updated weights for policy 0, policy_version 83962 (0.0009) +[2023-10-09 07:38:00,308][60144] Updated weights for policy 1, policy_version 84912 (0.0008) +[2023-10-09 07:38:00,673][60144] Updated weights for policy 1, policy_version 84922 (0.0009) +[2023-10-09 07:38:01,052][59242] Fps is (10 sec: 16383.2, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 172949504. Throughput: 0: 1696.1, 1: 1739.6. Samples: 43241434. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:38:01,054][59242] Avg episode reward: [(0, '34.490'), (1, '36.110')] +[2023-10-09 07:38:04,080][60143] Updated weights for policy 0, policy_version 83972 (0.0010) +[2023-10-09 07:38:04,447][60143] Updated weights for policy 0, policy_version 83982 (0.0011) +[2023-10-09 07:38:04,749][60144] Updated weights for policy 1, policy_version 84932 (0.0008) +[2023-10-09 07:38:04,822][60143] Updated weights for policy 0, policy_version 83992 (0.0008) +[2023-10-09 07:38:05,116][60144] Updated weights for policy 1, policy_version 84942 (0.0007) +[2023-10-09 07:38:05,484][60144] Updated weights for policy 1, policy_version 84952 (0.0007) +[2023-10-09 07:38:06,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 173015040. Throughput: 0: 1727.2, 1: 1754.9. Samples: 43252958. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:38:06,053][59242] Avg episode reward: [(0, '34.840'), (1, '35.420')] +[2023-10-09 07:38:08,874][60143] Updated weights for policy 0, policy_version 84002 (0.0008) +[2023-10-09 07:38:09,244][60143] Updated weights for policy 0, policy_version 84012 (0.0011) +[2023-10-09 07:38:09,465][60144] Updated weights for policy 1, policy_version 84962 (0.0010) +[2023-10-09 07:38:09,620][60143] Updated weights for policy 0, policy_version 84022 (0.0008) +[2023-10-09 07:38:09,836][60144] Updated weights for policy 1, policy_version 84972 (0.0009) +[2023-10-09 07:38:09,979][60143] Updated weights for policy 0, policy_version 84032 (0.0009) +[2023-10-09 07:38:10,209][60144] Updated weights for policy 1, policy_version 84982 (0.0010) +[2023-10-09 07:38:10,572][60144] Updated weights for policy 1, policy_version 84992 (0.0009) +[2023-10-09 07:38:11,052][59242] Fps is (10 sec: 13107.8, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 173080576. Throughput: 0: 1700.0, 1: 1747.2. Samples: 43272896. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:38:11,052][59242] Avg episode reward: [(0, '34.990'), (1, '35.530')] +[2023-10-09 07:38:14,153][60143] Updated weights for policy 0, policy_version 84042 (0.0009) +[2023-10-09 07:38:14,406][60144] Updated weights for policy 1, policy_version 85002 (0.0008) +[2023-10-09 07:38:14,526][60143] Updated weights for policy 0, policy_version 84052 (0.0007) +[2023-10-09 07:38:14,778][60144] Updated weights for policy 1, policy_version 85012 (0.0008) +[2023-10-09 07:38:14,886][60143] Updated weights for policy 0, policy_version 84062 (0.0007) +[2023-10-09 07:38:15,143][60144] Updated weights for policy 1, policy_version 85022 (0.0009) +[2023-10-09 07:38:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 173146112. Throughput: 0: 1687.5, 1: 1717.6. Samples: 43292410. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:38:16,053][59242] Avg episode reward: [(0, '34.970'), (1, '35.440')] +[2023-10-09 07:38:18,858][60143] Updated weights for policy 0, policy_version 84072 (0.0008) +[2023-10-09 07:38:19,099][60144] Updated weights for policy 1, policy_version 85032 (0.0009) +[2023-10-09 07:38:19,217][60143] Updated weights for policy 0, policy_version 84082 (0.0008) +[2023-10-09 07:38:19,470][60144] Updated weights for policy 1, policy_version 85042 (0.0007) +[2023-10-09 07:38:19,591][60143] Updated weights for policy 0, policy_version 84092 (0.0008) +[2023-10-09 07:38:19,834][60144] Updated weights for policy 1, policy_version 85052 (0.0009) +[2023-10-09 07:38:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 173211648. Throughput: 0: 1708.2, 1: 1747.6. Samples: 43304358. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:38:21,053][59242] Avg episode reward: [(0, '36.150'), (1, '34.230')] +[2023-10-09 07:38:23,654][60143] Updated weights for policy 0, policy_version 84102 (0.0009) +[2023-10-09 07:38:23,869][60144] Updated weights for policy 1, policy_version 85062 (0.0009) +[2023-10-09 07:38:24,028][60143] Updated weights for policy 0, policy_version 84112 (0.0008) +[2023-10-09 07:38:24,232][60144] Updated weights for policy 1, policy_version 85072 (0.0008) +[2023-10-09 07:38:24,401][60143] Updated weights for policy 0, policy_version 84122 (0.0010) +[2023-10-09 07:38:24,604][60144] Updated weights for policy 1, policy_version 85082 (0.0009) +[2023-10-09 07:38:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 173277184. Throughput: 0: 1684.0, 1: 1721.5. Samples: 43323584. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:38:26,053][59242] Avg episode reward: [(0, '35.730'), (1, '33.760')] +[2023-10-09 07:38:28,225][60144] Updated weights for policy 1, policy_version 85092 (0.0009) +[2023-10-09 07:38:28,452][60143] Updated weights for policy 0, policy_version 84132 (0.0010) +[2023-10-09 07:38:28,601][60144] Updated weights for policy 1, policy_version 85102 (0.0008) +[2023-10-09 07:38:28,823][60143] Updated weights for policy 0, policy_version 84142 (0.0009) +[2023-10-09 07:38:28,971][60144] Updated weights for policy 1, policy_version 85112 (0.0010) +[2023-10-09 07:38:29,188][60143] Updated weights for policy 0, policy_version 84152 (0.0009) +[2023-10-09 07:38:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 173342720. Throughput: 0: 1689.5, 1: 1715.0. Samples: 43344358. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:38:31,053][59242] Avg episode reward: [(0, '36.050'), (1, '33.790')] +[2023-10-09 07:38:32,944][60144] Updated weights for policy 1, policy_version 85122 (0.0009) +[2023-10-09 07:38:33,282][60143] Updated weights for policy 0, policy_version 84162 (0.0008) +[2023-10-09 07:38:33,320][60144] Updated weights for policy 1, policy_version 85132 (0.0010) +[2023-10-09 07:38:33,657][60143] Updated weights for policy 0, policy_version 84172 (0.0007) +[2023-10-09 07:38:33,695][60144] Updated weights for policy 1, policy_version 85142 (0.0008) +[2023-10-09 07:38:34,016][60143] Updated weights for policy 0, policy_version 84182 (0.0007) +[2023-10-09 07:38:34,054][60144] Updated weights for policy 1, policy_version 85152 (0.0008) +[2023-10-09 07:38:34,389][60143] Updated weights for policy 0, policy_version 84192 (0.0009) +[2023-10-09 07:38:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 173408256. Throughput: 0: 1699.6, 1: 1729.6. Samples: 43355348. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:38:36,053][59242] Avg episode reward: [(0, '35.880'), (1, '33.650')] +[2023-10-09 07:38:38,034][60144] Updated weights for policy 1, policy_version 85162 (0.0008) +[2023-10-09 07:38:38,376][60143] Updated weights for policy 0, policy_version 84202 (0.0009) +[2023-10-09 07:38:38,404][60144] Updated weights for policy 1, policy_version 85172 (0.0009) +[2023-10-09 07:38:38,754][60143] Updated weights for policy 0, policy_version 84212 (0.0008) +[2023-10-09 07:38:38,764][60144] Updated weights for policy 1, policy_version 85182 (0.0008) +[2023-10-09 07:38:39,114][60143] Updated weights for policy 0, policy_version 84222 (0.0007) +[2023-10-09 07:38:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 173473792. Throughput: 0: 1680.4, 1: 1712.8. Samples: 43374924. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:38:41,053][59242] Avg episode reward: [(0, '37.900'), (1, '34.060')] +[2023-10-09 07:38:42,826][60144] Updated weights for policy 1, policy_version 85192 (0.0007) +[2023-10-09 07:38:42,907][60143] Updated weights for policy 0, policy_version 84232 (0.0007) +[2023-10-09 07:38:43,197][60144] Updated weights for policy 1, policy_version 85202 (0.0009) +[2023-10-09 07:38:43,280][60143] Updated weights for policy 0, policy_version 84242 (0.0009) +[2023-10-09 07:38:43,570][60144] Updated weights for policy 1, policy_version 85212 (0.0009) +[2023-10-09 07:38:43,647][60143] Updated weights for policy 0, policy_version 84252 (0.0009) +[2023-10-09 07:38:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 173539328. Throughput: 0: 1712.8, 1: 1723.0. Samples: 43396042. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:38:46,053][59242] Avg episode reward: [(0, '37.020'), (1, '33.010')] +[2023-10-09 07:38:47,482][60144] Updated weights for policy 1, policy_version 85222 (0.0007) +[2023-10-09 07:38:47,697][60143] Updated weights for policy 0, policy_version 84262 (0.0010) +[2023-10-09 07:38:47,845][60144] Updated weights for policy 1, policy_version 85232 (0.0007) +[2023-10-09 07:38:48,062][60143] Updated weights for policy 0, policy_version 84272 (0.0010) +[2023-10-09 07:38:48,215][60144] Updated weights for policy 1, policy_version 85242 (0.0008) +[2023-10-09 07:38:48,436][60143] Updated weights for policy 0, policy_version 84282 (0.0010) +[2023-10-09 07:38:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 173604864. Throughput: 0: 1683.6, 1: 1708.0. Samples: 43405578. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) +[2023-10-09 07:38:51,053][59242] Avg episode reward: [(0, '35.960'), (1, '36.560')] +[2023-10-09 07:38:52,126][60144] Updated weights for policy 1, policy_version 85252 (0.0009) +[2023-10-09 07:38:52,469][60143] Updated weights for policy 0, policy_version 84292 (0.0009) +[2023-10-09 07:38:52,491][60144] Updated weights for policy 1, policy_version 85262 (0.0008) +[2023-10-09 07:38:52,833][60143] Updated weights for policy 0, policy_version 84302 (0.0009) +[2023-10-09 07:38:52,853][60144] Updated weights for policy 1, policy_version 85272 (0.0007) +[2023-10-09 07:38:53,205][60143] Updated weights for policy 0, policy_version 84312 (0.0009) +[2023-10-09 07:38:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 173670400. Throughput: 0: 1696.7, 1: 1720.0. Samples: 43426644. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) +[2023-10-09 07:38:56,052][59242] Avg episode reward: [(0, '37.120'), (1, '34.640')] +[2023-10-09 07:38:56,906][60144] Updated weights for policy 1, policy_version 85282 (0.0008) +[2023-10-09 07:38:57,327][60143] Updated weights for policy 0, policy_version 84322 (0.0009) +[2023-10-09 07:38:57,327][60144] Updated weights for policy 1, policy_version 85292 (0.0008) +[2023-10-09 07:38:57,688][60143] Updated weights for policy 0, policy_version 84332 (0.0007) +[2023-10-09 07:38:57,706][60144] Updated weights for policy 1, policy_version 85302 (0.0008) +[2023-10-09 07:38:58,062][60143] Updated weights for policy 0, policy_version 84342 (0.0007) +[2023-10-09 07:38:58,069][60144] Updated weights for policy 1, policy_version 85312 (0.0009) +[2023-10-09 07:38:58,434][60143] Updated weights for policy 0, policy_version 84352 (0.0011) +[2023-10-09 07:39:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 173735936. Throughput: 0: 1708.8, 1: 1742.4. Samples: 43447716. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) +[2023-10-09 07:39:01,053][59242] Avg episode reward: [(0, '36.890'), (1, '34.760')] +[2023-10-09 07:39:01,894][60144] Updated weights for policy 1, policy_version 85322 (0.0008) +[2023-10-09 07:39:02,260][60144] Updated weights for policy 1, policy_version 85332 (0.0007) +[2023-10-09 07:39:02,446][60143] Updated weights for policy 0, policy_version 84362 (0.0008) +[2023-10-09 07:39:02,640][60144] Updated weights for policy 1, policy_version 85342 (0.0008) +[2023-10-09 07:39:02,808][60143] Updated weights for policy 0, policy_version 84372 (0.0008) +[2023-10-09 07:39:03,186][60143] Updated weights for policy 0, policy_version 84382 (0.0008) +[2023-10-09 07:39:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 173801472. Throughput: 0: 1680.9, 1: 1713.7. Samples: 43457112. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) +[2023-10-09 07:39:06,052][59242] Avg episode reward: [(0, '37.680'), (1, '33.490')] +[2023-10-09 07:39:06,455][60144] Updated weights for policy 1, policy_version 85352 (0.0007) +[2023-10-09 07:39:06,823][60144] Updated weights for policy 1, policy_version 85362 (0.0008) +[2023-10-09 07:39:07,180][60143] Updated weights for policy 0, policy_version 84392 (0.0009) +[2023-10-09 07:39:07,191][60144] Updated weights for policy 1, policy_version 85372 (0.0009) +[2023-10-09 07:39:07,554][60143] Updated weights for policy 0, policy_version 84402 (0.0009) +[2023-10-09 07:39:07,920][60143] Updated weights for policy 0, policy_version 84412 (0.0008) +[2023-10-09 07:39:11,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 173867008. Throughput: 0: 1695.8, 1: 1742.1. Samples: 43478288. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) +[2023-10-09 07:39:11,052][59242] Avg episode reward: [(0, '38.640'), (1, '34.010')] +[2023-10-09 07:39:11,097][60144] Updated weights for policy 1, policy_version 85382 (0.0009) +[2023-10-09 07:39:11,467][60144] Updated weights for policy 1, policy_version 85392 (0.0011) +[2023-10-09 07:39:11,826][60144] Updated weights for policy 1, policy_version 85402 (0.0010) +[2023-10-09 07:39:12,210][60143] Updated weights for policy 0, policy_version 84422 (0.0008) +[2023-10-09 07:39:12,587][60143] Updated weights for policy 0, policy_version 84432 (0.0007) +[2023-10-09 07:39:12,959][60143] Updated weights for policy 0, policy_version 84442 (0.0009) +[2023-10-09 07:39:15,815][60144] Updated weights for policy 1, policy_version 85412 (0.0009) +[2023-10-09 07:39:16,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 173932544. Throughput: 0: 1699.1, 1: 1744.5. Samples: 43499318. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) +[2023-10-09 07:39:16,053][59242] Avg episode reward: [(0, '38.060'), (1, '35.720')] +[2023-10-09 07:39:16,187][60144] Updated weights for policy 1, policy_version 85422 (0.0007) +[2023-10-09 07:39:16,548][60144] Updated weights for policy 1, policy_version 85432 (0.0007) +[2023-10-09 07:39:16,850][60143] Updated weights for policy 0, policy_version 84452 (0.0008) +[2023-10-09 07:39:17,214][60143] Updated weights for policy 0, policy_version 84462 (0.0009) +[2023-10-09 07:39:17,592][60143] Updated weights for policy 0, policy_version 84472 (0.0007) +[2023-10-09 07:39:20,316][60144] Updated weights for policy 1, policy_version 85442 (0.0007) +[2023-10-09 07:39:20,687][60144] Updated weights for policy 1, policy_version 85452 (0.0008) +[2023-10-09 07:39:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 173998080. Throughput: 0: 1676.2, 1: 1732.9. Samples: 43508760. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) +[2023-10-09 07:39:21,052][59242] Avg episode reward: [(0, '36.510'), (1, '34.550')] +[2023-10-09 07:39:21,055][60144] Updated weights for policy 1, policy_version 85462 (0.0008) +[2023-10-09 07:39:21,419][60144] Updated weights for policy 1, policy_version 85472 (0.0007) +[2023-10-09 07:39:21,547][60143] Updated weights for policy 0, policy_version 84482 (0.0009) +[2023-10-09 07:39:21,928][60143] Updated weights for policy 0, policy_version 84492 (0.0010) +[2023-10-09 07:39:22,286][60143] Updated weights for policy 0, policy_version 84502 (0.0008) +[2023-10-09 07:39:22,654][60143] Updated weights for policy 0, policy_version 84512 (0.0007) +[2023-10-09 07:39:25,196][60144] Updated weights for policy 1, policy_version 85482 (0.0008) +[2023-10-09 07:39:25,558][60144] Updated weights for policy 1, policy_version 85492 (0.0009) +[2023-10-09 07:39:25,930][60144] Updated weights for policy 1, policy_version 85502 (0.0008) +[2023-10-09 07:39:26,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 174096384. Throughput: 0: 1700.5, 1: 1756.6. Samples: 43530494. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) +[2023-10-09 07:39:26,053][59242] Avg episode reward: [(0, '36.670'), (1, '33.220')] +[2023-10-09 07:39:26,561][60143] Updated weights for policy 0, policy_version 84522 (0.0008) +[2023-10-09 07:39:26,939][60143] Updated weights for policy 0, policy_version 84532 (0.0011) +[2023-10-09 07:39:27,317][60143] Updated weights for policy 0, policy_version 84542 (0.0008) +[2023-10-09 07:39:29,817][60144] Updated weights for policy 1, policy_version 85512 (0.0007) +[2023-10-09 07:39:30,189][60144] Updated weights for policy 1, policy_version 85522 (0.0008) +[2023-10-09 07:39:30,556][60144] Updated weights for policy 1, policy_version 85532 (0.0007) +[2023-10-09 07:39:31,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 174161920. Throughput: 0: 1698.5, 1: 1738.5. Samples: 43550706. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) +[2023-10-09 07:39:31,052][59242] Avg episode reward: [(0, '38.740'), (1, '33.220')] +[2023-10-09 07:39:31,058][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000085536_87588864.pth... +[2023-10-09 07:39:31,098][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000083904_85917696.pth +[2023-10-09 07:39:31,242][60143] Updated weights for policy 0, policy_version 84552 (0.0008) +[2023-10-09 07:39:31,606][60143] Updated weights for policy 0, policy_version 84562 (0.0008) +[2023-10-09 07:39:31,968][60143] Updated weights for policy 0, policy_version 84572 (0.0009) +[2023-10-09 07:39:32,114][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000084576_86605824.pth... +[2023-10-09 07:39:32,143][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000082976_84967424.pth +[2023-10-09 07:39:34,540][60144] Updated weights for policy 1, policy_version 85542 (0.0010) +[2023-10-09 07:39:34,918][60144] Updated weights for policy 1, policy_version 85552 (0.0007) +[2023-10-09 07:39:35,288][60144] Updated weights for policy 1, policy_version 85562 (0.0008) +[2023-10-09 07:39:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 174227456. Throughput: 0: 1694.9, 1: 1762.2. Samples: 43561150. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) +[2023-10-09 07:39:36,053][59242] Avg episode reward: [(0, '37.470'), (1, '32.980')] +[2023-10-09 07:39:36,085][60143] Updated weights for policy 0, policy_version 84582 (0.0009) +[2023-10-09 07:39:36,466][60143] Updated weights for policy 0, policy_version 84592 (0.0008) +[2023-10-09 07:39:36,840][60143] Updated weights for policy 0, policy_version 84602 (0.0007) +[2023-10-09 07:39:39,082][60144] Updated weights for policy 1, policy_version 85572 (0.0008) +[2023-10-09 07:39:39,446][60144] Updated weights for policy 1, policy_version 85582 (0.0008) +[2023-10-09 07:39:39,812][60144] Updated weights for policy 1, policy_version 85592 (0.0008) +[2023-10-09 07:39:40,755][60143] Updated weights for policy 0, policy_version 84612 (0.0009) +[2023-10-09 07:39:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 174292992. Throughput: 0: 1707.2, 1: 1744.4. Samples: 43581964. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) +[2023-10-09 07:39:41,053][59242] Avg episode reward: [(0, '38.270'), (1, '33.590')] +[2023-10-09 07:39:41,128][60143] Updated weights for policy 0, policy_version 84622 (0.0009) +[2023-10-09 07:39:41,490][60143] Updated weights for policy 0, policy_version 84632 (0.0008) +[2023-10-09 07:39:43,814][60144] Updated weights for policy 1, policy_version 85602 (0.0008) +[2023-10-09 07:39:44,223][60144] Updated weights for policy 1, policy_version 85612 (0.0010) +[2023-10-09 07:39:44,584][60144] Updated weights for policy 1, policy_version 85622 (0.0010) +[2023-10-09 07:39:44,947][60144] Updated weights for policy 1, policy_version 85632 (0.0010) +[2023-10-09 07:39:45,573][60143] Updated weights for policy 0, policy_version 84642 (0.0008) +[2023-10-09 07:39:45,940][60143] Updated weights for policy 0, policy_version 84652 (0.0008) +[2023-10-09 07:39:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 174358528. Throughput: 0: 1707.5, 1: 1727.9. Samples: 43602308. Policy #0 lag: (min: 10.0, avg: 14.1, max: 42.0) +[2023-10-09 07:39:46,053][59242] Avg episode reward: [(0, '36.340'), (1, '35.100')] +[2023-10-09 07:39:46,307][60143] Updated weights for policy 0, policy_version 84662 (0.0008) +[2023-10-09 07:39:46,683][60143] Updated weights for policy 0, policy_version 84672 (0.0009) +[2023-10-09 07:39:48,804][60144] Updated weights for policy 1, policy_version 85642 (0.0010) +[2023-10-09 07:39:49,165][60144] Updated weights for policy 1, policy_version 85652 (0.0010) +[2023-10-09 07:39:49,520][60144] Updated weights for policy 1, policy_version 85662 (0.0009) +[2023-10-09 07:39:50,657][60143] Updated weights for policy 0, policy_version 84682 (0.0009) +[2023-10-09 07:39:51,022][60143] Updated weights for policy 0, policy_version 84692 (0.0009) +[2023-10-09 07:39:51,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 174424064. Throughput: 0: 1705.6, 1: 1754.0. Samples: 43612792. Policy #0 lag: (min: 10.0, avg: 14.1, max: 42.0) +[2023-10-09 07:39:51,052][59242] Avg episode reward: [(0, '36.410'), (1, '34.190')] +[2023-10-09 07:39:51,385][60143] Updated weights for policy 0, policy_version 84702 (0.0011) +[2023-10-09 07:39:53,497][60144] Updated weights for policy 1, policy_version 85672 (0.0010) +[2023-10-09 07:39:53,867][60144] Updated weights for policy 1, policy_version 85682 (0.0008) +[2023-10-09 07:39:54,241][60144] Updated weights for policy 1, policy_version 85692 (0.0007) +[2023-10-09 07:39:55,534][60143] Updated weights for policy 0, policy_version 84712 (0.0009) +[2023-10-09 07:39:55,898][60143] Updated weights for policy 0, policy_version 84722 (0.0007) +[2023-10-09 07:39:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 174489600. Throughput: 0: 1714.7, 1: 1720.8. Samples: 43632886. Policy #0 lag: (min: 10.0, avg: 14.1, max: 42.0) +[2023-10-09 07:39:56,052][59242] Avg episode reward: [(0, '36.700'), (1, '35.210')] +[2023-10-09 07:39:56,271][60143] Updated weights for policy 0, policy_version 84732 (0.0009) +[2023-10-09 07:39:58,287][60144] Updated weights for policy 1, policy_version 85702 (0.0009) +[2023-10-09 07:39:58,657][60144] Updated weights for policy 1, policy_version 85712 (0.0009) +[2023-10-09 07:39:59,021][60144] Updated weights for policy 1, policy_version 85722 (0.0009) +[2023-10-09 07:40:00,209][60143] Updated weights for policy 0, policy_version 84742 (0.0010) +[2023-10-09 07:40:00,586][60143] Updated weights for policy 0, policy_version 84752 (0.0009) +[2023-10-09 07:40:00,946][60143] Updated weights for policy 0, policy_version 84762 (0.0009) +[2023-10-09 07:40:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 174555136. Throughput: 0: 1709.4, 1: 1721.9. Samples: 43653724. Policy #0 lag: (min: 10.0, avg: 14.1, max: 42.0) +[2023-10-09 07:40:01,053][59242] Avg episode reward: [(0, '36.430'), (1, '34.450')] +[2023-10-09 07:40:02,882][60144] Updated weights for policy 1, policy_version 85732 (0.0009) +[2023-10-09 07:40:03,241][60144] Updated weights for policy 1, policy_version 85742 (0.0007) +[2023-10-09 07:40:03,615][60144] Updated weights for policy 1, policy_version 85752 (0.0007) +[2023-10-09 07:40:04,802][60143] Updated weights for policy 0, policy_version 84772 (0.0009) +[2023-10-09 07:40:05,179][60143] Updated weights for policy 0, policy_version 84782 (0.0008) +[2023-10-09 07:40:05,539][60143] Updated weights for policy 0, policy_version 84792 (0.0010) +[2023-10-09 07:40:06,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 174653440. Throughput: 0: 1721.2, 1: 1729.2. Samples: 43664028. Policy #0 lag: (min: 10.0, avg: 14.1, max: 42.0) +[2023-10-09 07:40:06,053][59242] Avg episode reward: [(0, '36.880'), (1, '34.040')] +[2023-10-09 07:40:07,573][60144] Updated weights for policy 1, policy_version 85762 (0.0008) +[2023-10-09 07:40:07,941][60144] Updated weights for policy 1, policy_version 85772 (0.0008) +[2023-10-09 07:40:08,299][60144] Updated weights for policy 1, policy_version 85782 (0.0008) +[2023-10-09 07:40:08,669][60144] Updated weights for policy 1, policy_version 85792 (0.0008) +[2023-10-09 07:40:09,489][60143] Updated weights for policy 0, policy_version 84802 (0.0009) +[2023-10-09 07:40:09,861][60143] Updated weights for policy 0, policy_version 84812 (0.0009) +[2023-10-09 07:40:10,239][60143] Updated weights for policy 0, policy_version 84822 (0.0008) +[2023-10-09 07:40:10,601][60143] Updated weights for policy 0, policy_version 84832 (0.0010) +[2023-10-09 07:40:11,052][59242] Fps is (10 sec: 16384.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 174718976. Throughput: 0: 1714.4, 1: 1714.0. Samples: 43684774. Policy #0 lag: (min: 10.0, avg: 14.1, max: 42.0) +[2023-10-09 07:40:11,053][59242] Avg episode reward: [(0, '36.770'), (1, '34.490')] +[2023-10-09 07:40:12,627][60144] Updated weights for policy 1, policy_version 85802 (0.0008) +[2023-10-09 07:40:12,993][60144] Updated weights for policy 1, policy_version 85812 (0.0007) +[2023-10-09 07:40:13,361][60144] Updated weights for policy 1, policy_version 85822 (0.0009) +[2023-10-09 07:40:14,512][60143] Updated weights for policy 0, policy_version 84842 (0.0008) +[2023-10-09 07:40:14,881][60143] Updated weights for policy 0, policy_version 84852 (0.0009) +[2023-10-09 07:40:15,258][60143] Updated weights for policy 0, policy_version 84862 (0.0009) +[2023-10-09 07:40:16,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 174784512. Throughput: 0: 1686.9, 1: 1742.7. Samples: 43705040. Policy #0 lag: (min: 10.0, avg: 14.1, max: 42.0) +[2023-10-09 07:40:16,053][59242] Avg episode reward: [(0, '36.530'), (1, '35.750')] +[2023-10-09 07:40:17,352][60144] Updated weights for policy 1, policy_version 85832 (0.0008) +[2023-10-09 07:40:17,719][60144] Updated weights for policy 1, policy_version 85842 (0.0007) +[2023-10-09 07:40:18,078][60144] Updated weights for policy 1, policy_version 85852 (0.0009) +[2023-10-09 07:40:19,117][60143] Updated weights for policy 0, policy_version 84872 (0.0008) +[2023-10-09 07:40:19,489][60143] Updated weights for policy 0, policy_version 84882 (0.0008) +[2023-10-09 07:40:19,861][60143] Updated weights for policy 0, policy_version 84892 (0.0009) +[2023-10-09 07:40:21,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 174850048. Throughput: 0: 1718.8, 1: 1712.4. Samples: 43715554. Policy #0 lag: (min: 10.0, avg: 14.1, max: 42.0) +[2023-10-09 07:40:21,053][59242] Avg episode reward: [(0, '35.890'), (1, '34.010')] +[2023-10-09 07:40:21,994][60144] Updated weights for policy 1, policy_version 85862 (0.0009) +[2023-10-09 07:40:22,360][60144] Updated weights for policy 1, policy_version 85872 (0.0010) +[2023-10-09 07:40:22,724][60144] Updated weights for policy 1, policy_version 85882 (0.0008) +[2023-10-09 07:40:23,847][60143] Updated weights for policy 0, policy_version 84902 (0.0011) +[2023-10-09 07:40:24,233][60143] Updated weights for policy 0, policy_version 84912 (0.0010) +[2023-10-09 07:40:24,591][60143] Updated weights for policy 0, policy_version 84922 (0.0010) +[2023-10-09 07:40:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 174915584. Throughput: 0: 1693.8, 1: 1728.0. Samples: 43735944. Policy #0 lag: (min: 10.0, avg: 14.1, max: 42.0) +[2023-10-09 07:40:26,053][59242] Avg episode reward: [(0, '35.240'), (1, '33.630')] +[2023-10-09 07:40:26,487][60144] Updated weights for policy 1, policy_version 85892 (0.0008) +[2023-10-09 07:40:26,858][60144] Updated weights for policy 1, policy_version 85902 (0.0009) +[2023-10-09 07:40:27,226][60144] Updated weights for policy 1, policy_version 85912 (0.0009) +[2023-10-09 07:40:28,897][60143] Updated weights for policy 0, policy_version 84932 (0.0009) +[2023-10-09 07:40:29,271][60143] Updated weights for policy 0, policy_version 84942 (0.0008) +[2023-10-09 07:40:29,638][60143] Updated weights for policy 0, policy_version 84952 (0.0008) +[2023-10-09 07:40:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 174981120. Throughput: 0: 1684.2, 1: 1748.6. Samples: 43756782. Policy #0 lag: (min: 10.0, avg: 14.1, max: 42.0) +[2023-10-09 07:40:31,053][59242] Avg episode reward: [(0, '34.350'), (1, '35.110')] +[2023-10-09 07:40:31,221][60144] Updated weights for policy 1, policy_version 85922 (0.0010) +[2023-10-09 07:40:31,635][60144] Updated weights for policy 1, policy_version 85932 (0.0009) +[2023-10-09 07:40:32,008][60144] Updated weights for policy 1, policy_version 85942 (0.0011) +[2023-10-09 07:40:32,369][60144] Updated weights for policy 1, policy_version 85952 (0.0010) +[2023-10-09 07:40:33,517][60143] Updated weights for policy 0, policy_version 84962 (0.0007) +[2023-10-09 07:40:33,888][60143] Updated weights for policy 0, policy_version 84972 (0.0008) +[2023-10-09 07:40:34,259][60143] Updated weights for policy 0, policy_version 84982 (0.0008) +[2023-10-09 07:40:34,633][60143] Updated weights for policy 0, policy_version 84992 (0.0007) +[2023-10-09 07:40:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 175046656. Throughput: 0: 1715.4, 1: 1716.0. Samples: 43767202. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 07:40:36,053][59242] Avg episode reward: [(0, '34.210'), (1, '35.290')] +[2023-10-09 07:40:36,221][60144] Updated weights for policy 1, policy_version 85962 (0.0008) +[2023-10-09 07:40:36,589][60144] Updated weights for policy 1, policy_version 85972 (0.0007) +[2023-10-09 07:40:36,946][60144] Updated weights for policy 1, policy_version 85982 (0.0007) +[2023-10-09 07:40:38,515][60143] Updated weights for policy 0, policy_version 85002 (0.0009) +[2023-10-09 07:40:38,871][60143] Updated weights for policy 0, policy_version 85012 (0.0007) +[2023-10-09 07:40:39,244][60143] Updated weights for policy 0, policy_version 85022 (0.0007) +[2023-10-09 07:40:40,932][60144] Updated weights for policy 1, policy_version 85992 (0.0009) +[2023-10-09 07:40:41,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 175112192. Throughput: 0: 1687.8, 1: 1741.4. Samples: 43787198. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 07:40:41,053][59242] Avg episode reward: [(0, '33.190'), (1, '32.620')] +[2023-10-09 07:40:41,291][60144] Updated weights for policy 1, policy_version 86002 (0.0009) +[2023-10-09 07:40:41,658][60144] Updated weights for policy 1, policy_version 86012 (0.0008) +[2023-10-09 07:40:43,259][60143] Updated weights for policy 0, policy_version 85032 (0.0008) +[2023-10-09 07:40:43,625][60143] Updated weights for policy 0, policy_version 85042 (0.0008) +[2023-10-09 07:40:43,993][60143] Updated weights for policy 0, policy_version 85052 (0.0008) +[2023-10-09 07:40:45,531][60144] Updated weights for policy 1, policy_version 86022 (0.0008) +[2023-10-09 07:40:45,894][60144] Updated weights for policy 1, policy_version 86032 (0.0007) +[2023-10-09 07:40:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 175177728. Throughput: 0: 1693.2, 1: 1736.9. Samples: 43808078. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 07:40:46,053][59242] Avg episode reward: [(0, '32.640'), (1, '31.900')] +[2023-10-09 07:40:46,268][60144] Updated weights for policy 1, policy_version 86042 (0.0010) +[2023-10-09 07:40:47,986][60143] Updated weights for policy 0, policy_version 85062 (0.0009) +[2023-10-09 07:40:48,361][60143] Updated weights for policy 0, policy_version 85072 (0.0008) +[2023-10-09 07:40:48,723][60143] Updated weights for policy 0, policy_version 85082 (0.0007) +[2023-10-09 07:40:50,233][60144] Updated weights for policy 1, policy_version 86052 (0.0008) +[2023-10-09 07:40:50,604][60144] Updated weights for policy 1, policy_version 86062 (0.0007) +[2023-10-09 07:40:50,973][60144] Updated weights for policy 1, policy_version 86072 (0.0010) +[2023-10-09 07:40:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 175243264. Throughput: 0: 1694.8, 1: 1733.7. Samples: 43818310. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 07:40:51,052][59242] Avg episode reward: [(0, '33.450'), (1, '32.790')] +[2023-10-09 07:40:52,675][60143] Updated weights for policy 0, policy_version 85092 (0.0009) +[2023-10-09 07:40:53,045][60143] Updated weights for policy 0, policy_version 85102 (0.0008) +[2023-10-09 07:40:53,407][60143] Updated weights for policy 0, policy_version 85112 (0.0008) +[2023-10-09 07:40:54,922][60144] Updated weights for policy 1, policy_version 86082 (0.0007) +[2023-10-09 07:40:55,290][60144] Updated weights for policy 1, policy_version 86092 (0.0008) +[2023-10-09 07:40:55,660][60144] Updated weights for policy 1, policy_version 86102 (0.0009) +[2023-10-09 07:40:56,025][60144] Updated weights for policy 1, policy_version 86112 (0.0009) +[2023-10-09 07:40:56,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 175341568. Throughput: 0: 1691.0, 1: 1738.9. Samples: 43839118. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 07:40:56,053][59242] Avg episode reward: [(0, '31.970'), (1, '32.010')] +[2023-10-09 07:40:57,401][60143] Updated weights for policy 0, policy_version 85122 (0.0007) +[2023-10-09 07:40:57,762][60143] Updated weights for policy 0, policy_version 85132 (0.0007) +[2023-10-09 07:40:58,131][60143] Updated weights for policy 0, policy_version 85142 (0.0007) +[2023-10-09 07:40:58,504][60143] Updated weights for policy 0, policy_version 85152 (0.0008) +[2023-10-09 07:40:59,971][60144] Updated weights for policy 1, policy_version 86122 (0.0009) +[2023-10-09 07:41:00,323][60144] Updated weights for policy 1, policy_version 86132 (0.0011) +[2023-10-09 07:41:00,688][60144] Updated weights for policy 1, policy_version 86142 (0.0011) +[2023-10-09 07:41:01,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 175407104. Throughput: 0: 1724.2, 1: 1708.6. Samples: 43859514. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 07:41:01,053][59242] Avg episode reward: [(0, '30.440'), (1, '32.380')] +[2023-10-09 07:41:02,450][60143] Updated weights for policy 0, policy_version 85162 (0.0008) +[2023-10-09 07:41:02,824][60143] Updated weights for policy 0, policy_version 85172 (0.0008) +[2023-10-09 07:41:03,194][60143] Updated weights for policy 0, policy_version 85182 (0.0008) +[2023-10-09 07:41:04,584][60144] Updated weights for policy 1, policy_version 86152 (0.0008) +[2023-10-09 07:41:04,956][60144] Updated weights for policy 1, policy_version 86162 (0.0008) +[2023-10-09 07:41:05,322][60144] Updated weights for policy 1, policy_version 86172 (0.0008) +[2023-10-09 07:41:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 175472640. Throughput: 0: 1692.3, 1: 1739.3. Samples: 43869974. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 07:41:06,052][59242] Avg episode reward: [(0, '30.920'), (1, '32.320')] +[2023-10-09 07:41:07,232][60143] Updated weights for policy 0, policy_version 85192 (0.0008) +[2023-10-09 07:41:07,606][60143] Updated weights for policy 0, policy_version 85202 (0.0009) +[2023-10-09 07:41:07,977][60143] Updated weights for policy 0, policy_version 85212 (0.0007) +[2023-10-09 07:41:09,384][60144] Updated weights for policy 1, policy_version 86182 (0.0010) +[2023-10-09 07:41:09,744][60144] Updated weights for policy 1, policy_version 86192 (0.0007) +[2023-10-09 07:41:10,116][60144] Updated weights for policy 1, policy_version 86202 (0.0007) +[2023-10-09 07:41:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 175538176. Throughput: 0: 1715.6, 1: 1727.2. Samples: 43890868. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 07:41:11,053][59242] Avg episode reward: [(0, '30.450'), (1, '32.430')] +[2023-10-09 07:41:11,908][60143] Updated weights for policy 0, policy_version 85222 (0.0010) +[2023-10-09 07:41:12,278][60143] Updated weights for policy 0, policy_version 85232 (0.0009) +[2023-10-09 07:41:12,656][60143] Updated weights for policy 0, policy_version 85242 (0.0011) +[2023-10-09 07:41:14,152][60144] Updated weights for policy 1, policy_version 86212 (0.0009) +[2023-10-09 07:41:14,528][60144] Updated weights for policy 1, policy_version 86222 (0.0008) +[2023-10-09 07:41:14,895][60144] Updated weights for policy 1, policy_version 86232 (0.0008) +[2023-10-09 07:41:16,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 175603712. Throughput: 0: 1727.9, 1: 1699.5. Samples: 43911014. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 07:41:16,053][59242] Avg episode reward: [(0, '31.330'), (1, '32.120')] +[2023-10-09 07:41:16,643][60143] Updated weights for policy 0, policy_version 85252 (0.0009) +[2023-10-09 07:41:17,009][60143] Updated weights for policy 0, policy_version 85262 (0.0010) +[2023-10-09 07:41:17,378][60143] Updated weights for policy 0, policy_version 85272 (0.0010) +[2023-10-09 07:41:18,690][60144] Updated weights for policy 1, policy_version 86242 (0.0009) +[2023-10-09 07:41:19,122][60144] Updated weights for policy 1, policy_version 86252 (0.0011) +[2023-10-09 07:41:19,489][60144] Updated weights for policy 1, policy_version 86262 (0.0010) +[2023-10-09 07:41:19,849][60144] Updated weights for policy 1, policy_version 86272 (0.0007) +[2023-10-09 07:41:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 175669248. Throughput: 0: 1695.9, 1: 1740.6. Samples: 43921842. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 07:41:21,053][59242] Avg episode reward: [(0, '32.480'), (1, '32.190')] +[2023-10-09 07:41:21,394][60143] Updated weights for policy 0, policy_version 85282 (0.0009) +[2023-10-09 07:41:21,760][60143] Updated weights for policy 0, policy_version 85292 (0.0007) +[2023-10-09 07:41:22,132][60143] Updated weights for policy 0, policy_version 85302 (0.0007) +[2023-10-09 07:41:22,501][60143] Updated weights for policy 0, policy_version 85312 (0.0008) +[2023-10-09 07:41:23,680][60144] Updated weights for policy 1, policy_version 86282 (0.0008) +[2023-10-09 07:41:24,045][60144] Updated weights for policy 1, policy_version 86292 (0.0011) +[2023-10-09 07:41:24,409][60144] Updated weights for policy 1, policy_version 86302 (0.0008) +[2023-10-09 07:41:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 175734784. Throughput: 0: 1727.6, 1: 1707.7. Samples: 43941786. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-09 07:41:26,053][59242] Avg episode reward: [(0, '34.500'), (1, '31.810')] +[2023-10-09 07:41:26,513][60143] Updated weights for policy 0, policy_version 85322 (0.0007) +[2023-10-09 07:41:26,880][60143] Updated weights for policy 0, policy_version 85332 (0.0010) +[2023-10-09 07:41:27,249][60143] Updated weights for policy 0, policy_version 85342 (0.0011) +[2023-10-09 07:41:28,328][60144] Updated weights for policy 1, policy_version 86312 (0.0008) +[2023-10-09 07:41:28,696][60144] Updated weights for policy 1, policy_version 86322 (0.0010) +[2023-10-09 07:41:29,068][60144] Updated weights for policy 1, policy_version 86332 (0.0007) +[2023-10-09 07:41:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 175800320. Throughput: 0: 1730.2, 1: 1715.2. Samples: 43963118. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 07:41:31,053][59242] Avg episode reward: [(0, '34.170'), (1, '30.580')] +[2023-10-09 07:41:31,062][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000086336_88408064.pth... +[2023-10-09 07:41:31,097][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000084704_86736896.pth +[2023-10-09 07:41:31,396][60143] Updated weights for policy 0, policy_version 85352 (0.0008) +[2023-10-09 07:41:31,771][60143] Updated weights for policy 0, policy_version 85362 (0.0009) +[2023-10-09 07:41:32,150][60143] Updated weights for policy 0, policy_version 85372 (0.0007) +[2023-10-09 07:41:32,295][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000085376_87425024.pth... +[2023-10-09 07:41:32,324][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000083776_85786624.pth +[2023-10-09 07:41:33,014][60144] Updated weights for policy 1, policy_version 86342 (0.0008) +[2023-10-09 07:41:33,395][60144] Updated weights for policy 1, policy_version 86352 (0.0009) +[2023-10-09 07:41:33,762][60144] Updated weights for policy 1, policy_version 86362 (0.0008) +[2023-10-09 07:41:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 175865856. Throughput: 0: 1715.9, 1: 1720.4. Samples: 43972942. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 07:41:36,052][59242] Avg episode reward: [(0, '32.460'), (1, '30.290')] +[2023-10-09 07:41:36,137][60143] Updated weights for policy 0, policy_version 85382 (0.0007) +[2023-10-09 07:41:36,513][60143] Updated weights for policy 0, policy_version 85392 (0.0007) +[2023-10-09 07:41:36,894][60143] Updated weights for policy 0, policy_version 85402 (0.0007) +[2023-10-09 07:41:37,657][60144] Updated weights for policy 1, policy_version 86372 (0.0008) +[2023-10-09 07:41:38,020][60144] Updated weights for policy 1, policy_version 86382 (0.0009) +[2023-10-09 07:41:38,393][60144] Updated weights for policy 1, policy_version 86392 (0.0008) +[2023-10-09 07:41:40,726][60143] Updated weights for policy 0, policy_version 85412 (0.0009) +[2023-10-09 07:41:41,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 175931392. Throughput: 0: 1726.6, 1: 1711.3. Samples: 43993824. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 07:41:41,053][59242] Avg episode reward: [(0, '32.630'), (1, '30.700')] +[2023-10-09 07:41:41,093][60143] Updated weights for policy 0, policy_version 85422 (0.0007) +[2023-10-09 07:41:41,460][60143] Updated weights for policy 0, policy_version 85432 (0.0007) +[2023-10-09 07:41:42,462][60144] Updated weights for policy 1, policy_version 86402 (0.0009) +[2023-10-09 07:41:42,823][60144] Updated weights for policy 1, policy_version 86412 (0.0008) +[2023-10-09 07:41:43,193][60144] Updated weights for policy 1, policy_version 86422 (0.0010) +[2023-10-09 07:41:43,560][60144] Updated weights for policy 1, policy_version 86432 (0.0010) +[2023-10-09 07:41:45,454][60143] Updated weights for policy 0, policy_version 85442 (0.0008) +[2023-10-09 07:41:45,824][60143] Updated weights for policy 0, policy_version 85452 (0.0009) +[2023-10-09 07:41:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 175996928. Throughput: 0: 1718.8, 1: 1734.1. Samples: 44014896. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 07:41:46,053][59242] Avg episode reward: [(0, '32.770'), (1, '30.640')] +[2023-10-09 07:41:46,189][60143] Updated weights for policy 0, policy_version 85462 (0.0008) +[2023-10-09 07:41:46,561][60143] Updated weights for policy 0, policy_version 85472 (0.0008) +[2023-10-09 07:41:47,610][60144] Updated weights for policy 1, policy_version 86442 (0.0009) +[2023-10-09 07:41:47,979][60144] Updated weights for policy 1, policy_version 86452 (0.0010) +[2023-10-09 07:41:48,343][60144] Updated weights for policy 1, policy_version 86462 (0.0009) +[2023-10-09 07:41:50,406][60143] Updated weights for policy 0, policy_version 85482 (0.0008) +[2023-10-09 07:41:50,765][60143] Updated weights for policy 0, policy_version 85492 (0.0008) +[2023-10-09 07:41:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 176062464. Throughput: 0: 1724.4, 1: 1706.0. Samples: 44024344. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 07:41:51,052][59242] Avg episode reward: [(0, '33.160'), (1, '31.440')] +[2023-10-09 07:41:51,133][60143] Updated weights for policy 0, policy_version 85502 (0.0010) +[2023-10-09 07:41:52,296][60144] Updated weights for policy 1, policy_version 86472 (0.0008) +[2023-10-09 07:41:52,656][60144] Updated weights for policy 1, policy_version 86482 (0.0007) +[2023-10-09 07:41:53,027][60144] Updated weights for policy 1, policy_version 86492 (0.0009) +[2023-10-09 07:41:55,128][60143] Updated weights for policy 0, policy_version 85512 (0.0008) +[2023-10-09 07:41:55,491][60143] Updated weights for policy 0, policy_version 85522 (0.0008) +[2023-10-09 07:41:55,852][60143] Updated weights for policy 0, policy_version 85532 (0.0009) +[2023-10-09 07:41:56,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 176160768. Throughput: 0: 1720.8, 1: 1720.5. Samples: 44045726. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 07:41:56,052][59242] Avg episode reward: [(0, '33.830'), (1, '31.790')] +[2023-10-09 07:41:56,957][60144] Updated weights for policy 1, policy_version 86502 (0.0007) +[2023-10-09 07:41:57,319][60144] Updated weights for policy 1, policy_version 86512 (0.0007) +[2023-10-09 07:41:57,670][60144] Updated weights for policy 1, policy_version 86522 (0.0007) +[2023-10-09 07:41:59,805][60143] Updated weights for policy 0, policy_version 85542 (0.0010) +[2023-10-09 07:42:00,177][60143] Updated weights for policy 0, policy_version 85552 (0.0008) +[2023-10-09 07:42:00,550][60143] Updated weights for policy 0, policy_version 85562 (0.0010) +[2023-10-09 07:42:01,052][59242] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 176226304. Throughput: 0: 1699.7, 1: 1751.4. Samples: 44066312. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 07:42:01,053][59242] Avg episode reward: [(0, '33.590'), (1, '33.120')] +[2023-10-09 07:42:01,513][60144] Updated weights for policy 1, policy_version 86532 (0.0007) +[2023-10-09 07:42:01,882][60144] Updated weights for policy 1, policy_version 86542 (0.0007) +[2023-10-09 07:42:02,255][60144] Updated weights for policy 1, policy_version 86552 (0.0008) +[2023-10-09 07:42:04,623][60143] Updated weights for policy 0, policy_version 85572 (0.0009) +[2023-10-09 07:42:04,996][60143] Updated weights for policy 0, policy_version 85582 (0.0008) +[2023-10-09 07:42:05,372][60143] Updated weights for policy 0, policy_version 85592 (0.0010) +[2023-10-09 07:42:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 176291840. Throughput: 0: 1724.9, 1: 1713.9. Samples: 44076590. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 07:42:06,053][59242] Avg episode reward: [(0, '33.460'), (1, '34.060')] +[2023-10-09 07:42:06,207][60144] Updated weights for policy 1, policy_version 86562 (0.0009) +[2023-10-09 07:42:06,566][60144] Updated weights for policy 1, policy_version 86572 (0.0010) +[2023-10-09 07:42:06,928][60144] Updated weights for policy 1, policy_version 86582 (0.0010) +[2023-10-09 07:42:07,292][60144] Updated weights for policy 1, policy_version 86592 (0.0011) +[2023-10-09 07:42:09,280][60143] Updated weights for policy 0, policy_version 85602 (0.0010) +[2023-10-09 07:42:09,649][60143] Updated weights for policy 0, policy_version 85612 (0.0010) +[2023-10-09 07:42:10,014][60143] Updated weights for policy 0, policy_version 85622 (0.0007) +[2023-10-09 07:42:10,383][60143] Updated weights for policy 0, policy_version 85632 (0.0008) +[2023-10-09 07:42:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 176357376. Throughput: 0: 1711.8, 1: 1744.7. Samples: 44097328. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 07:42:11,053][59242] Avg episode reward: [(0, '34.490'), (1, '33.880')] +[2023-10-09 07:42:11,234][60144] Updated weights for policy 1, policy_version 86602 (0.0009) +[2023-10-09 07:42:11,599][60144] Updated weights for policy 1, policy_version 86612 (0.0008) +[2023-10-09 07:42:11,952][60144] Updated weights for policy 1, policy_version 86622 (0.0009) +[2023-10-09 07:42:14,429][60143] Updated weights for policy 0, policy_version 85642 (0.0011) +[2023-10-09 07:42:14,805][60143] Updated weights for policy 0, policy_version 85652 (0.0007) +[2023-10-09 07:42:15,184][60143] Updated weights for policy 0, policy_version 85662 (0.0008) +[2023-10-09 07:42:15,894][60144] Updated weights for policy 1, policy_version 86632 (0.0011) +[2023-10-09 07:42:16,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 176422912. Throughput: 0: 1691.9, 1: 1742.3. Samples: 44117656. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-09 07:42:16,053][59242] Avg episode reward: [(0, '34.780'), (1, '33.890')] +[2023-10-09 07:42:16,259][60144] Updated weights for policy 1, policy_version 86642 (0.0010) +[2023-10-09 07:42:16,622][60144] Updated weights for policy 1, policy_version 86652 (0.0007) +[2023-10-09 07:42:19,182][60143] Updated weights for policy 0, policy_version 85672 (0.0008) +[2023-10-09 07:42:19,552][60143] Updated weights for policy 0, policy_version 85682 (0.0008) +[2023-10-09 07:42:19,921][60143] Updated weights for policy 0, policy_version 85692 (0.0007) +[2023-10-09 07:42:20,482][60144] Updated weights for policy 1, policy_version 86662 (0.0007) +[2023-10-09 07:42:20,850][60144] Updated weights for policy 1, policy_version 86672 (0.0010) +[2023-10-09 07:42:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 176488448. Throughput: 0: 1722.0, 1: 1730.6. Samples: 44128308. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:42:21,053][59242] Avg episode reward: [(0, '34.740'), (1, '33.700')] +[2023-10-09 07:42:21,213][60144] Updated weights for policy 1, policy_version 86682 (0.0009) +[2023-10-09 07:42:23,891][60143] Updated weights for policy 0, policy_version 85702 (0.0008) +[2023-10-09 07:42:24,260][60143] Updated weights for policy 0, policy_version 85712 (0.0008) +[2023-10-09 07:42:24,624][60143] Updated weights for policy 0, policy_version 85722 (0.0009) +[2023-10-09 07:42:25,169][60144] Updated weights for policy 1, policy_version 86692 (0.0008) +[2023-10-09 07:42:25,536][60144] Updated weights for policy 1, policy_version 86702 (0.0009) +[2023-10-09 07:42:25,900][60144] Updated weights for policy 1, policy_version 86712 (0.0008) +[2023-10-09 07:42:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 176553984. Throughput: 0: 1699.3, 1: 1741.7. Samples: 44148672. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:42:26,053][59242] Avg episode reward: [(0, '35.990'), (1, '32.930')] +[2023-10-09 07:42:28,524][60143] Updated weights for policy 0, policy_version 85732 (0.0007) +[2023-10-09 07:42:28,887][60143] Updated weights for policy 0, policy_version 85742 (0.0007) +[2023-10-09 07:42:29,253][60143] Updated weights for policy 0, policy_version 85752 (0.0008) +[2023-10-09 07:42:29,736][60144] Updated weights for policy 1, policy_version 86722 (0.0008) +[2023-10-09 07:42:30,096][60144] Updated weights for policy 1, policy_version 86732 (0.0007) +[2023-10-09 07:42:30,463][60144] Updated weights for policy 1, policy_version 86742 (0.0008) +[2023-10-09 07:42:30,825][60144] Updated weights for policy 1, policy_version 86752 (0.0007) +[2023-10-09 07:42:31,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 176652288. Throughput: 0: 1692.5, 1: 1726.4. Samples: 44168750. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:42:31,053][59242] Avg episode reward: [(0, '34.650'), (1, '32.930')] +[2023-10-09 07:42:33,286][60143] Updated weights for policy 0, policy_version 85762 (0.0009) +[2023-10-09 07:42:33,654][60143] Updated weights for policy 0, policy_version 85772 (0.0008) +[2023-10-09 07:42:34,025][60143] Updated weights for policy 0, policy_version 85782 (0.0008) +[2023-10-09 07:42:34,391][60143] Updated weights for policy 0, policy_version 85792 (0.0007) +[2023-10-09 07:42:34,661][60144] Updated weights for policy 1, policy_version 86762 (0.0007) +[2023-10-09 07:42:35,021][60144] Updated weights for policy 1, policy_version 86772 (0.0007) +[2023-10-09 07:42:35,395][60144] Updated weights for policy 1, policy_version 86782 (0.0008) +[2023-10-09 07:42:36,052][59242] Fps is (10 sec: 16384.3, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 176717824. Throughput: 0: 1710.5, 1: 1755.2. Samples: 44180302. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:42:36,053][59242] Avg episode reward: [(0, '34.330'), (1, '33.700')] +[2023-10-09 07:42:38,342][60143] Updated weights for policy 0, policy_version 85802 (0.0010) +[2023-10-09 07:42:38,711][60143] Updated weights for policy 0, policy_version 85812 (0.0009) +[2023-10-09 07:42:39,073][60143] Updated weights for policy 0, policy_version 85822 (0.0010) +[2023-10-09 07:42:39,443][60144] Updated weights for policy 1, policy_version 86792 (0.0008) +[2023-10-09 07:42:39,809][60144] Updated weights for policy 1, policy_version 86802 (0.0007) +[2023-10-09 07:42:40,179][60144] Updated weights for policy 1, policy_version 86812 (0.0007) +[2023-10-09 07:42:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 176783360. Throughput: 0: 1690.4, 1: 1742.7. Samples: 44200214. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:42:41,053][59242] Avg episode reward: [(0, '36.780'), (1, '35.510')] +[2023-10-09 07:42:42,961][60143] Updated weights for policy 0, policy_version 85832 (0.0007) +[2023-10-09 07:42:43,327][60143] Updated weights for policy 0, policy_version 85842 (0.0008) +[2023-10-09 07:42:43,703][60143] Updated weights for policy 0, policy_version 85852 (0.0007) +[2023-10-09 07:42:44,049][60144] Updated weights for policy 1, policy_version 86822 (0.0008) +[2023-10-09 07:42:44,412][60144] Updated weights for policy 1, policy_version 86832 (0.0010) +[2023-10-09 07:42:44,775][60144] Updated weights for policy 1, policy_version 86842 (0.0009) +[2023-10-09 07:42:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 176848896. Throughput: 0: 1715.8, 1: 1718.7. Samples: 44220862. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:42:46,053][59242] Avg episode reward: [(0, '38.610'), (1, '35.560')] +[2023-10-09 07:42:47,621][60143] Updated weights for policy 0, policy_version 85862 (0.0008) +[2023-10-09 07:42:47,993][60143] Updated weights for policy 0, policy_version 85872 (0.0009) +[2023-10-09 07:42:48,355][60143] Updated weights for policy 0, policy_version 85882 (0.0009) +[2023-10-09 07:42:48,702][60144] Updated weights for policy 1, policy_version 86852 (0.0008) +[2023-10-09 07:42:49,065][60144] Updated weights for policy 1, policy_version 86862 (0.0009) +[2023-10-09 07:42:49,428][60144] Updated weights for policy 1, policy_version 86872 (0.0011) +[2023-10-09 07:42:51,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 176914432. Throughput: 0: 1696.5, 1: 1750.6. Samples: 44231712. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:42:51,053][59242] Avg episode reward: [(0, '39.050'), (1, '34.820')] +[2023-10-09 07:42:52,492][60143] Updated weights for policy 0, policy_version 85892 (0.0009) +[2023-10-09 07:42:52,858][60143] Updated weights for policy 0, policy_version 85902 (0.0008) +[2023-10-09 07:42:53,222][60143] Updated weights for policy 0, policy_version 85912 (0.0008) +[2023-10-09 07:42:53,462][60144] Updated weights for policy 1, policy_version 86882 (0.0008) +[2023-10-09 07:42:53,873][60144] Updated weights for policy 1, policy_version 86892 (0.0009) +[2023-10-09 07:42:54,240][60144] Updated weights for policy 1, policy_version 86902 (0.0010) +[2023-10-09 07:42:54,604][60144] Updated weights for policy 1, policy_version 86912 (0.0010) +[2023-10-09 07:42:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 176979968. Throughput: 0: 1698.8, 1: 1723.6. Samples: 44251336. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:42:56,053][59242] Avg episode reward: [(0, '38.550'), (1, '35.910')] +[2023-10-09 07:42:56,988][60143] Updated weights for policy 0, policy_version 85922 (0.0007) +[2023-10-09 07:42:57,364][60143] Updated weights for policy 0, policy_version 85932 (0.0008) +[2023-10-09 07:42:57,732][60143] Updated weights for policy 0, policy_version 85942 (0.0008) +[2023-10-09 07:42:58,097][60143] Updated weights for policy 0, policy_version 85952 (0.0008) +[2023-10-09 07:42:58,446][60144] Updated weights for policy 1, policy_version 86922 (0.0009) +[2023-10-09 07:42:58,810][60144] Updated weights for policy 1, policy_version 86932 (0.0010) +[2023-10-09 07:42:59,183][60144] Updated weights for policy 1, policy_version 86942 (0.0009) +[2023-10-09 07:43:01,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 177045504. Throughput: 0: 1724.7, 1: 1720.6. Samples: 44272694. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:43:01,052][59242] Avg episode reward: [(0, '37.130'), (1, '35.790')] +[2023-10-09 07:43:02,049][60143] Updated weights for policy 0, policy_version 85962 (0.0007) +[2023-10-09 07:43:02,419][60143] Updated weights for policy 0, policy_version 85972 (0.0008) +[2023-10-09 07:43:02,791][60143] Updated weights for policy 0, policy_version 85982 (0.0007) +[2023-10-09 07:43:03,262][60144] Updated weights for policy 1, policy_version 86952 (0.0009) +[2023-10-09 07:43:03,620][60144] Updated weights for policy 1, policy_version 86962 (0.0007) +[2023-10-09 07:43:03,989][60144] Updated weights for policy 1, policy_version 86972 (0.0008) +[2023-10-09 07:43:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 177111040. Throughput: 0: 1695.2, 1: 1733.5. Samples: 44282600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:43:06,053][59242] Avg episode reward: [(0, '36.210'), (1, '36.030')] +[2023-10-09 07:43:06,605][60143] Updated weights for policy 0, policy_version 85992 (0.0009) +[2023-10-09 07:43:06,970][60143] Updated weights for policy 0, policy_version 86002 (0.0008) +[2023-10-09 07:43:07,336][60143] Updated weights for policy 0, policy_version 86012 (0.0009) +[2023-10-09 07:43:07,931][60144] Updated weights for policy 1, policy_version 86982 (0.0007) +[2023-10-09 07:43:08,304][60144] Updated weights for policy 1, policy_version 86992 (0.0010) +[2023-10-09 07:43:08,672][60144] Updated weights for policy 1, policy_version 87002 (0.0009) +[2023-10-09 07:43:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 177176576. Throughput: 0: 1723.8, 1: 1714.1. Samples: 44303378. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:43:11,053][59242] Avg episode reward: [(0, '36.190'), (1, '37.020')] +[2023-10-09 07:43:11,337][60143] Updated weights for policy 0, policy_version 86022 (0.0008) +[2023-10-09 07:43:11,702][60143] Updated weights for policy 0, policy_version 86032 (0.0007) +[2023-10-09 07:43:12,077][60143] Updated weights for policy 0, policy_version 86042 (0.0010) +[2023-10-09 07:43:12,616][60144] Updated weights for policy 1, policy_version 87012 (0.0007) +[2023-10-09 07:43:12,993][60144] Updated weights for policy 1, policy_version 87022 (0.0007) +[2023-10-09 07:43:13,360][60144] Updated weights for policy 1, policy_version 87032 (0.0008) +[2023-10-09 07:43:16,008][60143] Updated weights for policy 0, policy_version 86052 (0.0010) +[2023-10-09 07:43:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 177242112. Throughput: 0: 1734.1, 1: 1732.3. Samples: 44324740. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:43:16,053][59242] Avg episode reward: [(0, '37.350'), (1, '36.260')] +[2023-10-09 07:43:16,369][60143] Updated weights for policy 0, policy_version 86062 (0.0010) +[2023-10-09 07:43:16,740][60143] Updated weights for policy 0, policy_version 86072 (0.0009) +[2023-10-09 07:43:17,145][60144] Updated weights for policy 1, policy_version 87042 (0.0009) +[2023-10-09 07:43:17,503][60144] Updated weights for policy 1, policy_version 87052 (0.0009) +[2023-10-09 07:43:17,873][60144] Updated weights for policy 1, policy_version 87062 (0.0008) +[2023-10-09 07:43:18,237][60144] Updated weights for policy 1, policy_version 87072 (0.0008) +[2023-10-09 07:43:20,657][60143] Updated weights for policy 0, policy_version 86082 (0.0010) +[2023-10-09 07:43:21,019][60143] Updated weights for policy 0, policy_version 86092 (0.0010) +[2023-10-09 07:43:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 177307648. Throughput: 0: 1711.6, 1: 1711.0. Samples: 44334318. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 07:43:21,053][59242] Avg episode reward: [(0, '37.420'), (1, '35.420')] +[2023-10-09 07:43:21,397][60143] Updated weights for policy 0, policy_version 86102 (0.0009) +[2023-10-09 07:43:21,770][60143] Updated weights for policy 0, policy_version 86112 (0.0008) +[2023-10-09 07:43:22,041][60144] Updated weights for policy 1, policy_version 87082 (0.0007) +[2023-10-09 07:43:22,399][60144] Updated weights for policy 1, policy_version 87092 (0.0008) +[2023-10-09 07:43:22,772][60144] Updated weights for policy 1, policy_version 87102 (0.0009) +[2023-10-09 07:43:25,942][60143] Updated weights for policy 0, policy_version 86122 (0.0009) +[2023-10-09 07:43:26,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 177373184. Throughput: 0: 1733.3, 1: 1722.6. Samples: 44355728. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 07:43:26,053][59242] Avg episode reward: [(0, '36.230'), (1, '35.660')] +[2023-10-09 07:43:26,315][60143] Updated weights for policy 0, policy_version 86132 (0.0009) +[2023-10-09 07:43:26,678][60143] Updated weights for policy 0, policy_version 86142 (0.0010) +[2023-10-09 07:43:26,687][60144] Updated weights for policy 1, policy_version 87112 (0.0007) +[2023-10-09 07:43:27,055][60144] Updated weights for policy 1, policy_version 87122 (0.0007) +[2023-10-09 07:43:27,423][60144] Updated weights for policy 1, policy_version 87132 (0.0007) +[2023-10-09 07:43:30,654][60143] Updated weights for policy 0, policy_version 86152 (0.0009) +[2023-10-09 07:43:31,023][60143] Updated weights for policy 0, policy_version 86162 (0.0010) +[2023-10-09 07:43:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 177438720. Throughput: 0: 1726.4, 1: 1744.3. Samples: 44377042. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 07:43:31,053][59242] Avg episode reward: [(0, '35.440'), (1, '34.680')] +[2023-10-09 07:43:31,064][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000087136_89227264.pth... +[2023-10-09 07:43:31,108][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000085536_87588864.pth +[2023-10-09 07:43:31,395][60143] Updated weights for policy 0, policy_version 86172 (0.0009) +[2023-10-09 07:43:31,452][60144] Updated weights for policy 1, policy_version 87142 (0.0009) +[2023-10-09 07:43:31,543][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000086176_88244224.pth... +[2023-10-09 07:43:31,577][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000084576_86605824.pth +[2023-10-09 07:43:31,808][60144] Updated weights for policy 1, policy_version 87152 (0.0007) +[2023-10-09 07:43:32,176][60144] Updated weights for policy 1, policy_version 87162 (0.0008) +[2023-10-09 07:43:35,505][60143] Updated weights for policy 0, policy_version 86182 (0.0008) +[2023-10-09 07:43:35,879][60143] Updated weights for policy 0, policy_version 86192 (0.0011) +[2023-10-09 07:43:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 177504256. Throughput: 0: 1726.7, 1: 1713.4. Samples: 44386516. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 07:43:36,053][59242] Avg episode reward: [(0, '35.980'), (1, '31.380')] +[2023-10-09 07:43:36,094][60144] Updated weights for policy 1, policy_version 87172 (0.0009) +[2023-10-09 07:43:36,247][60143] Updated weights for policy 0, policy_version 86202 (0.0009) +[2023-10-09 07:43:36,459][60144] Updated weights for policy 1, policy_version 87182 (0.0008) +[2023-10-09 07:43:36,825][60144] Updated weights for policy 1, policy_version 87192 (0.0007) +[2023-10-09 07:43:40,291][60143] Updated weights for policy 0, policy_version 86212 (0.0008) +[2023-10-09 07:43:40,667][60143] Updated weights for policy 0, policy_version 86222 (0.0009) +[2023-10-09 07:43:40,789][60144] Updated weights for policy 1, policy_version 87202 (0.0008) +[2023-10-09 07:43:41,035][60143] Updated weights for policy 0, policy_version 86232 (0.0008) +[2023-10-09 07:43:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 177569792. Throughput: 0: 1732.5, 1: 1741.1. Samples: 44407646. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 07:43:41,053][59242] Avg episode reward: [(0, '36.090'), (1, '31.700')] +[2023-10-09 07:43:41,152][60144] Updated weights for policy 1, policy_version 87212 (0.0007) +[2023-10-09 07:43:41,528][60144] Updated weights for policy 1, policy_version 87222 (0.0009) +[2023-10-09 07:43:41,892][60144] Updated weights for policy 1, policy_version 87232 (0.0009) +[2023-10-09 07:43:45,014][60143] Updated weights for policy 0, policy_version 86242 (0.0008) +[2023-10-09 07:43:45,381][60143] Updated weights for policy 0, policy_version 86252 (0.0008) +[2023-10-09 07:43:45,746][60143] Updated weights for policy 0, policy_version 86262 (0.0009) +[2023-10-09 07:43:46,023][60144] Updated weights for policy 1, policy_version 87242 (0.0007) +[2023-10-09 07:43:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 177635328. Throughput: 0: 1713.9, 1: 1740.8. Samples: 44428154. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 07:43:46,053][59242] Avg episode reward: [(0, '37.470'), (1, '33.180')] +[2023-10-09 07:43:46,108][60143] Updated weights for policy 0, policy_version 86272 (0.0009) +[2023-10-09 07:43:46,385][60144] Updated weights for policy 1, policy_version 87252 (0.0007) +[2023-10-09 07:43:46,757][60144] Updated weights for policy 1, policy_version 87262 (0.0008) +[2023-10-09 07:43:50,323][60143] Updated weights for policy 0, policy_version 86282 (0.0008) +[2023-10-09 07:43:50,475][60144] Updated weights for policy 1, policy_version 87272 (0.0008) +[2023-10-09 07:43:50,689][60143] Updated weights for policy 0, policy_version 86292 (0.0009) +[2023-10-09 07:43:50,842][60144] Updated weights for policy 1, policy_version 87282 (0.0007) +[2023-10-09 07:43:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 177700864. Throughput: 0: 1725.9, 1: 1727.9. Samples: 44438020. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 07:43:51,052][59242] Avg episode reward: [(0, '35.850'), (1, '33.480')] +[2023-10-09 07:43:51,057][60143] Updated weights for policy 0, policy_version 86302 (0.0010) +[2023-10-09 07:43:51,209][60144] Updated weights for policy 1, policy_version 87292 (0.0007) +[2023-10-09 07:43:55,040][60143] Updated weights for policy 0, policy_version 86312 (0.0009) +[2023-10-09 07:43:55,280][60144] Updated weights for policy 1, policy_version 87302 (0.0008) +[2023-10-09 07:43:55,410][60143] Updated weights for policy 0, policy_version 86322 (0.0008) +[2023-10-09 07:43:55,650][60144] Updated weights for policy 1, policy_version 87312 (0.0009) +[2023-10-09 07:43:55,784][60143] Updated weights for policy 0, policy_version 86332 (0.0008) +[2023-10-09 07:43:56,015][60144] Updated weights for policy 1, policy_version 87322 (0.0010) +[2023-10-09 07:43:56,052][59242] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 177799168. Throughput: 0: 1713.5, 1: 1750.2. Samples: 44459244. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 07:43:56,053][59242] Avg episode reward: [(0, '35.390'), (1, '33.420')] +[2023-10-09 07:43:59,607][60143] Updated weights for policy 0, policy_version 86342 (0.0008) +[2023-10-09 07:43:59,775][60144] Updated weights for policy 1, policy_version 87332 (0.0009) +[2023-10-09 07:43:59,988][60143] Updated weights for policy 0, policy_version 86352 (0.0007) +[2023-10-09 07:44:00,144][60144] Updated weights for policy 1, policy_version 87342 (0.0008) +[2023-10-09 07:44:00,347][60143] Updated weights for policy 0, policy_version 86362 (0.0008) +[2023-10-09 07:44:00,504][60144] Updated weights for policy 1, policy_version 87352 (0.0008) +[2023-10-09 07:44:01,052][59242] Fps is (10 sec: 19660.8, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 177897472. Throughput: 0: 1684.0, 1: 1729.0. Samples: 44478326. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 07:44:01,052][59242] Avg episode reward: [(0, '35.870'), (1, '33.510')] +[2023-10-09 07:44:04,336][60144] Updated weights for policy 1, policy_version 87362 (0.0011) +[2023-10-09 07:44:04,439][60143] Updated weights for policy 0, policy_version 86372 (0.0009) +[2023-10-09 07:44:04,704][60144] Updated weights for policy 1, policy_version 87372 (0.0009) +[2023-10-09 07:44:04,813][60143] Updated weights for policy 0, policy_version 86382 (0.0008) +[2023-10-09 07:44:05,065][60144] Updated weights for policy 1, policy_version 87382 (0.0009) +[2023-10-09 07:44:05,175][60143] Updated weights for policy 0, policy_version 86392 (0.0008) +[2023-10-09 07:44:05,436][60144] Updated weights for policy 1, policy_version 87392 (0.0007) +[2023-10-09 07:44:06,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 177963008. Throughput: 0: 1705.2, 1: 1746.4. Samples: 44489636. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-09 07:44:06,053][59242] Avg episode reward: [(0, '36.850'), (1, '30.880')] +[2023-10-09 07:44:09,206][60143] Updated weights for policy 0, policy_version 86402 (0.0009) +[2023-10-09 07:44:09,460][60144] Updated weights for policy 1, policy_version 87402 (0.0008) +[2023-10-09 07:44:09,576][60143] Updated weights for policy 0, policy_version 86412 (0.0007) +[2023-10-09 07:44:09,826][60144] Updated weights for policy 1, policy_version 87412 (0.0008) +[2023-10-09 07:44:09,949][60143] Updated weights for policy 0, policy_version 86422 (0.0007) +[2023-10-09 07:44:10,199][60144] Updated weights for policy 1, policy_version 87422 (0.0007) +[2023-10-09 07:44:10,318][60143] Updated weights for policy 0, policy_version 86432 (0.0009) +[2023-10-09 07:44:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 178028544. Throughput: 0: 1691.3, 1: 1732.1. Samples: 44509780. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-09 07:44:11,053][59242] Avg episode reward: [(0, '35.720'), (1, '31.990')] +[2023-10-09 07:44:14,077][60144] Updated weights for policy 1, policy_version 87432 (0.0008) +[2023-10-09 07:44:14,374][60143] Updated weights for policy 0, policy_version 86442 (0.0008) +[2023-10-09 07:44:14,435][60144] Updated weights for policy 1, policy_version 87442 (0.0009) +[2023-10-09 07:44:14,749][60143] Updated weights for policy 0, policy_version 86452 (0.0007) +[2023-10-09 07:44:14,810][60144] Updated weights for policy 1, policy_version 87452 (0.0007) +[2023-10-09 07:44:15,129][60143] Updated weights for policy 0, policy_version 86462 (0.0008) +[2023-10-09 07:44:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 178094080. Throughput: 0: 1671.5, 1: 1714.2. Samples: 44529398. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-09 07:44:16,053][59242] Avg episode reward: [(0, '32.030'), (1, '33.460')] +[2023-10-09 07:44:18,609][60144] Updated weights for policy 1, policy_version 87462 (0.0007) +[2023-10-09 07:44:18,974][60144] Updated weights for policy 1, policy_version 87472 (0.0008) +[2023-10-09 07:44:19,043][60143] Updated weights for policy 0, policy_version 86472 (0.0008) +[2023-10-09 07:44:19,340][60144] Updated weights for policy 1, policy_version 87482 (0.0010) +[2023-10-09 07:44:19,406][60143] Updated weights for policy 0, policy_version 86482 (0.0007) +[2023-10-09 07:44:19,777][60143] Updated weights for policy 0, policy_version 86492 (0.0008) +[2023-10-09 07:44:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 178159616. Throughput: 0: 1699.3, 1: 1738.3. Samples: 44541206. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-09 07:44:21,053][59242] Avg episode reward: [(0, '32.400'), (1, '32.790')] +[2023-10-09 07:44:23,225][60144] Updated weights for policy 1, policy_version 87492 (0.0008) +[2023-10-09 07:44:23,591][60144] Updated weights for policy 1, policy_version 87502 (0.0008) +[2023-10-09 07:44:23,794][60143] Updated weights for policy 0, policy_version 86502 (0.0008) +[2023-10-09 07:44:23,959][60144] Updated weights for policy 1, policy_version 87512 (0.0008) +[2023-10-09 07:44:24,170][60143] Updated weights for policy 0, policy_version 86512 (0.0007) +[2023-10-09 07:44:24,552][60143] Updated weights for policy 0, policy_version 86522 (0.0010) +[2023-10-09 07:44:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 178225152. Throughput: 0: 1674.1, 1: 1716.5. Samples: 44560226. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-09 07:44:26,053][59242] Avg episode reward: [(0, '32.400'), (1, '31.830')] +[2023-10-09 07:44:28,090][60144] Updated weights for policy 1, policy_version 87522 (0.0008) +[2023-10-09 07:44:28,499][60144] Updated weights for policy 1, policy_version 87532 (0.0007) +[2023-10-09 07:44:28,659][60143] Updated weights for policy 0, policy_version 86532 (0.0009) +[2023-10-09 07:44:28,871][60144] Updated weights for policy 1, policy_version 87542 (0.0008) +[2023-10-09 07:44:29,025][60143] Updated weights for policy 0, policy_version 86542 (0.0009) +[2023-10-09 07:44:29,243][60144] Updated weights for policy 1, policy_version 87552 (0.0007) +[2023-10-09 07:44:29,394][60143] Updated weights for policy 0, policy_version 86552 (0.0009) +[2023-10-09 07:44:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 178290688. Throughput: 0: 1681.6, 1: 1716.8. Samples: 44581080. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-09 07:44:31,053][59242] Avg episode reward: [(0, '33.510'), (1, '31.710')] +[2023-10-09 07:44:33,039][60144] Updated weights for policy 1, policy_version 87562 (0.0009) +[2023-10-09 07:44:33,403][60144] Updated weights for policy 1, policy_version 87572 (0.0007) +[2023-10-09 07:44:33,482][60143] Updated weights for policy 0, policy_version 86562 (0.0008) +[2023-10-09 07:44:33,770][60144] Updated weights for policy 1, policy_version 87582 (0.0008) +[2023-10-09 07:44:33,842][60143] Updated weights for policy 0, policy_version 86572 (0.0008) +[2023-10-09 07:44:34,222][60143] Updated weights for policy 0, policy_version 86582 (0.0008) +[2023-10-09 07:44:34,584][60143] Updated weights for policy 0, policy_version 86592 (0.0007) +[2023-10-09 07:44:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 178356224. Throughput: 0: 1696.2, 1: 1725.0. Samples: 44591974. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-09 07:44:36,053][59242] Avg episode reward: [(0, '35.390'), (1, '32.850')] +[2023-10-09 07:44:37,680][60144] Updated weights for policy 1, policy_version 87592 (0.0007) +[2023-10-09 07:44:38,045][60144] Updated weights for policy 1, policy_version 87602 (0.0010) +[2023-10-09 07:44:38,415][60144] Updated weights for policy 1, policy_version 87612 (0.0009) +[2023-10-09 07:44:38,557][60143] Updated weights for policy 0, policy_version 86602 (0.0010) +[2023-10-09 07:44:38,928][60143] Updated weights for policy 0, policy_version 86612 (0.0009) +[2023-10-09 07:44:39,298][60143] Updated weights for policy 0, policy_version 86622 (0.0009) +[2023-10-09 07:44:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 178421760. Throughput: 0: 1672.1, 1: 1715.4. Samples: 44611682. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-09 07:44:41,053][59242] Avg episode reward: [(0, '33.400'), (1, '32.490')] +[2023-10-09 07:44:42,469][60144] Updated weights for policy 1, policy_version 87622 (0.0008) +[2023-10-09 07:44:42,841][60144] Updated weights for policy 1, policy_version 87632 (0.0009) +[2023-10-09 07:44:43,203][60144] Updated weights for policy 1, policy_version 87642 (0.0007) +[2023-10-09 07:44:43,283][60143] Updated weights for policy 0, policy_version 86632 (0.0007) +[2023-10-09 07:44:43,659][60143] Updated weights for policy 0, policy_version 86642 (0.0007) +[2023-10-09 07:44:44,030][60143] Updated weights for policy 0, policy_version 86652 (0.0008) +[2023-10-09 07:44:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 178487296. Throughput: 0: 1696.1, 1: 1735.5. Samples: 44632748. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-09 07:44:46,053][59242] Avg episode reward: [(0, '33.260'), (1, '33.460')] +[2023-10-09 07:44:47,182][60144] Updated weights for policy 1, policy_version 87652 (0.0007) +[2023-10-09 07:44:47,549][60144] Updated weights for policy 1, policy_version 87662 (0.0010) +[2023-10-09 07:44:47,914][60144] Updated weights for policy 1, policy_version 87672 (0.0008) +[2023-10-09 07:44:48,211][60143] Updated weights for policy 0, policy_version 86662 (0.0010) +[2023-10-09 07:44:48,592][60143] Updated weights for policy 0, policy_version 86672 (0.0008) +[2023-10-09 07:44:48,956][60143] Updated weights for policy 0, policy_version 86682 (0.0008) +[2023-10-09 07:44:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 178552832. Throughput: 0: 1688.8, 1: 1711.8. Samples: 44642664. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-09 07:44:51,053][59242] Avg episode reward: [(0, '32.100'), (1, '33.310')] +[2023-10-09 07:44:51,885][60144] Updated weights for policy 1, policy_version 87682 (0.0009) +[2023-10-09 07:44:52,249][60144] Updated weights for policy 1, policy_version 87692 (0.0010) +[2023-10-09 07:44:52,615][60144] Updated weights for policy 1, policy_version 87702 (0.0009) +[2023-10-09 07:44:52,997][60144] Updated weights for policy 1, policy_version 87712 (0.0008) +[2023-10-09 07:44:52,998][60143] Updated weights for policy 0, policy_version 86692 (0.0010) +[2023-10-09 07:44:53,370][60143] Updated weights for policy 0, policy_version 86702 (0.0007) +[2023-10-09 07:44:53,739][60143] Updated weights for policy 0, policy_version 86712 (0.0009) +[2023-10-09 07:44:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 178618368. Throughput: 0: 1682.1, 1: 1719.4. Samples: 44662848. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-09 07:44:56,052][59242] Avg episode reward: [(0, '33.880'), (1, '31.610')] +[2023-10-09 07:44:57,009][60144] Updated weights for policy 1, policy_version 87722 (0.0007) +[2023-10-09 07:44:57,382][60144] Updated weights for policy 1, policy_version 87732 (0.0007) +[2023-10-09 07:44:57,700][60143] Updated weights for policy 0, policy_version 86722 (0.0008) +[2023-10-09 07:44:57,746][60144] Updated weights for policy 1, policy_version 87742 (0.0007) +[2023-10-09 07:44:58,071][60143] Updated weights for policy 0, policy_version 86732 (0.0009) +[2023-10-09 07:44:58,443][60143] Updated weights for policy 0, policy_version 86742 (0.0010) +[2023-10-09 07:44:58,815][60143] Updated weights for policy 0, policy_version 86752 (0.0008) +[2023-10-09 07:45:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13662.6). Total num frames: 178683904. Throughput: 0: 1703.3, 1: 1734.3. Samples: 44684092. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-09 07:45:01,053][59242] Avg episode reward: [(0, '35.080'), (1, '33.380')] +[2023-10-09 07:45:01,695][60144] Updated weights for policy 1, policy_version 87752 (0.0009) +[2023-10-09 07:45:02,062][60144] Updated weights for policy 1, policy_version 87762 (0.0010) +[2023-10-09 07:45:02,427][60144] Updated weights for policy 1, policy_version 87772 (0.0008) +[2023-10-09 07:45:02,832][60143] Updated weights for policy 0, policy_version 86762 (0.0008) +[2023-10-09 07:45:03,201][60143] Updated weights for policy 0, policy_version 86772 (0.0008) +[2023-10-09 07:45:03,558][60143] Updated weights for policy 0, policy_version 86782 (0.0008) +[2023-10-09 07:45:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 178749440. Throughput: 0: 1680.2, 1: 1711.1. Samples: 44693812. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-09 07:45:06,053][59242] Avg episode reward: [(0, '34.450'), (1, '33.190')] +[2023-10-09 07:45:06,489][60144] Updated weights for policy 1, policy_version 87782 (0.0007) +[2023-10-09 07:45:06,850][60144] Updated weights for policy 1, policy_version 87792 (0.0007) +[2023-10-09 07:45:07,214][60144] Updated weights for policy 1, policy_version 87802 (0.0007) +[2023-10-09 07:45:07,319][60143] Updated weights for policy 0, policy_version 86792 (0.0007) +[2023-10-09 07:45:07,687][60143] Updated weights for policy 0, policy_version 86802 (0.0009) +[2023-10-09 07:45:08,051][60143] Updated weights for policy 0, policy_version 86812 (0.0009) +[2023-10-09 07:45:10,979][60144] Updated weights for policy 1, policy_version 87812 (0.0009) +[2023-10-09 07:45:11,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 178814976. Throughput: 0: 1704.9, 1: 1734.1. Samples: 44714980. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) +[2023-10-09 07:45:11,052][59242] Avg episode reward: [(0, '33.650'), (1, '33.450')] +[2023-10-09 07:45:11,344][60144] Updated weights for policy 1, policy_version 87822 (0.0011) +[2023-10-09 07:45:11,710][60144] Updated weights for policy 1, policy_version 87832 (0.0009) +[2023-10-09 07:45:11,909][60143] Updated weights for policy 0, policy_version 86822 (0.0008) +[2023-10-09 07:45:12,272][60143] Updated weights for policy 0, policy_version 86832 (0.0009) +[2023-10-09 07:45:12,646][60143] Updated weights for policy 0, policy_version 86842 (0.0011) +[2023-10-09 07:45:15,877][60144] Updated weights for policy 1, policy_version 87842 (0.0008) +[2023-10-09 07:45:16,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 178880512. Throughput: 0: 1717.6, 1: 1730.7. Samples: 44736254. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) +[2023-10-09 07:45:16,053][59242] Avg episode reward: [(0, '33.390'), (1, '33.850')] +[2023-10-09 07:45:16,279][60144] Updated weights for policy 1, policy_version 87852 (0.0009) +[2023-10-09 07:45:16,616][60143] Updated weights for policy 0, policy_version 86852 (0.0009) +[2023-10-09 07:45:16,645][60144] Updated weights for policy 1, policy_version 87862 (0.0008) +[2023-10-09 07:45:16,989][60143] Updated weights for policy 0, policy_version 86862 (0.0007) +[2023-10-09 07:45:17,016][60144] Updated weights for policy 1, policy_version 87872 (0.0008) +[2023-10-09 07:45:17,356][60143] Updated weights for policy 0, policy_version 86872 (0.0008) +[2023-10-09 07:45:20,839][60144] Updated weights for policy 1, policy_version 87882 (0.0009) +[2023-10-09 07:45:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 178946048. Throughput: 0: 1693.8, 1: 1718.9. Samples: 44745544. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) +[2023-10-09 07:45:21,053][59242] Avg episode reward: [(0, '33.130'), (1, '33.180')] +[2023-10-09 07:45:21,198][60143] Updated weights for policy 0, policy_version 86882 (0.0008) +[2023-10-09 07:45:21,217][60144] Updated weights for policy 1, policy_version 87892 (0.0009) +[2023-10-09 07:45:21,558][60143] Updated weights for policy 0, policy_version 86892 (0.0009) +[2023-10-09 07:45:21,576][60144] Updated weights for policy 1, policy_version 87902 (0.0009) +[2023-10-09 07:45:21,929][60143] Updated weights for policy 0, policy_version 86902 (0.0010) +[2023-10-09 07:45:22,297][60143] Updated weights for policy 0, policy_version 86912 (0.0009) +[2023-10-09 07:45:25,683][60144] Updated weights for policy 1, policy_version 87912 (0.0010) +[2023-10-09 07:45:26,047][60144] Updated weights for policy 1, policy_version 87922 (0.0010) +[2023-10-09 07:45:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 179011584. Throughput: 0: 1723.6, 1: 1719.6. Samples: 44766626. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) +[2023-10-09 07:45:26,053][59242] Avg episode reward: [(0, '35.370'), (1, '34.140')] +[2023-10-09 07:45:26,281][60143] Updated weights for policy 0, policy_version 86922 (0.0008) +[2023-10-09 07:45:26,418][60144] Updated weights for policy 1, policy_version 87932 (0.0007) +[2023-10-09 07:45:26,651][60143] Updated weights for policy 0, policy_version 86932 (0.0008) +[2023-10-09 07:45:27,030][60143] Updated weights for policy 0, policy_version 86942 (0.0007) +[2023-10-09 07:45:30,369][60144] Updated weights for policy 1, policy_version 87942 (0.0007) +[2023-10-09 07:45:30,735][60144] Updated weights for policy 1, policy_version 87952 (0.0009) +[2023-10-09 07:45:31,013][60143] Updated weights for policy 0, policy_version 86952 (0.0009) +[2023-10-09 07:45:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 179077120. Throughput: 0: 1727.5, 1: 1707.5. Samples: 44787324. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) +[2023-10-09 07:45:31,053][59242] Avg episode reward: [(0, '34.480'), (1, '34.670')] +[2023-10-09 07:45:31,092][60144] Updated weights for policy 1, policy_version 87962 (0.0009) +[2023-10-09 07:45:31,311][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000087968_90079232.pth... +[2023-10-09 07:45:31,349][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000086336_88408064.pth +[2023-10-09 07:45:31,386][60143] Updated weights for policy 0, policy_version 86962 (0.0007) +[2023-10-09 07:45:31,756][60143] Updated weights for policy 0, policy_version 86972 (0.0010) +[2023-10-09 07:45:31,902][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000086976_89063424.pth... +[2023-10-09 07:45:31,931][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000085376_87425024.pth +[2023-10-09 07:45:35,147][60144] Updated weights for policy 1, policy_version 87972 (0.0008) +[2023-10-09 07:45:35,518][60144] Updated weights for policy 1, policy_version 87982 (0.0008) +[2023-10-09 07:45:35,871][60144] Updated weights for policy 1, policy_version 87992 (0.0009) +[2023-10-09 07:45:35,964][60143] Updated weights for policy 0, policy_version 86982 (0.0009) +[2023-10-09 07:45:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 179142656. Throughput: 0: 1713.0, 1: 1719.5. Samples: 44797128. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) +[2023-10-09 07:45:36,052][59242] Avg episode reward: [(0, '33.150'), (1, '36.350')] +[2023-10-09 07:45:36,346][60143] Updated weights for policy 0, policy_version 86992 (0.0007) +[2023-10-09 07:45:36,714][60143] Updated weights for policy 0, policy_version 87002 (0.0007) +[2023-10-09 07:45:39,928][60144] Updated weights for policy 1, policy_version 88002 (0.0008) +[2023-10-09 07:45:40,290][60144] Updated weights for policy 1, policy_version 88012 (0.0007) +[2023-10-09 07:45:40,655][60144] Updated weights for policy 1, policy_version 88022 (0.0007) +[2023-10-09 07:45:40,676][60143] Updated weights for policy 0, policy_version 87012 (0.0009) +[2023-10-09 07:45:41,013][60144] Updated weights for policy 1, policy_version 88032 (0.0008) +[2023-10-09 07:45:41,041][60143] Updated weights for policy 0, policy_version 87022 (0.0008) +[2023-10-09 07:45:41,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 179240960. Throughput: 0: 1730.5, 1: 1724.6. Samples: 44818328. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) +[2023-10-09 07:45:41,052][59242] Avg episode reward: [(0, '33.330'), (1, '36.210')] +[2023-10-09 07:45:41,413][60143] Updated weights for policy 0, policy_version 87032 (0.0010) +[2023-10-09 07:45:44,989][60144] Updated weights for policy 1, policy_version 88042 (0.0007) +[2023-10-09 07:45:45,352][60144] Updated weights for policy 1, policy_version 88052 (0.0008) +[2023-10-09 07:45:45,424][60143] Updated weights for policy 0, policy_version 87042 (0.0009) +[2023-10-09 07:45:45,717][60144] Updated weights for policy 1, policy_version 88062 (0.0008) +[2023-10-09 07:45:45,782][60143] Updated weights for policy 0, policy_version 87052 (0.0008) +[2023-10-09 07:45:46,052][59242] Fps is (10 sec: 16383.3, 60 sec: 13653.3, 300 sec: 13773.6). Total num frames: 179306496. Throughput: 0: 1733.7, 1: 1701.5. Samples: 44838674. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) +[2023-10-09 07:45:46,053][59242] Avg episode reward: [(0, '33.390'), (1, '37.450')] +[2023-10-09 07:45:46,158][60143] Updated weights for policy 0, policy_version 87062 (0.0009) +[2023-10-09 07:45:46,536][60143] Updated weights for policy 0, policy_version 87072 (0.0009) +[2023-10-09 07:45:49,698][60144] Updated weights for policy 1, policy_version 88072 (0.0008) +[2023-10-09 07:45:50,059][60144] Updated weights for policy 1, policy_version 88082 (0.0009) +[2023-10-09 07:45:50,412][60143] Updated weights for policy 0, policy_version 87082 (0.0008) +[2023-10-09 07:45:50,427][60144] Updated weights for policy 1, policy_version 88092 (0.0008) +[2023-10-09 07:45:50,788][60143] Updated weights for policy 0, policy_version 87092 (0.0009) +[2023-10-09 07:45:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 179372032. Throughput: 0: 1727.4, 1: 1723.1. Samples: 44849086. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) +[2023-10-09 07:45:51,053][59242] Avg episode reward: [(0, '33.400'), (1, '34.490')] +[2023-10-09 07:45:51,152][60143] Updated weights for policy 0, policy_version 87102 (0.0009) +[2023-10-09 07:45:54,308][60144] Updated weights for policy 1, policy_version 88102 (0.0007) +[2023-10-09 07:45:54,685][60144] Updated weights for policy 1, policy_version 88112 (0.0007) +[2023-10-09 07:45:55,050][60144] Updated weights for policy 1, policy_version 88122 (0.0007) +[2023-10-09 07:45:55,059][60143] Updated weights for policy 0, policy_version 87112 (0.0008) +[2023-10-09 07:45:55,418][60143] Updated weights for policy 0, policy_version 87122 (0.0009) +[2023-10-09 07:45:55,799][60143] Updated weights for policy 0, policy_version 87132 (0.0010) +[2023-10-09 07:45:56,052][59242] Fps is (10 sec: 16384.5, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 179470336. Throughput: 0: 1728.2, 1: 1714.9. Samples: 44869920. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) +[2023-10-09 07:45:56,053][59242] Avg episode reward: [(0, '33.400'), (1, '33.780')] +[2023-10-09 07:45:58,835][60144] Updated weights for policy 1, policy_version 88132 (0.0008) +[2023-10-09 07:45:59,195][60144] Updated weights for policy 1, policy_version 88142 (0.0007) +[2023-10-09 07:45:59,564][60144] Updated weights for policy 1, policy_version 88152 (0.0009) +[2023-10-09 07:45:59,608][60143] Updated weights for policy 0, policy_version 87142 (0.0008) +[2023-10-09 07:45:59,968][60143] Updated weights for policy 0, policy_version 87152 (0.0009) +[2023-10-09 07:46:00,335][60143] Updated weights for policy 0, policy_version 87162 (0.0009) +[2023-10-09 07:46:01,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 179535872. Throughput: 0: 1698.0, 1: 1705.1. Samples: 44889394. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) +[2023-10-09 07:46:01,053][59242] Avg episode reward: [(0, '34.990'), (1, '32.420')] +[2023-10-09 07:46:03,539][60144] Updated weights for policy 1, policy_version 88162 (0.0007) +[2023-10-09 07:46:03,948][60144] Updated weights for policy 1, policy_version 88172 (0.0008) +[2023-10-09 07:46:04,317][60144] Updated weights for policy 1, policy_version 88182 (0.0009) +[2023-10-09 07:46:04,437][60143] Updated weights for policy 0, policy_version 87172 (0.0008) +[2023-10-09 07:46:04,674][60144] Updated weights for policy 1, policy_version 88192 (0.0008) +[2023-10-09 07:46:04,805][60143] Updated weights for policy 0, policy_version 87182 (0.0007) +[2023-10-09 07:46:05,162][60143] Updated weights for policy 0, policy_version 87192 (0.0010) +[2023-10-09 07:46:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 179601408. Throughput: 0: 1718.5, 1: 1734.8. Samples: 44900942. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:46:06,052][59242] Avg episode reward: [(0, '34.230'), (1, '31.370')] +[2023-10-09 07:46:08,611][60144] Updated weights for policy 1, policy_version 88202 (0.0007) +[2023-10-09 07:46:08,976][60144] Updated weights for policy 1, policy_version 88212 (0.0007) +[2023-10-09 07:46:09,291][60143] Updated weights for policy 0, policy_version 87202 (0.0009) +[2023-10-09 07:46:09,347][60144] Updated weights for policy 1, policy_version 88222 (0.0007) +[2023-10-09 07:46:09,662][60143] Updated weights for policy 0, policy_version 87212 (0.0007) +[2023-10-09 07:46:10,027][60143] Updated weights for policy 0, policy_version 87222 (0.0009) +[2023-10-09 07:46:10,396][60143] Updated weights for policy 0, policy_version 87232 (0.0009) +[2023-10-09 07:46:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 179666944. Throughput: 0: 1711.8, 1: 1709.9. Samples: 44920602. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:46:11,053][59242] Avg episode reward: [(0, '34.400'), (1, '31.400')] +[2023-10-09 07:46:13,275][60144] Updated weights for policy 1, policy_version 88232 (0.0007) +[2023-10-09 07:46:13,644][60144] Updated weights for policy 1, policy_version 88242 (0.0009) +[2023-10-09 07:46:14,004][60144] Updated weights for policy 1, policy_version 88252 (0.0010) +[2023-10-09 07:46:14,248][60143] Updated weights for policy 0, policy_version 87242 (0.0008) +[2023-10-09 07:46:14,618][60143] Updated weights for policy 0, policy_version 87252 (0.0008) +[2023-10-09 07:46:14,978][60143] Updated weights for policy 0, policy_version 87262 (0.0008) +[2023-10-09 07:46:16,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 179732480. Throughput: 0: 1687.2, 1: 1723.7. Samples: 44940814. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:46:16,053][59242] Avg episode reward: [(0, '35.260'), (1, '31.300')] +[2023-10-09 07:46:17,811][60144] Updated weights for policy 1, policy_version 88262 (0.0010) +[2023-10-09 07:46:18,177][60144] Updated weights for policy 1, policy_version 88272 (0.0008) +[2023-10-09 07:46:18,550][60144] Updated weights for policy 1, policy_version 88282 (0.0011) +[2023-10-09 07:46:19,142][60143] Updated weights for policy 0, policy_version 87272 (0.0011) +[2023-10-09 07:46:19,509][60143] Updated weights for policy 0, policy_version 87282 (0.0010) +[2023-10-09 07:46:19,882][60143] Updated weights for policy 0, policy_version 87292 (0.0009) +[2023-10-09 07:46:21,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 179798016. Throughput: 0: 1714.8, 1: 1721.3. Samples: 44951752. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:46:21,052][59242] Avg episode reward: [(0, '34.950'), (1, '31.660')] +[2023-10-09 07:46:22,390][60144] Updated weights for policy 1, policy_version 88292 (0.0009) +[2023-10-09 07:46:22,753][60144] Updated weights for policy 1, policy_version 88302 (0.0009) +[2023-10-09 07:46:23,127][60144] Updated weights for policy 1, policy_version 88312 (0.0009) +[2023-10-09 07:46:23,839][60143] Updated weights for policy 0, policy_version 87302 (0.0009) +[2023-10-09 07:46:24,213][60143] Updated weights for policy 0, policy_version 87312 (0.0008) +[2023-10-09 07:46:24,582][60143] Updated weights for policy 0, policy_version 87322 (0.0008) +[2023-10-09 07:46:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 179863552. Throughput: 0: 1692.3, 1: 1717.4. Samples: 44971762. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:46:26,053][59242] Avg episode reward: [(0, '35.330'), (1, '31.090')] +[2023-10-09 07:46:27,114][60144] Updated weights for policy 1, policy_version 88322 (0.0009) +[2023-10-09 07:46:27,481][60144] Updated weights for policy 1, policy_version 88332 (0.0007) +[2023-10-09 07:46:27,854][60144] Updated weights for policy 1, policy_version 88342 (0.0007) +[2023-10-09 07:46:28,218][60144] Updated weights for policy 1, policy_version 88352 (0.0010) +[2023-10-09 07:46:28,688][60143] Updated weights for policy 0, policy_version 87332 (0.0009) +[2023-10-09 07:46:29,055][60143] Updated weights for policy 0, policy_version 87342 (0.0010) +[2023-10-09 07:46:29,417][60143] Updated weights for policy 0, policy_version 87352 (0.0009) +[2023-10-09 07:46:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 179929088. Throughput: 0: 1678.9, 1: 1738.5. Samples: 44992460. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:46:31,053][59242] Avg episode reward: [(0, '35.540'), (1, '30.510')] +[2023-10-09 07:46:32,012][60144] Updated weights for policy 1, policy_version 88362 (0.0010) +[2023-10-09 07:46:32,374][60144] Updated weights for policy 1, policy_version 88372 (0.0008) +[2023-10-09 07:46:32,748][60144] Updated weights for policy 1, policy_version 88382 (0.0008) +[2023-10-09 07:46:33,579][60143] Updated weights for policy 0, policy_version 87362 (0.0007) +[2023-10-09 07:46:33,944][60143] Updated weights for policy 0, policy_version 87372 (0.0010) +[2023-10-09 07:46:34,314][60143] Updated weights for policy 0, policy_version 87382 (0.0009) +[2023-10-09 07:46:34,691][60143] Updated weights for policy 0, policy_version 87392 (0.0011) +[2023-10-09 07:46:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 179994624. Throughput: 0: 1704.7, 1: 1716.5. Samples: 45003038. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:46:36,053][59242] Avg episode reward: [(0, '34.030'), (1, '30.970')] +[2023-10-09 07:46:36,701][60144] Updated weights for policy 1, policy_version 88392 (0.0008) +[2023-10-09 07:46:37,071][60144] Updated weights for policy 1, policy_version 88402 (0.0008) +[2023-10-09 07:46:37,446][60144] Updated weights for policy 1, policy_version 88412 (0.0009) +[2023-10-09 07:46:38,632][60143] Updated weights for policy 0, policy_version 87402 (0.0010) +[2023-10-09 07:46:39,006][60143] Updated weights for policy 0, policy_version 87412 (0.0010) +[2023-10-09 07:46:39,379][60143] Updated weights for policy 0, policy_version 87422 (0.0008) +[2023-10-09 07:46:41,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 180060160. Throughput: 0: 1677.0, 1: 1725.1. Samples: 45023016. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:46:41,053][59242] Avg episode reward: [(0, '33.970'), (1, '33.030')] +[2023-10-09 07:46:41,488][60144] Updated weights for policy 1, policy_version 88422 (0.0008) +[2023-10-09 07:46:41,855][60144] Updated weights for policy 1, policy_version 88432 (0.0008) +[2023-10-09 07:46:42,229][60144] Updated weights for policy 1, policy_version 88442 (0.0007) +[2023-10-09 07:46:43,663][60143] Updated weights for policy 0, policy_version 87432 (0.0009) +[2023-10-09 07:46:44,024][60143] Updated weights for policy 0, policy_version 87442 (0.0009) +[2023-10-09 07:46:44,397][60143] Updated weights for policy 0, policy_version 87452 (0.0009) +[2023-10-09 07:46:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 180125696. Throughput: 0: 1698.2, 1: 1740.7. Samples: 45044142. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:46:46,053][59242] Avg episode reward: [(0, '34.180'), (1, '32.550')] +[2023-10-09 07:46:46,157][60144] Updated weights for policy 1, policy_version 88452 (0.0008) +[2023-10-09 07:46:46,536][60144] Updated weights for policy 1, policy_version 88462 (0.0009) +[2023-10-09 07:46:46,901][60144] Updated weights for policy 1, policy_version 88472 (0.0008) +[2023-10-09 07:46:48,222][60143] Updated weights for policy 0, policy_version 87462 (0.0009) +[2023-10-09 07:46:48,594][60143] Updated weights for policy 0, policy_version 87472 (0.0008) +[2023-10-09 07:46:48,967][60143] Updated weights for policy 0, policy_version 87482 (0.0008) +[2023-10-09 07:46:50,841][60144] Updated weights for policy 1, policy_version 88482 (0.0008) +[2023-10-09 07:46:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 180191232. Throughput: 0: 1695.2, 1: 1714.2. Samples: 45054366. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:46:51,053][59242] Avg episode reward: [(0, '34.200'), (1, '32.140')] +[2023-10-09 07:46:51,261][60144] Updated weights for policy 1, policy_version 88492 (0.0011) +[2023-10-09 07:46:51,622][60144] Updated weights for policy 1, policy_version 88502 (0.0008) +[2023-10-09 07:46:51,993][60144] Updated weights for policy 1, policy_version 88512 (0.0008) +[2023-10-09 07:46:52,915][60143] Updated weights for policy 0, policy_version 87492 (0.0008) +[2023-10-09 07:46:53,287][60143] Updated weights for policy 0, policy_version 87502 (0.0010) +[2023-10-09 07:46:53,651][60143] Updated weights for policy 0, policy_version 87512 (0.0007) +[2023-10-09 07:46:55,868][60144] Updated weights for policy 1, policy_version 88522 (0.0010) +[2023-10-09 07:46:56,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 180256768. Throughput: 0: 1684.3, 1: 1739.6. Samples: 45074678. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:46:56,053][59242] Avg episode reward: [(0, '38.090'), (1, '33.910')] +[2023-10-09 07:46:56,229][60144] Updated weights for policy 1, policy_version 88532 (0.0008) +[2023-10-09 07:46:56,591][60144] Updated weights for policy 1, policy_version 88542 (0.0008) +[2023-10-09 07:46:57,513][60143] Updated weights for policy 0, policy_version 87522 (0.0009) +[2023-10-09 07:46:57,878][60143] Updated weights for policy 0, policy_version 87532 (0.0008) +[2023-10-09 07:46:58,252][60143] Updated weights for policy 0, policy_version 87542 (0.0009) +[2023-10-09 07:46:58,619][60143] Updated weights for policy 0, policy_version 87552 (0.0008) +[2023-10-09 07:47:00,497][60144] Updated weights for policy 1, policy_version 88552 (0.0009) +[2023-10-09 07:47:00,854][60144] Updated weights for policy 1, policy_version 88562 (0.0011) +[2023-10-09 07:47:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 180322304. Throughput: 0: 1710.9, 1: 1726.8. Samples: 45095512. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:47:01,053][59242] Avg episode reward: [(0, '38.070'), (1, '32.610')] +[2023-10-09 07:47:01,229][60144] Updated weights for policy 1, policy_version 88572 (0.0011) +[2023-10-09 07:47:02,624][60143] Updated weights for policy 0, policy_version 87562 (0.0009) +[2023-10-09 07:47:03,009][60143] Updated weights for policy 0, policy_version 87572 (0.0011) +[2023-10-09 07:47:03,369][60143] Updated weights for policy 0, policy_version 87582 (0.0007) +[2023-10-09 07:47:05,221][60144] Updated weights for policy 1, policy_version 88582 (0.0008) +[2023-10-09 07:47:05,587][60144] Updated weights for policy 1, policy_version 88592 (0.0009) +[2023-10-09 07:47:05,964][60144] Updated weights for policy 1, policy_version 88602 (0.0007) +[2023-10-09 07:47:06,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 180387840. Throughput: 0: 1682.7, 1: 1727.9. Samples: 45105230. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:47:06,052][59242] Avg episode reward: [(0, '37.740'), (1, '32.510')] +[2023-10-09 07:47:07,416][60143] Updated weights for policy 0, policy_version 87592 (0.0007) +[2023-10-09 07:47:07,782][60143] Updated weights for policy 0, policy_version 87602 (0.0007) +[2023-10-09 07:47:08,149][60143] Updated weights for policy 0, policy_version 87612 (0.0008) +[2023-10-09 07:47:09,922][60144] Updated weights for policy 1, policy_version 88612 (0.0008) +[2023-10-09 07:47:10,278][60144] Updated weights for policy 1, policy_version 88622 (0.0009) +[2023-10-09 07:47:10,649][60144] Updated weights for policy 1, policy_version 88632 (0.0007) +[2023-10-09 07:47:11,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 180486144. Throughput: 0: 1706.0, 1: 1729.3. Samples: 45126354. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:47:11,053][59242] Avg episode reward: [(0, '37.130'), (1, '31.770')] +[2023-10-09 07:47:12,334][60143] Updated weights for policy 0, policy_version 87622 (0.0007) +[2023-10-09 07:47:12,716][60143] Updated weights for policy 0, policy_version 87632 (0.0008) +[2023-10-09 07:47:13,072][60143] Updated weights for policy 0, policy_version 87642 (0.0008) +[2023-10-09 07:47:14,774][60144] Updated weights for policy 1, policy_version 88642 (0.0008) +[2023-10-09 07:47:15,135][60144] Updated weights for policy 1, policy_version 88652 (0.0008) +[2023-10-09 07:47:15,508][60144] Updated weights for policy 1, policy_version 88662 (0.0007) +[2023-10-09 07:47:15,869][60144] Updated weights for policy 1, policy_version 88672 (0.0007) +[2023-10-09 07:47:16,052][59242] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 180551680. Throughput: 0: 1714.1, 1: 1709.2. Samples: 45146506. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:47:16,053][59242] Avg episode reward: [(0, '38.000'), (1, '31.700')] +[2023-10-09 07:47:17,085][60143] Updated weights for policy 0, policy_version 87652 (0.0007) +[2023-10-09 07:47:17,454][60143] Updated weights for policy 0, policy_version 87662 (0.0008) +[2023-10-09 07:47:17,818][60143] Updated weights for policy 0, policy_version 87672 (0.0007) +[2023-10-09 07:47:19,823][60144] Updated weights for policy 1, policy_version 88682 (0.0009) +[2023-10-09 07:47:20,195][60144] Updated weights for policy 1, policy_version 88692 (0.0010) +[2023-10-09 07:47:20,562][60144] Updated weights for policy 1, policy_version 88702 (0.0008) +[2023-10-09 07:47:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 180617216. Throughput: 0: 1686.0, 1: 1729.1. Samples: 45156716. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:47:21,053][59242] Avg episode reward: [(0, '38.300'), (1, '31.440')] +[2023-10-09 07:47:21,740][60143] Updated weights for policy 0, policy_version 87682 (0.0009) +[2023-10-09 07:47:22,111][60143] Updated weights for policy 0, policy_version 87692 (0.0010) +[2023-10-09 07:47:22,486][60143] Updated weights for policy 0, policy_version 87702 (0.0008) +[2023-10-09 07:47:22,853][60143] Updated weights for policy 0, policy_version 87712 (0.0008) +[2023-10-09 07:47:24,745][60144] Updated weights for policy 1, policy_version 88712 (0.0008) +[2023-10-09 07:47:25,114][60144] Updated weights for policy 1, policy_version 88722 (0.0008) +[2023-10-09 07:47:25,483][60144] Updated weights for policy 1, policy_version 88732 (0.0007) +[2023-10-09 07:47:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 180682752. Throughput: 0: 1711.7, 1: 1724.4. Samples: 45177640. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:47:26,053][59242] Avg episode reward: [(0, '38.240'), (1, '32.050')] +[2023-10-09 07:47:26,647][60143] Updated weights for policy 0, policy_version 87722 (0.0009) +[2023-10-09 07:47:27,008][60143] Updated weights for policy 0, policy_version 87732 (0.0008) +[2023-10-09 07:47:27,387][60143] Updated weights for policy 0, policy_version 87742 (0.0008) +[2023-10-09 07:47:29,379][60144] Updated weights for policy 1, policy_version 88742 (0.0010) +[2023-10-09 07:47:29,756][60144] Updated weights for policy 1, policy_version 88752 (0.0009) +[2023-10-09 07:47:30,126][60144] Updated weights for policy 1, policy_version 88762 (0.0008) +[2023-10-09 07:47:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 180748288. Throughput: 0: 1718.7, 1: 1697.6. Samples: 45197878. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:47:31,053][59242] Avg episode reward: [(0, '38.250'), (1, '31.610')] +[2023-10-09 07:47:31,061][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000088768_90898432.pth... +[2023-10-09 07:47:31,101][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000087136_89227264.pth +[2023-10-09 07:47:31,107][60003] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p1/milestones/checkpoint_000088768_90898432.pth +[2023-10-09 07:47:31,329][60143] Updated weights for policy 0, policy_version 87752 (0.0008) +[2023-10-09 07:47:31,699][60143] Updated weights for policy 0, policy_version 87762 (0.0009) +[2023-10-09 07:47:32,055][60143] Updated weights for policy 0, policy_version 87772 (0.0008) +[2023-10-09 07:47:32,199][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000087776_89882624.pth... +[2023-10-09 07:47:32,227][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000086176_88244224.pth +[2023-10-09 07:47:32,231][59934] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p0/milestones/checkpoint_000087776_89882624.pth +[2023-10-09 07:47:34,185][60144] Updated weights for policy 1, policy_version 88772 (0.0008) +[2023-10-09 07:47:34,553][60144] Updated weights for policy 1, policy_version 88782 (0.0010) +[2023-10-09 07:47:34,920][60144] Updated weights for policy 1, policy_version 88792 (0.0008) +[2023-10-09 07:47:35,970][60143] Updated weights for policy 0, policy_version 87782 (0.0008) +[2023-10-09 07:47:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 180813824. Throughput: 0: 1697.9, 1: 1726.7. Samples: 45208470. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:47:36,053][59242] Avg episode reward: [(0, '38.010'), (1, '34.620')] +[2023-10-09 07:47:36,339][60143] Updated weights for policy 0, policy_version 87792 (0.0007) +[2023-10-09 07:47:36,706][60143] Updated weights for policy 0, policy_version 87802 (0.0009) +[2023-10-09 07:47:38,887][60144] Updated weights for policy 1, policy_version 88802 (0.0008) +[2023-10-09 07:47:39,308][60144] Updated weights for policy 1, policy_version 88812 (0.0011) +[2023-10-09 07:47:39,674][60144] Updated weights for policy 1, policy_version 88822 (0.0011) +[2023-10-09 07:47:40,041][60144] Updated weights for policy 1, policy_version 88832 (0.0011) +[2023-10-09 07:47:40,676][60143] Updated weights for policy 0, policy_version 87812 (0.0008) +[2023-10-09 07:47:41,036][60143] Updated weights for policy 0, policy_version 87822 (0.0011) +[2023-10-09 07:47:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 180879360. Throughput: 0: 1718.8, 1: 1706.0. Samples: 45228792. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:47:41,052][59242] Avg episode reward: [(0, '37.210'), (1, '35.310')] +[2023-10-09 07:47:41,406][60143] Updated weights for policy 0, policy_version 87832 (0.0009) +[2023-10-09 07:47:43,910][60144] Updated weights for policy 1, policy_version 88842 (0.0008) +[2023-10-09 07:47:44,268][60144] Updated weights for policy 1, policy_version 88852 (0.0010) +[2023-10-09 07:47:44,631][60144] Updated weights for policy 1, policy_version 88862 (0.0009) +[2023-10-09 07:47:45,593][60143] Updated weights for policy 0, policy_version 87842 (0.0009) +[2023-10-09 07:47:45,972][60143] Updated weights for policy 0, policy_version 87852 (0.0007) +[2023-10-09 07:47:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 180944896. Throughput: 0: 1711.1, 1: 1708.0. Samples: 45249370. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:47:46,053][59242] Avg episode reward: [(0, '36.330'), (1, '36.240')] +[2023-10-09 07:47:46,338][60143] Updated weights for policy 0, policy_version 87862 (0.0007) +[2023-10-09 07:47:46,703][60143] Updated weights for policy 0, policy_version 87872 (0.0007) +[2023-10-09 07:47:48,571][60144] Updated weights for policy 1, policy_version 88872 (0.0007) +[2023-10-09 07:47:48,944][60144] Updated weights for policy 1, policy_version 88882 (0.0009) +[2023-10-09 07:47:49,312][60144] Updated weights for policy 1, policy_version 88892 (0.0007) +[2023-10-09 07:47:50,755][60143] Updated weights for policy 0, policy_version 87882 (0.0008) +[2023-10-09 07:47:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 181010432. Throughput: 0: 1712.1, 1: 1723.3. Samples: 45259822. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:47:51,053][59242] Avg episode reward: [(0, '35.960'), (1, '36.550')] +[2023-10-09 07:47:51,123][60143] Updated weights for policy 0, policy_version 87892 (0.0007) +[2023-10-09 07:47:51,499][60143] Updated weights for policy 0, policy_version 87902 (0.0008) +[2023-10-09 07:47:53,224][60144] Updated weights for policy 1, policy_version 88902 (0.0008) +[2023-10-09 07:47:53,600][60144] Updated weights for policy 1, policy_version 88912 (0.0009) +[2023-10-09 07:47:53,963][60144] Updated weights for policy 1, policy_version 88922 (0.0010) +[2023-10-09 07:47:55,420][60143] Updated weights for policy 0, policy_version 87912 (0.0008) +[2023-10-09 07:47:55,780][60143] Updated weights for policy 0, policy_version 87922 (0.0009) +[2023-10-09 07:47:56,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 181075968. Throughput: 0: 1716.4, 1: 1701.1. Samples: 45280144. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:47:56,053][59242] Avg episode reward: [(0, '36.510'), (1, '35.480')] +[2023-10-09 07:47:56,155][60143] Updated weights for policy 0, policy_version 87932 (0.0008) +[2023-10-09 07:47:57,749][60144] Updated weights for policy 1, policy_version 88932 (0.0009) +[2023-10-09 07:47:58,122][60144] Updated weights for policy 1, policy_version 88942 (0.0007) +[2023-10-09 07:47:58,489][60144] Updated weights for policy 1, policy_version 88952 (0.0009) +[2023-10-09 07:47:59,999][60143] Updated weights for policy 0, policy_version 87942 (0.0009) +[2023-10-09 07:48:00,374][60143] Updated weights for policy 0, policy_version 87952 (0.0008) +[2023-10-09 07:48:00,741][60143] Updated weights for policy 0, policy_version 87962 (0.0008) +[2023-10-09 07:48:01,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 181174272. Throughput: 0: 1702.8, 1: 1727.6. Samples: 45300870. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-09 07:48:01,053][59242] Avg episode reward: [(0, '36.040'), (1, '36.580')] +[2023-10-09 07:48:02,447][60144] Updated weights for policy 1, policy_version 88962 (0.0008) +[2023-10-09 07:48:02,821][60144] Updated weights for policy 1, policy_version 88972 (0.0007) +[2023-10-09 07:48:03,185][60144] Updated weights for policy 1, policy_version 88982 (0.0007) +[2023-10-09 07:48:03,546][60144] Updated weights for policy 1, policy_version 88992 (0.0008) +[2023-10-09 07:48:04,894][60143] Updated weights for policy 0, policy_version 87972 (0.0008) +[2023-10-09 07:48:05,257][60143] Updated weights for policy 0, policy_version 87982 (0.0008) +[2023-10-09 07:48:05,622][60143] Updated weights for policy 0, policy_version 87992 (0.0011) +[2023-10-09 07:48:06,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 181239808. Throughput: 0: 1722.0, 1: 1710.0. Samples: 45311160. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-09 07:48:06,053][59242] Avg episode reward: [(0, '35.970'), (1, '35.520')] +[2023-10-09 07:48:07,482][60144] Updated weights for policy 1, policy_version 89002 (0.0007) +[2023-10-09 07:48:07,843][60144] Updated weights for policy 1, policy_version 89012 (0.0007) +[2023-10-09 07:48:08,220][60144] Updated weights for policy 1, policy_version 89022 (0.0008) +[2023-10-09 07:48:09,365][60143] Updated weights for policy 0, policy_version 88002 (0.0009) +[2023-10-09 07:48:09,738][60143] Updated weights for policy 0, policy_version 88012 (0.0007) +[2023-10-09 07:48:10,112][60143] Updated weights for policy 0, policy_version 88022 (0.0007) +[2023-10-09 07:48:10,484][60143] Updated weights for policy 0, policy_version 88032 (0.0008) +[2023-10-09 07:48:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 181305344. Throughput: 0: 1723.1, 1: 1712.7. Samples: 45332250. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-09 07:48:11,053][59242] Avg episode reward: [(0, '36.680'), (1, '34.450')] +[2023-10-09 07:48:12,058][60144] Updated weights for policy 1, policy_version 89032 (0.0007) +[2023-10-09 07:48:12,429][60144] Updated weights for policy 1, policy_version 89042 (0.0007) +[2023-10-09 07:48:12,793][60144] Updated weights for policy 1, policy_version 89052 (0.0008) +[2023-10-09 07:48:14,406][60143] Updated weights for policy 0, policy_version 88042 (0.0007) +[2023-10-09 07:48:14,771][60143] Updated weights for policy 0, policy_version 88052 (0.0008) +[2023-10-09 07:48:15,140][60143] Updated weights for policy 0, policy_version 88062 (0.0009) +[2023-10-09 07:48:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 181370880. Throughput: 0: 1696.8, 1: 1739.1. Samples: 45352496. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-09 07:48:16,053][59242] Avg episode reward: [(0, '37.070'), (1, '35.040')] +[2023-10-09 07:48:16,729][60144] Updated weights for policy 1, policy_version 89062 (0.0007) +[2023-10-09 07:48:17,094][60144] Updated weights for policy 1, policy_version 89072 (0.0007) +[2023-10-09 07:48:17,469][60144] Updated weights for policy 1, policy_version 89082 (0.0009) +[2023-10-09 07:48:19,195][60143] Updated weights for policy 0, policy_version 88072 (0.0009) +[2023-10-09 07:48:19,556][60143] Updated weights for policy 0, policy_version 88082 (0.0009) +[2023-10-09 07:48:19,933][60143] Updated weights for policy 0, policy_version 88092 (0.0009) +[2023-10-09 07:48:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 181436416. Throughput: 0: 1731.2, 1: 1708.5. Samples: 45363256. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-09 07:48:21,053][59242] Avg episode reward: [(0, '36.180'), (1, '33.980')] +[2023-10-09 07:48:21,300][60144] Updated weights for policy 1, policy_version 89092 (0.0008) +[2023-10-09 07:48:21,664][60144] Updated weights for policy 1, policy_version 89102 (0.0009) +[2023-10-09 07:48:22,033][60144] Updated weights for policy 1, policy_version 89112 (0.0008) +[2023-10-09 07:48:23,897][60143] Updated weights for policy 0, policy_version 88102 (0.0008) +[2023-10-09 07:48:24,271][60143] Updated weights for policy 0, policy_version 88112 (0.0009) +[2023-10-09 07:48:24,642][60143] Updated weights for policy 0, policy_version 88122 (0.0010) +[2023-10-09 07:48:26,037][60144] Updated weights for policy 1, policy_version 89122 (0.0008) +[2023-10-09 07:48:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 181501952. Throughput: 0: 1708.2, 1: 1738.3. Samples: 45383884. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-09 07:48:26,053][59242] Avg episode reward: [(0, '35.370'), (1, '34.480')] +[2023-10-09 07:48:26,422][60144] Updated weights for policy 1, policy_version 89132 (0.0008) +[2023-10-09 07:48:26,788][60144] Updated weights for policy 1, policy_version 89142 (0.0007) +[2023-10-09 07:48:27,163][60144] Updated weights for policy 1, policy_version 89152 (0.0007) +[2023-10-09 07:48:28,593][60143] Updated weights for policy 0, policy_version 88132 (0.0008) +[2023-10-09 07:48:28,954][60143] Updated weights for policy 0, policy_version 88142 (0.0010) +[2023-10-09 07:48:29,321][60143] Updated weights for policy 0, policy_version 88152 (0.0011) +[2023-10-09 07:48:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 181567488. Throughput: 0: 1707.1, 1: 1740.9. Samples: 45404530. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-09 07:48:31,053][59242] Avg episode reward: [(0, '35.280'), (1, '33.730')] +[2023-10-09 07:48:31,144][60144] Updated weights for policy 1, policy_version 89162 (0.0009) +[2023-10-09 07:48:31,504][60144] Updated weights for policy 1, policy_version 89172 (0.0008) +[2023-10-09 07:48:31,867][60144] Updated weights for policy 1, policy_version 89182 (0.0009) +[2023-10-09 07:48:33,304][60143] Updated weights for policy 0, policy_version 88162 (0.0011) +[2023-10-09 07:48:33,677][60143] Updated weights for policy 0, policy_version 88172 (0.0008) +[2023-10-09 07:48:34,040][60143] Updated weights for policy 0, policy_version 88182 (0.0010) +[2023-10-09 07:48:34,405][60143] Updated weights for policy 0, policy_version 88192 (0.0010) +[2023-10-09 07:48:35,732][60144] Updated weights for policy 1, policy_version 89192 (0.0008) +[2023-10-09 07:48:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 181633024. Throughput: 0: 1730.3, 1: 1716.4. Samples: 45414922. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-09 07:48:36,053][59242] Avg episode reward: [(0, '33.240'), (1, '32.920')] +[2023-10-09 07:48:36,107][60144] Updated weights for policy 1, policy_version 89202 (0.0008) +[2023-10-09 07:48:36,477][60144] Updated weights for policy 1, policy_version 89212 (0.0007) +[2023-10-09 07:48:38,145][60143] Updated weights for policy 0, policy_version 88202 (0.0009) +[2023-10-09 07:48:38,508][60143] Updated weights for policy 0, policy_version 88212 (0.0009) +[2023-10-09 07:48:38,883][60143] Updated weights for policy 0, policy_version 88222 (0.0010) +[2023-10-09 07:48:40,374][60144] Updated weights for policy 1, policy_version 89222 (0.0007) +[2023-10-09 07:48:40,729][60144] Updated weights for policy 1, policy_version 89232 (0.0009) +[2023-10-09 07:48:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 181698560. Throughput: 0: 1709.8, 1: 1743.8. Samples: 45435558. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-09 07:48:41,052][59242] Avg episode reward: [(0, '33.370'), (1, '34.160')] +[2023-10-09 07:48:41,103][60144] Updated weights for policy 1, policy_version 89242 (0.0011) +[2023-10-09 07:48:42,930][60143] Updated weights for policy 0, policy_version 88232 (0.0009) +[2023-10-09 07:48:43,313][60143] Updated weights for policy 0, policy_version 88242 (0.0009) +[2023-10-09 07:48:43,684][60143] Updated weights for policy 0, policy_version 88252 (0.0009) +[2023-10-09 07:48:45,240][60144] Updated weights for policy 1, policy_version 89252 (0.0009) +[2023-10-09 07:48:45,603][60144] Updated weights for policy 1, policy_version 89262 (0.0010) +[2023-10-09 07:48:45,970][60144] Updated weights for policy 1, policy_version 89272 (0.0008) +[2023-10-09 07:48:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 181764096. Throughput: 0: 1724.1, 1: 1726.1. Samples: 45456128. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-09 07:48:46,053][59242] Avg episode reward: [(0, '33.170'), (1, '35.600')] +[2023-10-09 07:48:47,816][60143] Updated weights for policy 0, policy_version 88262 (0.0008) +[2023-10-09 07:48:48,195][60143] Updated weights for policy 0, policy_version 88272 (0.0009) +[2023-10-09 07:48:48,565][60143] Updated weights for policy 0, policy_version 88282 (0.0008) +[2023-10-09 07:48:49,750][60144] Updated weights for policy 1, policy_version 89282 (0.0010) +[2023-10-09 07:48:50,128][60144] Updated weights for policy 1, policy_version 89292 (0.0010) +[2023-10-09 07:48:50,492][60144] Updated weights for policy 1, policy_version 89302 (0.0010) +[2023-10-09 07:48:50,852][60144] Updated weights for policy 1, policy_version 89312 (0.0009) +[2023-10-09 07:48:51,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 181862400. Throughput: 0: 1708.8, 1: 1741.1. Samples: 45466408. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-09 07:48:51,053][59242] Avg episode reward: [(0, '33.960'), (1, '36.000')] +[2023-10-09 07:48:52,544][60143] Updated weights for policy 0, policy_version 88292 (0.0008) +[2023-10-09 07:48:52,919][60143] Updated weights for policy 0, policy_version 88302 (0.0007) +[2023-10-09 07:48:53,297][60143] Updated weights for policy 0, policy_version 88312 (0.0008) +[2023-10-09 07:48:54,807][60144] Updated weights for policy 1, policy_version 89322 (0.0010) +[2023-10-09 07:48:55,168][60144] Updated weights for policy 1, policy_version 89332 (0.0010) +[2023-10-09 07:48:55,537][60144] Updated weights for policy 1, policy_version 89342 (0.0008) +[2023-10-09 07:48:56,052][59242] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 181927936. Throughput: 0: 1701.3, 1: 1742.9. Samples: 45487240. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-09 07:48:56,053][59242] Avg episode reward: [(0, '34.000'), (1, '35.960')] +[2023-10-09 07:48:57,386][60143] Updated weights for policy 0, policy_version 88322 (0.0008) +[2023-10-09 07:48:57,755][60143] Updated weights for policy 0, policy_version 88332 (0.0007) +[2023-10-09 07:48:58,123][60143] Updated weights for policy 0, policy_version 88342 (0.0009) +[2023-10-09 07:48:58,495][60143] Updated weights for policy 0, policy_version 88352 (0.0008) +[2023-10-09 07:48:59,367][60144] Updated weights for policy 1, policy_version 89352 (0.0007) +[2023-10-09 07:48:59,729][60144] Updated weights for policy 1, policy_version 89362 (0.0010) +[2023-10-09 07:49:00,094][60144] Updated weights for policy 1, policy_version 89372 (0.0008) +[2023-10-09 07:49:01,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 181993472. Throughput: 0: 1726.2, 1: 1719.6. Samples: 45507556. Policy #0 lag: (min: 31.0, avg: 41.5, max: 63.0) +[2023-10-09 07:49:01,052][59242] Avg episode reward: [(0, '33.520'), (1, '36.710')] +[2023-10-09 07:49:02,405][60143] Updated weights for policy 0, policy_version 88362 (0.0009) +[2023-10-09 07:49:02,775][60143] Updated weights for policy 0, policy_version 88372 (0.0007) +[2023-10-09 07:49:03,148][60143] Updated weights for policy 0, policy_version 88382 (0.0007) +[2023-10-09 07:49:04,081][60144] Updated weights for policy 1, policy_version 89382 (0.0008) +[2023-10-09 07:49:04,453][60144] Updated weights for policy 1, policy_version 89392 (0.0009) +[2023-10-09 07:49:04,829][60144] Updated weights for policy 1, policy_version 89402 (0.0008) +[2023-10-09 07:49:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 182059008. Throughput: 0: 1692.6, 1: 1749.3. Samples: 45518144. Policy #0 lag: (min: 31.0, avg: 41.5, max: 63.0) +[2023-10-09 07:49:06,053][59242] Avg episode reward: [(0, '33.960'), (1, '35.340')] +[2023-10-09 07:49:07,004][60143] Updated weights for policy 0, policy_version 88392 (0.0007) +[2023-10-09 07:49:07,381][60143] Updated weights for policy 0, policy_version 88402 (0.0008) +[2023-10-09 07:49:07,750][60143] Updated weights for policy 0, policy_version 88412 (0.0007) +[2023-10-09 07:49:08,631][60144] Updated weights for policy 1, policy_version 89412 (0.0009) +[2023-10-09 07:49:09,001][60144] Updated weights for policy 1, policy_version 89422 (0.0009) +[2023-10-09 07:49:09,362][60144] Updated weights for policy 1, policy_version 89432 (0.0007) +[2023-10-09 07:49:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 182124544. Throughput: 0: 1712.7, 1: 1719.5. Samples: 45538330. Policy #0 lag: (min: 31.0, avg: 41.5, max: 63.0) +[2023-10-09 07:49:11,053][59242] Avg episode reward: [(0, '35.380'), (1, '34.740')] +[2023-10-09 07:49:11,690][60143] Updated weights for policy 0, policy_version 88422 (0.0008) +[2023-10-09 07:49:12,061][60143] Updated weights for policy 0, policy_version 88432 (0.0007) +[2023-10-09 07:49:12,422][60143] Updated weights for policy 0, policy_version 88442 (0.0008) +[2023-10-09 07:49:13,297][60144] Updated weights for policy 1, policy_version 89442 (0.0007) +[2023-10-09 07:49:13,728][60144] Updated weights for policy 1, policy_version 89452 (0.0008) +[2023-10-09 07:49:14,100][60144] Updated weights for policy 1, policy_version 89462 (0.0009) +[2023-10-09 07:49:14,469][60144] Updated weights for policy 1, policy_version 89472 (0.0008) +[2023-10-09 07:49:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 182190080. Throughput: 0: 1717.6, 1: 1723.1. Samples: 45559360. Policy #0 lag: (min: 31.0, avg: 41.5, max: 63.0) +[2023-10-09 07:49:16,053][59242] Avg episode reward: [(0, '33.730'), (1, '34.640')] +[2023-10-09 07:49:16,458][60143] Updated weights for policy 0, policy_version 88452 (0.0008) +[2023-10-09 07:49:16,832][60143] Updated weights for policy 0, policy_version 88462 (0.0010) +[2023-10-09 07:49:17,203][60143] Updated weights for policy 0, policy_version 88472 (0.0011) +[2023-10-09 07:49:18,308][60144] Updated weights for policy 1, policy_version 89482 (0.0007) +[2023-10-09 07:49:18,680][60144] Updated weights for policy 1, policy_version 89492 (0.0010) +[2023-10-09 07:49:19,040][60144] Updated weights for policy 1, policy_version 89502 (0.0009) +[2023-10-09 07:49:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 182255616. Throughput: 0: 1691.7, 1: 1743.3. Samples: 45569496. Policy #0 lag: (min: 31.0, avg: 41.5, max: 63.0) +[2023-10-09 07:49:21,053][59242] Avg episode reward: [(0, '32.620'), (1, '32.660')] +[2023-10-09 07:49:21,205][60143] Updated weights for policy 0, policy_version 88482 (0.0010) +[2023-10-09 07:49:21,576][60143] Updated weights for policy 0, policy_version 88492 (0.0007) +[2023-10-09 07:49:21,946][60143] Updated weights for policy 0, policy_version 88502 (0.0007) +[2023-10-09 07:49:22,319][60143] Updated weights for policy 0, policy_version 88512 (0.0009) +[2023-10-09 07:49:23,225][60144] Updated weights for policy 1, policy_version 89512 (0.0009) +[2023-10-09 07:49:23,592][60144] Updated weights for policy 1, policy_version 89522 (0.0009) +[2023-10-09 07:49:23,956][60144] Updated weights for policy 1, policy_version 89532 (0.0009) +[2023-10-09 07:49:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 182321152. Throughput: 0: 1707.9, 1: 1717.9. Samples: 45589716. Policy #0 lag: (min: 31.0, avg: 41.5, max: 63.0) +[2023-10-09 07:49:26,053][59242] Avg episode reward: [(0, '33.660'), (1, '32.830')] +[2023-10-09 07:49:26,220][60143] Updated weights for policy 0, policy_version 88522 (0.0009) +[2023-10-09 07:49:26,582][60143] Updated weights for policy 0, policy_version 88532 (0.0011) +[2023-10-09 07:49:26,950][60143] Updated weights for policy 0, policy_version 88542 (0.0008) +[2023-10-09 07:49:27,890][60144] Updated weights for policy 1, policy_version 89542 (0.0008) +[2023-10-09 07:49:28,256][60144] Updated weights for policy 1, policy_version 89552 (0.0010) +[2023-10-09 07:49:28,610][60144] Updated weights for policy 1, policy_version 89562 (0.0011) +[2023-10-09 07:49:30,930][60143] Updated weights for policy 0, policy_version 88552 (0.0010) +[2023-10-09 07:49:31,053][59242] Fps is (10 sec: 13106.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 182386688. Throughput: 0: 1711.2, 1: 1733.8. Samples: 45611154. Policy #0 lag: (min: 31.0, avg: 41.5, max: 63.0) +[2023-10-09 07:49:31,054][59242] Avg episode reward: [(0, '34.760'), (1, '32.830')] +[2023-10-09 07:49:31,062][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000089568_91717632.pth... +[2023-10-09 07:49:31,094][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000087968_90079232.pth +[2023-10-09 07:49:31,306][60143] Updated weights for policy 0, policy_version 88562 (0.0009) +[2023-10-09 07:49:31,673][60143] Updated weights for policy 0, policy_version 88572 (0.0009) +[2023-10-09 07:49:31,817][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000088576_90701824.pth... +[2023-10-09 07:49:31,847][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000086976_89063424.pth +[2023-10-09 07:49:32,512][60144] Updated weights for policy 1, policy_version 89572 (0.0009) +[2023-10-09 07:49:32,883][60144] Updated weights for policy 1, policy_version 89582 (0.0007) +[2023-10-09 07:49:33,247][60144] Updated weights for policy 1, policy_version 89592 (0.0008) +[2023-10-09 07:49:35,736][60143] Updated weights for policy 0, policy_version 88582 (0.0009) +[2023-10-09 07:49:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 182452224. Throughput: 0: 1712.5, 1: 1718.8. Samples: 45620818. Policy #0 lag: (min: 31.0, avg: 41.5, max: 63.0) +[2023-10-09 07:49:36,053][59242] Avg episode reward: [(0, '34.040'), (1, '32.480')] +[2023-10-09 07:49:36,108][60143] Updated weights for policy 0, policy_version 88592 (0.0010) +[2023-10-09 07:49:36,475][60143] Updated weights for policy 0, policy_version 88602 (0.0011) +[2023-10-09 07:49:37,010][60144] Updated weights for policy 1, policy_version 89602 (0.0007) +[2023-10-09 07:49:37,381][60144] Updated weights for policy 1, policy_version 89612 (0.0008) +[2023-10-09 07:49:37,748][60144] Updated weights for policy 1, policy_version 89622 (0.0009) +[2023-10-09 07:49:38,119][60144] Updated weights for policy 1, policy_version 89632 (0.0010) +[2023-10-09 07:49:40,361][60143] Updated weights for policy 0, policy_version 88612 (0.0009) +[2023-10-09 07:49:40,740][60143] Updated weights for policy 0, policy_version 88622 (0.0007) +[2023-10-09 07:49:41,052][59242] Fps is (10 sec: 13108.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 182517760. Throughput: 0: 1716.2, 1: 1722.5. Samples: 45641982. Policy #0 lag: (min: 31.0, avg: 41.5, max: 63.0) +[2023-10-09 07:49:41,052][59242] Avg episode reward: [(0, '33.070'), (1, '33.160')] +[2023-10-09 07:49:41,109][60143] Updated weights for policy 0, policy_version 88632 (0.0008) +[2023-10-09 07:49:42,033][60144] Updated weights for policy 1, policy_version 89642 (0.0007) +[2023-10-09 07:49:42,403][60144] Updated weights for policy 1, policy_version 89652 (0.0008) +[2023-10-09 07:49:42,763][60144] Updated weights for policy 1, policy_version 89662 (0.0009) +[2023-10-09 07:49:45,061][60143] Updated weights for policy 0, policy_version 88642 (0.0008) +[2023-10-09 07:49:45,432][60143] Updated weights for policy 0, policy_version 88652 (0.0009) +[2023-10-09 07:49:45,797][60143] Updated weights for policy 0, policy_version 88662 (0.0007) +[2023-10-09 07:49:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 182583296. Throughput: 0: 1702.7, 1: 1745.5. Samples: 45662724. Policy #0 lag: (min: 31.0, avg: 41.5, max: 63.0) +[2023-10-09 07:49:46,053][59242] Avg episode reward: [(0, '32.090'), (1, '34.540')] +[2023-10-09 07:49:46,167][60143] Updated weights for policy 0, policy_version 88672 (0.0008) +[2023-10-09 07:49:46,830][60144] Updated weights for policy 1, policy_version 89672 (0.0008) +[2023-10-09 07:49:47,205][60144] Updated weights for policy 1, policy_version 89682 (0.0007) +[2023-10-09 07:49:47,572][60144] Updated weights for policy 1, policy_version 89692 (0.0008) +[2023-10-09 07:49:50,111][60143] Updated weights for policy 0, policy_version 88682 (0.0007) +[2023-10-09 07:49:50,480][60143] Updated weights for policy 0, policy_version 88692 (0.0010) +[2023-10-09 07:49:50,849][60143] Updated weights for policy 0, policy_version 88702 (0.0011) +[2023-10-09 07:49:51,052][59242] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 182681600. Throughput: 0: 1717.2, 1: 1719.1. Samples: 45672774. Policy #0 lag: (min: 31.0, avg: 41.5, max: 63.0) +[2023-10-09 07:49:51,053][59242] Avg episode reward: [(0, '32.640'), (1, '34.650')] +[2023-10-09 07:49:51,240][60144] Updated weights for policy 1, policy_version 89702 (0.0007) +[2023-10-09 07:49:51,602][60144] Updated weights for policy 1, policy_version 89712 (0.0008) +[2023-10-09 07:49:51,971][60144] Updated weights for policy 1, policy_version 89722 (0.0008) +[2023-10-09 07:49:54,781][60143] Updated weights for policy 0, policy_version 88712 (0.0009) +[2023-10-09 07:49:55,155][60143] Updated weights for policy 0, policy_version 88722 (0.0008) +[2023-10-09 07:49:55,514][60143] Updated weights for policy 0, policy_version 88732 (0.0008) +[2023-10-09 07:49:55,846][60144] Updated weights for policy 1, policy_version 89732 (0.0008) +[2023-10-09 07:49:56,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 182747136. Throughput: 0: 1718.6, 1: 1746.7. Samples: 45694266. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:49:56,053][59242] Avg episode reward: [(0, '33.310'), (1, '33.220')] +[2023-10-09 07:49:56,220][60144] Updated weights for policy 1, policy_version 89742 (0.0007) +[2023-10-09 07:49:56,578][60144] Updated weights for policy 1, policy_version 89752 (0.0007) +[2023-10-09 07:49:59,678][60143] Updated weights for policy 0, policy_version 88742 (0.0010) +[2023-10-09 07:50:00,058][60143] Updated weights for policy 0, policy_version 88752 (0.0010) +[2023-10-09 07:50:00,424][60143] Updated weights for policy 0, policy_version 88762 (0.0009) +[2023-10-09 07:50:00,441][60144] Updated weights for policy 1, policy_version 89762 (0.0007) +[2023-10-09 07:50:00,844][60144] Updated weights for policy 1, policy_version 89772 (0.0008) +[2023-10-09 07:50:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 182812672. Throughput: 0: 1692.8, 1: 1745.1. Samples: 45714064. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:50:01,053][59242] Avg episode reward: [(0, '32.940'), (1, '32.170')] +[2023-10-09 07:50:01,205][60144] Updated weights for policy 1, policy_version 89782 (0.0008) +[2023-10-09 07:50:01,569][60144] Updated weights for policy 1, policy_version 89792 (0.0010) +[2023-10-09 07:50:04,308][60143] Updated weights for policy 0, policy_version 88772 (0.0008) +[2023-10-09 07:50:04,677][60143] Updated weights for policy 0, policy_version 88782 (0.0007) +[2023-10-09 07:50:05,043][60143] Updated weights for policy 0, policy_version 88792 (0.0008) +[2023-10-09 07:50:05,465][60144] Updated weights for policy 1, policy_version 89802 (0.0008) +[2023-10-09 07:50:05,827][60144] Updated weights for policy 1, policy_version 89812 (0.0011) +[2023-10-09 07:50:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 182878208. Throughput: 0: 1723.1, 1: 1730.0. Samples: 45724886. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:50:06,053][59242] Avg episode reward: [(0, '32.690'), (1, '32.950')] +[2023-10-09 07:50:06,194][60144] Updated weights for policy 1, policy_version 89822 (0.0009) +[2023-10-09 07:50:08,970][60143] Updated weights for policy 0, policy_version 88802 (0.0007) +[2023-10-09 07:50:09,343][60143] Updated weights for policy 0, policy_version 88812 (0.0007) +[2023-10-09 07:50:09,706][60143] Updated weights for policy 0, policy_version 88822 (0.0007) +[2023-10-09 07:50:10,077][60143] Updated weights for policy 0, policy_version 88832 (0.0008) +[2023-10-09 07:50:10,281][60144] Updated weights for policy 1, policy_version 89832 (0.0008) +[2023-10-09 07:50:10,658][60144] Updated weights for policy 1, policy_version 89842 (0.0010) +[2023-10-09 07:50:11,011][60144] Updated weights for policy 1, policy_version 89852 (0.0008) +[2023-10-09 07:50:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 182943744. Throughput: 0: 1713.1, 1: 1751.8. Samples: 45745636. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:50:11,053][59242] Avg episode reward: [(0, '33.740'), (1, '34.600')] +[2023-10-09 07:50:14,057][60143] Updated weights for policy 0, policy_version 88842 (0.0011) +[2023-10-09 07:50:14,425][60143] Updated weights for policy 0, policy_version 88852 (0.0009) +[2023-10-09 07:50:14,792][60143] Updated weights for policy 0, policy_version 88862 (0.0007) +[2023-10-09 07:50:15,213][60144] Updated weights for policy 1, policy_version 89862 (0.0010) +[2023-10-09 07:50:15,582][60144] Updated weights for policy 1, policy_version 89872 (0.0011) +[2023-10-09 07:50:15,946][60144] Updated weights for policy 1, policy_version 89882 (0.0011) +[2023-10-09 07:50:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 183009280. Throughput: 0: 1700.7, 1: 1735.0. Samples: 45765762. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:50:16,052][59242] Avg episode reward: [(0, '32.870'), (1, '34.830')] +[2023-10-09 07:50:18,681][60143] Updated weights for policy 0, policy_version 88872 (0.0009) +[2023-10-09 07:50:19,048][60143] Updated weights for policy 0, policy_version 88882 (0.0008) +[2023-10-09 07:50:19,422][60143] Updated weights for policy 0, policy_version 88892 (0.0009) +[2023-10-09 07:50:19,824][60144] Updated weights for policy 1, policy_version 89892 (0.0009) +[2023-10-09 07:50:20,192][60144] Updated weights for policy 1, policy_version 89902 (0.0008) +[2023-10-09 07:50:20,553][60144] Updated weights for policy 1, policy_version 89912 (0.0010) +[2023-10-09 07:50:21,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 183107584. Throughput: 0: 1725.7, 1: 1746.7. Samples: 45777076. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:50:21,052][59242] Avg episode reward: [(0, '32.350'), (1, '34.820')] +[2023-10-09 07:50:23,414][60143] Updated weights for policy 0, policy_version 88902 (0.0010) +[2023-10-09 07:50:23,798][60143] Updated weights for policy 0, policy_version 88912 (0.0011) +[2023-10-09 07:50:24,171][60143] Updated weights for policy 0, policy_version 88922 (0.0010) +[2023-10-09 07:50:24,432][60144] Updated weights for policy 1, policy_version 89922 (0.0011) +[2023-10-09 07:50:24,798][60144] Updated weights for policy 1, policy_version 89932 (0.0008) +[2023-10-09 07:50:25,173][60144] Updated weights for policy 1, policy_version 89942 (0.0007) +[2023-10-09 07:50:25,532][60144] Updated weights for policy 1, policy_version 89952 (0.0007) +[2023-10-09 07:50:26,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 183173120. Throughput: 0: 1699.6, 1: 1741.4. Samples: 45796830. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:50:26,053][59242] Avg episode reward: [(0, '33.350'), (1, '33.430')] +[2023-10-09 07:50:28,098][60143] Updated weights for policy 0, policy_version 88932 (0.0007) +[2023-10-09 07:50:28,473][60143] Updated weights for policy 0, policy_version 88942 (0.0010) +[2023-10-09 07:50:28,842][60143] Updated weights for policy 0, policy_version 88952 (0.0010) +[2023-10-09 07:50:29,402][60144] Updated weights for policy 1, policy_version 89962 (0.0008) +[2023-10-09 07:50:29,761][60144] Updated weights for policy 1, policy_version 89972 (0.0007) +[2023-10-09 07:50:30,135][60144] Updated weights for policy 1, policy_version 89982 (0.0007) +[2023-10-09 07:50:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.6, 300 sec: 13884.7). Total num frames: 183238656. Throughput: 0: 1708.7, 1: 1721.1. Samples: 45817066. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:50:31,053][59242] Avg episode reward: [(0, '30.430'), (1, '33.340')] +[2023-10-09 07:50:32,961][60143] Updated weights for policy 0, policy_version 88962 (0.0010) +[2023-10-09 07:50:33,331][60143] Updated weights for policy 0, policy_version 88972 (0.0009) +[2023-10-09 07:50:33,703][60143] Updated weights for policy 0, policy_version 88982 (0.0009) +[2023-10-09 07:50:33,827][60144] Updated weights for policy 1, policy_version 89992 (0.0008) +[2023-10-09 07:50:34,073][60143] Updated weights for policy 0, policy_version 88992 (0.0007) +[2023-10-09 07:50:34,199][60144] Updated weights for policy 1, policy_version 90002 (0.0008) +[2023-10-09 07:50:34,561][60144] Updated weights for policy 1, policy_version 90012 (0.0008) +[2023-10-09 07:50:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 183304192. Throughput: 0: 1705.7, 1: 1752.5. Samples: 45828394. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:50:36,053][59242] Avg episode reward: [(0, '30.700'), (1, '34.560')] +[2023-10-09 07:50:38,063][60143] Updated weights for policy 0, policy_version 89002 (0.0008) +[2023-10-09 07:50:38,417][60144] Updated weights for policy 1, policy_version 90022 (0.0008) +[2023-10-09 07:50:38,432][60143] Updated weights for policy 0, policy_version 89012 (0.0007) +[2023-10-09 07:50:38,779][60144] Updated weights for policy 1, policy_version 90032 (0.0008) +[2023-10-09 07:50:38,796][60143] Updated weights for policy 0, policy_version 89022 (0.0007) +[2023-10-09 07:50:39,141][60144] Updated weights for policy 1, policy_version 90042 (0.0008) +[2023-10-09 07:50:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 183369728. Throughput: 0: 1687.8, 1: 1721.8. Samples: 45847696. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:50:41,053][59242] Avg episode reward: [(0, '31.680'), (1, '34.260')] +[2023-10-09 07:50:42,900][60143] Updated weights for policy 0, policy_version 89032 (0.0008) +[2023-10-09 07:50:43,060][60144] Updated weights for policy 1, policy_version 90052 (0.0009) +[2023-10-09 07:50:43,274][60143] Updated weights for policy 0, policy_version 89042 (0.0008) +[2023-10-09 07:50:43,431][60144] Updated weights for policy 1, policy_version 90062 (0.0008) +[2023-10-09 07:50:43,644][60143] Updated weights for policy 0, policy_version 89052 (0.0010) +[2023-10-09 07:50:43,796][60144] Updated weights for policy 1, policy_version 90072 (0.0008) +[2023-10-09 07:50:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 183435264. Throughput: 0: 1714.3, 1: 1727.4. Samples: 45868942. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:50:46,053][59242] Avg episode reward: [(0, '31.760'), (1, '34.580')] +[2023-10-09 07:50:47,628][60143] Updated weights for policy 0, policy_version 89062 (0.0009) +[2023-10-09 07:50:47,812][60144] Updated weights for policy 1, policy_version 90082 (0.0007) +[2023-10-09 07:50:47,999][60143] Updated weights for policy 0, policy_version 89072 (0.0009) +[2023-10-09 07:50:48,213][60144] Updated weights for policy 1, policy_version 90092 (0.0008) +[2023-10-09 07:50:48,367][60143] Updated weights for policy 0, policy_version 89082 (0.0010) +[2023-10-09 07:50:48,572][60144] Updated weights for policy 1, policy_version 90102 (0.0007) +[2023-10-09 07:50:48,931][60144] Updated weights for policy 1, policy_version 90112 (0.0010) +[2023-10-09 07:50:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 183500800. Throughput: 0: 1688.7, 1: 1731.4. Samples: 45878790. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:50:51,053][59242] Avg episode reward: [(0, '31.890'), (1, '35.200')] +[2023-10-09 07:50:52,444][60143] Updated weights for policy 0, policy_version 89092 (0.0007) +[2023-10-09 07:50:52,810][60143] Updated weights for policy 0, policy_version 89102 (0.0008) +[2023-10-09 07:50:52,847][60144] Updated weights for policy 1, policy_version 90122 (0.0008) +[2023-10-09 07:50:53,181][60143] Updated weights for policy 0, policy_version 89112 (0.0007) +[2023-10-09 07:50:53,212][60144] Updated weights for policy 1, policy_version 90132 (0.0009) +[2023-10-09 07:50:53,585][60144] Updated weights for policy 1, policy_version 90142 (0.0009) +[2023-10-09 07:50:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 183566336. Throughput: 0: 1694.6, 1: 1721.2. Samples: 45899348. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:50:56,052][59242] Avg episode reward: [(0, '32.760'), (1, '34.720')] +[2023-10-09 07:50:57,129][60143] Updated weights for policy 0, policy_version 89122 (0.0007) +[2023-10-09 07:50:57,456][60144] Updated weights for policy 1, policy_version 90152 (0.0008) +[2023-10-09 07:50:57,507][60143] Updated weights for policy 0, policy_version 89132 (0.0008) +[2023-10-09 07:50:57,821][60144] Updated weights for policy 1, policy_version 90162 (0.0008) +[2023-10-09 07:50:57,871][60143] Updated weights for policy 0, policy_version 89142 (0.0008) +[2023-10-09 07:50:58,200][60144] Updated weights for policy 1, policy_version 90172 (0.0008) +[2023-10-09 07:50:58,244][60143] Updated weights for policy 0, policy_version 89152 (0.0008) +[2023-10-09 07:51:01,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 183631872. Throughput: 0: 1707.4, 1: 1732.0. Samples: 45920534. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:51:01,053][59242] Avg episode reward: [(0, '31.310'), (1, '36.170')] +[2023-10-09 07:51:02,227][60144] Updated weights for policy 1, policy_version 90182 (0.0010) +[2023-10-09 07:51:02,262][60143] Updated weights for policy 0, policy_version 89162 (0.0008) +[2023-10-09 07:51:02,591][60144] Updated weights for policy 1, policy_version 90192 (0.0009) +[2023-10-09 07:51:02,628][60143] Updated weights for policy 0, policy_version 89172 (0.0008) +[2023-10-09 07:51:02,956][60144] Updated weights for policy 1, policy_version 90202 (0.0007) +[2023-10-09 07:51:02,988][60143] Updated weights for policy 0, policy_version 89182 (0.0007) +[2023-10-09 07:51:06,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 183697408. Throughput: 0: 1678.7, 1: 1718.3. Samples: 45929942. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:51:06,053][59242] Avg episode reward: [(0, '32.550'), (1, '35.510')] +[2023-10-09 07:51:06,935][60144] Updated weights for policy 1, policy_version 90212 (0.0007) +[2023-10-09 07:51:07,023][60143] Updated weights for policy 0, policy_version 89192 (0.0008) +[2023-10-09 07:51:07,301][60144] Updated weights for policy 1, policy_version 90222 (0.0007) +[2023-10-09 07:51:07,395][60143] Updated weights for policy 0, policy_version 89202 (0.0007) +[2023-10-09 07:51:07,660][60144] Updated weights for policy 1, policy_version 90232 (0.0008) +[2023-10-09 07:51:07,767][60143] Updated weights for policy 0, policy_version 89212 (0.0007) +[2023-10-09 07:51:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 183762944. Throughput: 0: 1708.9, 1: 1720.5. Samples: 45951156. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:51:11,052][59242] Avg episode reward: [(0, '32.520'), (1, '35.690')] +[2023-10-09 07:51:11,539][60144] Updated weights for policy 1, policy_version 90242 (0.0008) +[2023-10-09 07:51:11,822][60143] Updated weights for policy 0, policy_version 89222 (0.0007) +[2023-10-09 07:51:11,904][60144] Updated weights for policy 1, policy_version 90252 (0.0010) +[2023-10-09 07:51:12,201][60143] Updated weights for policy 0, policy_version 89232 (0.0007) +[2023-10-09 07:51:12,281][60144] Updated weights for policy 1, policy_version 90262 (0.0008) +[2023-10-09 07:51:12,568][60143] Updated weights for policy 0, policy_version 89242 (0.0007) +[2023-10-09 07:51:12,648][60144] Updated weights for policy 1, policy_version 90272 (0.0007) +[2023-10-09 07:51:16,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 183828480. Throughput: 0: 1710.8, 1: 1739.8. Samples: 45972342. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:51:16,052][59242] Avg episode reward: [(0, '33.180'), (1, '38.000')] +[2023-10-09 07:51:16,471][60143] Updated weights for policy 0, policy_version 89252 (0.0008) +[2023-10-09 07:51:16,604][60144] Updated weights for policy 1, policy_version 90282 (0.0010) +[2023-10-09 07:51:16,842][60143] Updated weights for policy 0, policy_version 89262 (0.0007) +[2023-10-09 07:51:16,968][60144] Updated weights for policy 1, policy_version 90292 (0.0008) +[2023-10-09 07:51:17,205][60143] Updated weights for policy 0, policy_version 89272 (0.0008) +[2023-10-09 07:51:17,329][60144] Updated weights for policy 1, policy_version 90302 (0.0008) +[2023-10-09 07:51:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 183894016. Throughput: 0: 1696.9, 1: 1706.1. Samples: 45981526. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:51:21,053][59242] Avg episode reward: [(0, '33.980'), (1, '37.130')] +[2023-10-09 07:51:21,249][60144] Updated weights for policy 1, policy_version 90312 (0.0009) +[2023-10-09 07:51:21,410][60143] Updated weights for policy 0, policy_version 89282 (0.0007) +[2023-10-09 07:51:21,616][60144] Updated weights for policy 1, policy_version 90322 (0.0008) +[2023-10-09 07:51:21,773][60143] Updated weights for policy 0, policy_version 89292 (0.0008) +[2023-10-09 07:51:21,984][60144] Updated weights for policy 1, policy_version 90332 (0.0007) +[2023-10-09 07:51:22,134][60143] Updated weights for policy 0, policy_version 89302 (0.0009) +[2023-10-09 07:51:22,514][60143] Updated weights for policy 0, policy_version 89312 (0.0010) +[2023-10-09 07:51:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 183959552. Throughput: 0: 1715.4, 1: 1733.6. Samples: 46002898. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:51:26,052][59242] Avg episode reward: [(0, '34.700'), (1, '37.790')] +[2023-10-09 07:51:26,140][60144] Updated weights for policy 1, policy_version 90342 (0.0008) +[2023-10-09 07:51:26,327][60143] Updated weights for policy 0, policy_version 89322 (0.0007) +[2023-10-09 07:51:26,498][60144] Updated weights for policy 1, policy_version 90352 (0.0007) +[2023-10-09 07:51:26,696][60143] Updated weights for policy 0, policy_version 89332 (0.0009) +[2023-10-09 07:51:26,864][60144] Updated weights for policy 1, policy_version 90362 (0.0007) +[2023-10-09 07:51:27,065][60143] Updated weights for policy 0, policy_version 89342 (0.0008) +[2023-10-09 07:51:30,993][60143] Updated weights for policy 0, policy_version 89352 (0.0009) +[2023-10-09 07:51:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 184025088. Throughput: 0: 1719.2, 1: 1730.9. Samples: 46024196. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:51:31,053][59242] Avg episode reward: [(0, '34.500'), (1, '38.980')] +[2023-10-09 07:51:31,084][60144] Updated weights for policy 1, policy_version 90372 (0.0009) +[2023-10-09 07:51:31,371][60143] Updated weights for policy 0, policy_version 89362 (0.0008) +[2023-10-09 07:51:31,450][60144] Updated weights for policy 1, policy_version 90382 (0.0009) +[2023-10-09 07:51:31,739][60143] Updated weights for policy 0, policy_version 89372 (0.0008) +[2023-10-09 07:51:31,815][60144] Updated weights for policy 1, policy_version 90392 (0.0007) +[2023-10-09 07:51:31,880][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000089376_91521024.pth... +[2023-10-09 07:51:31,909][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000087776_89882624.pth +[2023-10-09 07:51:32,106][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000090400_92569600.pth... +[2023-10-09 07:51:32,135][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000088768_90898432.pth +[2023-10-09 07:51:32,138][60003] Saving new best policy, reward=38.980! +[2023-10-09 07:51:35,799][60144] Updated weights for policy 1, policy_version 90402 (0.0007) +[2023-10-09 07:51:35,858][60143] Updated weights for policy 0, policy_version 89382 (0.0008) +[2023-10-09 07:51:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 184090624. Throughput: 0: 1718.0, 1: 1717.4. Samples: 46033384. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:51:36,053][59242] Avg episode reward: [(0, '34.010'), (1, '39.550')] +[2023-10-09 07:51:36,184][60144] Updated weights for policy 1, policy_version 90412 (0.0010) +[2023-10-09 07:51:36,227][60143] Updated weights for policy 0, policy_version 89392 (0.0009) +[2023-10-09 07:51:36,548][60144] Updated weights for policy 1, policy_version 90422 (0.0009) +[2023-10-09 07:51:36,597][60143] Updated weights for policy 0, policy_version 89402 (0.0009) +[2023-10-09 07:51:36,910][60003] Saving new best policy, reward=39.550! +[2023-10-09 07:51:36,913][60144] Updated weights for policy 1, policy_version 90432 (0.0008) +[2023-10-09 07:51:40,602][60143] Updated weights for policy 0, policy_version 89412 (0.0009) +[2023-10-09 07:51:40,846][60144] Updated weights for policy 1, policy_version 90442 (0.0010) +[2023-10-09 07:51:40,974][60143] Updated weights for policy 0, policy_version 89422 (0.0008) +[2023-10-09 07:51:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 184156160. Throughput: 0: 1721.3, 1: 1723.3. Samples: 46054356. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:51:41,053][59242] Avg episode reward: [(0, '34.630'), (1, '37.800')] +[2023-10-09 07:51:41,230][60144] Updated weights for policy 1, policy_version 90452 (0.0008) +[2023-10-09 07:51:41,341][60143] Updated weights for policy 0, policy_version 89432 (0.0008) +[2023-10-09 07:51:41,596][60144] Updated weights for policy 1, policy_version 90462 (0.0008) +[2023-10-09 07:51:45,237][60143] Updated weights for policy 0, policy_version 89442 (0.0009) +[2023-10-09 07:51:45,441][60144] Updated weights for policy 1, policy_version 90472 (0.0008) +[2023-10-09 07:51:45,606][60143] Updated weights for policy 0, policy_version 89452 (0.0009) +[2023-10-09 07:51:45,813][60144] Updated weights for policy 1, policy_version 90482 (0.0007) +[2023-10-09 07:51:45,989][60143] Updated weights for policy 0, policy_version 89462 (0.0008) +[2023-10-09 07:51:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 184221696. Throughput: 0: 1713.2, 1: 1717.3. Samples: 46074908. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:51:46,053][59242] Avg episode reward: [(0, '33.350'), (1, '37.940')] +[2023-10-09 07:51:46,184][60144] Updated weights for policy 1, policy_version 90492 (0.0009) +[2023-10-09 07:51:46,360][60143] Updated weights for policy 0, policy_version 89472 (0.0008) +[2023-10-09 07:51:50,212][60144] Updated weights for policy 1, policy_version 90502 (0.0008) +[2023-10-09 07:51:50,414][60143] Updated weights for policy 0, policy_version 89482 (0.0007) +[2023-10-09 07:51:50,584][60144] Updated weights for policy 1, policy_version 90512 (0.0007) +[2023-10-09 07:51:50,785][60143] Updated weights for policy 0, policy_version 89492 (0.0009) +[2023-10-09 07:51:50,948][60144] Updated weights for policy 1, policy_version 90522 (0.0008) +[2023-10-09 07:51:51,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 184287232. Throughput: 0: 1720.8, 1: 1725.5. Samples: 46085028. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-09 07:51:51,053][59242] Avg episode reward: [(0, '33.770'), (1, '36.450')] +[2023-10-09 07:51:51,160][60143] Updated weights for policy 0, policy_version 89502 (0.0009) +[2023-10-09 07:51:54,905][60144] Updated weights for policy 1, policy_version 90532 (0.0008) +[2023-10-09 07:51:55,025][60143] Updated weights for policy 0, policy_version 89512 (0.0009) +[2023-10-09 07:51:55,263][60144] Updated weights for policy 1, policy_version 90542 (0.0007) +[2023-10-09 07:51:55,389][60143] Updated weights for policy 0, policy_version 89522 (0.0010) +[2023-10-09 07:51:55,625][60144] Updated weights for policy 1, policy_version 90552 (0.0007) +[2023-10-09 07:51:55,750][60143] Updated weights for policy 0, policy_version 89532 (0.0009) +[2023-10-09 07:51:56,052][59242] Fps is (10 sec: 19660.7, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 184418304. Throughput: 0: 1721.1, 1: 1725.3. Samples: 46106244. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-09 07:51:56,053][59242] Avg episode reward: [(0, '32.530'), (1, '36.930')] +[2023-10-09 07:51:59,590][60144] Updated weights for policy 1, policy_version 90562 (0.0008) +[2023-10-09 07:51:59,830][60143] Updated weights for policy 0, policy_version 89542 (0.0009) +[2023-10-09 07:51:59,946][60144] Updated weights for policy 1, policy_version 90572 (0.0007) +[2023-10-09 07:52:00,193][60143] Updated weights for policy 0, policy_version 89552 (0.0008) +[2023-10-09 07:52:00,321][60144] Updated weights for policy 1, policy_version 90582 (0.0009) +[2023-10-09 07:52:00,567][60143] Updated weights for policy 0, policy_version 89562 (0.0008) +[2023-10-09 07:52:00,684][60144] Updated weights for policy 1, policy_version 90592 (0.0007) +[2023-10-09 07:52:01,052][59242] Fps is (10 sec: 19661.4, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 184483840. Throughput: 0: 1696.6, 1: 1699.6. Samples: 46125170. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-09 07:52:01,052][59242] Avg episode reward: [(0, '32.520'), (1, '37.700')] +[2023-10-09 07:52:04,360][60143] Updated weights for policy 0, policy_version 89572 (0.0008) +[2023-10-09 07:52:04,680][60144] Updated weights for policy 1, policy_version 90602 (0.0007) +[2023-10-09 07:52:04,727][60143] Updated weights for policy 0, policy_version 89582 (0.0009) +[2023-10-09 07:52:05,051][60144] Updated weights for policy 1, policy_version 90612 (0.0007) +[2023-10-09 07:52:05,089][60143] Updated weights for policy 0, policy_version 89592 (0.0009) +[2023-10-09 07:52:05,419][60144] Updated weights for policy 1, policy_version 90622 (0.0009) +[2023-10-09 07:52:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 184549376. Throughput: 0: 1721.1, 1: 1723.8. Samples: 46136544. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-09 07:52:06,053][59242] Avg episode reward: [(0, '33.600'), (1, '38.300')] +[2023-10-09 07:52:09,120][60143] Updated weights for policy 0, policy_version 89602 (0.0009) +[2023-10-09 07:52:09,343][60144] Updated weights for policy 1, policy_version 90632 (0.0008) +[2023-10-09 07:52:09,496][60143] Updated weights for policy 0, policy_version 89612 (0.0008) +[2023-10-09 07:52:09,707][60144] Updated weights for policy 1, policy_version 90642 (0.0007) +[2023-10-09 07:52:09,860][60143] Updated weights for policy 0, policy_version 89622 (0.0010) +[2023-10-09 07:52:10,077][60144] Updated weights for policy 1, policy_version 90652 (0.0007) +[2023-10-09 07:52:10,232][60143] Updated weights for policy 0, policy_version 89632 (0.0010) +[2023-10-09 07:52:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 184614912. Throughput: 0: 1707.6, 1: 1707.5. Samples: 46156578. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-09 07:52:11,053][59242] Avg episode reward: [(0, '36.300'), (1, '35.960')] +[2023-10-09 07:52:14,129][60144] Updated weights for policy 1, policy_version 90662 (0.0007) +[2023-10-09 07:52:14,162][60143] Updated weights for policy 0, policy_version 89642 (0.0009) +[2023-10-09 07:52:14,495][60144] Updated weights for policy 1, policy_version 90672 (0.0008) +[2023-10-09 07:52:14,531][60143] Updated weights for policy 0, policy_version 89652 (0.0010) +[2023-10-09 07:52:14,863][60144] Updated weights for policy 1, policy_version 90682 (0.0009) +[2023-10-09 07:52:14,892][60143] Updated weights for policy 0, policy_version 89662 (0.0010) +[2023-10-09 07:52:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 184680448. Throughput: 0: 1688.8, 1: 1691.4. Samples: 46176302. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-09 07:52:16,053][59242] Avg episode reward: [(0, '37.090'), (1, '35.900')] +[2023-10-09 07:52:18,807][60144] Updated weights for policy 1, policy_version 90692 (0.0007) +[2023-10-09 07:52:18,905][60143] Updated weights for policy 0, policy_version 89672 (0.0009) +[2023-10-09 07:52:19,175][60144] Updated weights for policy 1, policy_version 90702 (0.0008) +[2023-10-09 07:52:19,263][60143] Updated weights for policy 0, policy_version 89682 (0.0009) +[2023-10-09 07:52:19,542][60144] Updated weights for policy 1, policy_version 90712 (0.0008) +[2023-10-09 07:52:19,640][60143] Updated weights for policy 0, policy_version 89692 (0.0007) +[2023-10-09 07:52:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 184745984. Throughput: 0: 1714.9, 1: 1723.7. Samples: 46188120. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-09 07:52:21,053][59242] Avg episode reward: [(0, '37.140'), (1, '36.540')] +[2023-10-09 07:52:23,505][60144] Updated weights for policy 1, policy_version 90722 (0.0008) +[2023-10-09 07:52:23,529][60143] Updated weights for policy 0, policy_version 89702 (0.0007) +[2023-10-09 07:52:23,905][60143] Updated weights for policy 0, policy_version 89712 (0.0008) +[2023-10-09 07:52:23,914][60144] Updated weights for policy 1, policy_version 90732 (0.0010) +[2023-10-09 07:52:24,277][60143] Updated weights for policy 0, policy_version 89722 (0.0009) +[2023-10-09 07:52:24,290][60144] Updated weights for policy 1, policy_version 90742 (0.0009) +[2023-10-09 07:52:24,656][60144] Updated weights for policy 1, policy_version 90752 (0.0008) +[2023-10-09 07:52:26,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 184811520. Throughput: 0: 1691.1, 1: 1696.5. Samples: 46206798. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-09 07:52:26,053][59242] Avg episode reward: [(0, '37.410'), (1, '36.240')] +[2023-10-09 07:52:28,153][60143] Updated weights for policy 0, policy_version 89732 (0.0009) +[2023-10-09 07:52:28,515][60143] Updated weights for policy 0, policy_version 89742 (0.0009) +[2023-10-09 07:52:28,771][60144] Updated weights for policy 1, policy_version 90762 (0.0008) +[2023-10-09 07:52:28,877][60143] Updated weights for policy 0, policy_version 89752 (0.0008) +[2023-10-09 07:52:29,131][60144] Updated weights for policy 1, policy_version 90772 (0.0009) +[2023-10-09 07:52:29,501][60144] Updated weights for policy 1, policy_version 90782 (0.0008) +[2023-10-09 07:52:31,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 184877056. Throughput: 0: 1702.9, 1: 1700.8. Samples: 46228076. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-09 07:52:31,053][59242] Avg episode reward: [(0, '37.310'), (1, '35.210')] +[2023-10-09 07:52:33,029][60143] Updated weights for policy 0, policy_version 89762 (0.0008) +[2023-10-09 07:52:33,334][60144] Updated weights for policy 1, policy_version 90792 (0.0008) +[2023-10-09 07:52:33,402][60143] Updated weights for policy 0, policy_version 89772 (0.0008) +[2023-10-09 07:52:33,700][60144] Updated weights for policy 1, policy_version 90802 (0.0007) +[2023-10-09 07:52:33,771][60143] Updated weights for policy 0, policy_version 89782 (0.0007) +[2023-10-09 07:52:34,061][60144] Updated weights for policy 1, policy_version 90812 (0.0009) +[2023-10-09 07:52:34,141][60143] Updated weights for policy 0, policy_version 89792 (0.0008) +[2023-10-09 07:52:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 184942592. Throughput: 0: 1709.2, 1: 1708.2. Samples: 46238810. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-09 07:52:36,053][59242] Avg episode reward: [(0, '36.110'), (1, '36.050')] +[2023-10-09 07:52:37,872][60144] Updated weights for policy 1, policy_version 90822 (0.0008) +[2023-10-09 07:52:38,142][60143] Updated weights for policy 0, policy_version 89802 (0.0007) +[2023-10-09 07:52:38,238][60144] Updated weights for policy 1, policy_version 90832 (0.0009) +[2023-10-09 07:52:38,509][60143] Updated weights for policy 0, policy_version 89812 (0.0009) +[2023-10-09 07:52:38,605][60144] Updated weights for policy 1, policy_version 90842 (0.0008) +[2023-10-09 07:52:38,881][60143] Updated weights for policy 0, policy_version 89822 (0.0009) +[2023-10-09 07:52:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 185008128. Throughput: 0: 1687.3, 1: 1693.3. Samples: 46258372. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-09 07:52:41,053][59242] Avg episode reward: [(0, '35.670'), (1, '35.680')] +[2023-10-09 07:52:42,506][60144] Updated weights for policy 1, policy_version 90852 (0.0008) +[2023-10-09 07:52:42,875][60144] Updated weights for policy 1, policy_version 90862 (0.0008) +[2023-10-09 07:52:43,112][60143] Updated weights for policy 0, policy_version 89832 (0.0007) +[2023-10-09 07:52:43,230][60144] Updated weights for policy 1, policy_version 90872 (0.0008) +[2023-10-09 07:52:43,475][60143] Updated weights for policy 0, policy_version 89842 (0.0008) +[2023-10-09 07:52:43,847][60143] Updated weights for policy 0, policy_version 89852 (0.0010) +[2023-10-09 07:52:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 185073664. Throughput: 0: 1709.7, 1: 1721.4. Samples: 46279568. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-09 07:52:46,053][59242] Avg episode reward: [(0, '35.730'), (1, '37.290')] +[2023-10-09 07:52:47,176][60144] Updated weights for policy 1, policy_version 90882 (0.0008) +[2023-10-09 07:52:47,543][60144] Updated weights for policy 1, policy_version 90892 (0.0009) +[2023-10-09 07:52:47,698][60143] Updated weights for policy 0, policy_version 89862 (0.0009) +[2023-10-09 07:52:47,906][60144] Updated weights for policy 1, policy_version 90902 (0.0008) +[2023-10-09 07:52:48,084][60143] Updated weights for policy 0, policy_version 89872 (0.0009) +[2023-10-09 07:52:48,269][60144] Updated weights for policy 1, policy_version 90912 (0.0008) +[2023-10-09 07:52:48,449][60143] Updated weights for policy 0, policy_version 89882 (0.0010) +[2023-10-09 07:52:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 185139200. Throughput: 0: 1693.4, 1: 1697.2. Samples: 46289120. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-09 07:52:51,052][59242] Avg episode reward: [(0, '35.770'), (1, '34.950')] +[2023-10-09 07:52:52,337][60144] Updated weights for policy 1, policy_version 90922 (0.0007) +[2023-10-09 07:52:52,344][60143] Updated weights for policy 0, policy_version 89892 (0.0010) +[2023-10-09 07:52:52,696][60144] Updated weights for policy 1, policy_version 90932 (0.0007) +[2023-10-09 07:52:52,703][60143] Updated weights for policy 0, policy_version 89902 (0.0009) +[2023-10-09 07:52:53,067][60144] Updated weights for policy 1, policy_version 90942 (0.0007) +[2023-10-09 07:52:53,072][60143] Updated weights for policy 0, policy_version 89912 (0.0007) +[2023-10-09 07:52:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 185204736. Throughput: 0: 1699.1, 1: 1712.7. Samples: 46310108. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-09 07:52:56,053][59242] Avg episode reward: [(0, '34.560'), (1, '36.030')] +[2023-10-09 07:52:56,996][60144] Updated weights for policy 1, policy_version 90952 (0.0008) +[2023-10-09 07:52:57,043][60143] Updated weights for policy 0, policy_version 89922 (0.0008) +[2023-10-09 07:52:57,370][60144] Updated weights for policy 1, policy_version 90962 (0.0007) +[2023-10-09 07:52:57,410][60143] Updated weights for policy 0, policy_version 89932 (0.0007) +[2023-10-09 07:52:57,731][60144] Updated weights for policy 1, policy_version 90972 (0.0007) +[2023-10-09 07:52:57,776][60143] Updated weights for policy 0, policy_version 89942 (0.0010) +[2023-10-09 07:52:58,152][60143] Updated weights for policy 0, policy_version 89952 (0.0010) +[2023-10-09 07:53:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 185270272. Throughput: 0: 1716.1, 1: 1728.2. Samples: 46331296. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-09 07:53:01,053][59242] Avg episode reward: [(0, '33.240'), (1, '34.360')] +[2023-10-09 07:53:01,777][60144] Updated weights for policy 1, policy_version 90982 (0.0008) +[2023-10-09 07:53:02,146][60144] Updated weights for policy 1, policy_version 90992 (0.0008) +[2023-10-09 07:53:02,157][60143] Updated weights for policy 0, policy_version 89962 (0.0007) +[2023-10-09 07:53:02,514][60144] Updated weights for policy 1, policy_version 91002 (0.0007) +[2023-10-09 07:53:02,531][60143] Updated weights for policy 0, policy_version 89972 (0.0007) +[2023-10-09 07:53:02,906][60143] Updated weights for policy 0, policy_version 89982 (0.0009) +[2023-10-09 07:53:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 185335808. Throughput: 0: 1688.8, 1: 1700.0. Samples: 46340612. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-09 07:53:06,053][59242] Avg episode reward: [(0, '35.040'), (1, '34.940')] +[2023-10-09 07:53:06,402][60144] Updated weights for policy 1, policy_version 91012 (0.0008) +[2023-10-09 07:53:06,780][60144] Updated weights for policy 1, policy_version 91022 (0.0009) +[2023-10-09 07:53:07,065][60143] Updated weights for policy 0, policy_version 89992 (0.0008) +[2023-10-09 07:53:07,150][60144] Updated weights for policy 1, policy_version 91032 (0.0007) +[2023-10-09 07:53:07,427][60143] Updated weights for policy 0, policy_version 90002 (0.0007) +[2023-10-09 07:53:07,812][60143] Updated weights for policy 0, policy_version 90012 (0.0007) +[2023-10-09 07:53:11,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 185401344. Throughput: 0: 1711.3, 1: 1731.0. Samples: 46361704. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-09 07:53:11,053][59242] Avg episode reward: [(0, '35.900'), (1, '34.300')] +[2023-10-09 07:53:11,074][60144] Updated weights for policy 1, policy_version 91042 (0.0008) +[2023-10-09 07:53:11,488][60144] Updated weights for policy 1, policy_version 91052 (0.0009) +[2023-10-09 07:53:11,782][60143] Updated weights for policy 0, policy_version 90022 (0.0009) +[2023-10-09 07:53:11,855][60144] Updated weights for policy 1, policy_version 91062 (0.0010) +[2023-10-09 07:53:12,153][60143] Updated weights for policy 0, policy_version 90032 (0.0007) +[2023-10-09 07:53:12,216][60144] Updated weights for policy 1, policy_version 91072 (0.0008) +[2023-10-09 07:53:12,522][60143] Updated weights for policy 0, policy_version 90042 (0.0007) +[2023-10-09 07:53:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 185466880. Throughput: 0: 1703.8, 1: 1732.1. Samples: 46382690. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-09 07:53:16,053][59242] Avg episode reward: [(0, '35.940'), (1, '34.330')] +[2023-10-09 07:53:16,224][60144] Updated weights for policy 1, policy_version 91082 (0.0007) +[2023-10-09 07:53:16,500][60143] Updated weights for policy 0, policy_version 90052 (0.0008) +[2023-10-09 07:53:16,591][60144] Updated weights for policy 1, policy_version 91092 (0.0007) +[2023-10-09 07:53:16,867][60143] Updated weights for policy 0, policy_version 90062 (0.0007) +[2023-10-09 07:53:16,955][60144] Updated weights for policy 1, policy_version 91102 (0.0010) +[2023-10-09 07:53:17,240][60143] Updated weights for policy 0, policy_version 90072 (0.0009) +[2023-10-09 07:53:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 185532416. Throughput: 0: 1688.3, 1: 1716.6. Samples: 46392030. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-09 07:53:21,053][59242] Avg episode reward: [(0, '35.330'), (1, '34.660')] +[2023-10-09 07:53:21,061][60144] Updated weights for policy 1, policy_version 91112 (0.0009) +[2023-10-09 07:53:21,166][60143] Updated weights for policy 0, policy_version 90082 (0.0008) +[2023-10-09 07:53:21,414][60144] Updated weights for policy 1, policy_version 91122 (0.0009) +[2023-10-09 07:53:21,530][60143] Updated weights for policy 0, policy_version 90092 (0.0008) +[2023-10-09 07:53:21,775][60144] Updated weights for policy 1, policy_version 91132 (0.0009) +[2023-10-09 07:53:21,904][60143] Updated weights for policy 0, policy_version 90102 (0.0008) +[2023-10-09 07:53:22,266][60143] Updated weights for policy 0, policy_version 90112 (0.0007) +[2023-10-09 07:53:25,594][60144] Updated weights for policy 1, policy_version 91142 (0.0010) +[2023-10-09 07:53:25,958][60144] Updated weights for policy 1, policy_version 91152 (0.0010) +[2023-10-09 07:53:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 185597952. Throughput: 0: 1709.1, 1: 1733.5. Samples: 46413290. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-09 07:53:26,053][59242] Avg episode reward: [(0, '35.920'), (1, '35.230')] +[2023-10-09 07:53:26,280][60143] Updated weights for policy 0, policy_version 90122 (0.0008) +[2023-10-09 07:53:26,320][60144] Updated weights for policy 1, policy_version 91162 (0.0009) +[2023-10-09 07:53:26,653][60143] Updated weights for policy 0, policy_version 90132 (0.0009) +[2023-10-09 07:53:27,021][60143] Updated weights for policy 0, policy_version 90142 (0.0007) +[2023-10-09 07:53:30,456][60144] Updated weights for policy 1, policy_version 91172 (0.0009) +[2023-10-09 07:53:30,833][60144] Updated weights for policy 1, policy_version 91182 (0.0009) +[2023-10-09 07:53:30,943][60143] Updated weights for policy 0, policy_version 90152 (0.0007) +[2023-10-09 07:53:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 185663488. Throughput: 0: 1712.4, 1: 1721.6. Samples: 46434102. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-09 07:53:31,053][59242] Avg episode reward: [(0, '35.170'), (1, '35.940')] +[2023-10-09 07:53:31,197][60144] Updated weights for policy 1, policy_version 91192 (0.0008) +[2023-10-09 07:53:31,314][60143] Updated weights for policy 0, policy_version 90162 (0.0010) +[2023-10-09 07:53:31,481][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000091200_93388800.pth... +[2023-10-09 07:53:31,519][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000089568_91717632.pth +[2023-10-09 07:53:31,689][60143] Updated weights for policy 0, policy_version 90172 (0.0009) +[2023-10-09 07:53:31,835][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000090176_92340224.pth... +[2023-10-09 07:53:31,864][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000088576_90701824.pth +[2023-10-09 07:53:35,153][60144] Updated weights for policy 1, policy_version 91202 (0.0009) +[2023-10-09 07:53:35,508][60144] Updated weights for policy 1, policy_version 91212 (0.0009) +[2023-10-09 07:53:35,874][60144] Updated weights for policy 1, policy_version 91222 (0.0008) +[2023-10-09 07:53:35,997][60143] Updated weights for policy 0, policy_version 90182 (0.0009) +[2023-10-09 07:53:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 185729024. Throughput: 0: 1702.5, 1: 1727.3. Samples: 46443462. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-09 07:53:36,053][59242] Avg episode reward: [(0, '34.840'), (1, '34.990')] +[2023-10-09 07:53:36,231][60144] Updated weights for policy 1, policy_version 91232 (0.0008) +[2023-10-09 07:53:36,380][60143] Updated weights for policy 0, policy_version 90192 (0.0008) +[2023-10-09 07:53:36,742][60143] Updated weights for policy 0, policy_version 90202 (0.0008) +[2023-10-09 07:53:40,348][60144] Updated weights for policy 1, policy_version 91242 (0.0009) +[2023-10-09 07:53:40,719][60144] Updated weights for policy 1, policy_version 91252 (0.0008) +[2023-10-09 07:53:40,887][60143] Updated weights for policy 0, policy_version 90212 (0.0010) +[2023-10-09 07:53:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 185794560. Throughput: 0: 1700.3, 1: 1728.0. Samples: 46464382. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-09 07:53:41,053][59242] Avg episode reward: [(0, '34.160'), (1, '35.540')] +[2023-10-09 07:53:41,075][60144] Updated weights for policy 1, policy_version 91262 (0.0007) +[2023-10-09 07:53:41,253][60143] Updated weights for policy 0, policy_version 90222 (0.0010) +[2023-10-09 07:53:41,630][60143] Updated weights for policy 0, policy_version 90232 (0.0011) +[2023-10-09 07:53:44,807][60144] Updated weights for policy 1, policy_version 91272 (0.0008) +[2023-10-09 07:53:45,182][60144] Updated weights for policy 1, policy_version 91282 (0.0009) +[2023-10-09 07:53:45,450][60143] Updated weights for policy 0, policy_version 90242 (0.0009) +[2023-10-09 07:53:45,546][60144] Updated weights for policy 1, policy_version 91292 (0.0007) +[2023-10-09 07:53:45,817][60143] Updated weights for policy 0, policy_version 90252 (0.0008) +[2023-10-09 07:53:46,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 185892864. Throughput: 0: 1698.5, 1: 1711.2. Samples: 46484732. Policy #0 lag: (min: 31.0, avg: 36.7, max: 63.0) +[2023-10-09 07:53:46,053][59242] Avg episode reward: [(0, '34.790'), (1, '35.210')] +[2023-10-09 07:53:46,192][60143] Updated weights for policy 0, policy_version 90262 (0.0010) +[2023-10-09 07:53:46,568][60143] Updated weights for policy 0, policy_version 90272 (0.0007) +[2023-10-09 07:53:49,599][60144] Updated weights for policy 1, policy_version 91302 (0.0009) +[2023-10-09 07:53:49,972][60144] Updated weights for policy 1, policy_version 91312 (0.0009) +[2023-10-09 07:53:50,338][60144] Updated weights for policy 1, policy_version 91322 (0.0009) +[2023-10-09 07:53:50,731][60143] Updated weights for policy 0, policy_version 90282 (0.0007) +[2023-10-09 07:53:51,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 185958400. Throughput: 0: 1696.7, 1: 1734.1. Samples: 46495000. Policy #0 lag: (min: 26.0, avg: 36.6, max: 58.0) +[2023-10-09 07:53:51,053][59242] Avg episode reward: [(0, '34.340'), (1, '35.620')] +[2023-10-09 07:53:51,099][60143] Updated weights for policy 0, policy_version 90292 (0.0007) +[2023-10-09 07:53:51,463][60143] Updated weights for policy 0, policy_version 90302 (0.0007) +[2023-10-09 07:53:54,186][60144] Updated weights for policy 1, policy_version 91332 (0.0009) +[2023-10-09 07:53:54,554][60144] Updated weights for policy 1, policy_version 91342 (0.0008) +[2023-10-09 07:53:54,916][60144] Updated weights for policy 1, policy_version 91352 (0.0008) +[2023-10-09 07:53:55,410][60143] Updated weights for policy 0, policy_version 90312 (0.0008) +[2023-10-09 07:53:55,786][60143] Updated weights for policy 0, policy_version 90322 (0.0007) +[2023-10-09 07:53:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 186023936. Throughput: 0: 1701.4, 1: 1720.2. Samples: 46515674. Policy #0 lag: (min: 26.0, avg: 36.6, max: 58.0) +[2023-10-09 07:53:56,053][59242] Avg episode reward: [(0, '34.100'), (1, '36.740')] +[2023-10-09 07:53:56,151][60143] Updated weights for policy 0, policy_version 90332 (0.0008) +[2023-10-09 07:53:58,787][60144] Updated weights for policy 1, policy_version 91362 (0.0009) +[2023-10-09 07:53:59,190][60144] Updated weights for policy 1, policy_version 91372 (0.0009) +[2023-10-09 07:53:59,551][60144] Updated weights for policy 1, policy_version 91382 (0.0008) +[2023-10-09 07:53:59,923][60144] Updated weights for policy 1, policy_version 91392 (0.0008) +[2023-10-09 07:54:00,264][60143] Updated weights for policy 0, policy_version 90342 (0.0009) +[2023-10-09 07:54:00,629][60143] Updated weights for policy 0, policy_version 90352 (0.0009) +[2023-10-09 07:54:01,010][60143] Updated weights for policy 0, policy_version 90362 (0.0012) +[2023-10-09 07:54:01,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 186089472. Throughput: 0: 1689.0, 1: 1705.4. Samples: 46535438. Policy #0 lag: (min: 26.0, avg: 36.6, max: 58.0) +[2023-10-09 07:54:01,054][59242] Avg episode reward: [(0, '34.620'), (1, '36.530')] +[2023-10-09 07:54:03,938][60144] Updated weights for policy 1, policy_version 91402 (0.0008) +[2023-10-09 07:54:04,305][60144] Updated weights for policy 1, policy_version 91412 (0.0008) +[2023-10-09 07:54:04,684][60144] Updated weights for policy 1, policy_version 91422 (0.0010) +[2023-10-09 07:54:05,049][60143] Updated weights for policy 0, policy_version 90372 (0.0009) +[2023-10-09 07:54:05,416][60143] Updated weights for policy 0, policy_version 90382 (0.0009) +[2023-10-09 07:54:05,786][60143] Updated weights for policy 0, policy_version 90392 (0.0009) +[2023-10-09 07:54:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 186155008. Throughput: 0: 1699.2, 1: 1735.1. Samples: 46546572. Policy #0 lag: (min: 26.0, avg: 36.6, max: 58.0) +[2023-10-09 07:54:06,053][59242] Avg episode reward: [(0, '35.180'), (1, '36.910')] +[2023-10-09 07:54:08,698][60144] Updated weights for policy 1, policy_version 91432 (0.0009) +[2023-10-09 07:54:09,063][60144] Updated weights for policy 1, policy_version 91442 (0.0007) +[2023-10-09 07:54:09,437][60144] Updated weights for policy 1, policy_version 91452 (0.0009) +[2023-10-09 07:54:09,936][60143] Updated weights for policy 0, policy_version 90402 (0.0009) +[2023-10-09 07:54:10,306][60143] Updated weights for policy 0, policy_version 90412 (0.0009) +[2023-10-09 07:54:10,672][60143] Updated weights for policy 0, policy_version 90422 (0.0009) +[2023-10-09 07:54:11,036][60143] Updated weights for policy 0, policy_version 90432 (0.0009) +[2023-10-09 07:54:11,052][59242] Fps is (10 sec: 16384.6, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 186253312. Throughput: 0: 1698.0, 1: 1705.6. Samples: 46566456. Policy #0 lag: (min: 26.0, avg: 36.6, max: 58.0) +[2023-10-09 07:54:11,052][59242] Avg episode reward: [(0, '35.710'), (1, '35.170')] +[2023-10-09 07:54:13,225][60144] Updated weights for policy 1, policy_version 91462 (0.0009) +[2023-10-09 07:54:13,595][60144] Updated weights for policy 1, policy_version 91472 (0.0010) +[2023-10-09 07:54:13,960][60144] Updated weights for policy 1, policy_version 91482 (0.0010) +[2023-10-09 07:54:15,046][60143] Updated weights for policy 0, policy_version 90442 (0.0007) +[2023-10-09 07:54:15,414][60143] Updated weights for policy 0, policy_version 90452 (0.0008) +[2023-10-09 07:54:15,780][60143] Updated weights for policy 0, policy_version 90462 (0.0008) +[2023-10-09 07:54:16,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 186318848. Throughput: 0: 1677.2, 1: 1717.3. Samples: 46586854. Policy #0 lag: (min: 26.0, avg: 36.6, max: 58.0) +[2023-10-09 07:54:16,053][59242] Avg episode reward: [(0, '35.630'), (1, '36.080')] +[2023-10-09 07:54:17,795][60144] Updated weights for policy 1, policy_version 91492 (0.0010) +[2023-10-09 07:54:18,165][60144] Updated weights for policy 1, policy_version 91502 (0.0008) +[2023-10-09 07:54:18,539][60144] Updated weights for policy 1, policy_version 91512 (0.0007) +[2023-10-09 07:54:19,624][60143] Updated weights for policy 0, policy_version 90472 (0.0008) +[2023-10-09 07:54:19,993][60143] Updated weights for policy 0, policy_version 90482 (0.0009) +[2023-10-09 07:54:20,372][60143] Updated weights for policy 0, policy_version 90492 (0.0010) +[2023-10-09 07:54:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 186384384. Throughput: 0: 1705.2, 1: 1720.0. Samples: 46597598. Policy #0 lag: (min: 26.0, avg: 36.6, max: 58.0) +[2023-10-09 07:54:21,052][59242] Avg episode reward: [(0, '36.500'), (1, '34.370')] +[2023-10-09 07:54:22,578][60144] Updated weights for policy 1, policy_version 91522 (0.0008) +[2023-10-09 07:54:22,944][60144] Updated weights for policy 1, policy_version 91532 (0.0007) +[2023-10-09 07:54:23,320][60144] Updated weights for policy 1, policy_version 91542 (0.0007) +[2023-10-09 07:54:23,680][60144] Updated weights for policy 1, policy_version 91552 (0.0008) +[2023-10-09 07:54:24,346][60143] Updated weights for policy 0, policy_version 90502 (0.0008) +[2023-10-09 07:54:24,719][60143] Updated weights for policy 0, policy_version 90512 (0.0008) +[2023-10-09 07:54:25,086][60143] Updated weights for policy 0, policy_version 90522 (0.0009) +[2023-10-09 07:54:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 186449920. Throughput: 0: 1699.2, 1: 1718.7. Samples: 46618186. Policy #0 lag: (min: 26.0, avg: 36.6, max: 58.0) +[2023-10-09 07:54:26,052][59242] Avg episode reward: [(0, '35.740'), (1, '34.000')] +[2023-10-09 07:54:27,541][60144] Updated weights for policy 1, policy_version 91562 (0.0007) +[2023-10-09 07:54:27,896][60144] Updated weights for policy 1, policy_version 91572 (0.0007) +[2023-10-09 07:54:28,268][60144] Updated weights for policy 1, policy_version 91582 (0.0009) +[2023-10-09 07:54:28,962][60143] Updated weights for policy 0, policy_version 90532 (0.0009) +[2023-10-09 07:54:29,337][60143] Updated weights for policy 0, policy_version 90542 (0.0010) +[2023-10-09 07:54:29,719][60143] Updated weights for policy 0, policy_version 90552 (0.0008) +[2023-10-09 07:54:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 186515456. Throughput: 0: 1682.2, 1: 1743.9. Samples: 46638908. Policy #0 lag: (min: 26.0, avg: 36.6, max: 58.0) +[2023-10-09 07:54:31,053][59242] Avg episode reward: [(0, '34.470'), (1, '34.800')] +[2023-10-09 07:54:32,031][60144] Updated weights for policy 1, policy_version 91592 (0.0010) +[2023-10-09 07:54:32,396][60144] Updated weights for policy 1, policy_version 91602 (0.0008) +[2023-10-09 07:54:32,776][60144] Updated weights for policy 1, policy_version 91612 (0.0008) +[2023-10-09 07:54:33,711][60143] Updated weights for policy 0, policy_version 90562 (0.0009) +[2023-10-09 07:54:34,080][60143] Updated weights for policy 0, policy_version 90572 (0.0008) +[2023-10-09 07:54:34,448][60143] Updated weights for policy 0, policy_version 90582 (0.0008) +[2023-10-09 07:54:34,822][60143] Updated weights for policy 0, policy_version 90592 (0.0007) +[2023-10-09 07:54:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 186580992. Throughput: 0: 1714.6, 1: 1720.4. Samples: 46649572. Policy #0 lag: (min: 26.0, avg: 36.6, max: 58.0) +[2023-10-09 07:54:36,052][59242] Avg episode reward: [(0, '34.550'), (1, '34.200')] +[2023-10-09 07:54:36,765][60144] Updated weights for policy 1, policy_version 91622 (0.0007) +[2023-10-09 07:54:37,138][60144] Updated weights for policy 1, policy_version 91632 (0.0009) +[2023-10-09 07:54:37,512][60144] Updated weights for policy 1, policy_version 91642 (0.0008) +[2023-10-09 07:54:38,906][60143] Updated weights for policy 0, policy_version 90602 (0.0008) +[2023-10-09 07:54:39,282][60143] Updated weights for policy 0, policy_version 90612 (0.0007) +[2023-10-09 07:54:39,650][60143] Updated weights for policy 0, policy_version 90622 (0.0008) +[2023-10-09 07:54:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 186646528. Throughput: 0: 1692.3, 1: 1733.2. Samples: 46669820. Policy #0 lag: (min: 26.0, avg: 36.6, max: 58.0) +[2023-10-09 07:54:41,053][59242] Avg episode reward: [(0, '35.710'), (1, '33.690')] +[2023-10-09 07:54:41,358][60144] Updated weights for policy 1, policy_version 91652 (0.0009) +[2023-10-09 07:54:41,722][60144] Updated weights for policy 1, policy_version 91662 (0.0008) +[2023-10-09 07:54:42,093][60144] Updated weights for policy 1, policy_version 91672 (0.0010) +[2023-10-09 07:54:43,473][60143] Updated weights for policy 0, policy_version 90632 (0.0008) +[2023-10-09 07:54:43,845][60143] Updated weights for policy 0, policy_version 90642 (0.0008) +[2023-10-09 07:54:44,220][60143] Updated weights for policy 0, policy_version 90652 (0.0009) +[2023-10-09 07:54:45,992][60144] Updated weights for policy 1, policy_version 91682 (0.0007) +[2023-10-09 07:54:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 186712064. Throughput: 0: 1700.0, 1: 1752.9. Samples: 46690816. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-09 07:54:46,052][59242] Avg episode reward: [(0, '34.500'), (1, '33.790')] +[2023-10-09 07:54:46,366][60144] Updated weights for policy 1, policy_version 91692 (0.0007) +[2023-10-09 07:54:46,730][60144] Updated weights for policy 1, policy_version 91702 (0.0007) +[2023-10-09 07:54:47,096][60144] Updated weights for policy 1, policy_version 91712 (0.0007) +[2023-10-09 07:54:48,162][60143] Updated weights for policy 0, policy_version 90662 (0.0009) +[2023-10-09 07:54:48,526][60143] Updated weights for policy 0, policy_version 90672 (0.0008) +[2023-10-09 07:54:48,903][60143] Updated weights for policy 0, policy_version 90682 (0.0007) +[2023-10-09 07:54:51,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 186777600. Throughput: 0: 1708.3, 1: 1719.1. Samples: 46700806. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-09 07:54:51,054][59242] Avg episode reward: [(0, '34.740'), (1, '33.700')] +[2023-10-09 07:54:51,121][60144] Updated weights for policy 1, policy_version 91722 (0.0009) +[2023-10-09 07:54:51,503][60144] Updated weights for policy 1, policy_version 91732 (0.0008) +[2023-10-09 07:54:51,865][60144] Updated weights for policy 1, policy_version 91742 (0.0008) +[2023-10-09 07:54:52,826][60143] Updated weights for policy 0, policy_version 90692 (0.0010) +[2023-10-09 07:54:53,198][60143] Updated weights for policy 0, policy_version 90702 (0.0009) +[2023-10-09 07:54:53,567][60143] Updated weights for policy 0, policy_version 90712 (0.0007) +[2023-10-09 07:54:55,809][60144] Updated weights for policy 1, policy_version 91752 (0.0009) +[2023-10-09 07:54:56,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 186843136. Throughput: 0: 1695.1, 1: 1747.5. Samples: 46721370. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-09 07:54:56,053][59242] Avg episode reward: [(0, '35.650'), (1, '33.570')] +[2023-10-09 07:54:56,176][60144] Updated weights for policy 1, policy_version 91762 (0.0008) +[2023-10-09 07:54:56,537][60144] Updated weights for policy 1, policy_version 91772 (0.0008) +[2023-10-09 07:54:57,575][60143] Updated weights for policy 0, policy_version 90722 (0.0008) +[2023-10-09 07:54:57,950][60143] Updated weights for policy 0, policy_version 90732 (0.0009) +[2023-10-09 07:54:58,310][60143] Updated weights for policy 0, policy_version 90742 (0.0010) +[2023-10-09 07:54:58,687][60143] Updated weights for policy 0, policy_version 90752 (0.0009) +[2023-10-09 07:55:00,470][60144] Updated weights for policy 1, policy_version 91782 (0.0009) +[2023-10-09 07:55:00,840][60144] Updated weights for policy 1, policy_version 91792 (0.0010) +[2023-10-09 07:55:01,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 186908672. Throughput: 0: 1709.7, 1: 1739.0. Samples: 46742048. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-09 07:55:01,052][59242] Avg episode reward: [(0, '35.210'), (1, '33.490')] +[2023-10-09 07:55:01,200][60144] Updated weights for policy 1, policy_version 91802 (0.0007) +[2023-10-09 07:55:02,863][60143] Updated weights for policy 0, policy_version 90762 (0.0008) +[2023-10-09 07:55:03,228][60143] Updated weights for policy 0, policy_version 90772 (0.0009) +[2023-10-09 07:55:03,601][60143] Updated weights for policy 0, policy_version 90782 (0.0009) +[2023-10-09 07:55:05,183][60144] Updated weights for policy 1, policy_version 91812 (0.0007) +[2023-10-09 07:55:05,545][60144] Updated weights for policy 1, policy_version 91822 (0.0008) +[2023-10-09 07:55:05,917][60144] Updated weights for policy 1, policy_version 91832 (0.0009) +[2023-10-09 07:55:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 186974208. Throughput: 0: 1692.2, 1: 1739.5. Samples: 46752022. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-09 07:55:06,052][59242] Avg episode reward: [(0, '35.160'), (1, '34.570')] +[2023-10-09 07:55:07,828][60143] Updated weights for policy 0, policy_version 90792 (0.0008) +[2023-10-09 07:55:08,201][60143] Updated weights for policy 0, policy_version 90802 (0.0009) +[2023-10-09 07:55:08,574][60143] Updated weights for policy 0, policy_version 90812 (0.0008) +[2023-10-09 07:55:09,829][60144] Updated weights for policy 1, policy_version 91842 (0.0008) +[2023-10-09 07:55:10,199][60144] Updated weights for policy 1, policy_version 91852 (0.0008) +[2023-10-09 07:55:10,574][60144] Updated weights for policy 1, policy_version 91862 (0.0008) +[2023-10-09 07:55:10,943][60144] Updated weights for policy 1, policy_version 91872 (0.0009) +[2023-10-09 07:55:11,052][59242] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 187072512. Throughput: 0: 1700.4, 1: 1740.7. Samples: 46773040. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-09 07:55:11,053][59242] Avg episode reward: [(0, '36.380'), (1, '35.010')] +[2023-10-09 07:55:12,379][60143] Updated weights for policy 0, policy_version 90822 (0.0008) +[2023-10-09 07:55:12,761][60143] Updated weights for policy 0, policy_version 90832 (0.0007) +[2023-10-09 07:55:13,123][60143] Updated weights for policy 0, policy_version 90842 (0.0007) +[2023-10-09 07:55:14,874][60144] Updated weights for policy 1, policy_version 91882 (0.0010) +[2023-10-09 07:55:15,245][60144] Updated weights for policy 1, policy_version 91892 (0.0009) +[2023-10-09 07:55:15,603][60144] Updated weights for policy 1, policy_version 91902 (0.0009) +[2023-10-09 07:55:16,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 187138048. Throughput: 0: 1719.0, 1: 1708.1. Samples: 46793128. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-09 07:55:16,053][59242] Avg episode reward: [(0, '35.620'), (1, '34.970')] +[2023-10-09 07:55:17,027][60143] Updated weights for policy 0, policy_version 90852 (0.0008) +[2023-10-09 07:55:17,395][60143] Updated weights for policy 0, policy_version 90862 (0.0011) +[2023-10-09 07:55:17,756][60143] Updated weights for policy 0, policy_version 90872 (0.0010) +[2023-10-09 07:55:19,462][60144] Updated weights for policy 1, policy_version 91912 (0.0007) +[2023-10-09 07:55:19,833][60144] Updated weights for policy 1, policy_version 91922 (0.0007) +[2023-10-09 07:55:20,199][60144] Updated weights for policy 1, policy_version 91932 (0.0007) +[2023-10-09 07:55:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 187203584. Throughput: 0: 1687.7, 1: 1737.1. Samples: 46803690. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-09 07:55:21,053][59242] Avg episode reward: [(0, '34.270'), (1, '34.940')] +[2023-10-09 07:55:21,598][60143] Updated weights for policy 0, policy_version 90882 (0.0011) +[2023-10-09 07:55:21,962][60143] Updated weights for policy 0, policy_version 90892 (0.0010) +[2023-10-09 07:55:22,329][60143] Updated weights for policy 0, policy_version 90902 (0.0011) +[2023-10-09 07:55:22,700][60143] Updated weights for policy 0, policy_version 90912 (0.0009) +[2023-10-09 07:55:24,218][60144] Updated weights for policy 1, policy_version 91942 (0.0008) +[2023-10-09 07:55:24,583][60144] Updated weights for policy 1, policy_version 91952 (0.0007) +[2023-10-09 07:55:24,954][60144] Updated weights for policy 1, policy_version 91962 (0.0008) +[2023-10-09 07:55:26,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 187269120. Throughput: 0: 1714.4, 1: 1725.2. Samples: 46824602. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-09 07:55:26,053][59242] Avg episode reward: [(0, '34.390'), (1, '34.720')] +[2023-10-09 07:55:26,612][60143] Updated weights for policy 0, policy_version 90922 (0.0008) +[2023-10-09 07:55:26,983][60143] Updated weights for policy 0, policy_version 90932 (0.0009) +[2023-10-09 07:55:27,348][60143] Updated weights for policy 0, policy_version 90942 (0.0009) +[2023-10-09 07:55:28,904][60144] Updated weights for policy 1, policy_version 91972 (0.0008) +[2023-10-09 07:55:29,267][60144] Updated weights for policy 1, policy_version 91982 (0.0007) +[2023-10-09 07:55:29,634][60144] Updated weights for policy 1, policy_version 91992 (0.0008) +[2023-10-09 07:55:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 187334656. Throughput: 0: 1724.7, 1: 1707.3. Samples: 46845258. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-09 07:55:31,053][59242] Avg episode reward: [(0, '33.690'), (1, '35.910')] +[2023-10-09 07:55:31,060][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000092000_94208000.pth... +[2023-10-09 07:55:31,095][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000090400_92569600.pth +[2023-10-09 07:55:31,400][60143] Updated weights for policy 0, policy_version 90952 (0.0009) +[2023-10-09 07:55:31,769][60143] Updated weights for policy 0, policy_version 90962 (0.0009) +[2023-10-09 07:55:32,154][60143] Updated weights for policy 0, policy_version 90972 (0.0008) +[2023-10-09 07:55:32,299][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000090976_93159424.pth... +[2023-10-09 07:55:32,334][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000089376_91521024.pth +[2023-10-09 07:55:33,637][60144] Updated weights for policy 1, policy_version 92002 (0.0008) +[2023-10-09 07:55:34,034][60144] Updated weights for policy 1, policy_version 92012 (0.0007) +[2023-10-09 07:55:34,412][60144] Updated weights for policy 1, policy_version 92022 (0.0009) +[2023-10-09 07:55:34,779][60144] Updated weights for policy 1, policy_version 92032 (0.0009) +[2023-10-09 07:55:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 187400192. Throughput: 0: 1705.8, 1: 1741.8. Samples: 46855950. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-09 07:55:36,052][59242] Avg episode reward: [(0, '33.390'), (1, '36.400')] +[2023-10-09 07:55:36,073][60143] Updated weights for policy 0, policy_version 90982 (0.0007) +[2023-10-09 07:55:36,436][60143] Updated weights for policy 0, policy_version 90992 (0.0008) +[2023-10-09 07:55:36,812][60143] Updated weights for policy 0, policy_version 91002 (0.0007) +[2023-10-09 07:55:38,600][60144] Updated weights for policy 1, policy_version 92042 (0.0007) +[2023-10-09 07:55:38,968][60144] Updated weights for policy 1, policy_version 92052 (0.0009) +[2023-10-09 07:55:39,332][60144] Updated weights for policy 1, policy_version 92062 (0.0009) +[2023-10-09 07:55:40,752][60143] Updated weights for policy 0, policy_version 91012 (0.0008) +[2023-10-09 07:55:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 187465728. Throughput: 0: 1724.8, 1: 1714.6. Samples: 46876142. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-09 07:55:41,053][59242] Avg episode reward: [(0, '34.500'), (1, '35.170')] +[2023-10-09 07:55:41,125][60143] Updated weights for policy 0, policy_version 91022 (0.0010) +[2023-10-09 07:55:41,488][60143] Updated weights for policy 0, policy_version 91032 (0.0010) +[2023-10-09 07:55:43,285][60144] Updated weights for policy 1, policy_version 92072 (0.0007) +[2023-10-09 07:55:43,649][60144] Updated weights for policy 1, policy_version 92082 (0.0008) +[2023-10-09 07:55:44,017][60144] Updated weights for policy 1, policy_version 92092 (0.0008) +[2023-10-09 07:55:45,431][60143] Updated weights for policy 0, policy_version 91042 (0.0010) +[2023-10-09 07:55:45,797][60143] Updated weights for policy 0, policy_version 91052 (0.0009) +[2023-10-09 07:55:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 187531264. Throughput: 0: 1728.3, 1: 1723.6. Samples: 46897386. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-09 07:55:46,053][59242] Avg episode reward: [(0, '34.280'), (1, '34.610')] +[2023-10-09 07:55:46,172][60143] Updated weights for policy 0, policy_version 91062 (0.0008) +[2023-10-09 07:55:46,537][60143] Updated weights for policy 0, policy_version 91072 (0.0008) +[2023-10-09 07:55:47,667][60144] Updated weights for policy 1, policy_version 92102 (0.0008) +[2023-10-09 07:55:48,043][60144] Updated weights for policy 1, policy_version 92112 (0.0011) +[2023-10-09 07:55:48,410][60144] Updated weights for policy 1, policy_version 92122 (0.0009) +[2023-10-09 07:55:50,586][60143] Updated weights for policy 0, policy_version 91082 (0.0009) +[2023-10-09 07:55:50,950][60143] Updated weights for policy 0, policy_version 91092 (0.0008) +[2023-10-09 07:55:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 187596800. Throughput: 0: 1728.3, 1: 1723.3. Samples: 46907344. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-09 07:55:51,053][59242] Avg episode reward: [(0, '31.940'), (1, '35.200')] +[2023-10-09 07:55:51,324][60143] Updated weights for policy 0, policy_version 91102 (0.0009) +[2023-10-09 07:55:52,294][60144] Updated weights for policy 1, policy_version 92132 (0.0009) +[2023-10-09 07:55:52,664][60144] Updated weights for policy 1, policy_version 92142 (0.0009) +[2023-10-09 07:55:53,031][60144] Updated weights for policy 1, policy_version 92152 (0.0009) +[2023-10-09 07:55:55,379][60143] Updated weights for policy 0, policy_version 91112 (0.0009) +[2023-10-09 07:55:55,743][60143] Updated weights for policy 0, policy_version 91122 (0.0008) +[2023-10-09 07:55:56,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 187662336. Throughput: 0: 1728.2, 1: 1721.7. Samples: 46928284. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-09 07:55:56,052][59242] Avg episode reward: [(0, '32.960'), (1, '35.420')] +[2023-10-09 07:55:56,117][60143] Updated weights for policy 0, policy_version 91132 (0.0007) +[2023-10-09 07:55:57,105][60144] Updated weights for policy 1, policy_version 92162 (0.0007) +[2023-10-09 07:55:57,482][60144] Updated weights for policy 1, policy_version 92172 (0.0009) +[2023-10-09 07:55:57,841][60144] Updated weights for policy 1, policy_version 92182 (0.0008) +[2023-10-09 07:55:58,199][60144] Updated weights for policy 1, policy_version 92192 (0.0008) +[2023-10-09 07:55:59,988][60143] Updated weights for policy 0, policy_version 91142 (0.0007) +[2023-10-09 07:56:00,367][60143] Updated weights for policy 0, policy_version 91152 (0.0008) +[2023-10-09 07:56:00,739][60143] Updated weights for policy 0, policy_version 91162 (0.0007) +[2023-10-09 07:56:01,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 187760640. Throughput: 0: 1716.1, 1: 1748.4. Samples: 46949034. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-09 07:56:01,053][59242] Avg episode reward: [(0, '33.180'), (1, '35.440')] +[2023-10-09 07:56:02,329][60144] Updated weights for policy 1, policy_version 92202 (0.0007) +[2023-10-09 07:56:02,693][60144] Updated weights for policy 1, policy_version 92212 (0.0009) +[2023-10-09 07:56:03,066][60144] Updated weights for policy 1, policy_version 92222 (0.0009) +[2023-10-09 07:56:04,717][60143] Updated weights for policy 0, policy_version 91172 (0.0008) +[2023-10-09 07:56:05,096][60143] Updated weights for policy 0, policy_version 91182 (0.0008) +[2023-10-09 07:56:05,470][60143] Updated weights for policy 0, policy_version 91192 (0.0007) +[2023-10-09 07:56:06,052][59242] Fps is (10 sec: 16383.4, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 187826176. Throughput: 0: 1734.8, 1: 1717.6. Samples: 46959048. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-09 07:56:06,053][59242] Avg episode reward: [(0, '33.120'), (1, '34.840')] +[2023-10-09 07:56:06,824][60144] Updated weights for policy 1, policy_version 92232 (0.0010) +[2023-10-09 07:56:07,200][60144] Updated weights for policy 1, policy_version 92242 (0.0008) +[2023-10-09 07:56:07,567][60144] Updated weights for policy 1, policy_version 92252 (0.0007) +[2023-10-09 07:56:09,455][60143] Updated weights for policy 0, policy_version 91202 (0.0008) +[2023-10-09 07:56:09,822][60143] Updated weights for policy 0, policy_version 91212 (0.0007) +[2023-10-09 07:56:10,190][60143] Updated weights for policy 0, policy_version 91222 (0.0009) +[2023-10-09 07:56:10,566][60143] Updated weights for policy 0, policy_version 91232 (0.0009) +[2023-10-09 07:56:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 187891712. Throughput: 0: 1723.0, 1: 1733.1. Samples: 46980126. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-09 07:56:11,052][59242] Avg episode reward: [(0, '33.710'), (1, '34.160')] +[2023-10-09 07:56:11,417][60144] Updated weights for policy 1, policy_version 92262 (0.0009) +[2023-10-09 07:56:11,788][60144] Updated weights for policy 1, policy_version 92272 (0.0009) +[2023-10-09 07:56:12,159][60144] Updated weights for policy 1, policy_version 92282 (0.0007) +[2023-10-09 07:56:14,346][60143] Updated weights for policy 0, policy_version 91242 (0.0008) +[2023-10-09 07:56:14,721][60143] Updated weights for policy 0, policy_version 91252 (0.0008) +[2023-10-09 07:56:15,097][60143] Updated weights for policy 0, policy_version 91262 (0.0008) +[2023-10-09 07:56:16,046][60144] Updated weights for policy 1, policy_version 92292 (0.0007) +[2023-10-09 07:56:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 187957248. Throughput: 0: 1693.8, 1: 1756.4. Samples: 47000518. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-09 07:56:16,053][59242] Avg episode reward: [(0, '32.510'), (1, '34.220')] +[2023-10-09 07:56:16,418][60144] Updated weights for policy 1, policy_version 92302 (0.0007) +[2023-10-09 07:56:16,785][60144] Updated weights for policy 1, policy_version 92312 (0.0007) +[2023-10-09 07:56:19,093][60143] Updated weights for policy 0, policy_version 91272 (0.0010) +[2023-10-09 07:56:19,457][60143] Updated weights for policy 0, policy_version 91282 (0.0008) +[2023-10-09 07:56:19,835][60143] Updated weights for policy 0, policy_version 91292 (0.0007) +[2023-10-09 07:56:20,739][60144] Updated weights for policy 1, policy_version 92322 (0.0009) +[2023-10-09 07:56:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 188022784. Throughput: 0: 1725.1, 1: 1722.9. Samples: 47011110. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-09 07:56:21,053][59242] Avg episode reward: [(0, '32.560'), (1, '35.160')] +[2023-10-09 07:56:21,157][60144] Updated weights for policy 1, policy_version 92332 (0.0009) +[2023-10-09 07:56:21,514][60144] Updated weights for policy 1, policy_version 92342 (0.0008) +[2023-10-09 07:56:21,882][60144] Updated weights for policy 1, policy_version 92352 (0.0011) +[2023-10-09 07:56:23,778][60143] Updated weights for policy 0, policy_version 91302 (0.0008) +[2023-10-09 07:56:24,145][60143] Updated weights for policy 0, policy_version 91312 (0.0008) +[2023-10-09 07:56:24,524][60143] Updated weights for policy 0, policy_version 91322 (0.0010) +[2023-10-09 07:56:25,558][60144] Updated weights for policy 1, policy_version 92362 (0.0007) +[2023-10-09 07:56:25,927][60144] Updated weights for policy 1, policy_version 92372 (0.0007) +[2023-10-09 07:56:26,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 188088320. Throughput: 0: 1700.3, 1: 1755.2. Samples: 47031642. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-09 07:56:26,052][59242] Avg episode reward: [(0, '33.750'), (1, '36.030')] +[2023-10-09 07:56:26,289][60144] Updated weights for policy 1, policy_version 92382 (0.0008) +[2023-10-09 07:56:28,522][60143] Updated weights for policy 0, policy_version 91332 (0.0008) +[2023-10-09 07:56:28,888][60143] Updated weights for policy 0, policy_version 91342 (0.0009) +[2023-10-09 07:56:29,256][60143] Updated weights for policy 0, policy_version 91352 (0.0007) +[2023-10-09 07:56:30,282][60144] Updated weights for policy 1, policy_version 92392 (0.0008) +[2023-10-09 07:56:30,650][60144] Updated weights for policy 1, policy_version 92402 (0.0008) +[2023-10-09 07:56:31,015][60144] Updated weights for policy 1, policy_version 92412 (0.0009) +[2023-10-09 07:56:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 188153856. Throughput: 0: 1696.0, 1: 1743.4. Samples: 47052158. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-09 07:56:31,052][59242] Avg episode reward: [(0, '35.350'), (1, '35.620')] +[2023-10-09 07:56:33,183][60143] Updated weights for policy 0, policy_version 91362 (0.0009) +[2023-10-09 07:56:33,546][60143] Updated weights for policy 0, policy_version 91372 (0.0007) +[2023-10-09 07:56:33,915][60143] Updated weights for policy 0, policy_version 91382 (0.0007) +[2023-10-09 07:56:34,279][60143] Updated weights for policy 0, policy_version 91392 (0.0008) +[2023-10-09 07:56:34,740][60144] Updated weights for policy 1, policy_version 92422 (0.0008) +[2023-10-09 07:56:35,109][60144] Updated weights for policy 1, policy_version 92432 (0.0007) +[2023-10-09 07:56:35,471][60144] Updated weights for policy 1, policy_version 92442 (0.0009) +[2023-10-09 07:56:36,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 188252160. Throughput: 0: 1712.8, 1: 1753.1. Samples: 47063312. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-09 07:56:36,053][59242] Avg episode reward: [(0, '35.280'), (1, '34.140')] +[2023-10-09 07:56:38,340][60143] Updated weights for policy 0, policy_version 91402 (0.0008) +[2023-10-09 07:56:38,710][60143] Updated weights for policy 0, policy_version 91412 (0.0007) +[2023-10-09 07:56:39,087][60143] Updated weights for policy 0, policy_version 91422 (0.0008) +[2023-10-09 07:56:39,537][60144] Updated weights for policy 1, policy_version 92452 (0.0011) +[2023-10-09 07:56:39,901][60144] Updated weights for policy 1, policy_version 92462 (0.0009) +[2023-10-09 07:56:40,269][60144] Updated weights for policy 1, policy_version 92472 (0.0009) +[2023-10-09 07:56:41,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 188317696. Throughput: 0: 1698.3, 1: 1747.5. Samples: 47083344. Policy #0 lag: (min: 31.0, avg: 31.6, max: 48.0) +[2023-10-09 07:56:41,053][59242] Avg episode reward: [(0, '36.360'), (1, '33.200')] +[2023-10-09 07:56:43,079][60143] Updated weights for policy 0, policy_version 91432 (0.0008) +[2023-10-09 07:56:43,455][60143] Updated weights for policy 0, policy_version 91442 (0.0007) +[2023-10-09 07:56:43,820][60143] Updated weights for policy 0, policy_version 91452 (0.0011) +[2023-10-09 07:56:44,451][60144] Updated weights for policy 1, policy_version 92482 (0.0008) +[2023-10-09 07:56:44,816][60144] Updated weights for policy 1, policy_version 92492 (0.0007) +[2023-10-09 07:56:45,198][60144] Updated weights for policy 1, policy_version 92502 (0.0008) +[2023-10-09 07:56:45,556][60144] Updated weights for policy 1, policy_version 92512 (0.0010) +[2023-10-09 07:56:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 188383232. Throughput: 0: 1710.5, 1: 1720.2. Samples: 47103414. Policy #0 lag: (min: 31.0, avg: 31.6, max: 48.0) +[2023-10-09 07:56:46,053][59242] Avg episode reward: [(0, '35.550'), (1, '33.450')] +[2023-10-09 07:56:47,806][60143] Updated weights for policy 0, policy_version 91462 (0.0010) +[2023-10-09 07:56:48,176][60143] Updated weights for policy 0, policy_version 91472 (0.0010) +[2023-10-09 07:56:48,550][60143] Updated weights for policy 0, policy_version 91482 (0.0011) +[2023-10-09 07:56:49,431][60144] Updated weights for policy 1, policy_version 92522 (0.0007) +[2023-10-09 07:56:49,805][60144] Updated weights for policy 1, policy_version 92532 (0.0009) +[2023-10-09 07:56:50,175][60144] Updated weights for policy 1, policy_version 92542 (0.0009) +[2023-10-09 07:56:51,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 188448768. Throughput: 0: 1699.5, 1: 1752.7. Samples: 47114398. Policy #0 lag: (min: 31.0, avg: 31.6, max: 48.0) +[2023-10-09 07:56:51,052][59242] Avg episode reward: [(0, '36.170'), (1, '32.420')] +[2023-10-09 07:56:52,415][60143] Updated weights for policy 0, policy_version 91492 (0.0009) +[2023-10-09 07:56:52,795][60143] Updated weights for policy 0, policy_version 91502 (0.0009) +[2023-10-09 07:56:53,160][60143] Updated weights for policy 0, policy_version 91512 (0.0009) +[2023-10-09 07:56:54,117][60144] Updated weights for policy 1, policy_version 92552 (0.0011) +[2023-10-09 07:56:54,481][60144] Updated weights for policy 1, policy_version 92562 (0.0009) +[2023-10-09 07:56:54,849][60144] Updated weights for policy 1, policy_version 92572 (0.0008) +[2023-10-09 07:56:56,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 188514304. Throughput: 0: 1699.5, 1: 1733.4. Samples: 47134604. Policy #0 lag: (min: 31.0, avg: 31.6, max: 48.0) +[2023-10-09 07:56:56,053][59242] Avg episode reward: [(0, '36.320'), (1, '31.400')] +[2023-10-09 07:56:57,155][60143] Updated weights for policy 0, policy_version 91522 (0.0008) +[2023-10-09 07:56:57,528][60143] Updated weights for policy 0, policy_version 91532 (0.0007) +[2023-10-09 07:56:57,897][60143] Updated weights for policy 0, policy_version 91542 (0.0007) +[2023-10-09 07:56:58,262][60143] Updated weights for policy 0, policy_version 91552 (0.0009) +[2023-10-09 07:56:58,651][60144] Updated weights for policy 1, policy_version 92582 (0.0009) +[2023-10-09 07:56:59,014][60144] Updated weights for policy 1, policy_version 92592 (0.0009) +[2023-10-09 07:56:59,378][60144] Updated weights for policy 1, policy_version 92602 (0.0008) +[2023-10-09 07:57:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 188579840. Throughput: 0: 1726.4, 1: 1718.1. Samples: 47155516. Policy #0 lag: (min: 31.0, avg: 31.6, max: 48.0) +[2023-10-09 07:57:01,053][59242] Avg episode reward: [(0, '35.690'), (1, '30.120')] +[2023-10-09 07:57:02,283][60143] Updated weights for policy 0, policy_version 91562 (0.0010) +[2023-10-09 07:57:02,647][60143] Updated weights for policy 0, policy_version 91572 (0.0008) +[2023-10-09 07:57:03,023][60143] Updated weights for policy 0, policy_version 91582 (0.0008) +[2023-10-09 07:57:03,358][60144] Updated weights for policy 1, policy_version 92612 (0.0008) +[2023-10-09 07:57:03,717][60144] Updated weights for policy 1, policy_version 92622 (0.0009) +[2023-10-09 07:57:04,085][60144] Updated weights for policy 1, policy_version 92632 (0.0010) +[2023-10-09 07:57:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 188645376. Throughput: 0: 1695.1, 1: 1741.1. Samples: 47165738. Policy #0 lag: (min: 31.0, avg: 31.6, max: 48.0) +[2023-10-09 07:57:06,053][59242] Avg episode reward: [(0, '37.810'), (1, '30.740')] +[2023-10-09 07:57:07,070][60143] Updated weights for policy 0, policy_version 91592 (0.0009) +[2023-10-09 07:57:07,446][60143] Updated weights for policy 0, policy_version 91602 (0.0011) +[2023-10-09 07:57:07,812][60143] Updated weights for policy 0, policy_version 91612 (0.0011) +[2023-10-09 07:57:08,186][60144] Updated weights for policy 1, policy_version 92642 (0.0009) +[2023-10-09 07:57:08,551][60144] Updated weights for policy 1, policy_version 92652 (0.0007) +[2023-10-09 07:57:08,914][60144] Updated weights for policy 1, policy_version 92662 (0.0007) +[2023-10-09 07:57:09,284][60144] Updated weights for policy 1, policy_version 92672 (0.0008) +[2023-10-09 07:57:11,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 188710912. Throughput: 0: 1720.3, 1: 1712.4. Samples: 47186112. Policy #0 lag: (min: 31.0, avg: 31.6, max: 48.0) +[2023-10-09 07:57:11,053][59242] Avg episode reward: [(0, '37.310'), (1, '32.100')] +[2023-10-09 07:57:11,760][60143] Updated weights for policy 0, policy_version 91622 (0.0009) +[2023-10-09 07:57:12,131][60143] Updated weights for policy 0, policy_version 91632 (0.0010) +[2023-10-09 07:57:12,505][60143] Updated weights for policy 0, policy_version 91642 (0.0010) +[2023-10-09 07:57:13,200][60144] Updated weights for policy 1, policy_version 92682 (0.0011) +[2023-10-09 07:57:13,570][60144] Updated weights for policy 1, policy_version 92692 (0.0010) +[2023-10-09 07:57:13,940][60144] Updated weights for policy 1, policy_version 92702 (0.0009) +[2023-10-09 07:57:16,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 188776448. Throughput: 0: 1728.3, 1: 1725.1. Samples: 47207560. Policy #0 lag: (min: 31.0, avg: 31.6, max: 48.0) +[2023-10-09 07:57:16,052][59242] Avg episode reward: [(0, '37.670'), (1, '31.500')] +[2023-10-09 07:57:16,497][60143] Updated weights for policy 0, policy_version 91652 (0.0008) +[2023-10-09 07:57:16,861][60143] Updated weights for policy 0, policy_version 91662 (0.0009) +[2023-10-09 07:57:17,234][60143] Updated weights for policy 0, policy_version 91672 (0.0008) +[2023-10-09 07:57:17,623][60144] Updated weights for policy 1, policy_version 92712 (0.0009) +[2023-10-09 07:57:17,988][60144] Updated weights for policy 1, policy_version 92722 (0.0008) +[2023-10-09 07:57:18,352][60144] Updated weights for policy 1, policy_version 92732 (0.0009) +[2023-10-09 07:57:21,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 188841984. Throughput: 0: 1705.9, 1: 1712.0. Samples: 47217118. Policy #0 lag: (min: 31.0, avg: 31.6, max: 48.0) +[2023-10-09 07:57:21,053][59242] Avg episode reward: [(0, '36.100'), (1, '31.350')] +[2023-10-09 07:57:21,152][60143] Updated weights for policy 0, policy_version 91682 (0.0008) +[2023-10-09 07:57:21,528][60143] Updated weights for policy 0, policy_version 91692 (0.0007) +[2023-10-09 07:57:21,899][60143] Updated weights for policy 0, policy_version 91702 (0.0009) +[2023-10-09 07:57:22,264][60143] Updated weights for policy 0, policy_version 91712 (0.0009) +[2023-10-09 07:57:22,354][60144] Updated weights for policy 1, policy_version 92742 (0.0009) +[2023-10-09 07:57:22,730][60144] Updated weights for policy 1, policy_version 92752 (0.0007) +[2023-10-09 07:57:23,098][60144] Updated weights for policy 1, policy_version 92762 (0.0008) +[2023-10-09 07:57:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 188907520. Throughput: 0: 1724.8, 1: 1719.0. Samples: 47238314. Policy #0 lag: (min: 31.0, avg: 31.6, max: 48.0) +[2023-10-09 07:57:26,052][59242] Avg episode reward: [(0, '36.730'), (1, '31.760')] +[2023-10-09 07:57:26,210][60143] Updated weights for policy 0, policy_version 91722 (0.0010) +[2023-10-09 07:57:26,589][60143] Updated weights for policy 0, policy_version 91732 (0.0010) +[2023-10-09 07:57:26,956][60143] Updated weights for policy 0, policy_version 91742 (0.0008) +[2023-10-09 07:57:27,049][60144] Updated weights for policy 1, policy_version 92772 (0.0007) +[2023-10-09 07:57:27,419][60144] Updated weights for policy 1, policy_version 92782 (0.0008) +[2023-10-09 07:57:27,788][60144] Updated weights for policy 1, policy_version 92792 (0.0008) +[2023-10-09 07:57:30,966][60143] Updated weights for policy 0, policy_version 91752 (0.0007) +[2023-10-09 07:57:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 188973056. Throughput: 0: 1722.9, 1: 1745.3. Samples: 47259484. Policy #0 lag: (min: 31.0, avg: 31.6, max: 48.0) +[2023-10-09 07:57:31,053][59242] Avg episode reward: [(0, '37.460'), (1, '31.160')] +[2023-10-09 07:57:31,064][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000092800_95027200.pth... +[2023-10-09 07:57:31,096][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000091200_93388800.pth +[2023-10-09 07:57:31,333][60143] Updated weights for policy 0, policy_version 91762 (0.0008) +[2023-10-09 07:57:31,714][60143] Updated weights for policy 0, policy_version 91772 (0.0008) +[2023-10-09 07:57:31,782][60144] Updated weights for policy 1, policy_version 92802 (0.0008) +[2023-10-09 07:57:31,853][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000091776_93978624.pth... +[2023-10-09 07:57:31,882][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000090176_92340224.pth +[2023-10-09 07:57:32,156][60144] Updated weights for policy 1, policy_version 92812 (0.0010) +[2023-10-09 07:57:32,516][60144] Updated weights for policy 1, policy_version 92822 (0.0010) +[2023-10-09 07:57:32,876][60144] Updated weights for policy 1, policy_version 92832 (0.0007) +[2023-10-09 07:57:35,513][60143] Updated weights for policy 0, policy_version 91782 (0.0009) +[2023-10-09 07:57:35,892][60143] Updated weights for policy 0, policy_version 91792 (0.0010) +[2023-10-09 07:57:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 189038592. Throughput: 0: 1718.3, 1: 1712.8. Samples: 47268798. Policy #0 lag: (min: 31.0, avg: 31.6, max: 48.0) +[2023-10-09 07:57:36,053][59242] Avg episode reward: [(0, '39.470'), (1, '32.180')] +[2023-10-09 07:57:36,265][60143] Updated weights for policy 0, policy_version 91802 (0.0011) +[2023-10-09 07:57:36,743][60144] Updated weights for policy 1, policy_version 92842 (0.0008) +[2023-10-09 07:57:37,119][60144] Updated weights for policy 1, policy_version 92852 (0.0008) +[2023-10-09 07:57:37,485][60144] Updated weights for policy 1, policy_version 92862 (0.0008) +[2023-10-09 07:57:40,234][60143] Updated weights for policy 0, policy_version 91812 (0.0007) +[2023-10-09 07:57:40,606][60143] Updated weights for policy 0, policy_version 91822 (0.0011) +[2023-10-09 07:57:40,978][60143] Updated weights for policy 0, policy_version 91832 (0.0009) +[2023-10-09 07:57:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 189104128. Throughput: 0: 1728.1, 1: 1729.5. Samples: 47290194. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:57:41,053][59242] Avg episode reward: [(0, '39.070'), (1, '32.760')] +[2023-10-09 07:57:41,461][60144] Updated weights for policy 1, policy_version 92872 (0.0010) +[2023-10-09 07:57:41,828][60144] Updated weights for policy 1, policy_version 92882 (0.0009) +[2023-10-09 07:57:42,201][60144] Updated weights for policy 1, policy_version 92892 (0.0009) +[2023-10-09 07:57:45,268][60143] Updated weights for policy 0, policy_version 91842 (0.0007) +[2023-10-09 07:57:45,642][60143] Updated weights for policy 0, policy_version 91852 (0.0007) +[2023-10-09 07:57:46,021][60143] Updated weights for policy 0, policy_version 91862 (0.0008) +[2023-10-09 07:57:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 189169664. Throughput: 0: 1714.5, 1: 1740.2. Samples: 47310978. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:57:46,053][59242] Avg episode reward: [(0, '38.820'), (1, '33.040')] +[2023-10-09 07:57:46,077][60144] Updated weights for policy 1, policy_version 92902 (0.0008) +[2023-10-09 07:57:46,385][60143] Updated weights for policy 0, policy_version 91872 (0.0008) +[2023-10-09 07:57:46,453][60144] Updated weights for policy 1, policy_version 92912 (0.0007) +[2023-10-09 07:57:46,827][60144] Updated weights for policy 1, policy_version 92922 (0.0008) +[2023-10-09 07:57:50,483][60143] Updated weights for policy 0, policy_version 91882 (0.0009) +[2023-10-09 07:57:50,842][60143] Updated weights for policy 0, policy_version 91892 (0.0010) +[2023-10-09 07:57:50,864][60144] Updated weights for policy 1, policy_version 92932 (0.0009) +[2023-10-09 07:57:51,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 189235200. Throughput: 0: 1721.7, 1: 1718.0. Samples: 47320526. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:57:51,052][59242] Avg episode reward: [(0, '39.670'), (1, '34.930')] +[2023-10-09 07:57:51,213][60143] Updated weights for policy 0, policy_version 91902 (0.0011) +[2023-10-09 07:57:51,235][60144] Updated weights for policy 1, policy_version 92942 (0.0007) +[2023-10-09 07:57:51,596][60144] Updated weights for policy 1, policy_version 92952 (0.0009) +[2023-10-09 07:57:55,183][60143] Updated weights for policy 0, policy_version 91912 (0.0009) +[2023-10-09 07:57:55,563][60143] Updated weights for policy 0, policy_version 91922 (0.0009) +[2023-10-09 07:57:55,589][60144] Updated weights for policy 1, policy_version 92962 (0.0010) +[2023-10-09 07:57:55,926][60143] Updated weights for policy 0, policy_version 91932 (0.0008) +[2023-10-09 07:57:55,954][60144] Updated weights for policy 1, policy_version 92972 (0.0009) +[2023-10-09 07:57:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 189300736. Throughput: 0: 1720.8, 1: 1737.2. Samples: 47341720. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:57:56,052][59242] Avg episode reward: [(0, '39.120'), (1, '33.700')] +[2023-10-09 07:57:56,326][60144] Updated weights for policy 1, policy_version 92982 (0.0009) +[2023-10-09 07:57:56,687][60144] Updated weights for policy 1, policy_version 92992 (0.0008) +[2023-10-09 07:57:59,857][60143] Updated weights for policy 0, policy_version 91942 (0.0007) +[2023-10-09 07:58:00,233][60143] Updated weights for policy 0, policy_version 91952 (0.0007) +[2023-10-09 07:58:00,599][60143] Updated weights for policy 0, policy_version 91962 (0.0008) +[2023-10-09 07:58:00,604][60144] Updated weights for policy 1, policy_version 93002 (0.0009) +[2023-10-09 07:58:00,968][60144] Updated weights for policy 1, policy_version 93012 (0.0007) +[2023-10-09 07:58:01,052][59242] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 189399040. Throughput: 0: 1699.0, 1: 1725.1. Samples: 47361642. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:58:01,053][59242] Avg episode reward: [(0, '38.680'), (1, '33.310')] +[2023-10-09 07:58:01,339][60144] Updated weights for policy 1, policy_version 93022 (0.0007) +[2023-10-09 07:58:04,384][60143] Updated weights for policy 0, policy_version 91972 (0.0010) +[2023-10-09 07:58:04,758][60143] Updated weights for policy 0, policy_version 91982 (0.0010) +[2023-10-09 07:58:05,125][60143] Updated weights for policy 0, policy_version 91992 (0.0008) +[2023-10-09 07:58:05,306][60144] Updated weights for policy 1, policy_version 93032 (0.0008) +[2023-10-09 07:58:05,672][60144] Updated weights for policy 1, policy_version 93042 (0.0009) +[2023-10-09 07:58:06,046][60144] Updated weights for policy 1, policy_version 93052 (0.0009) +[2023-10-09 07:58:06,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 189464576. Throughput: 0: 1719.5, 1: 1727.1. Samples: 47372214. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:58:06,053][59242] Avg episode reward: [(0, '37.080'), (1, '34.210')] +[2023-10-09 07:58:08,999][60143] Updated weights for policy 0, policy_version 92002 (0.0009) +[2023-10-09 07:58:09,369][60143] Updated weights for policy 0, policy_version 92012 (0.0007) +[2023-10-09 07:58:09,733][60143] Updated weights for policy 0, policy_version 92022 (0.0009) +[2023-10-09 07:58:10,048][60144] Updated weights for policy 1, policy_version 93062 (0.0009) +[2023-10-09 07:58:10,103][60143] Updated weights for policy 0, policy_version 92032 (0.0007) +[2023-10-09 07:58:10,419][60144] Updated weights for policy 1, policy_version 93072 (0.0009) +[2023-10-09 07:58:10,780][60144] Updated weights for policy 1, policy_version 93082 (0.0008) +[2023-10-09 07:58:11,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 189562880. Throughput: 0: 1706.3, 1: 1724.8. Samples: 47392718. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:58:11,053][59242] Avg episode reward: [(0, '37.020'), (1, '33.660')] +[2023-10-09 07:58:13,966][60143] Updated weights for policy 0, policy_version 92042 (0.0009) +[2023-10-09 07:58:14,343][60143] Updated weights for policy 0, policy_version 92052 (0.0008) +[2023-10-09 07:58:14,710][60143] Updated weights for policy 0, policy_version 92062 (0.0007) +[2023-10-09 07:58:14,767][60144] Updated weights for policy 1, policy_version 93092 (0.0007) +[2023-10-09 07:58:15,133][60144] Updated weights for policy 1, policy_version 93102 (0.0010) +[2023-10-09 07:58:15,504][60144] Updated weights for policy 1, policy_version 93112 (0.0008) +[2023-10-09 07:58:16,052][59242] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 189628416. Throughput: 0: 1698.0, 1: 1702.9. Samples: 47412528. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:58:16,053][59242] Avg episode reward: [(0, '36.440'), (1, '33.800')] +[2023-10-09 07:58:18,654][60143] Updated weights for policy 0, policy_version 92072 (0.0010) +[2023-10-09 07:58:19,030][60143] Updated weights for policy 0, policy_version 92082 (0.0009) +[2023-10-09 07:58:19,121][60144] Updated weights for policy 1, policy_version 93122 (0.0008) +[2023-10-09 07:58:19,387][60143] Updated weights for policy 0, policy_version 92092 (0.0010) +[2023-10-09 07:58:19,485][60144] Updated weights for policy 1, policy_version 93132 (0.0008) +[2023-10-09 07:58:19,858][60144] Updated weights for policy 1, policy_version 93142 (0.0008) +[2023-10-09 07:58:20,224][60144] Updated weights for policy 1, policy_version 93152 (0.0008) +[2023-10-09 07:58:21,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 189693952. Throughput: 0: 1722.2, 1: 1731.8. Samples: 47424228. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:58:21,052][59242] Avg episode reward: [(0, '37.340'), (1, '33.730')] +[2023-10-09 07:58:23,349][60143] Updated weights for policy 0, policy_version 92102 (0.0009) +[2023-10-09 07:58:23,729][60143] Updated weights for policy 0, policy_version 92112 (0.0007) +[2023-10-09 07:58:24,106][60143] Updated weights for policy 0, policy_version 92122 (0.0008) +[2023-10-09 07:58:24,117][60144] Updated weights for policy 1, policy_version 93162 (0.0009) +[2023-10-09 07:58:24,484][60144] Updated weights for policy 1, policy_version 93172 (0.0010) +[2023-10-09 07:58:24,843][60144] Updated weights for policy 1, policy_version 93182 (0.0007) +[2023-10-09 07:58:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13884.8). Total num frames: 189759488. Throughput: 0: 1689.7, 1: 1717.2. Samples: 47443506. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:58:26,053][59242] Avg episode reward: [(0, '37.660'), (1, '33.400')] +[2023-10-09 07:58:28,270][60143] Updated weights for policy 0, policy_version 92132 (0.0008) +[2023-10-09 07:58:28,641][60143] Updated weights for policy 0, policy_version 92142 (0.0008) +[2023-10-09 07:58:28,886][60144] Updated weights for policy 1, policy_version 93192 (0.0008) +[2023-10-09 07:58:28,999][60143] Updated weights for policy 0, policy_version 92152 (0.0008) +[2023-10-09 07:58:29,252][60144] Updated weights for policy 1, policy_version 93202 (0.0007) +[2023-10-09 07:58:29,610][60144] Updated weights for policy 1, policy_version 93212 (0.0007) +[2023-10-09 07:58:31,053][59242] Fps is (10 sec: 13106.5, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 189825024. Throughput: 0: 1697.4, 1: 1709.0. Samples: 47464270. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:58:31,054][59242] Avg episode reward: [(0, '37.100'), (1, '33.210')] +[2023-10-09 07:58:33,029][60143] Updated weights for policy 0, policy_version 92162 (0.0008) +[2023-10-09 07:58:33,401][60143] Updated weights for policy 0, policy_version 92172 (0.0007) +[2023-10-09 07:58:33,510][60144] Updated weights for policy 1, policy_version 93222 (0.0007) +[2023-10-09 07:58:33,767][60143] Updated weights for policy 0, policy_version 92182 (0.0008) +[2023-10-09 07:58:33,877][60144] Updated weights for policy 1, policy_version 93232 (0.0008) +[2023-10-09 07:58:34,141][60143] Updated weights for policy 0, policy_version 92192 (0.0010) +[2023-10-09 07:58:34,240][60144] Updated weights for policy 1, policy_version 93242 (0.0007) +[2023-10-09 07:58:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 189890560. Throughput: 0: 1710.5, 1: 1733.1. Samples: 47475492. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:58:36,053][59242] Avg episode reward: [(0, '37.890'), (1, '33.620')] +[2023-10-09 07:58:38,178][60143] Updated weights for policy 0, policy_version 92202 (0.0007) +[2023-10-09 07:58:38,185][60144] Updated weights for policy 1, policy_version 93252 (0.0007) +[2023-10-09 07:58:38,549][60143] Updated weights for policy 0, policy_version 92212 (0.0007) +[2023-10-09 07:58:38,554][60144] Updated weights for policy 1, policy_version 93262 (0.0008) +[2023-10-09 07:58:38,924][60144] Updated weights for policy 1, policy_version 93272 (0.0008) +[2023-10-09 07:58:38,925][60143] Updated weights for policy 0, policy_version 92222 (0.0009) +[2023-10-09 07:58:41,052][59242] Fps is (10 sec: 13107.8, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 189956096. Throughput: 0: 1691.8, 1: 1717.0. Samples: 47495116. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:58:41,053][59242] Avg episode reward: [(0, '36.840'), (1, '34.820')] +[2023-10-09 07:58:42,757][60143] Updated weights for policy 0, policy_version 92232 (0.0009) +[2023-10-09 07:58:43,097][60144] Updated weights for policy 1, policy_version 93282 (0.0009) +[2023-10-09 07:58:43,124][60143] Updated weights for policy 0, policy_version 92242 (0.0009) +[2023-10-09 07:58:43,499][60143] Updated weights for policy 0, policy_version 92252 (0.0008) +[2023-10-09 07:58:43,506][60144] Updated weights for policy 1, policy_version 93292 (0.0007) +[2023-10-09 07:58:43,875][60144] Updated weights for policy 1, policy_version 93302 (0.0007) +[2023-10-09 07:58:44,239][60144] Updated weights for policy 1, policy_version 93312 (0.0008) +[2023-10-09 07:58:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 190021632. Throughput: 0: 1718.8, 1: 1719.4. Samples: 47516362. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:58:46,053][59242] Avg episode reward: [(0, '37.380'), (1, '35.630')] +[2023-10-09 07:58:47,398][60143] Updated weights for policy 0, policy_version 92262 (0.0009) +[2023-10-09 07:58:47,775][60143] Updated weights for policy 0, policy_version 92272 (0.0010) +[2023-10-09 07:58:48,033][60144] Updated weights for policy 1, policy_version 93322 (0.0009) +[2023-10-09 07:58:48,137][60143] Updated weights for policy 0, policy_version 92282 (0.0009) +[2023-10-09 07:58:48,389][60144] Updated weights for policy 1, policy_version 93332 (0.0009) +[2023-10-09 07:58:48,760][60144] Updated weights for policy 1, policy_version 93342 (0.0011) +[2023-10-09 07:58:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 190087168. Throughput: 0: 1695.5, 1: 1722.3. Samples: 47526014. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:58:51,053][59242] Avg episode reward: [(0, '35.800'), (1, '36.100')] +[2023-10-09 07:58:52,231][60143] Updated weights for policy 0, policy_version 92292 (0.0007) +[2023-10-09 07:58:52,605][60143] Updated weights for policy 0, policy_version 92302 (0.0009) +[2023-10-09 07:58:52,743][60144] Updated weights for policy 1, policy_version 93352 (0.0009) +[2023-10-09 07:58:52,972][60143] Updated weights for policy 0, policy_version 92312 (0.0008) +[2023-10-09 07:58:53,102][60144] Updated weights for policy 1, policy_version 93362 (0.0008) +[2023-10-09 07:58:53,474][60144] Updated weights for policy 1, policy_version 93372 (0.0008) +[2023-10-09 07:58:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 190152704. Throughput: 0: 1705.3, 1: 1721.6. Samples: 47546932. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:58:56,053][59242] Avg episode reward: [(0, '35.900'), (1, '36.520')] +[2023-10-09 07:58:56,902][60143] Updated weights for policy 0, policy_version 92322 (0.0008) +[2023-10-09 07:58:57,278][60143] Updated weights for policy 0, policy_version 92332 (0.0009) +[2023-10-09 07:58:57,368][60144] Updated weights for policy 1, policy_version 93382 (0.0007) +[2023-10-09 07:58:57,640][60143] Updated weights for policy 0, policy_version 92342 (0.0009) +[2023-10-09 07:58:57,735][60144] Updated weights for policy 1, policy_version 93392 (0.0007) +[2023-10-09 07:58:58,011][60143] Updated weights for policy 0, policy_version 92352 (0.0008) +[2023-10-09 07:58:58,107][60144] Updated weights for policy 1, policy_version 93402 (0.0009) +[2023-10-09 07:59:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 190218240. Throughput: 0: 1711.2, 1: 1744.5. Samples: 47568038. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:59:01,053][59242] Avg episode reward: [(0, '35.990'), (1, '36.350')] +[2023-10-09 07:59:02,079][60144] Updated weights for policy 1, policy_version 93412 (0.0009) +[2023-10-09 07:59:02,156][60143] Updated weights for policy 0, policy_version 92362 (0.0007) +[2023-10-09 07:59:02,447][60144] Updated weights for policy 1, policy_version 93422 (0.0007) +[2023-10-09 07:59:02,522][60143] Updated weights for policy 0, policy_version 92372 (0.0007) +[2023-10-09 07:59:02,814][60144] Updated weights for policy 1, policy_version 93432 (0.0008) +[2023-10-09 07:59:02,899][60143] Updated weights for policy 0, policy_version 92382 (0.0007) +[2023-10-09 07:59:06,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 190283776. Throughput: 0: 1683.6, 1: 1717.2. Samples: 47577266. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:59:06,052][59242] Avg episode reward: [(0, '35.310'), (1, '34.840')] +[2023-10-09 07:59:06,655][60144] Updated weights for policy 1, policy_version 93442 (0.0008) +[2023-10-09 07:59:06,996][60143] Updated weights for policy 0, policy_version 92392 (0.0008) +[2023-10-09 07:59:07,016][60144] Updated weights for policy 1, policy_version 93452 (0.0007) +[2023-10-09 07:59:07,375][60143] Updated weights for policy 0, policy_version 92402 (0.0009) +[2023-10-09 07:59:07,379][60144] Updated weights for policy 1, policy_version 93462 (0.0007) +[2023-10-09 07:59:07,745][60144] Updated weights for policy 1, policy_version 93472 (0.0007) +[2023-10-09 07:59:07,749][60143] Updated weights for policy 0, policy_version 92412 (0.0008) +[2023-10-09 07:59:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 190349312. Throughput: 0: 1709.5, 1: 1740.0. Samples: 47598732. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:59:11,053][59242] Avg episode reward: [(0, '35.050'), (1, '33.690')] +[2023-10-09 07:59:11,569][60144] Updated weights for policy 1, policy_version 93482 (0.0008) +[2023-10-09 07:59:11,587][60143] Updated weights for policy 0, policy_version 92422 (0.0008) +[2023-10-09 07:59:11,924][60144] Updated weights for policy 1, policy_version 93492 (0.0010) +[2023-10-09 07:59:11,960][60143] Updated weights for policy 0, policy_version 92432 (0.0009) +[2023-10-09 07:59:12,296][60144] Updated weights for policy 1, policy_version 93502 (0.0009) +[2023-10-09 07:59:12,331][60143] Updated weights for policy 0, policy_version 92442 (0.0007) +[2023-10-09 07:59:16,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 190414848. Throughput: 0: 1714.5, 1: 1744.5. Samples: 47619928. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:59:16,053][59242] Avg episode reward: [(0, '35.300'), (1, '34.590')] +[2023-10-09 07:59:16,130][60143] Updated weights for policy 0, policy_version 92452 (0.0008) +[2023-10-09 07:59:16,280][60144] Updated weights for policy 1, policy_version 93512 (0.0008) +[2023-10-09 07:59:16,497][60143] Updated weights for policy 0, policy_version 92462 (0.0009) +[2023-10-09 07:59:16,643][60144] Updated weights for policy 1, policy_version 93522 (0.0008) +[2023-10-09 07:59:16,874][60143] Updated weights for policy 0, policy_version 92472 (0.0008) +[2023-10-09 07:59:17,010][60144] Updated weights for policy 1, policy_version 93532 (0.0009) +[2023-10-09 07:59:20,963][60143] Updated weights for policy 0, policy_version 92482 (0.0009) +[2023-10-09 07:59:21,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 190480384. Throughput: 0: 1692.9, 1: 1718.3. Samples: 47628996. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:59:21,052][59242] Avg episode reward: [(0, '35.900'), (1, '35.880')] +[2023-10-09 07:59:21,105][60144] Updated weights for policy 1, policy_version 93542 (0.0008) +[2023-10-09 07:59:21,335][60143] Updated weights for policy 0, policy_version 92492 (0.0007) +[2023-10-09 07:59:21,469][60144] Updated weights for policy 1, policy_version 93552 (0.0008) +[2023-10-09 07:59:21,695][60143] Updated weights for policy 0, policy_version 92502 (0.0008) +[2023-10-09 07:59:21,836][60144] Updated weights for policy 1, policy_version 93562 (0.0007) +[2023-10-09 07:59:22,063][60143] Updated weights for policy 0, policy_version 92512 (0.0007) +[2023-10-09 07:59:25,724][60144] Updated weights for policy 1, policy_version 93572 (0.0010) +[2023-10-09 07:59:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 190545920. Throughput: 0: 1712.4, 1: 1739.0. Samples: 47650428. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:59:26,053][59242] Avg episode reward: [(0, '36.490'), (1, '34.790')] +[2023-10-09 07:59:26,092][60144] Updated weights for policy 1, policy_version 93582 (0.0007) +[2023-10-09 07:59:26,125][60143] Updated weights for policy 0, policy_version 92522 (0.0008) +[2023-10-09 07:59:26,467][60144] Updated weights for policy 1, policy_version 93592 (0.0007) +[2023-10-09 07:59:26,501][60143] Updated weights for policy 0, policy_version 92532 (0.0008) +[2023-10-09 07:59:26,868][60143] Updated weights for policy 0, policy_version 92542 (0.0009) +[2023-10-09 07:59:30,551][60144] Updated weights for policy 1, policy_version 93602 (0.0008) +[2023-10-09 07:59:30,888][60143] Updated weights for policy 0, policy_version 92552 (0.0008) +[2023-10-09 07:59:30,968][60144] Updated weights for policy 1, policy_version 93612 (0.0008) +[2023-10-09 07:59:31,052][59242] Fps is (10 sec: 13106.6, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 190611456. Throughput: 0: 1706.5, 1: 1738.9. Samples: 47671404. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:59:31,054][59242] Avg episode reward: [(0, '36.870'), (1, '34.310')] +[2023-10-09 07:59:31,260][60143] Updated weights for policy 0, policy_version 92562 (0.0008) +[2023-10-09 07:59:31,340][60144] Updated weights for policy 1, policy_version 93622 (0.0008) +[2023-10-09 07:59:31,623][60143] Updated weights for policy 0, policy_version 92572 (0.0007) +[2023-10-09 07:59:31,700][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000093632_95879168.pth... +[2023-10-09 07:59:31,703][60144] Updated weights for policy 1, policy_version 93632 (0.0007) +[2023-10-09 07:59:31,732][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000092000_94208000.pth +[2023-10-09 07:59:31,775][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000092576_94797824.pth... +[2023-10-09 07:59:31,804][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000090976_93159424.pth +[2023-10-09 07:59:35,549][60144] Updated weights for policy 1, policy_version 93642 (0.0007) +[2023-10-09 07:59:35,650][60143] Updated weights for policy 0, policy_version 92582 (0.0009) +[2023-10-09 07:59:35,911][60144] Updated weights for policy 1, policy_version 93652 (0.0009) +[2023-10-09 07:59:36,006][60143] Updated weights for policy 0, policy_version 92592 (0.0008) +[2023-10-09 07:59:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 190676992. Throughput: 0: 1711.2, 1: 1728.2. Samples: 47680784. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:59:36,053][59242] Avg episode reward: [(0, '36.340'), (1, '34.440')] +[2023-10-09 07:59:36,281][60144] Updated weights for policy 1, policy_version 93662 (0.0009) +[2023-10-09 07:59:36,383][60143] Updated weights for policy 0, policy_version 92602 (0.0007) +[2023-10-09 07:59:40,077][60144] Updated weights for policy 1, policy_version 93672 (0.0009) +[2023-10-09 07:59:40,349][60143] Updated weights for policy 0, policy_version 92612 (0.0008) +[2023-10-09 07:59:40,441][60144] Updated weights for policy 1, policy_version 93682 (0.0009) +[2023-10-09 07:59:40,719][60143] Updated weights for policy 0, policy_version 92622 (0.0009) +[2023-10-09 07:59:40,814][60144] Updated weights for policy 1, policy_version 93692 (0.0008) +[2023-10-09 07:59:41,052][59242] Fps is (10 sec: 16384.6, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 190775296. Throughput: 0: 1713.1, 1: 1732.3. Samples: 47701972. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:59:41,053][59242] Avg episode reward: [(0, '37.770'), (1, '35.260')] +[2023-10-09 07:59:41,090][60143] Updated weights for policy 0, policy_version 92632 (0.0009) +[2023-10-09 07:59:44,878][60144] Updated weights for policy 1, policy_version 93702 (0.0007) +[2023-10-09 07:59:45,149][60143] Updated weights for policy 0, policy_version 92642 (0.0008) +[2023-10-09 07:59:45,253][60144] Updated weights for policy 1, policy_version 93712 (0.0007) +[2023-10-09 07:59:45,522][60143] Updated weights for policy 0, policy_version 92652 (0.0008) +[2023-10-09 07:59:45,620][60144] Updated weights for policy 1, policy_version 93722 (0.0008) +[2023-10-09 07:59:45,879][60143] Updated weights for policy 0, policy_version 92662 (0.0009) +[2023-10-09 07:59:46,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 190840832. Throughput: 0: 1707.7, 1: 1710.4. Samples: 47721854. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:59:46,053][59242] Avg episode reward: [(0, '36.750'), (1, '34.570')] +[2023-10-09 07:59:46,258][60143] Updated weights for policy 0, policy_version 92672 (0.0007) +[2023-10-09 07:59:49,577][60144] Updated weights for policy 1, policy_version 93732 (0.0007) +[2023-10-09 07:59:49,946][60144] Updated weights for policy 1, policy_version 93742 (0.0008) +[2023-10-09 07:59:50,313][60144] Updated weights for policy 1, policy_version 93752 (0.0008) +[2023-10-09 07:59:50,323][60143] Updated weights for policy 0, policy_version 92682 (0.0008) +[2023-10-09 07:59:50,695][60143] Updated weights for policy 0, policy_version 92692 (0.0008) +[2023-10-09 07:59:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 190906368. Throughput: 0: 1717.2, 1: 1734.5. Samples: 47732592. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:59:51,053][59242] Avg episode reward: [(0, '37.230'), (1, '33.290')] +[2023-10-09 07:59:51,059][60143] Updated weights for policy 0, policy_version 92702 (0.0007) +[2023-10-09 07:59:54,229][60144] Updated weights for policy 1, policy_version 93762 (0.0007) +[2023-10-09 07:59:54,600][60144] Updated weights for policy 1, policy_version 93772 (0.0009) +[2023-10-09 07:59:54,972][60144] Updated weights for policy 1, policy_version 93782 (0.0008) +[2023-10-09 07:59:55,073][60143] Updated weights for policy 0, policy_version 92712 (0.0008) +[2023-10-09 07:59:55,335][60144] Updated weights for policy 1, policy_version 93792 (0.0007) +[2023-10-09 07:59:55,446][60143] Updated weights for policy 0, policy_version 92722 (0.0008) +[2023-10-09 07:59:55,815][60143] Updated weights for policy 0, policy_version 92732 (0.0010) +[2023-10-09 07:59:56,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 191004672. Throughput: 0: 1716.5, 1: 1717.7. Samples: 47753272. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 07:59:56,053][59242] Avg episode reward: [(0, '34.870'), (1, '33.910')] +[2023-10-09 07:59:59,178][60144] Updated weights for policy 1, policy_version 93802 (0.0007) +[2023-10-09 07:59:59,550][60144] Updated weights for policy 1, policy_version 93812 (0.0008) +[2023-10-09 07:59:59,730][60143] Updated weights for policy 0, policy_version 92742 (0.0007) +[2023-10-09 07:59:59,917][60144] Updated weights for policy 1, policy_version 93822 (0.0010) +[2023-10-09 08:00:00,107][60143] Updated weights for policy 0, policy_version 92752 (0.0009) +[2023-10-09 08:00:00,478][60143] Updated weights for policy 0, policy_version 92762 (0.0010) +[2023-10-09 08:00:01,052][59242] Fps is (10 sec: 16383.4, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 191070208. Throughput: 0: 1692.5, 1: 1700.4. Samples: 47772610. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:00:01,053][59242] Avg episode reward: [(0, '34.870'), (1, '34.550')] +[2023-10-09 08:00:03,953][60144] Updated weights for policy 1, policy_version 93832 (0.0008) +[2023-10-09 08:00:04,315][60144] Updated weights for policy 1, policy_version 93842 (0.0009) +[2023-10-09 08:00:04,414][60143] Updated weights for policy 0, policy_version 92772 (0.0008) +[2023-10-09 08:00:04,677][60144] Updated weights for policy 1, policy_version 93852 (0.0008) +[2023-10-09 08:00:04,778][60143] Updated weights for policy 0, policy_version 92782 (0.0008) +[2023-10-09 08:00:05,141][60143] Updated weights for policy 0, policy_version 92792 (0.0008) +[2023-10-09 08:00:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 191135744. Throughput: 0: 1716.3, 1: 1730.4. Samples: 47784098. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:00:06,053][59242] Avg episode reward: [(0, '36.440'), (1, '34.830')] +[2023-10-09 08:00:08,614][60144] Updated weights for policy 1, policy_version 93862 (0.0009) +[2023-10-09 08:00:08,985][60144] Updated weights for policy 1, policy_version 93872 (0.0007) +[2023-10-09 08:00:09,184][60143] Updated weights for policy 0, policy_version 92802 (0.0008) +[2023-10-09 08:00:09,356][60144] Updated weights for policy 1, policy_version 93882 (0.0008) +[2023-10-09 08:00:09,549][60143] Updated weights for policy 0, policy_version 92812 (0.0008) +[2023-10-09 08:00:09,922][60143] Updated weights for policy 0, policy_version 92822 (0.0007) +[2023-10-09 08:00:10,290][60143] Updated weights for policy 0, policy_version 92832 (0.0009) +[2023-10-09 08:00:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 191201280. Throughput: 0: 1700.5, 1: 1704.0. Samples: 47803628. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:00:11,053][59242] Avg episode reward: [(0, '35.600'), (1, '33.960')] +[2023-10-09 08:00:13,362][60144] Updated weights for policy 1, policy_version 93892 (0.0007) +[2023-10-09 08:00:13,724][60144] Updated weights for policy 1, policy_version 93902 (0.0007) +[2023-10-09 08:00:14,090][60144] Updated weights for policy 1, policy_version 93912 (0.0009) +[2023-10-09 08:00:14,255][60143] Updated weights for policy 0, policy_version 92842 (0.0007) +[2023-10-09 08:00:14,630][60143] Updated weights for policy 0, policy_version 92852 (0.0007) +[2023-10-09 08:00:15,006][60143] Updated weights for policy 0, policy_version 92862 (0.0007) +[2023-10-09 08:00:16,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 191266816. Throughput: 0: 1680.5, 1: 1708.5. Samples: 47823912. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:00:16,053][59242] Avg episode reward: [(0, '35.060'), (1, '33.090')] +[2023-10-09 08:00:18,024][60144] Updated weights for policy 1, policy_version 93922 (0.0008) +[2023-10-09 08:00:18,431][60144] Updated weights for policy 1, policy_version 93932 (0.0007) +[2023-10-09 08:00:18,786][60144] Updated weights for policy 1, policy_version 93942 (0.0007) +[2023-10-09 08:00:18,899][60143] Updated weights for policy 0, policy_version 92872 (0.0007) +[2023-10-09 08:00:19,150][60144] Updated weights for policy 1, policy_version 93952 (0.0008) +[2023-10-09 08:00:19,270][60143] Updated weights for policy 0, policy_version 92882 (0.0008) +[2023-10-09 08:00:19,641][60143] Updated weights for policy 0, policy_version 92892 (0.0009) +[2023-10-09 08:00:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 191332352. Throughput: 0: 1709.9, 1: 1719.9. Samples: 47835126. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:00:21,053][59242] Avg episode reward: [(0, '35.050'), (1, '32.830')] +[2023-10-09 08:00:23,158][60144] Updated weights for policy 1, policy_version 93962 (0.0008) +[2023-10-09 08:00:23,532][60144] Updated weights for policy 1, policy_version 93972 (0.0009) +[2023-10-09 08:00:23,758][60143] Updated weights for policy 0, policy_version 92902 (0.0008) +[2023-10-09 08:00:23,887][60144] Updated weights for policy 1, policy_version 93982 (0.0008) +[2023-10-09 08:00:24,125][60143] Updated weights for policy 0, policy_version 92912 (0.0007) +[2023-10-09 08:00:24,498][60143] Updated weights for policy 0, policy_version 92922 (0.0007) +[2023-10-09 08:00:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 191397888. Throughput: 0: 1688.8, 1: 1704.5. Samples: 47854668. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:00:26,052][59242] Avg episode reward: [(0, '34.680'), (1, '33.690')] +[2023-10-09 08:00:27,761][60144] Updated weights for policy 1, policy_version 93992 (0.0008) +[2023-10-09 08:00:28,123][60144] Updated weights for policy 1, policy_version 94002 (0.0008) +[2023-10-09 08:00:28,403][60143] Updated weights for policy 0, policy_version 92932 (0.0007) +[2023-10-09 08:00:28,495][60144] Updated weights for policy 1, policy_version 94012 (0.0008) +[2023-10-09 08:00:28,768][60143] Updated weights for policy 0, policy_version 92942 (0.0009) +[2023-10-09 08:00:29,143][60143] Updated weights for policy 0, policy_version 92952 (0.0009) +[2023-10-09 08:00:31,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 191463424. Throughput: 0: 1690.8, 1: 1729.8. Samples: 47875784. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:00:31,053][59242] Avg episode reward: [(0, '34.530'), (1, '32.820')] +[2023-10-09 08:00:32,372][60144] Updated weights for policy 1, policy_version 94022 (0.0008) +[2023-10-09 08:00:32,751][60144] Updated weights for policy 1, policy_version 94032 (0.0010) +[2023-10-09 08:00:33,117][60144] Updated weights for policy 1, policy_version 94042 (0.0007) +[2023-10-09 08:00:33,187][60143] Updated weights for policy 0, policy_version 92962 (0.0008) +[2023-10-09 08:00:33,554][60143] Updated weights for policy 0, policy_version 92972 (0.0010) +[2023-10-09 08:00:33,932][60143] Updated weights for policy 0, policy_version 92982 (0.0009) +[2023-10-09 08:00:34,311][60143] Updated weights for policy 0, policy_version 92992 (0.0009) +[2023-10-09 08:00:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 191528960. Throughput: 0: 1700.9, 1: 1708.0. Samples: 47885996. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:00:36,053][59242] Avg episode reward: [(0, '35.440'), (1, '33.040')] +[2023-10-09 08:00:37,165][60144] Updated weights for policy 1, policy_version 94052 (0.0009) +[2023-10-09 08:00:37,528][60144] Updated weights for policy 1, policy_version 94062 (0.0009) +[2023-10-09 08:00:37,895][60144] Updated weights for policy 1, policy_version 94072 (0.0008) +[2023-10-09 08:00:38,456][60143] Updated weights for policy 0, policy_version 93002 (0.0009) +[2023-10-09 08:00:38,839][60143] Updated weights for policy 0, policy_version 93012 (0.0009) +[2023-10-09 08:00:39,205][60143] Updated weights for policy 0, policy_version 93022 (0.0011) +[2023-10-09 08:00:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 191594496. Throughput: 0: 1680.6, 1: 1715.3. Samples: 47906088. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:00:41,053][59242] Avg episode reward: [(0, '34.730'), (1, '33.790')] +[2023-10-09 08:00:41,877][60144] Updated weights for policy 1, policy_version 94082 (0.0009) +[2023-10-09 08:00:42,249][60144] Updated weights for policy 1, policy_version 94092 (0.0009) +[2023-10-09 08:00:42,622][60144] Updated weights for policy 1, policy_version 94102 (0.0009) +[2023-10-09 08:00:42,995][60144] Updated weights for policy 1, policy_version 94112 (0.0009) +[2023-10-09 08:00:43,241][60143] Updated weights for policy 0, policy_version 93032 (0.0009) +[2023-10-09 08:00:43,608][60143] Updated weights for policy 0, policy_version 93042 (0.0010) +[2023-10-09 08:00:43,977][60143] Updated weights for policy 0, policy_version 93052 (0.0010) +[2023-10-09 08:00:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 191660032. Throughput: 0: 1702.0, 1: 1737.7. Samples: 47927394. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:00:46,053][59242] Avg episode reward: [(0, '34.060'), (1, '37.190')] +[2023-10-09 08:00:46,876][60144] Updated weights for policy 1, policy_version 94122 (0.0008) +[2023-10-09 08:00:47,235][60144] Updated weights for policy 1, policy_version 94132 (0.0007) +[2023-10-09 08:00:47,602][60144] Updated weights for policy 1, policy_version 94142 (0.0009) +[2023-10-09 08:00:48,021][60143] Updated weights for policy 0, policy_version 93062 (0.0010) +[2023-10-09 08:00:48,390][60143] Updated weights for policy 0, policy_version 93072 (0.0009) +[2023-10-09 08:00:48,768][60143] Updated weights for policy 0, policy_version 93082 (0.0008) +[2023-10-09 08:00:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 191725568. Throughput: 0: 1694.5, 1: 1715.2. Samples: 47937534. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:00:51,053][59242] Avg episode reward: [(0, '33.730'), (1, '37.650')] +[2023-10-09 08:00:51,421][60144] Updated weights for policy 1, policy_version 94152 (0.0008) +[2023-10-09 08:00:51,787][60144] Updated weights for policy 1, policy_version 94162 (0.0008) +[2023-10-09 08:00:52,153][60144] Updated weights for policy 1, policy_version 94172 (0.0010) +[2023-10-09 08:00:52,804][60143] Updated weights for policy 0, policy_version 93092 (0.0009) +[2023-10-09 08:00:53,178][60143] Updated weights for policy 0, policy_version 93102 (0.0007) +[2023-10-09 08:00:53,540][60143] Updated weights for policy 0, policy_version 93112 (0.0007) +[2023-10-09 08:00:56,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 191791104. Throughput: 0: 1693.0, 1: 1742.5. Samples: 47958224. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:00:56,053][59242] Avg episode reward: [(0, '34.290'), (1, '36.710')] +[2023-10-09 08:00:56,134][60144] Updated weights for policy 1, policy_version 94182 (0.0011) +[2023-10-09 08:00:56,502][60144] Updated weights for policy 1, policy_version 94192 (0.0009) +[2023-10-09 08:00:56,876][60144] Updated weights for policy 1, policy_version 94202 (0.0008) +[2023-10-09 08:00:57,446][60143] Updated weights for policy 0, policy_version 93122 (0.0008) +[2023-10-09 08:00:57,815][60143] Updated weights for policy 0, policy_version 93132 (0.0007) +[2023-10-09 08:00:58,188][60143] Updated weights for policy 0, policy_version 93142 (0.0010) +[2023-10-09 08:00:58,562][60143] Updated weights for policy 0, policy_version 93152 (0.0007) +[2023-10-09 08:01:00,811][60144] Updated weights for policy 1, policy_version 94212 (0.0008) +[2023-10-09 08:01:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 191856640. Throughput: 0: 1717.8, 1: 1743.3. Samples: 47979662. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:01:01,053][59242] Avg episode reward: [(0, '32.730'), (1, '35.030')] +[2023-10-09 08:01:01,179][60144] Updated weights for policy 1, policy_version 94222 (0.0008) +[2023-10-09 08:01:01,556][60144] Updated weights for policy 1, policy_version 94232 (0.0007) +[2023-10-09 08:01:02,318][60143] Updated weights for policy 0, policy_version 93162 (0.0007) +[2023-10-09 08:01:02,697][60143] Updated weights for policy 0, policy_version 93172 (0.0009) +[2023-10-09 08:01:03,071][60143] Updated weights for policy 0, policy_version 93182 (0.0009) +[2023-10-09 08:01:05,534][60144] Updated weights for policy 1, policy_version 94242 (0.0010) +[2023-10-09 08:01:05,945][60144] Updated weights for policy 1, policy_version 94252 (0.0008) +[2023-10-09 08:01:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 191922176. Throughput: 0: 1686.4, 1: 1733.4. Samples: 47989018. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:01:06,053][59242] Avg episode reward: [(0, '32.070'), (1, '37.300')] +[2023-10-09 08:01:06,305][60144] Updated weights for policy 1, policy_version 94262 (0.0008) +[2023-10-09 08:01:06,672][60144] Updated weights for policy 1, policy_version 94272 (0.0009) +[2023-10-09 08:01:07,134][60143] Updated weights for policy 0, policy_version 93192 (0.0009) +[2023-10-09 08:01:07,500][60143] Updated weights for policy 0, policy_version 93202 (0.0008) +[2023-10-09 08:01:07,882][60143] Updated weights for policy 0, policy_version 93212 (0.0008) +[2023-10-09 08:01:10,569][60144] Updated weights for policy 1, policy_version 94282 (0.0008) +[2023-10-09 08:01:10,944][60144] Updated weights for policy 1, policy_version 94292 (0.0010) +[2023-10-09 08:01:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 191987712. Throughput: 0: 1706.8, 1: 1746.8. Samples: 48010084. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:01:11,053][59242] Avg episode reward: [(0, '31.760'), (1, '36.230')] +[2023-10-09 08:01:11,312][60144] Updated weights for policy 1, policy_version 94302 (0.0008) +[2023-10-09 08:01:11,759][60143] Updated weights for policy 0, policy_version 93222 (0.0009) +[2023-10-09 08:01:12,129][60143] Updated weights for policy 0, policy_version 93232 (0.0009) +[2023-10-09 08:01:12,503][60143] Updated weights for policy 0, policy_version 93242 (0.0007) +[2023-10-09 08:01:15,203][60144] Updated weights for policy 1, policy_version 94312 (0.0007) +[2023-10-09 08:01:15,573][60144] Updated weights for policy 1, policy_version 94322 (0.0008) +[2023-10-09 08:01:15,935][60144] Updated weights for policy 1, policy_version 94332 (0.0008) +[2023-10-09 08:01:16,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 192053248. Throughput: 0: 1715.0, 1: 1732.1. Samples: 48030902. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:01:16,052][59242] Avg episode reward: [(0, '31.720'), (1, '34.780')] +[2023-10-09 08:01:16,465][60143] Updated weights for policy 0, policy_version 93252 (0.0009) +[2023-10-09 08:01:16,837][60143] Updated weights for policy 0, policy_version 93262 (0.0009) +[2023-10-09 08:01:17,209][60143] Updated weights for policy 0, policy_version 93272 (0.0010) +[2023-10-09 08:01:19,768][60144] Updated weights for policy 1, policy_version 94342 (0.0008) +[2023-10-09 08:01:20,129][60144] Updated weights for policy 1, policy_version 94352 (0.0010) +[2023-10-09 08:01:20,491][60144] Updated weights for policy 1, policy_version 94362 (0.0010) +[2023-10-09 08:01:21,052][59242] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 192151552. Throughput: 0: 1693.5, 1: 1750.0. Samples: 48040954. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:01:21,052][59242] Avg episode reward: [(0, '31.720'), (1, '35.240')] +[2023-10-09 08:01:21,335][60143] Updated weights for policy 0, policy_version 93282 (0.0007) +[2023-10-09 08:01:21,703][60143] Updated weights for policy 0, policy_version 93292 (0.0009) +[2023-10-09 08:01:22,070][60143] Updated weights for policy 0, policy_version 93302 (0.0007) +[2023-10-09 08:01:22,439][60143] Updated weights for policy 0, policy_version 93312 (0.0007) +[2023-10-09 08:01:24,335][60144] Updated weights for policy 1, policy_version 94372 (0.0010) +[2023-10-09 08:01:24,698][60144] Updated weights for policy 1, policy_version 94382 (0.0009) +[2023-10-09 08:01:25,059][60144] Updated weights for policy 1, policy_version 94392 (0.0008) +[2023-10-09 08:01:26,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 192217088. Throughput: 0: 1718.6, 1: 1747.2. Samples: 48062052. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:01:26,052][59242] Avg episode reward: [(0, '32.560'), (1, '35.930')] +[2023-10-09 08:01:26,364][60143] Updated weights for policy 0, policy_version 93322 (0.0008) +[2023-10-09 08:01:26,731][60143] Updated weights for policy 0, policy_version 93332 (0.0007) +[2023-10-09 08:01:27,099][60143] Updated weights for policy 0, policy_version 93342 (0.0011) +[2023-10-09 08:01:28,828][60144] Updated weights for policy 1, policy_version 94402 (0.0008) +[2023-10-09 08:01:29,190][60144] Updated weights for policy 1, policy_version 94412 (0.0008) +[2023-10-09 08:01:29,553][60144] Updated weights for policy 1, policy_version 94422 (0.0010) +[2023-10-09 08:01:29,925][60144] Updated weights for policy 1, policy_version 94432 (0.0009) +[2023-10-09 08:01:31,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 192282624. Throughput: 0: 1718.9, 1: 1722.2. Samples: 48082244. Policy #0 lag: (min: 26.0, avg: 36.2, max: 58.0) +[2023-10-09 08:01:31,053][59242] Avg episode reward: [(0, '34.180'), (1, '36.260')] +[2023-10-09 08:01:31,063][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000094432_96698368.pth... +[2023-10-09 08:01:31,093][60143] Updated weights for policy 0, policy_version 93352 (0.0010) +[2023-10-09 08:01:31,094][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000092800_95027200.pth +[2023-10-09 08:01:31,461][60143] Updated weights for policy 0, policy_version 93362 (0.0007) +[2023-10-09 08:01:31,834][60143] Updated weights for policy 0, policy_version 93372 (0.0009) +[2023-10-09 08:01:31,978][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000093376_95617024.pth... +[2023-10-09 08:01:32,006][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000091776_93978624.pth +[2023-10-09 08:01:34,001][60144] Updated weights for policy 1, policy_version 94442 (0.0010) +[2023-10-09 08:01:34,369][60144] Updated weights for policy 1, policy_version 94452 (0.0009) +[2023-10-09 08:01:34,747][60144] Updated weights for policy 1, policy_version 94462 (0.0008) +[2023-10-09 08:01:35,959][60143] Updated weights for policy 0, policy_version 93382 (0.0011) +[2023-10-09 08:01:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 192348160. Throughput: 0: 1702.0, 1: 1748.0. Samples: 48092786. Policy #0 lag: (min: 26.0, avg: 36.2, max: 58.0) +[2023-10-09 08:01:36,052][59242] Avg episode reward: [(0, '34.560'), (1, '35.590')] +[2023-10-09 08:01:36,332][60143] Updated weights for policy 0, policy_version 93392 (0.0010) +[2023-10-09 08:01:36,703][60143] Updated weights for policy 0, policy_version 93402 (0.0009) +[2023-10-09 08:01:38,781][60144] Updated weights for policy 1, policy_version 94472 (0.0009) +[2023-10-09 08:01:39,146][60144] Updated weights for policy 1, policy_version 94482 (0.0007) +[2023-10-09 08:01:39,524][60144] Updated weights for policy 1, policy_version 94492 (0.0008) +[2023-10-09 08:01:40,631][60143] Updated weights for policy 0, policy_version 93412 (0.0007) +[2023-10-09 08:01:41,010][60143] Updated weights for policy 0, policy_version 93422 (0.0010) +[2023-10-09 08:01:41,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 192413696. Throughput: 0: 1714.4, 1: 1720.9. Samples: 48112814. Policy #0 lag: (min: 26.0, avg: 36.2, max: 58.0) +[2023-10-09 08:01:41,053][59242] Avg episode reward: [(0, '36.840'), (1, '36.270')] +[2023-10-09 08:01:41,382][60143] Updated weights for policy 0, policy_version 93432 (0.0008) +[2023-10-09 08:01:43,365][60144] Updated weights for policy 1, policy_version 94502 (0.0010) +[2023-10-09 08:01:43,737][60144] Updated weights for policy 1, policy_version 94512 (0.0008) +[2023-10-09 08:01:44,102][60144] Updated weights for policy 1, policy_version 94522 (0.0010) +[2023-10-09 08:01:45,489][60143] Updated weights for policy 0, policy_version 93442 (0.0009) +[2023-10-09 08:01:45,851][60143] Updated weights for policy 0, policy_version 93452 (0.0009) +[2023-10-09 08:01:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 192479232. Throughput: 0: 1704.4, 1: 1723.1. Samples: 48133902. Policy #0 lag: (min: 26.0, avg: 36.2, max: 58.0) +[2023-10-09 08:01:46,053][59242] Avg episode reward: [(0, '35.900'), (1, '37.250')] +[2023-10-09 08:01:46,227][60143] Updated weights for policy 0, policy_version 93462 (0.0009) +[2023-10-09 08:01:46,593][60143] Updated weights for policy 0, policy_version 93472 (0.0007) +[2023-10-09 08:01:47,985][60144] Updated weights for policy 1, policy_version 94532 (0.0008) +[2023-10-09 08:01:48,353][60144] Updated weights for policy 1, policy_version 94542 (0.0009) +[2023-10-09 08:01:48,720][60144] Updated weights for policy 1, policy_version 94552 (0.0009) +[2023-10-09 08:01:50,712][60143] Updated weights for policy 0, policy_version 93482 (0.0007) +[2023-10-09 08:01:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 192544768. Throughput: 0: 1702.5, 1: 1736.8. Samples: 48143784. Policy #0 lag: (min: 26.0, avg: 36.2, max: 58.0) +[2023-10-09 08:01:51,052][59242] Avg episode reward: [(0, '36.790'), (1, '37.060')] +[2023-10-09 08:01:51,090][60143] Updated weights for policy 0, policy_version 93492 (0.0010) +[2023-10-09 08:01:51,457][60143] Updated weights for policy 0, policy_version 93502 (0.0008) +[2023-10-09 08:01:52,605][60144] Updated weights for policy 1, policy_version 94562 (0.0009) +[2023-10-09 08:01:53,006][60144] Updated weights for policy 1, policy_version 94572 (0.0007) +[2023-10-09 08:01:53,370][60144] Updated weights for policy 1, policy_version 94582 (0.0007) +[2023-10-09 08:01:53,738][60144] Updated weights for policy 1, policy_version 94592 (0.0008) +[2023-10-09 08:01:55,492][60143] Updated weights for policy 0, policy_version 93512 (0.0010) +[2023-10-09 08:01:55,854][60143] Updated weights for policy 0, policy_version 93522 (0.0010) +[2023-10-09 08:01:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 192610304. Throughput: 0: 1704.4, 1: 1727.4. Samples: 48164514. Policy #0 lag: (min: 26.0, avg: 36.2, max: 58.0) +[2023-10-09 08:01:56,053][59242] Avg episode reward: [(0, '37.520'), (1, '37.030')] +[2023-10-09 08:01:56,226][60143] Updated weights for policy 0, policy_version 93532 (0.0007) +[2023-10-09 08:01:57,504][60144] Updated weights for policy 1, policy_version 94602 (0.0008) +[2023-10-09 08:01:57,884][60144] Updated weights for policy 1, policy_version 94612 (0.0009) +[2023-10-09 08:01:58,246][60144] Updated weights for policy 1, policy_version 94622 (0.0009) +[2023-10-09 08:02:00,206][60143] Updated weights for policy 0, policy_version 93542 (0.0009) +[2023-10-09 08:02:00,563][60143] Updated weights for policy 0, policy_version 93552 (0.0012) +[2023-10-09 08:02:00,927][60143] Updated weights for policy 0, policy_version 93562 (0.0011) +[2023-10-09 08:02:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 192675840. Throughput: 0: 1688.7, 1: 1739.1. Samples: 48185150. Policy #0 lag: (min: 26.0, avg: 36.2, max: 58.0) +[2023-10-09 08:02:01,053][59242] Avg episode reward: [(0, '37.470'), (1, '36.480')] +[2023-10-09 08:02:02,190][60144] Updated weights for policy 1, policy_version 94632 (0.0008) +[2023-10-09 08:02:02,552][60144] Updated weights for policy 1, policy_version 94642 (0.0011) +[2023-10-09 08:02:02,921][60144] Updated weights for policy 1, policy_version 94652 (0.0008) +[2023-10-09 08:02:05,029][60143] Updated weights for policy 0, policy_version 93572 (0.0010) +[2023-10-09 08:02:05,396][60143] Updated weights for policy 0, policy_version 93582 (0.0009) +[2023-10-09 08:02:05,763][60143] Updated weights for policy 0, policy_version 93592 (0.0011) +[2023-10-09 08:02:06,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 192741376. Throughput: 0: 1704.2, 1: 1719.2. Samples: 48195006. Policy #0 lag: (min: 26.0, avg: 36.2, max: 58.0) +[2023-10-09 08:02:06,053][59242] Avg episode reward: [(0, '36.770'), (1, '35.110')] +[2023-10-09 08:02:07,074][60144] Updated weights for policy 1, policy_version 94662 (0.0007) +[2023-10-09 08:02:07,439][60144] Updated weights for policy 1, policy_version 94672 (0.0007) +[2023-10-09 08:02:07,804][60144] Updated weights for policy 1, policy_version 94682 (0.0007) +[2023-10-09 08:02:09,628][60143] Updated weights for policy 0, policy_version 93602 (0.0008) +[2023-10-09 08:02:09,995][60143] Updated weights for policy 0, policy_version 93612 (0.0009) +[2023-10-09 08:02:10,366][60143] Updated weights for policy 0, policy_version 93622 (0.0008) +[2023-10-09 08:02:10,746][60143] Updated weights for policy 0, policy_version 93632 (0.0009) +[2023-10-09 08:02:11,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 192839680. Throughput: 0: 1702.7, 1: 1720.5. Samples: 48216096. Policy #0 lag: (min: 26.0, avg: 36.2, max: 58.0) +[2023-10-09 08:02:11,053][59242] Avg episode reward: [(0, '36.350'), (1, '35.460')] +[2023-10-09 08:02:11,831][60144] Updated weights for policy 1, policy_version 94692 (0.0008) +[2023-10-09 08:02:12,194][60144] Updated weights for policy 1, policy_version 94702 (0.0010) +[2023-10-09 08:02:12,557][60144] Updated weights for policy 1, policy_version 94712 (0.0008) +[2023-10-09 08:02:14,528][60143] Updated weights for policy 0, policy_version 93642 (0.0007) +[2023-10-09 08:02:14,899][60143] Updated weights for policy 0, policy_version 93652 (0.0008) +[2023-10-09 08:02:15,270][60143] Updated weights for policy 0, policy_version 93662 (0.0009) +[2023-10-09 08:02:16,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 192905216. Throughput: 0: 1679.6, 1: 1746.5. Samples: 48236418. Policy #0 lag: (min: 26.0, avg: 36.2, max: 58.0) +[2023-10-09 08:02:16,053][59242] Avg episode reward: [(0, '38.950'), (1, '35.360')] +[2023-10-09 08:02:16,482][60144] Updated weights for policy 1, policy_version 94722 (0.0007) +[2023-10-09 08:02:16,844][60144] Updated weights for policy 1, policy_version 94732 (0.0010) +[2023-10-09 08:02:17,208][60144] Updated weights for policy 1, policy_version 94742 (0.0008) +[2023-10-09 08:02:17,578][60144] Updated weights for policy 1, policy_version 94752 (0.0011) +[2023-10-09 08:02:19,276][60143] Updated weights for policy 0, policy_version 93672 (0.0008) +[2023-10-09 08:02:19,656][60143] Updated weights for policy 0, policy_version 93682 (0.0007) +[2023-10-09 08:02:20,024][60143] Updated weights for policy 0, policy_version 93692 (0.0007) +[2023-10-09 08:02:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 192970752. Throughput: 0: 1711.6, 1: 1715.2. Samples: 48246994. Policy #0 lag: (min: 26.0, avg: 36.2, max: 58.0) +[2023-10-09 08:02:21,053][59242] Avg episode reward: [(0, '38.950'), (1, '35.540')] +[2023-10-09 08:02:21,484][60144] Updated weights for policy 1, policy_version 94762 (0.0009) +[2023-10-09 08:02:21,857][60144] Updated weights for policy 1, policy_version 94772 (0.0009) +[2023-10-09 08:02:22,232][60144] Updated weights for policy 1, policy_version 94782 (0.0009) +[2023-10-09 08:02:23,937][60143] Updated weights for policy 0, policy_version 93702 (0.0009) +[2023-10-09 08:02:24,299][60143] Updated weights for policy 0, policy_version 93712 (0.0007) +[2023-10-09 08:02:24,661][60143] Updated weights for policy 0, policy_version 93722 (0.0010) +[2023-10-09 08:02:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 193036288. Throughput: 0: 1698.4, 1: 1741.2. Samples: 48267598. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:02:26,053][59242] Avg episode reward: [(0, '38.980'), (1, '34.510')] +[2023-10-09 08:02:26,298][60144] Updated weights for policy 1, policy_version 94792 (0.0007) +[2023-10-09 08:02:26,672][60144] Updated weights for policy 1, policy_version 94802 (0.0008) +[2023-10-09 08:02:27,041][60144] Updated weights for policy 1, policy_version 94812 (0.0009) +[2023-10-09 08:02:28,694][60143] Updated weights for policy 0, policy_version 93732 (0.0008) +[2023-10-09 08:02:29,065][60143] Updated weights for policy 0, policy_version 93742 (0.0009) +[2023-10-09 08:02:29,447][60143] Updated weights for policy 0, policy_version 93752 (0.0008) +[2023-10-09 08:02:31,001][60144] Updated weights for policy 1, policy_version 94822 (0.0009) +[2023-10-09 08:02:31,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 193101824. Throughput: 0: 1692.6, 1: 1739.4. Samples: 48288340. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:02:31,052][59242] Avg episode reward: [(0, '40.640'), (1, '35.400')] +[2023-10-09 08:02:31,060][59934] Saving new best policy, reward=40.640! +[2023-10-09 08:02:31,363][60144] Updated weights for policy 1, policy_version 94832 (0.0010) +[2023-10-09 08:02:31,735][60144] Updated weights for policy 1, policy_version 94842 (0.0012) +[2023-10-09 08:02:33,385][60143] Updated weights for policy 0, policy_version 93762 (0.0008) +[2023-10-09 08:02:33,756][60143] Updated weights for policy 0, policy_version 93772 (0.0007) +[2023-10-09 08:02:34,126][60143] Updated weights for policy 0, policy_version 93782 (0.0007) +[2023-10-09 08:02:34,490][60143] Updated weights for policy 0, policy_version 93792 (0.0007) +[2023-10-09 08:02:35,680][60144] Updated weights for policy 1, policy_version 94852 (0.0009) +[2023-10-09 08:02:36,052][60144] Updated weights for policy 1, policy_version 94862 (0.0008) +[2023-10-09 08:02:36,052][59242] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 193167360. Throughput: 0: 1719.6, 1: 1720.7. Samples: 48298598. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:02:36,053][59242] Avg episode reward: [(0, '41.570'), (1, '34.950')] +[2023-10-09 08:02:36,055][59934] Saving new best policy, reward=41.570! +[2023-10-09 08:02:36,426][60144] Updated weights for policy 1, policy_version 94872 (0.0008) +[2023-10-09 08:02:38,651][60143] Updated weights for policy 0, policy_version 93802 (0.0007) +[2023-10-09 08:02:39,007][60143] Updated weights for policy 0, policy_version 93812 (0.0009) +[2023-10-09 08:02:39,372][60143] Updated weights for policy 0, policy_version 93822 (0.0007) +[2023-10-09 08:02:40,409][60144] Updated weights for policy 1, policy_version 94882 (0.0008) +[2023-10-09 08:02:40,829][60144] Updated weights for policy 1, policy_version 94892 (0.0008) +[2023-10-09 08:02:41,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 193232896. Throughput: 0: 1694.4, 1: 1732.3. Samples: 48318716. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:02:41,053][59242] Avg episode reward: [(0, '39.530'), (1, '36.180')] +[2023-10-09 08:02:41,183][60144] Updated weights for policy 1, policy_version 94902 (0.0009) +[2023-10-09 08:02:41,549][60144] Updated weights for policy 1, policy_version 94912 (0.0007) +[2023-10-09 08:02:43,422][60143] Updated weights for policy 0, policy_version 93832 (0.0008) +[2023-10-09 08:02:43,791][60143] Updated weights for policy 0, policy_version 93842 (0.0009) +[2023-10-09 08:02:44,163][60143] Updated weights for policy 0, policy_version 93852 (0.0010) +[2023-10-09 08:02:45,398][60144] Updated weights for policy 1, policy_version 94922 (0.0008) +[2023-10-09 08:02:45,774][60144] Updated weights for policy 1, policy_version 94932 (0.0009) +[2023-10-09 08:02:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 193298432. Throughput: 0: 1710.6, 1: 1713.6. Samples: 48339238. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:02:46,053][59242] Avg episode reward: [(0, '39.840'), (1, '36.500')] +[2023-10-09 08:02:46,140][60144] Updated weights for policy 1, policy_version 94942 (0.0007) +[2023-10-09 08:02:48,081][60143] Updated weights for policy 0, policy_version 93862 (0.0009) +[2023-10-09 08:02:48,449][60143] Updated weights for policy 0, policy_version 93872 (0.0009) +[2023-10-09 08:02:48,817][60143] Updated weights for policy 0, policy_version 93882 (0.0011) +[2023-10-09 08:02:50,133][60144] Updated weights for policy 1, policy_version 94952 (0.0007) +[2023-10-09 08:02:50,496][60144] Updated weights for policy 1, policy_version 94962 (0.0010) +[2023-10-09 08:02:50,866][60144] Updated weights for policy 1, policy_version 94972 (0.0009) +[2023-10-09 08:02:51,052][59242] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 193396736. Throughput: 0: 1710.5, 1: 1725.8. Samples: 48349638. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:02:51,053][59242] Avg episode reward: [(0, '40.700'), (1, '36.660')] +[2023-10-09 08:02:52,726][60143] Updated weights for policy 0, policy_version 93892 (0.0010) +[2023-10-09 08:02:53,096][60143] Updated weights for policy 0, policy_version 93902 (0.0009) +[2023-10-09 08:02:53,473][60143] Updated weights for policy 0, policy_version 93912 (0.0008) +[2023-10-09 08:02:54,727][60144] Updated weights for policy 1, policy_version 94982 (0.0009) +[2023-10-09 08:02:55,087][60144] Updated weights for policy 1, policy_version 94992 (0.0007) +[2023-10-09 08:02:55,461][60144] Updated weights for policy 1, policy_version 95002 (0.0009) +[2023-10-09 08:02:56,052][59242] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 193462272. Throughput: 0: 1694.2, 1: 1729.3. Samples: 48370156. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:02:56,053][59242] Avg episode reward: [(0, '40.980'), (1, '36.530')] +[2023-10-09 08:02:57,425][60143] Updated weights for policy 0, policy_version 93922 (0.0010) +[2023-10-09 08:02:57,786][60143] Updated weights for policy 0, policy_version 93932 (0.0007) +[2023-10-09 08:02:58,157][60143] Updated weights for policy 0, policy_version 93942 (0.0007) +[2023-10-09 08:02:58,514][60143] Updated weights for policy 0, policy_version 93952 (0.0010) +[2023-10-09 08:02:59,463][60144] Updated weights for policy 1, policy_version 95012 (0.0009) +[2023-10-09 08:02:59,833][60144] Updated weights for policy 1, policy_version 95022 (0.0007) +[2023-10-09 08:03:00,200][60144] Updated weights for policy 1, policy_version 95032 (0.0008) +[2023-10-09 08:03:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 193527808. Throughput: 0: 1719.5, 1: 1697.4. Samples: 48390180. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:03:01,053][59242] Avg episode reward: [(0, '40.390'), (1, '37.570')] +[2023-10-09 08:03:02,592][60143] Updated weights for policy 0, policy_version 93962 (0.0009) +[2023-10-09 08:03:02,959][60143] Updated weights for policy 0, policy_version 93972 (0.0008) +[2023-10-09 08:03:03,337][60143] Updated weights for policy 0, policy_version 93982 (0.0007) +[2023-10-09 08:03:04,031][60144] Updated weights for policy 1, policy_version 95042 (0.0008) +[2023-10-09 08:03:04,396][60144] Updated weights for policy 1, policy_version 95052 (0.0009) +[2023-10-09 08:03:04,760][60144] Updated weights for policy 1, policy_version 95062 (0.0007) +[2023-10-09 08:03:05,120][60144] Updated weights for policy 1, policy_version 95072 (0.0009) +[2023-10-09 08:03:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 193593344. Throughput: 0: 1689.6, 1: 1732.7. Samples: 48401000. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:03:06,053][59242] Avg episode reward: [(0, '38.770'), (1, '36.610')] +[2023-10-09 08:03:07,247][60143] Updated weights for policy 0, policy_version 93992 (0.0009) +[2023-10-09 08:03:07,620][60143] Updated weights for policy 0, policy_version 94002 (0.0008) +[2023-10-09 08:03:07,992][60143] Updated weights for policy 0, policy_version 94012 (0.0008) +[2023-10-09 08:03:09,067][60144] Updated weights for policy 1, policy_version 95082 (0.0010) +[2023-10-09 08:03:09,436][60144] Updated weights for policy 1, policy_version 95092 (0.0011) +[2023-10-09 08:03:09,810][60144] Updated weights for policy 1, policy_version 95102 (0.0008) +[2023-10-09 08:03:11,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 193658880. Throughput: 0: 1704.7, 1: 1713.7. Samples: 48421428. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:03:11,053][59242] Avg episode reward: [(0, '37.010'), (1, '36.560')] +[2023-10-09 08:03:12,104][60143] Updated weights for policy 0, policy_version 94022 (0.0008) +[2023-10-09 08:03:12,497][60143] Updated weights for policy 0, policy_version 94032 (0.0007) +[2023-10-09 08:03:12,864][60143] Updated weights for policy 0, policy_version 94042 (0.0009) +[2023-10-09 08:03:13,754][60144] Updated weights for policy 1, policy_version 95112 (0.0010) +[2023-10-09 08:03:14,127][60144] Updated weights for policy 1, policy_version 95122 (0.0010) +[2023-10-09 08:03:14,491][60144] Updated weights for policy 1, policy_version 95132 (0.0012) +[2023-10-09 08:03:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 193724416. Throughput: 0: 1711.3, 1: 1706.6. Samples: 48442148. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:03:16,052][59242] Avg episode reward: [(0, '36.880'), (1, '36.620')] +[2023-10-09 08:03:16,761][60143] Updated weights for policy 0, policy_version 94052 (0.0009) +[2023-10-09 08:03:17,125][60143] Updated weights for policy 0, policy_version 94062 (0.0009) +[2023-10-09 08:03:17,505][60143] Updated weights for policy 0, policy_version 94072 (0.0011) +[2023-10-09 08:03:18,412][60144] Updated weights for policy 1, policy_version 95142 (0.0008) +[2023-10-09 08:03:18,775][60144] Updated weights for policy 1, policy_version 95152 (0.0007) +[2023-10-09 08:03:19,147][60144] Updated weights for policy 1, policy_version 95162 (0.0011) +[2023-10-09 08:03:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 193789952. Throughput: 0: 1684.4, 1: 1731.5. Samples: 48452312. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:03:21,053][59242] Avg episode reward: [(0, '35.370'), (1, '35.890')] +[2023-10-09 08:03:21,599][60143] Updated weights for policy 0, policy_version 94082 (0.0008) +[2023-10-09 08:03:21,967][60143] Updated weights for policy 0, policy_version 94092 (0.0008) +[2023-10-09 08:03:22,326][60143] Updated weights for policy 0, policy_version 94102 (0.0008) +[2023-10-09 08:03:22,699][60143] Updated weights for policy 0, policy_version 94112 (0.0008) +[2023-10-09 08:03:23,249][60144] Updated weights for policy 1, policy_version 95172 (0.0011) +[2023-10-09 08:03:23,615][60144] Updated weights for policy 1, policy_version 95182 (0.0008) +[2023-10-09 08:03:23,980][60144] Updated weights for policy 1, policy_version 95192 (0.0007) +[2023-10-09 08:03:26,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 193855488. Throughput: 0: 1708.4, 1: 1707.1. Samples: 48472414. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:03:26,053][59242] Avg episode reward: [(0, '35.190'), (1, '35.610')] +[2023-10-09 08:03:26,741][60143] Updated weights for policy 0, policy_version 94122 (0.0009) +[2023-10-09 08:03:27,118][60143] Updated weights for policy 0, policy_version 94132 (0.0010) +[2023-10-09 08:03:27,489][60143] Updated weights for policy 0, policy_version 94142 (0.0008) +[2023-10-09 08:03:27,788][60144] Updated weights for policy 1, policy_version 95202 (0.0007) +[2023-10-09 08:03:28,201][60144] Updated weights for policy 1, policy_version 95212 (0.0007) +[2023-10-09 08:03:28,568][60144] Updated weights for policy 1, policy_version 95222 (0.0009) +[2023-10-09 08:03:28,939][60144] Updated weights for policy 1, policy_version 95232 (0.0011) +[2023-10-09 08:03:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 193921024. Throughput: 0: 1701.9, 1: 1726.9. Samples: 48493534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:03:31,053][59242] Avg episode reward: [(0, '35.060'), (1, '36.470')] +[2023-10-09 08:03:31,064][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000095232_97517568.pth... +[2023-10-09 08:03:31,065][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000094144_96403456.pth... +[2023-10-09 08:03:31,101][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000092576_94797824.pth +[2023-10-09 08:03:31,104][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000093632_95879168.pth +[2023-10-09 08:03:31,653][60143] Updated weights for policy 0, policy_version 94152 (0.0009) +[2023-10-09 08:03:32,019][60143] Updated weights for policy 0, policy_version 94162 (0.0011) +[2023-10-09 08:03:32,391][60143] Updated weights for policy 0, policy_version 94172 (0.0009) +[2023-10-09 08:03:32,699][60144] Updated weights for policy 1, policy_version 95242 (0.0009) +[2023-10-09 08:03:33,080][60144] Updated weights for policy 1, policy_version 95252 (0.0009) +[2023-10-09 08:03:33,452][60144] Updated weights for policy 1, policy_version 95262 (0.0008) +[2023-10-09 08:03:36,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 193986560. Throughput: 0: 1690.1, 1: 1713.8. Samples: 48502816. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:03:36,053][59242] Avg episode reward: [(0, '34.880'), (1, '37.120')] +[2023-10-09 08:03:36,327][60143] Updated weights for policy 0, policy_version 94182 (0.0008) +[2023-10-09 08:03:36,685][60143] Updated weights for policy 0, policy_version 94192 (0.0008) +[2023-10-09 08:03:37,062][60143] Updated weights for policy 0, policy_version 94202 (0.0009) +[2023-10-09 08:03:37,391][60144] Updated weights for policy 1, policy_version 95272 (0.0008) +[2023-10-09 08:03:37,758][60144] Updated weights for policy 1, policy_version 95282 (0.0008) +[2023-10-09 08:03:38,132][60144] Updated weights for policy 1, policy_version 95292 (0.0009) +[2023-10-09 08:03:41,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 194052096. Throughput: 0: 1707.1, 1: 1715.6. Samples: 48524180. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:03:41,053][59242] Avg episode reward: [(0, '35.020'), (1, '35.770')] +[2023-10-09 08:03:41,124][60143] Updated weights for policy 0, policy_version 94212 (0.0010) +[2023-10-09 08:03:41,499][60143] Updated weights for policy 0, policy_version 94222 (0.0007) +[2023-10-09 08:03:41,867][60143] Updated weights for policy 0, policy_version 94232 (0.0011) +[2023-10-09 08:03:42,065][60144] Updated weights for policy 1, policy_version 95302 (0.0007) +[2023-10-09 08:03:42,425][60144] Updated weights for policy 1, policy_version 95312 (0.0008) +[2023-10-09 08:03:42,803][60144] Updated weights for policy 1, policy_version 95322 (0.0007) +[2023-10-09 08:03:45,872][60143] Updated weights for policy 0, policy_version 94242 (0.0009) +[2023-10-09 08:03:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 194117632. Throughput: 0: 1707.2, 1: 1740.0. Samples: 48545300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:03:46,053][59242] Avg episode reward: [(0, '33.910'), (1, '36.690')] +[2023-10-09 08:03:46,237][60143] Updated weights for policy 0, policy_version 94252 (0.0010) +[2023-10-09 08:03:46,621][60143] Updated weights for policy 0, policy_version 94262 (0.0010) +[2023-10-09 08:03:46,853][60144] Updated weights for policy 1, policy_version 95332 (0.0009) +[2023-10-09 08:03:46,984][60143] Updated weights for policy 0, policy_version 94272 (0.0009) +[2023-10-09 08:03:47,212][60144] Updated weights for policy 1, policy_version 95342 (0.0009) +[2023-10-09 08:03:47,576][60144] Updated weights for policy 1, policy_version 95352 (0.0007) +[2023-10-09 08:03:51,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 194183168. Throughput: 0: 1706.1, 1: 1704.7. Samples: 48554488. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:03:51,053][59242] Avg episode reward: [(0, '32.330'), (1, '38.500')] +[2023-10-09 08:03:51,173][60143] Updated weights for policy 0, policy_version 94282 (0.0010) +[2023-10-09 08:03:51,546][60143] Updated weights for policy 0, policy_version 94292 (0.0007) +[2023-10-09 08:03:51,654][60144] Updated weights for policy 1, policy_version 95362 (0.0007) +[2023-10-09 08:03:51,919][60143] Updated weights for policy 0, policy_version 94302 (0.0008) +[2023-10-09 08:03:52,021][60144] Updated weights for policy 1, policy_version 95372 (0.0009) +[2023-10-09 08:03:52,390][60144] Updated weights for policy 1, policy_version 95382 (0.0008) +[2023-10-09 08:03:52,758][60144] Updated weights for policy 1, policy_version 95392 (0.0009) +[2023-10-09 08:03:55,785][60143] Updated weights for policy 0, policy_version 94312 (0.0007) +[2023-10-09 08:03:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 194248704. Throughput: 0: 1704.4, 1: 1722.1. Samples: 48575618. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:03:56,052][59242] Avg episode reward: [(0, '31.350'), (1, '39.130')] +[2023-10-09 08:03:56,155][60143] Updated weights for policy 0, policy_version 94322 (0.0009) +[2023-10-09 08:03:56,531][60143] Updated weights for policy 0, policy_version 94332 (0.0009) +[2023-10-09 08:03:56,939][60144] Updated weights for policy 1, policy_version 95402 (0.0009) +[2023-10-09 08:03:57,318][60144] Updated weights for policy 1, policy_version 95412 (0.0008) +[2023-10-09 08:03:57,690][60144] Updated weights for policy 1, policy_version 95422 (0.0008) +[2023-10-09 08:04:00,399][60143] Updated weights for policy 0, policy_version 94342 (0.0010) +[2023-10-09 08:04:00,777][60143] Updated weights for policy 0, policy_version 94352 (0.0008) +[2023-10-09 08:04:01,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 194314240. Throughput: 0: 1696.6, 1: 1730.5. Samples: 48596366. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:04:01,053][59242] Avg episode reward: [(0, '30.780'), (1, '38.490')] +[2023-10-09 08:04:01,147][60143] Updated weights for policy 0, policy_version 94362 (0.0008) +[2023-10-09 08:04:01,482][60144] Updated weights for policy 1, policy_version 95432 (0.0009) +[2023-10-09 08:04:01,840][60144] Updated weights for policy 1, policy_version 95442 (0.0010) +[2023-10-09 08:04:02,208][60144] Updated weights for policy 1, policy_version 95452 (0.0009) +[2023-10-09 08:04:05,228][60143] Updated weights for policy 0, policy_version 94372 (0.0007) +[2023-10-09 08:04:05,600][60143] Updated weights for policy 0, policy_version 94382 (0.0010) +[2023-10-09 08:04:05,963][60143] Updated weights for policy 0, policy_version 94392 (0.0010) +[2023-10-09 08:04:06,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 194379776. Throughput: 0: 1703.3, 1: 1713.8. Samples: 48606080. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:04:06,052][59242] Avg episode reward: [(0, '30.580'), (1, '37.960')] +[2023-10-09 08:04:06,104][60144] Updated weights for policy 1, policy_version 95462 (0.0008) +[2023-10-09 08:04:06,466][60144] Updated weights for policy 1, policy_version 95472 (0.0008) +[2023-10-09 08:04:06,836][60144] Updated weights for policy 1, policy_version 95482 (0.0009) +[2023-10-09 08:04:09,826][60143] Updated weights for policy 0, policy_version 94402 (0.0009) +[2023-10-09 08:04:10,194][60143] Updated weights for policy 0, policy_version 94412 (0.0007) +[2023-10-09 08:04:10,554][60143] Updated weights for policy 0, policy_version 94422 (0.0007) +[2023-10-09 08:04:10,899][60144] Updated weights for policy 1, policy_version 95492 (0.0009) +[2023-10-09 08:04:10,916][60143] Updated weights for policy 0, policy_version 94432 (0.0008) +[2023-10-09 08:04:11,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 194478080. Throughput: 0: 1709.1, 1: 1732.0. Samples: 48627260. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:04:11,052][59242] Avg episode reward: [(0, '33.110'), (1, '36.600')] +[2023-10-09 08:04:11,267][60144] Updated weights for policy 1, policy_version 95502 (0.0008) +[2023-10-09 08:04:11,637][60144] Updated weights for policy 1, policy_version 95512 (0.0008) +[2023-10-09 08:04:14,813][60143] Updated weights for policy 0, policy_version 94442 (0.0009) +[2023-10-09 08:04:15,182][60143] Updated weights for policy 0, policy_version 94452 (0.0009) +[2023-10-09 08:04:15,541][60144] Updated weights for policy 1, policy_version 95522 (0.0008) +[2023-10-09 08:04:15,546][60143] Updated weights for policy 0, policy_version 94462 (0.0009) +[2023-10-09 08:04:15,928][60144] Updated weights for policy 1, policy_version 95532 (0.0009) +[2023-10-09 08:04:16,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 194543616. Throughput: 0: 1689.3, 1: 1726.5. Samples: 48647242. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:04:16,053][59242] Avg episode reward: [(0, '33.390'), (1, '36.020')] +[2023-10-09 08:04:16,294][60144] Updated weights for policy 1, policy_version 95542 (0.0008) +[2023-10-09 08:04:16,652][60144] Updated weights for policy 1, policy_version 95552 (0.0008) +[2023-10-09 08:04:19,644][60143] Updated weights for policy 0, policy_version 94472 (0.0008) +[2023-10-09 08:04:20,006][60143] Updated weights for policy 0, policy_version 94482 (0.0010) +[2023-10-09 08:04:20,373][60143] Updated weights for policy 0, policy_version 94492 (0.0010) +[2023-10-09 08:04:20,485][60144] Updated weights for policy 1, policy_version 95562 (0.0009) +[2023-10-09 08:04:20,857][60144] Updated weights for policy 1, policy_version 95572 (0.0007) +[2023-10-09 08:04:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 194609152. Throughput: 0: 1715.9, 1: 1730.9. Samples: 48657924. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:04:21,053][59242] Avg episode reward: [(0, '32.100'), (1, '34.890')] +[2023-10-09 08:04:21,222][60144] Updated weights for policy 1, policy_version 95582 (0.0010) +[2023-10-09 08:04:24,373][60143] Updated weights for policy 0, policy_version 94502 (0.0009) +[2023-10-09 08:04:24,741][60143] Updated weights for policy 0, policy_version 94512 (0.0010) +[2023-10-09 08:04:25,106][60143] Updated weights for policy 0, policy_version 94522 (0.0008) +[2023-10-09 08:04:25,171][60144] Updated weights for policy 1, policy_version 95592 (0.0009) +[2023-10-09 08:04:25,537][60144] Updated weights for policy 1, policy_version 95602 (0.0009) +[2023-10-09 08:04:25,906][60144] Updated weights for policy 1, policy_version 95612 (0.0008) +[2023-10-09 08:04:26,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 194707456. Throughput: 0: 1705.1, 1: 1729.6. Samples: 48678740. Policy #0 lag: (min: 29.0, avg: 43.1, max: 61.0) +[2023-10-09 08:04:26,053][59242] Avg episode reward: [(0, '31.670'), (1, '35.690')] +[2023-10-09 08:04:29,072][60143] Updated weights for policy 0, policy_version 94532 (0.0009) +[2023-10-09 08:04:29,435][60143] Updated weights for policy 0, policy_version 94542 (0.0011) +[2023-10-09 08:04:29,808][60143] Updated weights for policy 0, policy_version 94552 (0.0009) +[2023-10-09 08:04:29,887][60144] Updated weights for policy 1, policy_version 95622 (0.0008) +[2023-10-09 08:04:30,250][60144] Updated weights for policy 1, policy_version 95632 (0.0009) +[2023-10-09 08:04:30,619][60144] Updated weights for policy 1, policy_version 95642 (0.0009) +[2023-10-09 08:04:31,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 194772992. Throughput: 0: 1687.0, 1: 1710.0. Samples: 48698164. Policy #0 lag: (min: 29.0, avg: 43.1, max: 61.0) +[2023-10-09 08:04:31,052][59242] Avg episode reward: [(0, '33.070'), (1, '35.880')] +[2023-10-09 08:04:33,753][60143] Updated weights for policy 0, policy_version 94562 (0.0009) +[2023-10-09 08:04:34,127][60143] Updated weights for policy 0, policy_version 94572 (0.0007) +[2023-10-09 08:04:34,474][60144] Updated weights for policy 1, policy_version 95652 (0.0009) +[2023-10-09 08:04:34,496][60143] Updated weights for policy 0, policy_version 94582 (0.0007) +[2023-10-09 08:04:34,841][60144] Updated weights for policy 1, policy_version 95662 (0.0008) +[2023-10-09 08:04:34,858][60143] Updated weights for policy 0, policy_version 94592 (0.0008) +[2023-10-09 08:04:35,197][60144] Updated weights for policy 1, policy_version 95672 (0.0008) +[2023-10-09 08:04:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 194838528. Throughput: 0: 1721.1, 1: 1732.2. Samples: 48709884. Policy #0 lag: (min: 29.0, avg: 43.1, max: 61.0) +[2023-10-09 08:04:36,053][59242] Avg episode reward: [(0, '33.330'), (1, '37.380')] +[2023-10-09 08:04:38,839][60143] Updated weights for policy 0, policy_version 94602 (0.0011) +[2023-10-09 08:04:39,209][60143] Updated weights for policy 0, policy_version 94612 (0.0009) +[2023-10-09 08:04:39,223][60144] Updated weights for policy 1, policy_version 95682 (0.0010) +[2023-10-09 08:04:39,570][60143] Updated weights for policy 0, policy_version 94622 (0.0007) +[2023-10-09 08:04:39,593][60144] Updated weights for policy 1, policy_version 95692 (0.0008) +[2023-10-09 08:04:39,956][60144] Updated weights for policy 1, policy_version 95702 (0.0008) +[2023-10-09 08:04:40,320][60144] Updated weights for policy 1, policy_version 95712 (0.0008) +[2023-10-09 08:04:41,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 194904064. Throughput: 0: 1699.0, 1: 1722.7. Samples: 48729592. Policy #0 lag: (min: 29.0, avg: 43.1, max: 61.0) +[2023-10-09 08:04:41,053][59242] Avg episode reward: [(0, '32.780'), (1, '35.510')] +[2023-10-09 08:04:43,540][60143] Updated weights for policy 0, policy_version 94632 (0.0009) +[2023-10-09 08:04:43,915][60143] Updated weights for policy 0, policy_version 94642 (0.0009) +[2023-10-09 08:04:44,046][60144] Updated weights for policy 1, policy_version 95722 (0.0009) +[2023-10-09 08:04:44,283][60143] Updated weights for policy 0, policy_version 94652 (0.0009) +[2023-10-09 08:04:44,413][60144] Updated weights for policy 1, policy_version 95732 (0.0009) +[2023-10-09 08:04:44,776][60144] Updated weights for policy 1, policy_version 95742 (0.0010) +[2023-10-09 08:04:46,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 194969600. Throughput: 0: 1700.4, 1: 1704.3. Samples: 48749576. Policy #0 lag: (min: 29.0, avg: 43.1, max: 61.0) +[2023-10-09 08:04:46,052][59242] Avg episode reward: [(0, '34.530'), (1, '35.790')] +[2023-10-09 08:04:48,393][60143] Updated weights for policy 0, policy_version 94662 (0.0009) +[2023-10-09 08:04:48,777][60143] Updated weights for policy 0, policy_version 94672 (0.0008) +[2023-10-09 08:04:48,870][60144] Updated weights for policy 1, policy_version 95752 (0.0010) +[2023-10-09 08:04:49,147][60143] Updated weights for policy 0, policy_version 94682 (0.0008) +[2023-10-09 08:04:49,235][60144] Updated weights for policy 1, policy_version 95762 (0.0009) +[2023-10-09 08:04:49,596][60144] Updated weights for policy 1, policy_version 95772 (0.0008) +[2023-10-09 08:04:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 195035136. Throughput: 0: 1715.2, 1: 1729.1. Samples: 48761070. Policy #0 lag: (min: 29.0, avg: 43.1, max: 61.0) +[2023-10-09 08:04:51,053][59242] Avg episode reward: [(0, '33.780'), (1, '37.350')] +[2023-10-09 08:04:53,228][60143] Updated weights for policy 0, policy_version 94692 (0.0008) +[2023-10-09 08:04:53,511][60144] Updated weights for policy 1, policy_version 95782 (0.0008) +[2023-10-09 08:04:53,593][60143] Updated weights for policy 0, policy_version 94702 (0.0007) +[2023-10-09 08:04:53,872][60144] Updated weights for policy 1, policy_version 95792 (0.0009) +[2023-10-09 08:04:53,953][60143] Updated weights for policy 0, policy_version 94712 (0.0008) +[2023-10-09 08:04:54,239][60144] Updated weights for policy 1, policy_version 95802 (0.0007) +[2023-10-09 08:04:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 195100672. Throughput: 0: 1686.8, 1: 1704.8. Samples: 48779886. Policy #0 lag: (min: 29.0, avg: 43.1, max: 61.0) +[2023-10-09 08:04:56,052][59242] Avg episode reward: [(0, '34.110'), (1, '37.450')] +[2023-10-09 08:04:57,934][60143] Updated weights for policy 0, policy_version 94722 (0.0007) +[2023-10-09 08:04:58,108][60144] Updated weights for policy 1, policy_version 95812 (0.0009) +[2023-10-09 08:04:58,297][60143] Updated weights for policy 0, policy_version 94732 (0.0009) +[2023-10-09 08:04:58,481][60144] Updated weights for policy 1, policy_version 95822 (0.0008) +[2023-10-09 08:04:58,672][60143] Updated weights for policy 0, policy_version 94742 (0.0010) +[2023-10-09 08:04:58,838][60144] Updated weights for policy 1, policy_version 95832 (0.0009) +[2023-10-09 08:04:59,043][60143] Updated weights for policy 0, policy_version 94752 (0.0009) +[2023-10-09 08:05:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 195166208. Throughput: 0: 1708.2, 1: 1710.1. Samples: 48801064. Policy #0 lag: (min: 29.0, avg: 43.1, max: 61.0) +[2023-10-09 08:05:01,053][59242] Avg episode reward: [(0, '33.490'), (1, '37.450')] +[2023-10-09 08:05:02,924][60144] Updated weights for policy 1, policy_version 95842 (0.0008) +[2023-10-09 08:05:03,104][60143] Updated weights for policy 0, policy_version 94762 (0.0007) +[2023-10-09 08:05:03,293][60144] Updated weights for policy 1, policy_version 95852 (0.0007) +[2023-10-09 08:05:03,473][60143] Updated weights for policy 0, policy_version 94772 (0.0009) +[2023-10-09 08:05:03,664][60144] Updated weights for policy 1, policy_version 95862 (0.0007) +[2023-10-09 08:05:03,842][60143] Updated weights for policy 0, policy_version 94782 (0.0009) +[2023-10-09 08:05:04,032][60144] Updated weights for policy 1, policy_version 95872 (0.0009) +[2023-10-09 08:05:06,052][59242] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 195231744. Throughput: 0: 1691.5, 1: 1715.8. Samples: 48811250. Policy #0 lag: (min: 29.0, avg: 43.1, max: 61.0) +[2023-10-09 08:05:06,053][59242] Avg episode reward: [(0, '33.630'), (1, '38.010')] +[2023-10-09 08:05:07,805][60143] Updated weights for policy 0, policy_version 94792 (0.0007) +[2023-10-09 08:05:07,809][60144] Updated weights for policy 1, policy_version 95882 (0.0007) +[2023-10-09 08:05:08,175][60143] Updated weights for policy 0, policy_version 94802 (0.0008) +[2023-10-09 08:05:08,176][60144] Updated weights for policy 1, policy_version 95892 (0.0010) +[2023-10-09 08:05:08,542][60144] Updated weights for policy 1, policy_version 95902 (0.0007) +[2023-10-09 08:05:08,546][60143] Updated weights for policy 0, policy_version 94812 (0.0007) +[2023-10-09 08:05:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 195297280. Throughput: 0: 1689.7, 1: 1703.9. Samples: 48831450. Policy #0 lag: (min: 29.0, avg: 43.1, max: 61.0) +[2023-10-09 08:05:11,053][59242] Avg episode reward: [(0, '34.670'), (1, '37.320')] +[2023-10-09 08:05:12,495][60144] Updated weights for policy 1, policy_version 95912 (0.0009) +[2023-10-09 08:05:12,536][60143] Updated weights for policy 0, policy_version 94822 (0.0008) +[2023-10-09 08:05:12,868][60144] Updated weights for policy 1, policy_version 95922 (0.0007) +[2023-10-09 08:05:12,897][60143] Updated weights for policy 0, policy_version 94832 (0.0008) +[2023-10-09 08:05:13,227][60144] Updated weights for policy 1, policy_version 95932 (0.0008) +[2023-10-09 08:05:13,256][60143] Updated weights for policy 0, policy_version 94842 (0.0007) +[2023-10-09 08:05:16,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 195362816. Throughput: 0: 1709.1, 1: 1730.3. Samples: 48852936. Policy #0 lag: (min: 29.0, avg: 43.1, max: 61.0) +[2023-10-09 08:05:16,053][59242] Avg episode reward: [(0, '34.080'), (1, '38.080')] +[2023-10-09 08:05:17,183][60143] Updated weights for policy 0, policy_version 94852 (0.0008) +[2023-10-09 08:05:17,259][60144] Updated weights for policy 1, policy_version 95942 (0.0009) +[2023-10-09 08:05:17,552][60143] Updated weights for policy 0, policy_version 94862 (0.0008) +[2023-10-09 08:05:17,629][60144] Updated weights for policy 1, policy_version 95952 (0.0008) +[2023-10-09 08:05:17,921][60143] Updated weights for policy 0, policy_version 94872 (0.0008) +[2023-10-09 08:05:18,002][60144] Updated weights for policy 1, policy_version 95962 (0.0008) +[2023-10-09 08:05:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 195428352. Throughput: 0: 1676.1, 1: 1709.3. Samples: 48862226. Policy #0 lag: (min: 29.0, avg: 43.1, max: 61.0) +[2023-10-09 08:05:21,053][59242] Avg episode reward: [(0, '34.370'), (1, '38.540')] +[2023-10-09 08:05:21,780][60143] Updated weights for policy 0, policy_version 94882 (0.0007) +[2023-10-09 08:05:22,049][60144] Updated weights for policy 1, policy_version 95972 (0.0009) +[2023-10-09 08:05:22,149][60143] Updated weights for policy 0, policy_version 94892 (0.0007) +[2023-10-09 08:05:22,412][60144] Updated weights for policy 1, policy_version 95982 (0.0009) +[2023-10-09 08:05:22,517][60143] Updated weights for policy 0, policy_version 94902 (0.0007) +[2023-10-09 08:05:22,771][60144] Updated weights for policy 1, policy_version 95992 (0.0008) +[2023-10-09 08:05:22,883][60143] Updated weights for policy 0, policy_version 94912 (0.0009) +[2023-10-09 08:05:26,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 195493888. Throughput: 0: 1701.0, 1: 1716.9. Samples: 48883396. Policy #0 lag: (min: 29.0, avg: 43.1, max: 61.0) +[2023-10-09 08:05:26,052][59242] Avg episode reward: [(0, '34.170'), (1, '36.720')] +[2023-10-09 08:05:26,688][60144] Updated weights for policy 1, policy_version 96002 (0.0010) +[2023-10-09 08:05:27,022][60143] Updated weights for policy 0, policy_version 94922 (0.0009) +[2023-10-09 08:05:27,056][60144] Updated weights for policy 1, policy_version 96012 (0.0009) +[2023-10-09 08:05:27,393][60143] Updated weights for policy 0, policy_version 94932 (0.0008) +[2023-10-09 08:05:27,412][60144] Updated weights for policy 1, policy_version 96022 (0.0008) +[2023-10-09 08:05:27,759][60143] Updated weights for policy 0, policy_version 94942 (0.0008) +[2023-10-09 08:05:27,776][60144] Updated weights for policy 1, policy_version 96032 (0.0007) +[2023-10-09 08:05:31,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 195559424. Throughput: 0: 1708.9, 1: 1735.9. Samples: 48904592. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-09 08:05:31,053][59242] Avg episode reward: [(0, '33.410'), (1, '36.610')] +[2023-10-09 08:05:31,061][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000094944_97222656.pth... +[2023-10-09 08:05:31,061][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000096032_98336768.pth... +[2023-10-09 08:05:31,096][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000094432_96698368.pth +[2023-10-09 08:05:31,102][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000093376_95617024.pth +[2023-10-09 08:05:31,747][60144] Updated weights for policy 1, policy_version 96042 (0.0009) +[2023-10-09 08:05:31,769][60143] Updated weights for policy 0, policy_version 94952 (0.0009) +[2023-10-09 08:05:32,116][60144] Updated weights for policy 1, policy_version 96052 (0.0008) +[2023-10-09 08:05:32,147][60143] Updated weights for policy 0, policy_version 94962 (0.0008) +[2023-10-09 08:05:32,483][60144] Updated weights for policy 1, policy_version 96062 (0.0008) +[2023-10-09 08:05:32,512][60143] Updated weights for policy 0, policy_version 94972 (0.0008) +[2023-10-09 08:05:36,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 195624960. Throughput: 0: 1688.1, 1: 1710.7. Samples: 48914014. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-09 08:05:36,053][59242] Avg episode reward: [(0, '34.170'), (1, '36.980')] +[2023-10-09 08:05:36,359][60144] Updated weights for policy 1, policy_version 96072 (0.0009) +[2023-10-09 08:05:36,648][60143] Updated weights for policy 0, policy_version 94982 (0.0009) +[2023-10-09 08:05:36,715][60144] Updated weights for policy 1, policy_version 96082 (0.0007) +[2023-10-09 08:05:37,022][60143] Updated weights for policy 0, policy_version 94992 (0.0008) +[2023-10-09 08:05:37,073][60144] Updated weights for policy 1, policy_version 96092 (0.0007) +[2023-10-09 08:05:37,383][60143] Updated weights for policy 0, policy_version 95002 (0.0008) +[2023-10-09 08:05:41,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 195690496. Throughput: 0: 1717.1, 1: 1739.8. Samples: 48935448. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-09 08:05:41,053][59242] Avg episode reward: [(0, '34.880'), (1, '36.440')] +[2023-10-09 08:05:41,192][60144] Updated weights for policy 1, policy_version 96102 (0.0007) +[2023-10-09 08:05:41,442][60143] Updated weights for policy 0, policy_version 95012 (0.0008) +[2023-10-09 08:05:41,550][60144] Updated weights for policy 1, policy_version 96112 (0.0007) +[2023-10-09 08:05:41,809][60143] Updated weights for policy 0, policy_version 95022 (0.0008) +[2023-10-09 08:05:41,925][60144] Updated weights for policy 1, policy_version 96122 (0.0008) +[2023-10-09 08:05:42,179][60143] Updated weights for policy 0, policy_version 95032 (0.0009) +[2023-10-09 08:05:45,823][60144] Updated weights for policy 1, policy_version 96132 (0.0009) +[2023-10-09 08:05:46,038][60143] Updated weights for policy 0, policy_version 95042 (0.0010) +[2023-10-09 08:05:46,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13662.6). Total num frames: 195756032. Throughput: 0: 1717.8, 1: 1738.7. Samples: 48956608. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-09 08:05:46,053][59242] Avg episode reward: [(0, '35.120'), (1, '36.170')] +[2023-10-09 08:05:46,200][60144] Updated weights for policy 1, policy_version 96142 (0.0009) +[2023-10-09 08:05:46,410][60143] Updated weights for policy 0, policy_version 95052 (0.0007) +[2023-10-09 08:05:46,562][60144] Updated weights for policy 1, policy_version 96152 (0.0008) +[2023-10-09 08:05:46,770][60143] Updated weights for policy 0, policy_version 95062 (0.0007) +[2023-10-09 08:05:47,145][60143] Updated weights for policy 0, policy_version 95072 (0.0010) +[2023-10-09 08:05:50,517][60144] Updated weights for policy 1, policy_version 96162 (0.0008) +[2023-10-09 08:05:50,894][60144] Updated weights for policy 1, policy_version 96172 (0.0008) +[2023-10-09 08:05:51,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 195821568. Throughput: 0: 1706.3, 1: 1729.5. Samples: 48965862. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-09 08:05:51,052][59242] Avg episode reward: [(0, '34.510'), (1, '36.650')] +[2023-10-09 08:05:51,262][60144] Updated weights for policy 1, policy_version 96182 (0.0009) +[2023-10-09 08:05:51,284][60143] Updated weights for policy 0, policy_version 95082 (0.0008) +[2023-10-09 08:05:51,628][60144] Updated weights for policy 1, policy_version 96192 (0.0008) +[2023-10-09 08:05:51,654][60143] Updated weights for policy 0, policy_version 95092 (0.0008) +[2023-10-09 08:05:52,020][60143] Updated weights for policy 0, policy_version 95102 (0.0007) +[2023-10-09 08:05:55,433][60144] Updated weights for policy 1, policy_version 96202 (0.0008) +[2023-10-09 08:05:55,794][60144] Updated weights for policy 1, policy_version 96212 (0.0008) +[2023-10-09 08:05:55,978][60143] Updated weights for policy 0, policy_version 95112 (0.0008) +[2023-10-09 08:05:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.1, 300 sec: 13662.6). Total num frames: 195887104. Throughput: 0: 1718.2, 1: 1742.3. Samples: 48987172. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-09 08:05:56,053][59242] Avg episode reward: [(0, '35.530'), (1, '36.140')] +[2023-10-09 08:05:56,157][60144] Updated weights for policy 1, policy_version 96222 (0.0009) +[2023-10-09 08:05:56,350][60143] Updated weights for policy 0, policy_version 95122 (0.0009) +[2023-10-09 08:05:56,714][60143] Updated weights for policy 0, policy_version 95132 (0.0007) +[2023-10-09 08:06:00,165][60144] Updated weights for policy 1, policy_version 96232 (0.0009) +[2023-10-09 08:06:00,534][60144] Updated weights for policy 1, policy_version 96242 (0.0010) +[2023-10-09 08:06:00,711][60143] Updated weights for policy 0, policy_version 95142 (0.0009) +[2023-10-09 08:06:00,890][60144] Updated weights for policy 1, policy_version 96252 (0.0009) +[2023-10-09 08:06:01,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 195985408. Throughput: 0: 1709.8, 1: 1722.4. Samples: 49007384. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-09 08:06:01,053][59242] Avg episode reward: [(0, '35.760'), (1, '35.390')] +[2023-10-09 08:06:01,078][60143] Updated weights for policy 0, policy_version 95152 (0.0008) +[2023-10-09 08:06:01,446][60143] Updated weights for policy 0, policy_version 95162 (0.0007) +[2023-10-09 08:06:04,807][60144] Updated weights for policy 1, policy_version 96262 (0.0010) +[2023-10-09 08:06:05,168][60144] Updated weights for policy 1, policy_version 96272 (0.0010) +[2023-10-09 08:06:05,523][60143] Updated weights for policy 0, policy_version 95172 (0.0008) +[2023-10-09 08:06:05,539][60144] Updated weights for policy 1, policy_version 96282 (0.0007) +[2023-10-09 08:06:05,898][60143] Updated weights for policy 0, policy_version 95182 (0.0008) +[2023-10-09 08:06:06,052][59242] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 196050944. Throughput: 0: 1712.4, 1: 1743.2. Samples: 49017724. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-09 08:06:06,053][59242] Avg episode reward: [(0, '35.430'), (1, '34.490')] +[2023-10-09 08:06:06,265][60143] Updated weights for policy 0, policy_version 95192 (0.0011) +[2023-10-09 08:06:09,491][60144] Updated weights for policy 1, policy_version 96292 (0.0009) +[2023-10-09 08:06:09,851][60144] Updated weights for policy 1, policy_version 96302 (0.0007) +[2023-10-09 08:06:10,212][60144] Updated weights for policy 1, policy_version 96312 (0.0008) +[2023-10-09 08:06:10,252][60143] Updated weights for policy 0, policy_version 95202 (0.0008) +[2023-10-09 08:06:10,615][60143] Updated weights for policy 0, policy_version 95212 (0.0008) +[2023-10-09 08:06:10,981][60143] Updated weights for policy 0, policy_version 95222 (0.0008) +[2023-10-09 08:06:11,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 196116480. Throughput: 0: 1708.6, 1: 1740.8. Samples: 49038622. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-09 08:06:11,053][59242] Avg episode reward: [(0, '35.620'), (1, '33.120')] +[2023-10-09 08:06:11,356][60143] Updated weights for policy 0, policy_version 95232 (0.0008) +[2023-10-09 08:06:14,020][60144] Updated weights for policy 1, policy_version 96322 (0.0008) +[2023-10-09 08:06:14,395][60144] Updated weights for policy 1, policy_version 96332 (0.0010) +[2023-10-09 08:06:14,750][60144] Updated weights for policy 1, policy_version 96342 (0.0009) +[2023-10-09 08:06:15,101][60143] Updated weights for policy 0, policy_version 95242 (0.0009) +[2023-10-09 08:06:15,111][60144] Updated weights for policy 1, policy_version 96352 (0.0007) +[2023-10-09 08:06:15,461][60143] Updated weights for policy 0, policy_version 95252 (0.0011) +[2023-10-09 08:06:15,832][60143] Updated weights for policy 0, policy_version 95262 (0.0010) +[2023-10-09 08:06:16,052][59242] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 196214784. Throughput: 0: 1696.9, 1: 1719.8. Samples: 49058342. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-09 08:06:16,053][59242] Avg episode reward: [(0, '35.490'), (1, '35.150')] +[2023-10-09 08:06:19,113][60144] Updated weights for policy 1, policy_version 96362 (0.0009) +[2023-10-09 08:06:19,477][60144] Updated weights for policy 1, policy_version 96372 (0.0007) +[2023-10-09 08:06:19,824][60143] Updated weights for policy 0, policy_version 95272 (0.0008) +[2023-10-09 08:06:19,847][60144] Updated weights for policy 1, policy_version 96382 (0.0008) +[2023-10-09 08:06:20,197][60143] Updated weights for policy 0, policy_version 95282 (0.0007) +[2023-10-09 08:06:20,563][60143] Updated weights for policy 0, policy_version 95292 (0.0009) +[2023-10-09 08:06:21,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 196280320. Throughput: 0: 1720.1, 1: 1744.2. Samples: 49069910. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-09 08:06:21,052][59242] Avg episode reward: [(0, '34.830'), (1, '34.170')] +[2023-10-09 08:06:23,742][60144] Updated weights for policy 1, policy_version 96392 (0.0009) +[2023-10-09 08:06:24,110][60144] Updated weights for policy 1, policy_version 96402 (0.0011) +[2023-10-09 08:06:24,481][60144] Updated weights for policy 1, policy_version 96412 (0.0009) +[2023-10-09 08:06:24,642][60143] Updated weights for policy 0, policy_version 95302 (0.0008) +[2023-10-09 08:06:25,009][60143] Updated weights for policy 0, policy_version 95312 (0.0010) +[2023-10-09 08:06:25,384][60143] Updated weights for policy 0, policy_version 95322 (0.0009) +[2023-10-09 08:06:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 196345856. Throughput: 0: 1713.3, 1: 1718.0. Samples: 49089856. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-09 08:06:26,053][59242] Avg episode reward: [(0, '34.320'), (1, '33.440')] +[2023-10-09 08:06:28,337][60144] Updated weights for policy 1, policy_version 96422 (0.0008) +[2023-10-09 08:06:28,707][60144] Updated weights for policy 1, policy_version 96432 (0.0008) +[2023-10-09 08:06:29,072][60144] Updated weights for policy 1, policy_version 96442 (0.0007) +[2023-10-09 08:06:29,354][60143] Updated weights for policy 0, policy_version 95332 (0.0010) +[2023-10-09 08:06:29,734][60143] Updated weights for policy 0, policy_version 95342 (0.0011) +[2023-10-09 08:06:30,092][60143] Updated weights for policy 0, policy_version 95352 (0.0010) +[2023-10-09 08:06:31,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 196411392. Throughput: 0: 1689.4, 1: 1720.7. Samples: 49110062. Policy #0 lag: (min: 10.0, avg: 20.5, max: 42.0) +[2023-10-09 08:06:31,053][59242] Avg episode reward: [(0, '36.050'), (1, '33.250')] +[2023-10-09 08:06:33,093][60144] Updated weights for policy 1, policy_version 96452 (0.0008) +[2023-10-09 08:06:33,456][60144] Updated weights for policy 1, policy_version 96462 (0.0009) +[2023-10-09 08:06:33,824][60144] Updated weights for policy 1, policy_version 96472 (0.0009) +[2023-10-09 08:06:33,926][60143] Updated weights for policy 0, policy_version 95362 (0.0007) +[2023-10-09 08:06:34,294][60143] Updated weights for policy 0, policy_version 95372 (0.0007) +[2023-10-09 08:06:34,666][60143] Updated weights for policy 0, policy_version 95382 (0.0008) +[2023-10-09 08:06:35,038][60143] Updated weights for policy 0, policy_version 95392 (0.0007) +[2023-10-09 08:06:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 196476928. Throughput: 0: 1722.4, 1: 1734.0. Samples: 49121404. Policy #0 lag: (min: 10.0, avg: 20.5, max: 42.0) +[2023-10-09 08:06:36,053][59242] Avg episode reward: [(0, '35.540'), (1, '32.500')] +[2023-10-09 08:06:37,635][60144] Updated weights for policy 1, policy_version 96482 (0.0009) +[2023-10-09 08:06:37,994][60144] Updated weights for policy 1, policy_version 96492 (0.0009) +[2023-10-09 08:06:38,353][60144] Updated weights for policy 1, policy_version 96502 (0.0008) +[2023-10-09 08:06:38,728][60144] Updated weights for policy 1, policy_version 96512 (0.0008) +[2023-10-09 08:06:38,980][60143] Updated weights for policy 0, policy_version 95402 (0.0009) +[2023-10-09 08:06:39,351][60143] Updated weights for policy 0, policy_version 95412 (0.0011) +[2023-10-09 08:06:39,717][60143] Updated weights for policy 0, policy_version 95422 (0.0007) +[2023-10-09 08:06:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 196542464. Throughput: 0: 1707.5, 1: 1724.1. Samples: 49141596. Policy #0 lag: (min: 10.0, avg: 20.5, max: 42.0) +[2023-10-09 08:06:41,053][59242] Avg episode reward: [(0, '36.480'), (1, '32.410')] +[2023-10-09 08:06:42,898][60144] Updated weights for policy 1, policy_version 96522 (0.0009) +[2023-10-09 08:06:43,256][60144] Updated weights for policy 1, policy_version 96532 (0.0008) +[2023-10-09 08:06:43,521][60143] Updated weights for policy 0, policy_version 95432 (0.0007) +[2023-10-09 08:06:43,616][60144] Updated weights for policy 1, policy_version 96542 (0.0007) +[2023-10-09 08:06:43,898][60143] Updated weights for policy 0, policy_version 95442 (0.0008) +[2023-10-09 08:06:44,263][60143] Updated weights for policy 0, policy_version 95452 (0.0008) +[2023-10-09 08:06:46,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 196608000. Throughput: 0: 1710.7, 1: 1742.5. Samples: 49162780. Policy #0 lag: (min: 10.0, avg: 20.5, max: 42.0) +[2023-10-09 08:06:46,052][59242] Avg episode reward: [(0, '34.470'), (1, '33.430')] +[2023-10-09 08:06:47,416][60144] Updated weights for policy 1, policy_version 96552 (0.0007) +[2023-10-09 08:06:47,772][60144] Updated weights for policy 1, policy_version 96562 (0.0011) +[2023-10-09 08:06:48,137][60144] Updated weights for policy 1, policy_version 96572 (0.0009) +[2023-10-09 08:06:48,186][60143] Updated weights for policy 0, policy_version 95462 (0.0010) +[2023-10-09 08:06:48,541][60143] Updated weights for policy 0, policy_version 95472 (0.0007) +[2023-10-09 08:06:48,903][60143] Updated weights for policy 0, policy_version 95482 (0.0008) +[2023-10-09 08:06:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 196673536. Throughput: 0: 1726.7, 1: 1719.3. Samples: 49172794. Policy #0 lag: (min: 10.0, avg: 20.5, max: 42.0) +[2023-10-09 08:06:51,053][59242] Avg episode reward: [(0, '34.560'), (1, '33.200')] +[2023-10-09 08:06:52,157][60144] Updated weights for policy 1, policy_version 96582 (0.0009) +[2023-10-09 08:06:52,522][60144] Updated weights for policy 1, policy_version 96592 (0.0008) +[2023-10-09 08:06:52,849][60143] Updated weights for policy 0, policy_version 95492 (0.0007) +[2023-10-09 08:06:52,883][60144] Updated weights for policy 1, policy_version 96602 (0.0009) +[2023-10-09 08:06:53,221][60143] Updated weights for policy 0, policy_version 95502 (0.0009) +[2023-10-09 08:06:53,583][60143] Updated weights for policy 0, policy_version 95512 (0.0007) +[2023-10-09 08:06:56,052][59242] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 196739072. Throughput: 0: 1710.9, 1: 1726.6. Samples: 49193308. Policy #0 lag: (min: 10.0, avg: 20.5, max: 42.0) +[2023-10-09 08:06:56,053][59242] Avg episode reward: [(0, '34.850'), (1, '32.820')] +[2023-10-09 08:06:56,722][60144] Updated weights for policy 1, policy_version 96612 (0.0008) +[2023-10-09 08:06:57,094][60144] Updated weights for policy 1, policy_version 96622 (0.0008) +[2023-10-09 08:06:57,471][60144] Updated weights for policy 1, policy_version 96632 (0.0008) +[2023-10-09 08:06:57,634][60143] Updated weights for policy 0, policy_version 95522 (0.0007) +[2023-10-09 08:06:58,010][60143] Updated weights for policy 0, policy_version 95532 (0.0009) +[2023-10-09 08:06:58,375][60143] Updated weights for policy 0, policy_version 95542 (0.0009) +[2023-10-09 08:06:58,741][60143] Updated weights for policy 0, policy_version 95552 (0.0011) +[2023-10-09 08:07:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 196804608. Throughput: 0: 1718.7, 1: 1753.2. Samples: 49214576. Policy #0 lag: (min: 10.0, avg: 20.5, max: 42.0) +[2023-10-09 08:07:01,053][59242] Avg episode reward: [(0, '35.650'), (1, '33.060')] +[2023-10-09 08:07:01,282][60144] Updated weights for policy 1, policy_version 96642 (0.0008) +[2023-10-09 08:07:01,660][60144] Updated weights for policy 1, policy_version 96652 (0.0007) +[2023-10-09 08:07:02,028][60144] Updated weights for policy 1, policy_version 96662 (0.0009) +[2023-10-09 08:07:02,401][60144] Updated weights for policy 1, policy_version 96672 (0.0008) +[2023-10-09 08:07:02,760][60143] Updated weights for policy 0, policy_version 95562 (0.0008) +[2023-10-09 08:07:03,140][60143] Updated weights for policy 0, policy_version 95572 (0.0008) +[2023-10-09 08:07:03,502][60143] Updated weights for policy 0, policy_version 95582 (0.0007) +[2023-10-09 08:07:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 196870144. Throughput: 0: 1700.8, 1: 1723.6. Samples: 49224008. Policy #0 lag: (min: 10.0, avg: 20.5, max: 42.0) +[2023-10-09 08:07:06,053][59242] Avg episode reward: [(0, '35.820'), (1, '32.520')] +[2023-10-09 08:07:06,346][60144] Updated weights for policy 1, policy_version 96682 (0.0007) +[2023-10-09 08:07:06,718][60144] Updated weights for policy 1, policy_version 96692 (0.0007) +[2023-10-09 08:07:07,086][60144] Updated weights for policy 1, policy_version 96702 (0.0009) +[2023-10-09 08:07:07,506][60143] Updated weights for policy 0, policy_version 95592 (0.0011) +[2023-10-09 08:07:07,878][60143] Updated weights for policy 0, policy_version 95602 (0.0008) +[2023-10-09 08:07:08,252][60143] Updated weights for policy 0, policy_version 95612 (0.0007) +[2023-10-09 08:07:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 196935680. Throughput: 0: 1696.8, 1: 1753.2. Samples: 49245102. Policy #0 lag: (min: 10.0, avg: 20.5, max: 42.0) +[2023-10-09 08:07:11,053][59242] Avg episode reward: [(0, '34.800'), (1, '32.480')] +[2023-10-09 08:07:11,060][60144] Updated weights for policy 1, policy_version 96712 (0.0008) +[2023-10-09 08:07:11,416][60144] Updated weights for policy 1, policy_version 96722 (0.0007) +[2023-10-09 08:07:11,787][60144] Updated weights for policy 1, policy_version 96732 (0.0007) +[2023-10-09 08:07:12,376][60143] Updated weights for policy 0, policy_version 95622 (0.0008) +[2023-10-09 08:07:12,763][60143] Updated weights for policy 0, policy_version 95632 (0.0008) +[2023-10-09 08:07:13,131][60143] Updated weights for policy 0, policy_version 95642 (0.0009) +[2023-10-09 08:07:15,618][60144] Updated weights for policy 1, policy_version 96742 (0.0008) +[2023-10-09 08:07:15,983][60144] Updated weights for policy 1, policy_version 96752 (0.0007) +[2023-10-09 08:07:16,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 197001216. Throughput: 0: 1716.1, 1: 1744.9. Samples: 49265810. Policy #0 lag: (min: 10.0, avg: 20.5, max: 42.0) +[2023-10-09 08:07:16,052][59242] Avg episode reward: [(0, '34.800'), (1, '32.470')] +[2023-10-09 08:07:16,348][60144] Updated weights for policy 1, policy_version 96762 (0.0007) +[2023-10-09 08:07:17,090][60143] Updated weights for policy 0, policy_version 95652 (0.0009) +[2023-10-09 08:07:17,458][60143] Updated weights for policy 0, policy_version 95662 (0.0008) +[2023-10-09 08:07:17,821][60143] Updated weights for policy 0, policy_version 95672 (0.0007) +[2023-10-09 08:07:20,272][60144] Updated weights for policy 1, policy_version 96772 (0.0007) +[2023-10-09 08:07:20,645][60144] Updated weights for policy 1, policy_version 96782 (0.0007) +[2023-10-09 08:07:21,003][60144] Updated weights for policy 1, policy_version 96792 (0.0010) +[2023-10-09 08:07:21,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 197066752. Throughput: 0: 1688.0, 1: 1742.3. Samples: 49275764. Policy #0 lag: (min: 10.0, avg: 20.5, max: 42.0) +[2023-10-09 08:07:21,052][59242] Avg episode reward: [(0, '34.460'), (1, '32.940')] +[2023-10-09 08:07:21,773][60143] Updated weights for policy 0, policy_version 95682 (0.0008) +[2023-10-09 08:07:22,139][60143] Updated weights for policy 0, policy_version 95692 (0.0011) +[2023-10-09 08:07:22,510][60143] Updated weights for policy 0, policy_version 95702 (0.0011) +[2023-10-09 08:07:22,878][60143] Updated weights for policy 0, policy_version 95712 (0.0008) +[2023-10-09 08:07:24,862][60144] Updated weights for policy 1, policy_version 96802 (0.0010) +[2023-10-09 08:07:25,219][60144] Updated weights for policy 1, policy_version 96812 (0.0007) +[2023-10-09 08:07:25,586][60144] Updated weights for policy 1, policy_version 96822 (0.0008) +[2023-10-09 08:07:25,952][60144] Updated weights for policy 1, policy_version 96832 (0.0009) +[2023-10-09 08:07:26,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 197165056. Throughput: 0: 1702.3, 1: 1754.4. Samples: 49297146. Policy #0 lag: (min: 10.0, avg: 20.5, max: 42.0) +[2023-10-09 08:07:26,052][59242] Avg episode reward: [(0, '34.260'), (1, '32.880')] +[2023-10-09 08:07:26,998][60143] Updated weights for policy 0, policy_version 95722 (0.0009) +[2023-10-09 08:07:27,372][60143] Updated weights for policy 0, policy_version 95732 (0.0010) +[2023-10-09 08:07:27,749][60143] Updated weights for policy 0, policy_version 95742 (0.0009) +[2023-10-09 08:07:29,865][60144] Updated weights for policy 1, policy_version 96842 (0.0008) +[2023-10-09 08:07:30,243][60144] Updated weights for policy 1, policy_version 96852 (0.0008) +[2023-10-09 08:07:30,599][60144] Updated weights for policy 1, policy_version 96862 (0.0009) +[2023-10-09 08:07:31,052][59242] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 197230592. Throughput: 0: 1703.0, 1: 1723.1. Samples: 49316954. Policy #0 lag: (min: 10.0, avg: 20.5, max: 42.0) +[2023-10-09 08:07:31,053][59242] Avg episode reward: [(0, '34.840'), (1, '33.270')] +[2023-10-09 08:07:31,060][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000095744_98041856.pth... +[2023-10-09 08:07:31,060][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000096864_99188736.pth... +[2023-10-09 08:07:31,089][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000094144_96403456.pth +[2023-10-09 08:07:31,093][59934] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p0/milestones/checkpoint_000095744_98041856.pth +[2023-10-09 08:07:31,101][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000095232_97517568.pth +[2023-10-09 08:07:31,106][60003] Saving a milestone ./train_atari/atari_beamrider_APPO/checkpoint_p1/milestones/checkpoint_000096864_99188736.pth +[2023-10-09 08:07:31,693][60143] Updated weights for policy 0, policy_version 95752 (0.0008) +[2023-10-09 08:07:32,072][60143] Updated weights for policy 0, policy_version 95762 (0.0010) +[2023-10-09 08:07:32,443][60143] Updated weights for policy 0, policy_version 95772 (0.0009) +[2023-10-09 08:07:34,405][60144] Updated weights for policy 1, policy_version 96872 (0.0009) +[2023-10-09 08:07:34,774][60144] Updated weights for policy 1, policy_version 96882 (0.0008) +[2023-10-09 08:07:35,132][60144] Updated weights for policy 1, policy_version 96892 (0.0007) +[2023-10-09 08:07:36,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 197296128. Throughput: 0: 1683.9, 1: 1754.5. Samples: 49327520. Policy #0 lag: (min: 0.0, avg: 24.1, max: 32.0) +[2023-10-09 08:07:36,053][59242] Avg episode reward: [(0, '34.710'), (1, '35.210')] +[2023-10-09 08:07:36,413][60143] Updated weights for policy 0, policy_version 95782 (0.0009) +[2023-10-09 08:07:36,776][60143] Updated weights for policy 0, policy_version 95792 (0.0007) +[2023-10-09 08:07:37,149][60143] Updated weights for policy 0, policy_version 95802 (0.0007) +[2023-10-09 08:07:39,012][60144] Updated weights for policy 1, policy_version 96902 (0.0008) +[2023-10-09 08:07:39,379][60144] Updated weights for policy 1, policy_version 96912 (0.0008) +[2023-10-09 08:07:39,737][60144] Updated weights for policy 1, policy_version 96922 (0.0007) +[2023-10-09 08:07:41,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 197361664. Throughput: 0: 1702.8, 1: 1736.1. Samples: 49348058. Policy #0 lag: (min: 0.0, avg: 24.1, max: 32.0) +[2023-10-09 08:07:41,052][59242] Avg episode reward: [(0, '35.040'), (1, '35.690')] +[2023-10-09 08:07:41,296][60143] Updated weights for policy 0, policy_version 95812 (0.0009) +[2023-10-09 08:07:41,665][60143] Updated weights for policy 0, policy_version 95822 (0.0009) +[2023-10-09 08:07:42,035][60143] Updated weights for policy 0, policy_version 95832 (0.0007) +[2023-10-09 08:07:43,825][60144] Updated weights for policy 1, policy_version 96932 (0.0008) +[2023-10-09 08:07:44,190][60144] Updated weights for policy 1, policy_version 96942 (0.0008) +[2023-10-09 08:07:44,563][60144] Updated weights for policy 1, policy_version 96952 (0.0009) +[2023-10-09 08:07:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 197427200. Throughput: 0: 1706.9, 1: 1716.5. Samples: 49368630. Policy #0 lag: (min: 0.0, avg: 24.1, max: 32.0) +[2023-10-09 08:07:46,053][59242] Avg episode reward: [(0, '34.330'), (1, '37.460')] +[2023-10-09 08:07:46,063][60143] Updated weights for policy 0, policy_version 95842 (0.0009) +[2023-10-09 08:07:46,439][60143] Updated weights for policy 0, policy_version 95852 (0.0009) +[2023-10-09 08:07:46,804][60143] Updated weights for policy 0, policy_version 95862 (0.0007) +[2023-10-09 08:07:47,180][60143] Updated weights for policy 0, policy_version 95872 (0.0007) +[2023-10-09 08:07:48,416][60144] Updated weights for policy 1, policy_version 96962 (0.0008) +[2023-10-09 08:07:48,785][60144] Updated weights for policy 1, policy_version 96972 (0.0008) +[2023-10-09 08:07:49,147][60144] Updated weights for policy 1, policy_version 96982 (0.0010) +[2023-10-09 08:07:49,515][60144] Updated weights for policy 1, policy_version 96992 (0.0008) +[2023-10-09 08:07:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 197492736. Throughput: 0: 1700.3, 1: 1744.5. Samples: 49379024. Policy #0 lag: (min: 0.0, avg: 24.1, max: 32.0) +[2023-10-09 08:07:51,053][59242] Avg episode reward: [(0, '34.800'), (1, '35.170')] +[2023-10-09 08:07:51,223][60143] Updated weights for policy 0, policy_version 95882 (0.0009) +[2023-10-09 08:07:51,599][60143] Updated weights for policy 0, policy_version 95892 (0.0010) +[2023-10-09 08:07:51,958][60143] Updated weights for policy 0, policy_version 95902 (0.0010) +[2023-10-09 08:07:53,538][60144] Updated weights for policy 1, policy_version 97002 (0.0007) +[2023-10-09 08:07:53,906][60144] Updated weights for policy 1, policy_version 97012 (0.0010) +[2023-10-09 08:07:54,276][60144] Updated weights for policy 1, policy_version 97022 (0.0009) +[2023-10-09 08:07:55,949][60143] Updated weights for policy 0, policy_version 95912 (0.0011) +[2023-10-09 08:07:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 197558272. Throughput: 0: 1704.8, 1: 1715.6. Samples: 49399024. Policy #0 lag: (min: 0.0, avg: 24.1, max: 32.0) +[2023-10-09 08:07:56,053][59242] Avg episode reward: [(0, '32.540'), (1, '36.320')] +[2023-10-09 08:07:56,322][60143] Updated weights for policy 0, policy_version 95922 (0.0011) +[2023-10-09 08:07:56,695][60143] Updated weights for policy 0, policy_version 95932 (0.0009) +[2023-10-09 08:07:58,096][60144] Updated weights for policy 1, policy_version 97032 (0.0008) +[2023-10-09 08:07:58,471][60144] Updated weights for policy 1, policy_version 97042 (0.0008) +[2023-10-09 08:07:58,834][60144] Updated weights for policy 1, policy_version 97052 (0.0008) +[2023-10-09 08:08:00,668][60143] Updated weights for policy 0, policy_version 95942 (0.0008) +[2023-10-09 08:08:01,037][60143] Updated weights for policy 0, policy_version 95952 (0.0009) +[2023-10-09 08:08:01,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 197623808. Throughput: 0: 1709.1, 1: 1722.3. Samples: 49420222. Policy #0 lag: (min: 0.0, avg: 24.1, max: 32.0) +[2023-10-09 08:08:01,053][59242] Avg episode reward: [(0, '32.500'), (1, '36.540')] +[2023-10-09 08:08:01,410][60143] Updated weights for policy 0, policy_version 95962 (0.0008) +[2023-10-09 08:08:02,784][60144] Updated weights for policy 1, policy_version 97062 (0.0009) +[2023-10-09 08:08:03,155][60144] Updated weights for policy 1, policy_version 97072 (0.0010) +[2023-10-09 08:08:03,517][60144] Updated weights for policy 1, policy_version 97082 (0.0009) +[2023-10-09 08:08:05,404][60143] Updated weights for policy 0, policy_version 95972 (0.0009) +[2023-10-09 08:08:05,774][60143] Updated weights for policy 0, policy_version 95982 (0.0008) +[2023-10-09 08:08:06,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 197689344. Throughput: 0: 1705.9, 1: 1716.5. Samples: 49429770. Policy #0 lag: (min: 0.0, avg: 24.1, max: 32.0) +[2023-10-09 08:08:06,052][59242] Avg episode reward: [(0, '32.840'), (1, '35.980')] +[2023-10-09 08:08:06,138][60143] Updated weights for policy 0, policy_version 95992 (0.0009) +[2023-10-09 08:08:07,446][60144] Updated weights for policy 1, policy_version 97092 (0.0008) +[2023-10-09 08:08:07,817][60144] Updated weights for policy 1, policy_version 97102 (0.0009) +[2023-10-09 08:08:08,181][60144] Updated weights for policy 1, policy_version 97112 (0.0009) +[2023-10-09 08:08:10,215][60143] Updated weights for policy 0, policy_version 96002 (0.0007) +[2023-10-09 08:08:10,578][60143] Updated weights for policy 0, policy_version 96012 (0.0007) +[2023-10-09 08:08:10,947][60143] Updated weights for policy 0, policy_version 96022 (0.0007) +[2023-10-09 08:08:11,052][59242] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 197754880. Throughput: 0: 1708.4, 1: 1708.8. Samples: 49450920. Policy #0 lag: (min: 0.0, avg: 24.1, max: 32.0) +[2023-10-09 08:08:11,053][59242] Avg episode reward: [(0, '32.540'), (1, '35.840')] +[2023-10-09 08:08:11,314][60143] Updated weights for policy 0, policy_version 96032 (0.0009) +[2023-10-09 08:08:12,063][60144] Updated weights for policy 1, policy_version 97122 (0.0007) +[2023-10-09 08:08:12,432][60144] Updated weights for policy 1, policy_version 97132 (0.0008) +[2023-10-09 08:08:12,803][60144] Updated weights for policy 1, policy_version 97142 (0.0010) +[2023-10-09 08:08:13,172][60144] Updated weights for policy 1, policy_version 97152 (0.0007) +[2023-10-09 08:08:15,235][60143] Updated weights for policy 0, policy_version 96042 (0.0008) +[2023-10-09 08:08:15,611][60143] Updated weights for policy 0, policy_version 96052 (0.0009) +[2023-10-09 08:08:15,979][60143] Updated weights for policy 0, policy_version 96062 (0.0008) +[2023-10-09 08:08:16,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 197853184. Throughput: 0: 1696.0, 1: 1744.0. Samples: 49471750. Policy #0 lag: (min: 0.0, avg: 24.1, max: 32.0) +[2023-10-09 08:08:16,052][59242] Avg episode reward: [(0, '32.370'), (1, '34.760')] +[2023-10-09 08:08:17,170][60144] Updated weights for policy 1, policy_version 97162 (0.0007) +[2023-10-09 08:08:17,539][60144] Updated weights for policy 1, policy_version 97172 (0.0008) +[2023-10-09 08:08:17,917][60144] Updated weights for policy 1, policy_version 97182 (0.0008) +[2023-10-09 08:08:19,993][60143] Updated weights for policy 0, policy_version 96072 (0.0008) +[2023-10-09 08:08:20,356][60143] Updated weights for policy 0, policy_version 96082 (0.0010) +[2023-10-09 08:08:20,730][60143] Updated weights for policy 0, policy_version 96092 (0.0010) +[2023-10-09 08:08:21,052][59242] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 197918720. Throughput: 0: 1714.2, 1: 1711.2. Samples: 49481662. Policy #0 lag: (min: 0.0, avg: 24.1, max: 32.0) +[2023-10-09 08:08:21,053][59242] Avg episode reward: [(0, '31.980'), (1, '34.980')] +[2023-10-09 08:08:21,660][60144] Updated weights for policy 1, policy_version 97192 (0.0009) +[2023-10-09 08:08:22,028][60144] Updated weights for policy 1, policy_version 97202 (0.0008) +[2023-10-09 08:08:22,399][60144] Updated weights for policy 1, policy_version 97212 (0.0008) +[2023-10-09 08:08:24,794][60143] Updated weights for policy 0, policy_version 96102 (0.0011) +[2023-10-09 08:08:25,167][60143] Updated weights for policy 0, policy_version 96112 (0.0008) +[2023-10-09 08:08:25,532][60143] Updated weights for policy 0, policy_version 96122 (0.0008) +[2023-10-09 08:08:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 197984256. Throughput: 0: 1709.1, 1: 1732.2. Samples: 49502918. Policy #0 lag: (min: 0.0, avg: 24.1, max: 32.0) +[2023-10-09 08:08:26,052][59242] Avg episode reward: [(0, '35.400'), (1, '35.410')] +[2023-10-09 08:08:26,351][60144] Updated weights for policy 1, policy_version 97222 (0.0007) +[2023-10-09 08:08:26,714][60144] Updated weights for policy 1, policy_version 97232 (0.0009) +[2023-10-09 08:08:27,078][60144] Updated weights for policy 1, policy_version 97242 (0.0010) +[2023-10-09 08:08:29,465][60143] Updated weights for policy 0, policy_version 96132 (0.0007) +[2023-10-09 08:08:29,839][60143] Updated weights for policy 0, policy_version 96142 (0.0008) +[2023-10-09 08:08:30,206][60143] Updated weights for policy 0, policy_version 96152 (0.0009) +[2023-10-09 08:08:31,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 198049792. Throughput: 0: 1679.9, 1: 1745.9. Samples: 49522792. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:08:31,053][59242] Avg episode reward: [(0, '34.800'), (1, '34.910')] +[2023-10-09 08:08:31,093][60144] Updated weights for policy 1, policy_version 97252 (0.0009) +[2023-10-09 08:08:31,461][60144] Updated weights for policy 1, policy_version 97262 (0.0008) +[2023-10-09 08:08:31,836][60144] Updated weights for policy 1, policy_version 97272 (0.0009) +[2023-10-09 08:08:34,129][60143] Updated weights for policy 0, policy_version 96162 (0.0009) +[2023-10-09 08:08:34,498][60143] Updated weights for policy 0, policy_version 96172 (0.0008) +[2023-10-09 08:08:34,858][60143] Updated weights for policy 0, policy_version 96182 (0.0007) +[2023-10-09 08:08:35,225][60143] Updated weights for policy 0, policy_version 96192 (0.0009) +[2023-10-09 08:08:35,802][60144] Updated weights for policy 1, policy_version 97282 (0.0008) +[2023-10-09 08:08:36,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 198115328. Throughput: 0: 1712.0, 1: 1719.8. Samples: 49533456. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:08:36,053][59242] Avg episode reward: [(0, '33.860'), (1, '34.610')] +[2023-10-09 08:08:36,177][60144] Updated weights for policy 1, policy_version 97292 (0.0009) +[2023-10-09 08:08:36,535][60144] Updated weights for policy 1, policy_version 97302 (0.0010) +[2023-10-09 08:08:36,903][60144] Updated weights for policy 1, policy_version 97312 (0.0007) +[2023-10-09 08:08:39,157][60143] Updated weights for policy 0, policy_version 96202 (0.0008) +[2023-10-09 08:08:39,528][60143] Updated weights for policy 0, policy_version 96212 (0.0008) +[2023-10-09 08:08:39,896][60143] Updated weights for policy 0, policy_version 96222 (0.0007) +[2023-10-09 08:08:40,897][60144] Updated weights for policy 1, policy_version 97322 (0.0008) +[2023-10-09 08:08:41,052][59242] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 198180864. Throughput: 0: 1697.2, 1: 1743.7. Samples: 49553864. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:08:41,052][59242] Avg episode reward: [(0, '34.250'), (1, '34.840')] +[2023-10-09 08:08:41,262][60144] Updated weights for policy 1, policy_version 97332 (0.0010) +[2023-10-09 08:08:41,622][60144] Updated weights for policy 1, policy_version 97342 (0.0007) +[2023-10-09 08:08:43,808][60143] Updated weights for policy 0, policy_version 96232 (0.0008) +[2023-10-09 08:08:44,177][60143] Updated weights for policy 0, policy_version 96242 (0.0009) +[2023-10-09 08:08:44,554][60143] Updated weights for policy 0, policy_version 96252 (0.0008) +[2023-10-09 08:08:45,634][60144] Updated weights for policy 1, policy_version 97352 (0.0009) +[2023-10-09 08:08:46,007][60144] Updated weights for policy 1, policy_version 97362 (0.0009) +[2023-10-09 08:08:46,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 198246400. Throughput: 0: 1689.5, 1: 1733.8. Samples: 49574272. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:08:46,053][59242] Avg episode reward: [(0, '32.690'), (1, '34.680')] +[2023-10-09 08:08:46,375][60144] Updated weights for policy 1, policy_version 97372 (0.0009) +[2023-10-09 08:08:48,716][60143] Updated weights for policy 0, policy_version 96262 (0.0007) +[2023-10-09 08:08:49,090][60143] Updated weights for policy 0, policy_version 96272 (0.0007) +[2023-10-09 08:08:49,459][60143] Updated weights for policy 0, policy_version 96282 (0.0007) +[2023-10-09 08:08:50,274][60144] Updated weights for policy 1, policy_version 97382 (0.0008) +[2023-10-09 08:08:50,637][60144] Updated weights for policy 1, policy_version 97392 (0.0010) +[2023-10-09 08:08:51,009][60144] Updated weights for policy 1, policy_version 97402 (0.0008) +[2023-10-09 08:08:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 198311936. Throughput: 0: 1718.2, 1: 1736.6. Samples: 49585234. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:08:51,053][59242] Avg episode reward: [(0, '31.950'), (1, '34.050')] +[2023-10-09 08:08:53,317][60143] Updated weights for policy 0, policy_version 96292 (0.0008) +[2023-10-09 08:08:53,686][60143] Updated weights for policy 0, policy_version 96302 (0.0008) +[2023-10-09 08:08:54,055][60143] Updated weights for policy 0, policy_version 96312 (0.0010) +[2023-10-09 08:08:54,827][60144] Updated weights for policy 1, policy_version 97412 (0.0009) +[2023-10-09 08:08:55,199][60144] Updated weights for policy 1, policy_version 97422 (0.0008) +[2023-10-09 08:08:55,564][60144] Updated weights for policy 1, policy_version 97432 (0.0008) +[2023-10-09 08:08:56,052][59242] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 198410240. Throughput: 0: 1689.7, 1: 1744.3. Samples: 49605450. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:08:56,053][59242] Avg episode reward: [(0, '32.480'), (1, '32.410')] +[2023-10-09 08:08:58,084][60143] Updated weights for policy 0, policy_version 96322 (0.0010) +[2023-10-09 08:08:58,457][60143] Updated weights for policy 0, policy_version 96332 (0.0007) +[2023-10-09 08:08:58,817][60143] Updated weights for policy 0, policy_version 96342 (0.0007) +[2023-10-09 08:08:59,194][60143] Updated weights for policy 0, policy_version 96352 (0.0008) +[2023-10-09 08:08:59,448][60144] Updated weights for policy 1, policy_version 97442 (0.0008) +[2023-10-09 08:08:59,811][60144] Updated weights for policy 1, policy_version 97452 (0.0007) +[2023-10-09 08:09:00,176][60144] Updated weights for policy 1, policy_version 97462 (0.0009) +[2023-10-09 08:09:00,548][60144] Updated weights for policy 1, policy_version 97472 (0.0011) +[2023-10-09 08:09:01,052][59242] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13884.7). Total num frames: 198475776. Throughput: 0: 1706.0, 1: 1711.0. Samples: 49625514. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:09:01,052][59242] Avg episode reward: [(0, '31.360'), (1, '32.370')] +[2023-10-09 08:09:03,178][60143] Updated weights for policy 0, policy_version 96362 (0.0007) +[2023-10-09 08:09:03,555][60143] Updated weights for policy 0, policy_version 96372 (0.0008) +[2023-10-09 08:09:03,917][60143] Updated weights for policy 0, policy_version 96382 (0.0007) +[2023-10-09 08:09:04,805][60144] Updated weights for policy 1, policy_version 97482 (0.0009) +[2023-10-09 08:09:05,179][60144] Updated weights for policy 1, policy_version 97492 (0.0010) +[2023-10-09 08:09:05,545][60144] Updated weights for policy 1, policy_version 97502 (0.0010) +[2023-10-09 08:09:06,052][59242] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 198541312. Throughput: 0: 1702.2, 1: 1741.8. Samples: 49636640. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:09:06,052][59242] Avg episode reward: [(0, '30.720'), (1, '33.050')] +[2023-10-09 08:09:07,864][60143] Updated weights for policy 0, policy_version 96392 (0.0008) +[2023-10-09 08:09:08,231][60143] Updated weights for policy 0, policy_version 96402 (0.0011) +[2023-10-09 08:09:08,601][60143] Updated weights for policy 0, policy_version 96412 (0.0010) +[2023-10-09 08:09:09,504][60144] Updated weights for policy 1, policy_version 97512 (0.0008) +[2023-10-09 08:09:09,872][60144] Updated weights for policy 1, policy_version 97522 (0.0008) +[2023-10-09 08:09:10,241][60144] Updated weights for policy 1, policy_version 97532 (0.0009) +[2023-10-09 08:09:11,052][59242] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 198606848. Throughput: 0: 1696.5, 1: 1721.8. Samples: 49656744. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:09:11,052][59242] Avg episode reward: [(0, '31.130'), (1, '31.800')] +[2023-10-09 08:09:12,543][60143] Updated weights for policy 0, policy_version 96422 (0.0008) +[2023-10-09 08:09:12,920][60143] Updated weights for policy 0, policy_version 96432 (0.0009) +[2023-10-09 08:09:13,295][60143] Updated weights for policy 0, policy_version 96442 (0.0008) +[2023-10-09 08:09:13,970][60144] Updated weights for policy 1, policy_version 97542 (0.0008) +[2023-10-09 08:09:14,335][60144] Updated weights for policy 1, policy_version 97552 (0.0010) +[2023-10-09 08:09:14,700][60144] Updated weights for policy 1, policy_version 97562 (0.0009) +[2023-10-09 08:09:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 198672384. Throughput: 0: 1730.4, 1: 1701.5. Samples: 49677224. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:09:16,052][59242] Avg episode reward: [(0, '31.590'), (1, '31.520')] +[2023-10-09 08:09:17,232][60143] Updated weights for policy 0, policy_version 96452 (0.0008) +[2023-10-09 08:09:17,596][60143] Updated weights for policy 0, policy_version 96462 (0.0008) +[2023-10-09 08:09:17,974][60143] Updated weights for policy 0, policy_version 96472 (0.0008) +[2023-10-09 08:09:18,746][60144] Updated weights for policy 1, policy_version 97572 (0.0007) +[2023-10-09 08:09:19,110][60144] Updated weights for policy 1, policy_version 97582 (0.0007) +[2023-10-09 08:09:19,484][60144] Updated weights for policy 1, policy_version 97592 (0.0008) +[2023-10-09 08:09:21,052][59242] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 198737920. Throughput: 0: 1699.4, 1: 1730.5. Samples: 49687802. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:09:21,053][59242] Avg episode reward: [(0, '31.320'), (1, '32.050')] +[2023-10-09 08:09:22,038][60143] Updated weights for policy 0, policy_version 96482 (0.0010) +[2023-10-09 08:09:22,411][60143] Updated weights for policy 0, policy_version 96492 (0.0008) +[2023-10-09 08:09:22,782][60143] Updated weights for policy 0, policy_version 96502 (0.0007) +[2023-10-09 08:09:23,150][60143] Updated weights for policy 0, policy_version 96512 (0.0007) +[2023-10-09 08:09:23,435][60144] Updated weights for policy 1, policy_version 97602 (0.0007) +[2023-10-09 08:09:23,803][60144] Updated weights for policy 1, policy_version 97612 (0.0008) +[2023-10-09 08:09:24,177][60144] Updated weights for policy 1, policy_version 97622 (0.0010) +[2023-10-09 08:09:24,537][60144] Updated weights for policy 1, policy_version 97632 (0.0009) +[2023-10-09 08:09:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 198803456. Throughput: 0: 1716.3, 1: 1704.4. Samples: 49707796. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:09:26,052][59242] Avg episode reward: [(0, '31.820'), (1, '31.570')] +[2023-10-09 08:09:27,076][60143] Updated weights for policy 0, policy_version 96522 (0.0007) +[2023-10-09 08:09:27,450][60143] Updated weights for policy 0, policy_version 96532 (0.0008) +[2023-10-09 08:09:27,818][60143] Updated weights for policy 0, policy_version 96542 (0.0008) +[2023-10-09 08:09:28,559][60144] Updated weights for policy 1, policy_version 97642 (0.0011) +[2023-10-09 08:09:28,923][60144] Updated weights for policy 1, policy_version 97652 (0.0008) +[2023-10-09 08:09:29,291][60144] Updated weights for policy 1, policy_version 97662 (0.0007) +[2023-10-09 08:09:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 198868992. Throughput: 0: 1726.3, 1: 1708.7. Samples: 49728846. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-09 08:09:31,053][59242] Avg episode reward: [(0, '31.320'), (1, '33.300')] +[2023-10-09 08:09:31,062][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000097664_100007936.pth... +[2023-10-09 08:09:31,062][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000096544_98861056.pth... +[2023-10-09 08:09:31,116][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000094944_97222656.pth +[2023-10-09 08:09:31,116][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000096032_98336768.pth +[2023-10-09 08:09:31,829][60143] Updated weights for policy 0, policy_version 96552 (0.0008) +[2023-10-09 08:09:32,209][60143] Updated weights for policy 0, policy_version 96562 (0.0008) +[2023-10-09 08:09:32,575][60143] Updated weights for policy 0, policy_version 96572 (0.0009) +[2023-10-09 08:09:33,177][60144] Updated weights for policy 1, policy_version 97672 (0.0007) +[2023-10-09 08:09:33,540][60144] Updated weights for policy 1, policy_version 97682 (0.0007) +[2023-10-09 08:09:33,910][60144] Updated weights for policy 1, policy_version 97692 (0.0008) +[2023-10-09 08:09:36,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 198934528. Throughput: 0: 1694.8, 1: 1715.0. Samples: 49738676. Policy #0 lag: (min: 6.0, avg: 6.2, max: 15.0) +[2023-10-09 08:09:36,053][59242] Avg episode reward: [(0, '31.380'), (1, '33.320')] +[2023-10-09 08:09:36,645][60143] Updated weights for policy 0, policy_version 96582 (0.0009) +[2023-10-09 08:09:37,023][60143] Updated weights for policy 0, policy_version 96592 (0.0010) +[2023-10-09 08:09:37,399][60143] Updated weights for policy 0, policy_version 96602 (0.0008) +[2023-10-09 08:09:37,797][60144] Updated weights for policy 1, policy_version 97702 (0.0011) +[2023-10-09 08:09:38,165][60144] Updated weights for policy 1, policy_version 97712 (0.0009) +[2023-10-09 08:09:38,529][60144] Updated weights for policy 1, policy_version 97722 (0.0010) +[2023-10-09 08:09:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 199000064. Throughput: 0: 1719.4, 1: 1698.9. Samples: 49759270. Policy #0 lag: (min: 6.0, avg: 6.2, max: 15.0) +[2023-10-09 08:09:41,053][59242] Avg episode reward: [(0, '32.590'), (1, '33.750')] +[2023-10-09 08:09:41,291][60143] Updated weights for policy 0, policy_version 96612 (0.0009) +[2023-10-09 08:09:41,664][60143] Updated weights for policy 0, policy_version 96622 (0.0008) +[2023-10-09 08:09:42,021][60143] Updated weights for policy 0, policy_version 96632 (0.0008) +[2023-10-09 08:09:42,589][60144] Updated weights for policy 1, policy_version 97732 (0.0008) +[2023-10-09 08:09:42,967][60144] Updated weights for policy 1, policy_version 97742 (0.0009) +[2023-10-09 08:09:43,339][60144] Updated weights for policy 1, policy_version 97752 (0.0007) +[2023-10-09 08:09:45,956][60143] Updated weights for policy 0, policy_version 96642 (0.0009) +[2023-10-09 08:09:46,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 199065600. Throughput: 0: 1720.4, 1: 1725.0. Samples: 49780556. Policy #0 lag: (min: 6.0, avg: 6.2, max: 15.0) +[2023-10-09 08:09:46,052][59242] Avg episode reward: [(0, '32.890'), (1, '34.430')] +[2023-10-09 08:09:46,325][60143] Updated weights for policy 0, policy_version 96652 (0.0007) +[2023-10-09 08:09:46,694][60143] Updated weights for policy 0, policy_version 96662 (0.0008) +[2023-10-09 08:09:47,060][60143] Updated weights for policy 0, policy_version 96672 (0.0007) +[2023-10-09 08:09:47,305][60144] Updated weights for policy 1, policy_version 97762 (0.0008) +[2023-10-09 08:09:47,669][60144] Updated weights for policy 1, policy_version 97772 (0.0007) +[2023-10-09 08:09:48,034][60144] Updated weights for policy 1, policy_version 97782 (0.0011) +[2023-10-09 08:09:48,397][60144] Updated weights for policy 1, policy_version 97792 (0.0010) +[2023-10-09 08:09:51,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 199131136. Throughput: 0: 1707.6, 1: 1699.8. Samples: 49789974. Policy #0 lag: (min: 6.0, avg: 6.2, max: 15.0) +[2023-10-09 08:09:51,052][59242] Avg episode reward: [(0, '31.390'), (1, '34.050')] +[2023-10-09 08:09:51,064][60143] Updated weights for policy 0, policy_version 96682 (0.0007) +[2023-10-09 08:09:51,428][60143] Updated weights for policy 0, policy_version 96692 (0.0007) +[2023-10-09 08:09:51,803][60143] Updated weights for policy 0, policy_version 96702 (0.0008) +[2023-10-09 08:09:52,361][60144] Updated weights for policy 1, policy_version 97802 (0.0010) +[2023-10-09 08:09:52,738][60144] Updated weights for policy 1, policy_version 97812 (0.0007) +[2023-10-09 08:09:53,110][60144] Updated weights for policy 1, policy_version 97822 (0.0007) +[2023-10-09 08:09:55,539][60143] Updated weights for policy 0, policy_version 96712 (0.0007) +[2023-10-09 08:09:55,904][60143] Updated weights for policy 0, policy_version 96722 (0.0009) +[2023-10-09 08:09:56,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13662.6). Total num frames: 199196672. Throughput: 0: 1721.2, 1: 1712.3. Samples: 49811254. Policy #0 lag: (min: 6.0, avg: 6.2, max: 15.0) +[2023-10-09 08:09:56,052][59242] Avg episode reward: [(0, '32.140'), (1, '33.470')] +[2023-10-09 08:09:56,281][60143] Updated weights for policy 0, policy_version 96732 (0.0009) +[2023-10-09 08:09:57,056][60144] Updated weights for policy 1, policy_version 97832 (0.0007) +[2023-10-09 08:09:57,427][60144] Updated weights for policy 1, policy_version 97842 (0.0007) +[2023-10-09 08:09:57,799][60144] Updated weights for policy 1, policy_version 97852 (0.0008) +[2023-10-09 08:10:00,365][60143] Updated weights for policy 0, policy_version 96742 (0.0007) +[2023-10-09 08:10:00,736][60143] Updated weights for policy 0, policy_version 96752 (0.0009) +[2023-10-09 08:10:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 199262208. Throughput: 0: 1711.8, 1: 1729.5. Samples: 49832082. Policy #0 lag: (min: 6.0, avg: 6.2, max: 15.0) +[2023-10-09 08:10:01,053][59242] Avg episode reward: [(0, '32.010'), (1, '33.740')] +[2023-10-09 08:10:01,100][60143] Updated weights for policy 0, policy_version 96762 (0.0010) +[2023-10-09 08:10:01,849][60144] Updated weights for policy 1, policy_version 97862 (0.0009) +[2023-10-09 08:10:02,203][60144] Updated weights for policy 1, policy_version 97872 (0.0008) +[2023-10-09 08:10:02,572][60144] Updated weights for policy 1, policy_version 97882 (0.0007) +[2023-10-09 08:10:05,080][60143] Updated weights for policy 0, policy_version 96772 (0.0009) +[2023-10-09 08:10:05,455][60143] Updated weights for policy 0, policy_version 96782 (0.0009) +[2023-10-09 08:10:05,819][60143] Updated weights for policy 0, policy_version 96792 (0.0008) +[2023-10-09 08:10:06,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 199327744. Throughput: 0: 1722.9, 1: 1701.7. Samples: 49841906. Policy #0 lag: (min: 6.0, avg: 6.2, max: 15.0) +[2023-10-09 08:10:06,053][59242] Avg episode reward: [(0, '32.290'), (1, '34.060')] +[2023-10-09 08:10:06,365][60144] Updated weights for policy 1, policy_version 97892 (0.0007) +[2023-10-09 08:10:06,742][60144] Updated weights for policy 1, policy_version 97902 (0.0007) +[2023-10-09 08:10:07,113][60144] Updated weights for policy 1, policy_version 97912 (0.0008) +[2023-10-09 08:10:09,760][60143] Updated weights for policy 0, policy_version 96802 (0.0007) +[2023-10-09 08:10:10,132][60143] Updated weights for policy 0, policy_version 96812 (0.0009) +[2023-10-09 08:10:10,506][60143] Updated weights for policy 0, policy_version 96822 (0.0009) +[2023-10-09 08:10:10,873][60143] Updated weights for policy 0, policy_version 96832 (0.0007) +[2023-10-09 08:10:11,052][59242] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 199426048. Throughput: 0: 1725.3, 1: 1726.9. Samples: 49863148. Policy #0 lag: (min: 6.0, avg: 6.2, max: 15.0) +[2023-10-09 08:10:11,053][59242] Avg episode reward: [(0, '31.710'), (1, '34.300')] +[2023-10-09 08:10:11,158][60144] Updated weights for policy 1, policy_version 97922 (0.0008) +[2023-10-09 08:10:11,522][60144] Updated weights for policy 1, policy_version 97932 (0.0009) +[2023-10-09 08:10:11,894][60144] Updated weights for policy 1, policy_version 97942 (0.0007) +[2023-10-09 08:10:12,263][60144] Updated weights for policy 1, policy_version 97952 (0.0007) +[2023-10-09 08:10:14,849][60143] Updated weights for policy 0, policy_version 96842 (0.0007) +[2023-10-09 08:10:15,222][60143] Updated weights for policy 0, policy_version 96852 (0.0009) +[2023-10-09 08:10:15,594][60143] Updated weights for policy 0, policy_version 96862 (0.0008) +[2023-10-09 08:10:16,052][59242] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 199491584. Throughput: 0: 1700.6, 1: 1734.1. Samples: 49883408. Policy #0 lag: (min: 6.0, avg: 6.2, max: 15.0) +[2023-10-09 08:10:16,053][59242] Avg episode reward: [(0, '31.220'), (1, '34.690')] +[2023-10-09 08:10:16,122][60144] Updated weights for policy 1, policy_version 97962 (0.0010) +[2023-10-09 08:10:16,479][60144] Updated weights for policy 1, policy_version 97972 (0.0011) +[2023-10-09 08:10:16,845][60144] Updated weights for policy 1, policy_version 97982 (0.0010) +[2023-10-09 08:10:19,525][60143] Updated weights for policy 0, policy_version 96872 (0.0007) +[2023-10-09 08:10:19,894][60143] Updated weights for policy 0, policy_version 96882 (0.0008) +[2023-10-09 08:10:20,271][60143] Updated weights for policy 0, policy_version 96892 (0.0008) +[2023-10-09 08:10:20,873][60144] Updated weights for policy 1, policy_version 97992 (0.0011) +[2023-10-09 08:10:21,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 199557120. Throughput: 0: 1727.9, 1: 1718.6. Samples: 49893766. Policy #0 lag: (min: 6.0, avg: 6.2, max: 15.0) +[2023-10-09 08:10:21,053][59242] Avg episode reward: [(0, '30.990'), (1, '36.640')] +[2023-10-09 08:10:21,240][60144] Updated weights for policy 1, policy_version 98002 (0.0010) +[2023-10-09 08:10:21,606][60144] Updated weights for policy 1, policy_version 98012 (0.0008) +[2023-10-09 08:10:24,197][60143] Updated weights for policy 0, policy_version 96902 (0.0010) +[2023-10-09 08:10:24,576][60143] Updated weights for policy 0, policy_version 96912 (0.0011) +[2023-10-09 08:10:24,949][60143] Updated weights for policy 0, policy_version 96922 (0.0009) +[2023-10-09 08:10:25,508][60144] Updated weights for policy 1, policy_version 98022 (0.0009) +[2023-10-09 08:10:25,877][60144] Updated weights for policy 1, policy_version 98032 (0.0008) +[2023-10-09 08:10:26,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 199622656. Throughput: 0: 1717.1, 1: 1730.5. Samples: 49914412. Policy #0 lag: (min: 6.0, avg: 6.2, max: 15.0) +[2023-10-09 08:10:26,053][59242] Avg episode reward: [(0, '31.610'), (1, '36.470')] +[2023-10-09 08:10:26,247][60144] Updated weights for policy 1, policy_version 98042 (0.0010) +[2023-10-09 08:10:28,820][60143] Updated weights for policy 0, policy_version 96932 (0.0010) +[2023-10-09 08:10:29,186][60143] Updated weights for policy 0, policy_version 96942 (0.0008) +[2023-10-09 08:10:29,555][60143] Updated weights for policy 0, policy_version 96952 (0.0007) +[2023-10-09 08:10:30,425][60144] Updated weights for policy 1, policy_version 98052 (0.0010) +[2023-10-09 08:10:30,796][60144] Updated weights for policy 1, policy_version 98062 (0.0008) +[2023-10-09 08:10:31,053][59242] Fps is (10 sec: 13106.6, 60 sec: 13653.2, 300 sec: 13773.6). Total num frames: 199688192. Throughput: 0: 1703.1, 1: 1718.0. Samples: 49934508. Policy #0 lag: (min: 6.0, avg: 6.2, max: 15.0) +[2023-10-09 08:10:31,054][59242] Avg episode reward: [(0, '31.330'), (1, '37.120')] +[2023-10-09 08:10:31,166][60144] Updated weights for policy 1, policy_version 98072 (0.0007) +[2023-10-09 08:10:33,391][60143] Updated weights for policy 0, policy_version 96962 (0.0008) +[2023-10-09 08:10:33,763][60143] Updated weights for policy 0, policy_version 96972 (0.0009) +[2023-10-09 08:10:34,133][60143] Updated weights for policy 0, policy_version 96982 (0.0012) +[2023-10-09 08:10:34,505][60143] Updated weights for policy 0, policy_version 96992 (0.0009) +[2023-10-09 08:10:35,067][60144] Updated weights for policy 1, policy_version 98082 (0.0009) +[2023-10-09 08:10:35,429][60144] Updated weights for policy 1, policy_version 98092 (0.0010) +[2023-10-09 08:10:35,803][60144] Updated weights for policy 1, policy_version 98102 (0.0008) +[2023-10-09 08:10:36,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 199753728. Throughput: 0: 1731.3, 1: 1726.2. Samples: 49945560. Policy #0 lag: (min: 5.0, avg: 11.4, max: 37.0) +[2023-10-09 08:10:36,052][59242] Avg episode reward: [(0, '33.810'), (1, '37.360')] +[2023-10-09 08:10:36,165][60144] Updated weights for policy 1, policy_version 98112 (0.0007) +[2023-10-09 08:10:38,707][60143] Updated weights for policy 0, policy_version 97002 (0.0007) +[2023-10-09 08:10:39,073][60143] Updated weights for policy 0, policy_version 97012 (0.0010) +[2023-10-09 08:10:39,436][60143] Updated weights for policy 0, policy_version 97022 (0.0009) +[2023-10-09 08:10:40,230][60144] Updated weights for policy 1, policy_version 98122 (0.0007) +[2023-10-09 08:10:40,604][60144] Updated weights for policy 1, policy_version 98132 (0.0008) +[2023-10-09 08:10:40,973][60144] Updated weights for policy 1, policy_version 98142 (0.0008) +[2023-10-09 08:10:41,052][59242] Fps is (10 sec: 16384.8, 60 sec: 14199.5, 300 sec: 13884.8). Total num frames: 199852032. Throughput: 0: 1700.8, 1: 1732.0. Samples: 49965728. Policy #0 lag: (min: 5.0, avg: 11.4, max: 37.0) +[2023-10-09 08:10:41,053][59242] Avg episode reward: [(0, '33.180'), (1, '37.420')] +[2023-10-09 08:10:43,450][60143] Updated weights for policy 0, policy_version 97032 (0.0008) +[2023-10-09 08:10:43,818][60143] Updated weights for policy 0, policy_version 97042 (0.0008) +[2023-10-09 08:10:44,189][60143] Updated weights for policy 0, policy_version 97052 (0.0009) +[2023-10-09 08:10:44,829][60144] Updated weights for policy 1, policy_version 98152 (0.0008) +[2023-10-09 08:10:45,195][60144] Updated weights for policy 1, policy_version 98162 (0.0009) +[2023-10-09 08:10:45,561][60144] Updated weights for policy 1, policy_version 98172 (0.0008) +[2023-10-09 08:10:46,052][59242] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 13884.7). Total num frames: 199917568. Throughput: 0: 1707.1, 1: 1707.2. Samples: 49985726. Policy #0 lag: (min: 5.0, avg: 11.4, max: 37.0) +[2023-10-09 08:10:46,053][59242] Avg episode reward: [(0, '33.330'), (1, '36.040')] +[2023-10-09 08:10:48,206][60143] Updated weights for policy 0, policy_version 97062 (0.0008) +[2023-10-09 08:10:48,575][60143] Updated weights for policy 0, policy_version 97072 (0.0008) +[2023-10-09 08:10:48,938][60143] Updated weights for policy 0, policy_version 97082 (0.0009) +[2023-10-09 08:10:49,514][60144] Updated weights for policy 1, policy_version 98182 (0.0008) +[2023-10-09 08:10:49,883][60144] Updated weights for policy 1, policy_version 98192 (0.0010) +[2023-10-09 08:10:50,244][60144] Updated weights for policy 1, policy_version 98202 (0.0008) +[2023-10-09 08:10:51,052][59242] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13884.8). Total num frames: 199983104. Throughput: 0: 1712.8, 1: 1729.4. Samples: 49996804. Policy #0 lag: (min: 5.0, avg: 11.4, max: 37.0) +[2023-10-09 08:10:51,053][59242] Avg episode reward: [(0, '33.570'), (1, '37.090')] +[2023-10-09 08:10:52,889][60143] Updated weights for policy 0, policy_version 97092 (0.0009) +[2023-10-09 08:10:53,266][60143] Updated weights for policy 0, policy_version 97102 (0.0008) +[2023-10-09 08:10:53,633][60143] Updated weights for policy 0, policy_version 97112 (0.0009) +[2023-10-09 08:10:54,150][60144] Updated weights for policy 1, policy_version 98212 (0.0008) +[2023-10-09 08:10:54,521][60144] Updated weights for policy 1, policy_version 98222 (0.0007) +[2023-10-09 08:10:54,889][60144] Updated weights for policy 1, policy_version 98232 (0.0007) +[2023-10-09 08:10:56,052][59242] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 200048640. Throughput: 0: 1696.2, 1: 1719.6. Samples: 50016858. Policy #0 lag: (min: 5.0, avg: 11.4, max: 37.0) +[2023-10-09 08:10:56,053][59242] Avg episode reward: [(0, '33.950'), (1, '36.940')] +[2023-10-09 08:10:57,638][60143] Updated weights for policy 0, policy_version 97122 (0.0008) +[2023-10-09 08:10:58,008][60143] Updated weights for policy 0, policy_version 97132 (0.0007) +[2023-10-09 08:10:58,369][60143] Updated weights for policy 0, policy_version 97142 (0.0007) +[2023-10-09 08:10:58,738][60143] Updated weights for policy 0, policy_version 97152 (0.0009) +[2023-10-09 08:10:58,766][60144] Updated weights for policy 1, policy_version 98242 (0.0008) +[2023-10-09 08:10:59,129][60144] Updated weights for policy 1, policy_version 98252 (0.0007) +[2023-10-09 08:10:59,496][60144] Updated weights for policy 1, policy_version 98262 (0.0008) +[2023-10-09 08:10:59,866][60144] Updated weights for policy 1, policy_version 98272 (0.0009) +[2023-10-09 08:11:01,052][59242] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13773.7). Total num frames: 200114176. Throughput: 0: 1725.9, 1: 1703.1. Samples: 50037710. Policy #0 lag: (min: 5.0, avg: 11.4, max: 37.0) +[2023-10-09 08:11:01,053][59242] Avg episode reward: [(0, '34.970'), (1, '36.390')] +[2023-10-09 08:11:02,772][60143] Updated weights for policy 0, policy_version 97162 (0.0008) +[2023-10-09 08:11:03,140][60143] Updated weights for policy 0, policy_version 97172 (0.0007) +[2023-10-09 08:11:03,498][60143] Updated weights for policy 0, policy_version 97182 (0.0008) +[2023-10-09 08:11:03,813][60144] Updated weights for policy 1, policy_version 98282 (0.0008) +[2023-10-09 08:11:04,178][60144] Updated weights for policy 1, policy_version 98292 (0.0008) +[2023-10-09 08:11:04,555][60144] Updated weights for policy 1, policy_version 98302 (0.0007) +[2023-10-09 08:11:06,052][59242] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 200179712. Throughput: 0: 1700.9, 1: 1730.0. Samples: 50048156. Policy #0 lag: (min: 5.0, avg: 11.4, max: 37.0) +[2023-10-09 08:11:06,053][59242] Avg episode reward: [(0, '36.800'), (1, '36.130')] +[2023-10-09 08:11:07,500][60143] Updated weights for policy 0, policy_version 97192 (0.0009) +[2023-10-09 08:11:07,861][60143] Updated weights for policy 0, policy_version 97202 (0.0010) +[2023-10-09 08:11:08,224][60143] Updated weights for policy 0, policy_version 97212 (0.0009) +[2023-10-09 08:11:08,404][60144] Updated weights for policy 1, policy_version 98312 (0.0008) +[2023-10-09 08:11:08,774][60144] Updated weights for policy 1, policy_version 98322 (0.0008) +[2023-10-09 08:11:09,132][60144] Updated weights for policy 1, policy_version 98332 (0.0007) +[2023-10-09 08:11:11,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 200245248. Throughput: 0: 1710.6, 1: 1704.0. Samples: 50068072. Policy #0 lag: (min: 5.0, avg: 11.4, max: 37.0) +[2023-10-09 08:11:11,053][59242] Avg episode reward: [(0, '37.550'), (1, '35.970')] +[2023-10-09 08:11:12,240][60143] Updated weights for policy 0, policy_version 97222 (0.0008) +[2023-10-09 08:11:12,623][60143] Updated weights for policy 0, policy_version 97232 (0.0010) +[2023-10-09 08:11:12,998][60143] Updated weights for policy 0, policy_version 97242 (0.0008) +[2023-10-09 08:11:13,020][60144] Updated weights for policy 1, policy_version 98342 (0.0008) +[2023-10-09 08:11:13,379][60144] Updated weights for policy 1, policy_version 98352 (0.0008) +[2023-10-09 08:11:13,742][60144] Updated weights for policy 1, policy_version 98362 (0.0007) +[2023-10-09 08:11:16,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 200310784. Throughput: 0: 1718.5, 1: 1721.7. Samples: 50089316. Policy #0 lag: (min: 5.0, avg: 11.4, max: 37.0) +[2023-10-09 08:11:16,053][59242] Avg episode reward: [(0, '36.130'), (1, '36.450')] +[2023-10-09 08:11:16,940][60143] Updated weights for policy 0, policy_version 97252 (0.0007) +[2023-10-09 08:11:17,297][60143] Updated weights for policy 0, policy_version 97262 (0.0008) +[2023-10-09 08:11:17,672][60143] Updated weights for policy 0, policy_version 97272 (0.0007) +[2023-10-09 08:11:17,693][60144] Updated weights for policy 1, policy_version 98372 (0.0008) +[2023-10-09 08:11:18,065][60144] Updated weights for policy 1, policy_version 98382 (0.0007) +[2023-10-09 08:11:18,428][60144] Updated weights for policy 1, policy_version 98392 (0.0008) +[2023-10-09 08:11:21,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 200376320. Throughput: 0: 1688.7, 1: 1718.8. Samples: 50098902. Policy #0 lag: (min: 5.0, avg: 11.4, max: 37.0) +[2023-10-09 08:11:21,053][59242] Avg episode reward: [(0, '35.780'), (1, '36.160')] +[2023-10-09 08:11:21,571][60143] Updated weights for policy 0, policy_version 97282 (0.0008) +[2023-10-09 08:11:21,944][60143] Updated weights for policy 0, policy_version 97292 (0.0010) +[2023-10-09 08:11:22,313][60143] Updated weights for policy 0, policy_version 97302 (0.0009) +[2023-10-09 08:11:22,345][60144] Updated weights for policy 1, policy_version 98402 (0.0007) +[2023-10-09 08:11:22,686][60143] Updated weights for policy 0, policy_version 97312 (0.0008) +[2023-10-09 08:11:22,714][60144] Updated weights for policy 1, policy_version 98412 (0.0008) +[2023-10-09 08:11:23,082][60144] Updated weights for policy 1, policy_version 98422 (0.0007) +[2023-10-09 08:11:23,447][60144] Updated weights for policy 1, policy_version 98432 (0.0008) +[2023-10-09 08:11:26,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 200441856. Throughput: 0: 1714.1, 1: 1714.8. Samples: 50120028. Policy #0 lag: (min: 5.0, avg: 11.4, max: 37.0) +[2023-10-09 08:11:26,053][59242] Avg episode reward: [(0, '34.860'), (1, '35.440')] +[2023-10-09 08:11:26,848][60143] Updated weights for policy 0, policy_version 97322 (0.0011) +[2023-10-09 08:11:27,233][60143] Updated weights for policy 0, policy_version 97332 (0.0009) +[2023-10-09 08:11:27,595][60143] Updated weights for policy 0, policy_version 97342 (0.0007) +[2023-10-09 08:11:27,633][60144] Updated weights for policy 1, policy_version 98442 (0.0007) +[2023-10-09 08:11:28,002][60144] Updated weights for policy 1, policy_version 98452 (0.0007) +[2023-10-09 08:11:28,378][60144] Updated weights for policy 1, policy_version 98462 (0.0009) +[2023-10-09 08:11:31,052][59242] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13662.6). Total num frames: 200507392. Throughput: 0: 1714.5, 1: 1744.1. Samples: 50141362. Policy #0 lag: (min: 5.0, avg: 11.4, max: 37.0) +[2023-10-09 08:11:31,053][59242] Avg episode reward: [(0, '34.190'), (1, '36.200')] +[2023-10-09 08:11:31,061][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000098464_100827136.pth... +[2023-10-09 08:11:31,061][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000097344_99680256.pth... +[2023-10-09 08:11:31,102][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000096864_99188736.pth +[2023-10-09 08:11:31,102][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000095744_98041856.pth +[2023-10-09 08:11:31,532][60143] Updated weights for policy 0, policy_version 97352 (0.0010) +[2023-10-09 08:11:31,907][60143] Updated weights for policy 0, policy_version 97362 (0.0010) +[2023-10-09 08:11:32,276][60143] Updated weights for policy 0, policy_version 97372 (0.0008) +[2023-10-09 08:11:32,357][60144] Updated weights for policy 1, policy_version 98472 (0.0008) +[2023-10-09 08:11:32,731][60144] Updated weights for policy 1, policy_version 98482 (0.0008) +[2023-10-09 08:11:33,094][60144] Updated weights for policy 1, policy_version 98492 (0.0008) +[2023-10-09 08:11:36,052][59242] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13662.6). Total num frames: 200572928. Throughput: 0: 1700.1, 1: 1719.6. Samples: 50150694. Policy #0 lag: (min: 5.0, avg: 11.4, max: 37.0) +[2023-10-09 08:11:36,053][59242] Avg episode reward: [(0, '34.530'), (1, '35.690')] +[2023-10-09 08:11:36,163][60143] Updated weights for policy 0, policy_version 97382 (0.0007) +[2023-10-09 08:11:36,530][60143] Updated weights for policy 0, policy_version 97392 (0.0007) +[2023-10-09 08:11:36,895][60143] Updated weights for policy 0, policy_version 97402 (0.0009) +[2023-10-09 08:11:36,935][60144] Updated weights for policy 1, policy_version 98502 (0.0009) +[2023-10-09 08:11:37,306][60144] Updated weights for policy 1, policy_version 98512 (0.0008) +[2023-10-09 08:11:37,660][60144] Updated weights for policy 1, policy_version 98522 (0.0008) +[2023-10-09 08:11:41,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 200638464. Throughput: 0: 1712.9, 1: 1724.1. Samples: 50171522. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:11:41,053][59242] Avg episode reward: [(0, '34.100'), (1, '34.990')] +[2023-10-09 08:11:41,054][60143] Updated weights for policy 0, policy_version 97412 (0.0008) +[2023-10-09 08:11:41,423][60143] Updated weights for policy 0, policy_version 97422 (0.0008) +[2023-10-09 08:11:41,617][60144] Updated weights for policy 1, policy_version 98532 (0.0007) +[2023-10-09 08:11:41,790][60143] Updated weights for policy 0, policy_version 97432 (0.0008) +[2023-10-09 08:11:41,977][60144] Updated weights for policy 1, policy_version 98542 (0.0007) +[2023-10-09 08:11:42,341][60144] Updated weights for policy 1, policy_version 98552 (0.0008) +[2023-10-09 08:11:45,729][60143] Updated weights for policy 0, policy_version 97442 (0.0008) +[2023-10-09 08:11:46,052][59242] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 200704000. Throughput: 0: 1708.9, 1: 1737.0. Samples: 50192776. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:11:46,053][59242] Avg episode reward: [(0, '34.380'), (1, '35.910')] +[2023-10-09 08:11:46,095][60143] Updated weights for policy 0, policy_version 97452 (0.0010) +[2023-10-09 08:11:46,374][60144] Updated weights for policy 1, policy_version 98562 (0.0007) +[2023-10-09 08:11:46,462][60143] Updated weights for policy 0, policy_version 97462 (0.0010) +[2023-10-09 08:11:46,732][60144] Updated weights for policy 1, policy_version 98572 (0.0007) +[2023-10-09 08:11:46,836][60143] Updated weights for policy 0, policy_version 97472 (0.0007) +[2023-10-09 08:11:47,103][60144] Updated weights for policy 1, policy_version 98582 (0.0008) +[2023-10-09 08:11:47,475][60144] Updated weights for policy 1, policy_version 98592 (0.0009) +[2023-10-09 08:11:50,919][60143] Updated weights for policy 0, policy_version 97482 (0.0010) +[2023-10-09 08:11:51,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 200769536. Throughput: 0: 1710.0, 1: 1710.4. Samples: 50202072. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:11:51,052][59242] Avg episode reward: [(0, '35.890'), (1, '36.950')] +[2023-10-09 08:11:51,278][60143] Updated weights for policy 0, policy_version 97492 (0.0009) +[2023-10-09 08:11:51,486][60144] Updated weights for policy 1, policy_version 98602 (0.0007) +[2023-10-09 08:11:51,647][60143] Updated weights for policy 0, policy_version 97502 (0.0008) +[2023-10-09 08:11:51,856][60144] Updated weights for policy 1, policy_version 98612 (0.0007) +[2023-10-09 08:11:52,231][60144] Updated weights for policy 1, policy_version 98622 (0.0007) +[2023-10-09 08:11:55,638][60143] Updated weights for policy 0, policy_version 97512 (0.0008) +[2023-10-09 08:11:56,005][60143] Updated weights for policy 0, policy_version 97522 (0.0008) +[2023-10-09 08:11:56,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 200835072. Throughput: 0: 1709.9, 1: 1735.0. Samples: 50223092. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:11:56,053][59242] Avg episode reward: [(0, '36.630'), (1, '37.440')] +[2023-10-09 08:11:56,198][60144] Updated weights for policy 1, policy_version 98632 (0.0009) +[2023-10-09 08:11:56,369][60143] Updated weights for policy 0, policy_version 97532 (0.0009) +[2023-10-09 08:11:56,561][60144] Updated weights for policy 1, policy_version 98642 (0.0008) +[2023-10-09 08:11:56,924][60144] Updated weights for policy 1, policy_version 98652 (0.0009) +[2023-10-09 08:12:00,325][60143] Updated weights for policy 0, policy_version 97542 (0.0009) +[2023-10-09 08:12:00,704][60143] Updated weights for policy 0, policy_version 97552 (0.0009) +[2023-10-09 08:12:00,726][60144] Updated weights for policy 1, policy_version 98662 (0.0009) +[2023-10-09 08:12:01,052][59242] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13662.6). Total num frames: 200900608. Throughput: 0: 1705.0, 1: 1732.7. Samples: 50244012. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:12:01,053][59242] Avg episode reward: [(0, '35.610'), (1, '36.250')] +[2023-10-09 08:12:01,078][60143] Updated weights for policy 0, policy_version 97562 (0.0009) +[2023-10-09 08:12:01,093][60144] Updated weights for policy 1, policy_version 98672 (0.0010) +[2023-10-09 08:12:01,462][60144] Updated weights for policy 1, policy_version 98682 (0.0009) +[2023-10-09 08:12:04,964][60143] Updated weights for policy 0, policy_version 97572 (0.0008) +[2023-10-09 08:12:05,330][60143] Updated weights for policy 0, policy_version 97582 (0.0008) +[2023-10-09 08:12:05,409][60144] Updated weights for policy 1, policy_version 98692 (0.0008) +[2023-10-09 08:12:05,702][60143] Updated weights for policy 0, policy_version 97592 (0.0008) +[2023-10-09 08:12:05,771][60144] Updated weights for policy 1, policy_version 98702 (0.0007) +[2023-10-09 08:12:06,052][59242] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13773.7). Total num frames: 200998912. Throughput: 0: 1721.2, 1: 1725.1. Samples: 50253982. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:12:06,053][59242] Avg episode reward: [(0, '36.000'), (1, '36.630')] +[2023-10-09 08:12:06,131][60144] Updated weights for policy 1, policy_version 98712 (0.0007) +[2023-10-09 08:12:09,751][60143] Updated weights for policy 0, policy_version 97602 (0.0008) +[2023-10-09 08:12:10,124][60143] Updated weights for policy 0, policy_version 97612 (0.0010) +[2023-10-09 08:12:10,232][60144] Updated weights for policy 1, policy_version 98722 (0.0009) +[2023-10-09 08:12:10,491][60143] Updated weights for policy 0, policy_version 97622 (0.0007) +[2023-10-09 08:12:10,593][60144] Updated weights for policy 1, policy_version 98732 (0.0008) +[2023-10-09 08:12:10,861][60143] Updated weights for policy 0, policy_version 97632 (0.0007) +[2023-10-09 08:12:10,957][60144] Updated weights for policy 1, policy_version 98742 (0.0009) +[2023-10-09 08:12:11,052][59242] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 201064448. Throughput: 0: 1717.3, 1: 1724.9. Samples: 50274928. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:12:11,052][59242] Avg episode reward: [(0, '35.210'), (1, '36.380')] +[2023-10-09 08:12:11,326][60144] Updated weights for policy 1, policy_version 98752 (0.0010) +[2023-10-09 08:12:14,889][60143] Updated weights for policy 0, policy_version 97642 (0.0010) +[2023-10-09 08:12:15,261][60143] Updated weights for policy 0, policy_version 97652 (0.0008) +[2023-10-09 08:12:15,412][60144] Updated weights for policy 1, policy_version 98762 (0.0008) +[2023-10-09 08:12:15,627][60143] Updated weights for policy 0, policy_version 97662 (0.0008) +[2023-10-09 08:12:15,786][60144] Updated weights for policy 1, policy_version 98772 (0.0007) +[2023-10-09 08:12:16,052][59242] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13773.7). Total num frames: 201129984. Throughput: 0: 1688.6, 1: 1711.3. Samples: 50294358. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-09 08:12:16,052][59242] Avg episode reward: [(0, '35.630'), (1, '36.380')] +[2023-10-09 08:12:16,144][60144] Updated weights for policy 1, policy_version 98782 (0.0007) +[2023-10-09 08:12:16,217][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000098784_101154816.pth... +[2023-10-09 08:12:16,218][60184] Stopping RolloutWorker_w4... +[2023-10-09 08:12:16,218][60188] Stopping RolloutWorker_w10... +[2023-10-09 08:12:16,218][60179] Stopping RolloutWorker_w2... +[2023-10-09 08:12:16,218][60184] Loop rollout_proc4_evt_loop terminating... +[2023-10-09 08:12:16,218][59934] Stopping Batcher_0... +[2023-10-09 08:12:16,218][60188] Loop rollout_proc10_evt_loop terminating... +[2023-10-09 08:12:16,218][60179] Loop rollout_proc2_evt_loop terminating... +[2023-10-09 08:12:16,218][60176] Stopping RolloutWorker_w0... +[2023-10-09 08:12:16,218][59934] Loop batcher_evt_loop terminating... +[2023-10-09 08:12:16,219][60176] Loop rollout_proc0_evt_loop terminating... +[2023-10-09 08:12:16,219][60185] Stopping RolloutWorker_w7... +[2023-10-09 08:12:16,219][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000097664_100007936.pth... +[2023-10-09 08:12:16,219][59242] Component RolloutWorker_w4 stopped! +[2023-10-09 08:12:16,219][60190] Stopping RolloutWorker_w12... +[2023-10-09 08:12:16,219][60185] Loop rollout_proc7_evt_loop terminating... +[2023-10-09 08:12:16,220][60190] Loop rollout_proc12_evt_loop terminating... +[2023-10-09 08:12:16,219][60180] Stopping RolloutWorker_w3... +[2023-10-09 08:12:16,219][59242] Component RolloutWorker_w10 stopped! +[2023-10-09 08:12:16,220][60180] Loop rollout_proc3_evt_loop terminating... +[2023-10-09 08:12:16,220][59242] Component RolloutWorker_w2 stopped! +[2023-10-09 08:12:16,220][60186] Stopping RolloutWorker_w8... +[2023-10-09 08:12:16,221][60186] Loop rollout_proc8_evt_loop terminating... +[2023-10-09 08:12:16,221][59242] Component Batcher_0 stopped! +[2023-10-09 08:12:16,221][59242] Component RolloutWorker_w0 stopped! +[2023-10-09 08:12:16,222][60189] Stopping RolloutWorker_w11... +[2023-10-09 08:12:16,222][59242] Component RolloutWorker_w7 stopped! +[2023-10-09 08:12:16,222][60189] Loop rollout_proc11_evt_loop terminating... +[2023-10-09 08:12:16,223][59242] Component RolloutWorker_w12 stopped! +[2023-10-09 08:12:16,223][60191] Stopping RolloutWorker_w13... +[2023-10-09 08:12:16,223][60181] Stopping RolloutWorker_w5... +[2023-10-09 08:12:16,223][60178] Stopping RolloutWorker_w1... +[2023-10-09 08:12:16,223][60187] Stopping RolloutWorker_w9... +[2023-10-09 08:12:16,223][60181] Loop rollout_proc5_evt_loop terminating... +[2023-10-09 08:12:16,223][60191] Loop rollout_proc13_evt_loop terminating... +[2023-10-09 08:12:16,223][59242] Component RolloutWorker_w3 stopped! +[2023-10-09 08:12:16,223][60187] Loop rollout_proc9_evt_loop terminating... +[2023-10-09 08:12:16,223][60178] Loop rollout_proc1_evt_loop terminating... +[2023-10-09 08:12:16,224][59242] Component RolloutWorker_w8 stopped! +[2023-10-09 08:12:16,224][59242] Component RolloutWorker_w11 stopped! +[2023-10-09 08:12:16,224][59242] Component RolloutWorker_w13 stopped! +[2023-10-09 08:12:16,224][59242] Component RolloutWorker_w5 stopped! +[2023-10-09 08:12:16,225][59242] Component RolloutWorker_w9 stopped! +[2023-10-09 08:12:16,225][60182] Stopping RolloutWorker_w6... +[2023-10-09 08:12:16,225][59242] Component RolloutWorker_w1 stopped! +[2023-10-09 08:12:16,225][60182] Loop rollout_proc6_evt_loop terminating... +[2023-10-09 08:12:16,225][59242] Component RolloutWorker_w6 stopped! +[2023-10-09 08:12:16,226][60919] Stopping RolloutWorker_w15... +[2023-10-09 08:12:16,226][59242] Component RolloutWorker_w15 stopped! +[2023-10-09 08:12:16,226][60919] Loop rollout_proc15_evt_loop terminating... +[2023-10-09 08:12:16,230][59242] Component Batcher_1 stopped! +[2023-10-09 08:12:16,239][60886] Stopping RolloutWorker_w14... +[2023-10-09 08:12:16,239][59242] Component RolloutWorker_w14 stopped! +[2023-10-09 08:12:16,239][60886] Loop rollout_proc14_evt_loop terminating... +[2023-10-09 08:12:16,243][60143] Weights refcount: 2 0 +[2023-10-09 08:12:16,245][60143] Stopping InferenceWorker_p0-w0... +[2023-10-09 08:12:16,246][60143] Loop inference_proc0-0_evt_loop terminating... +[2023-10-09 08:12:16,246][59242] Component InferenceWorker_p0-w0 stopped! +[2023-10-09 08:12:16,239][60003] Stopping Batcher_1... +[2023-10-09 08:12:16,250][60144] Weights refcount: 2 0 +[2023-10-09 08:12:16,250][60003] Loop batcher_evt_loop terminating... +[2023-10-09 08:12:16,251][60003] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000097664_100007936.pth +[2023-10-09 08:12:16,252][60144] Stopping InferenceWorker_p1-w0... +[2023-10-09 08:12:16,252][60144] Loop inference_proc1-0_evt_loop terminating... +[2023-10-09 08:12:16,252][59242] Component InferenceWorker_p1-w0 stopped! +[2023-10-09 08:12:16,255][60003] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p1/checkpoint_000098784_101154816.pth... +[2023-10-09 08:12:16,257][59934] Removing ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000096544_98861056.pth +[2023-10-09 08:12:16,262][59934] Saving ./train_atari/atari_beamrider_APPO/checkpoint_p0/checkpoint_000097664_100007936.pth... +[2023-10-09 08:12:16,294][60003] Stopping LearnerWorker_p1... +[2023-10-09 08:12:16,295][60003] Loop learner_proc1_evt_loop terminating... +[2023-10-09 08:12:16,295][59242] Component LearnerWorker_p1 stopped! +[2023-10-09 08:12:16,300][59934] Stopping LearnerWorker_p0... +[2023-10-09 08:12:16,301][59934] Loop learner_proc0_evt_loop terminating... +[2023-10-09 08:12:16,300][59242] Component LearnerWorker_p0 stopped! +[2023-10-09 08:12:16,301][59242] Waiting for process learner_proc0 to stop... +[2023-10-09 08:12:17,110][59242] Waiting for process learner_proc1 to stop... +[2023-10-09 08:12:17,111][59242] Waiting for process inference_proc0-0 to join... +[2023-10-09 08:12:17,197][59242] Waiting for process inference_proc1-0 to join... +[2023-10-09 08:12:17,198][59242] Waiting for process rollout_proc0 to join... +[2023-10-09 08:12:17,199][59242] Waiting for process rollout_proc1 to join... +[2023-10-09 08:12:17,200][59242] Waiting for process rollout_proc2 to join... +[2023-10-09 08:12:17,200][59242] Waiting for process rollout_proc3 to join... +[2023-10-09 08:12:17,201][59242] Waiting for process rollout_proc4 to join... +[2023-10-09 08:12:17,202][59242] Waiting for process rollout_proc5 to join... +[2023-10-09 08:12:17,203][59242] Waiting for process rollout_proc6 to join... +[2023-10-09 08:12:17,204][59242] Waiting for process rollout_proc7 to join... +[2023-10-09 08:12:17,205][59242] Waiting for process rollout_proc8 to join... +[2023-10-09 08:12:17,205][59242] Waiting for process rollout_proc9 to join... +[2023-10-09 08:12:17,206][59242] Waiting for process rollout_proc10 to join... +[2023-10-09 08:12:17,206][59242] Waiting for process rollout_proc11 to join... +[2023-10-09 08:12:17,207][59242] Waiting for process rollout_proc12 to join... +[2023-10-09 08:12:17,207][59242] Waiting for process rollout_proc13 to join... +[2023-10-09 08:12:17,208][59242] Waiting for process rollout_proc14 to join... +[2023-10-09 08:12:17,208][59242] Waiting for process rollout_proc15 to join... +[2023-10-09 08:12:17,209][59242] Batcher 0 profile tree view: +batching: 170.2887, releasing_batches: 0.0897 +[2023-10-09 08:12:17,209][59242] Batcher 1 profile tree view: +batching: 172.1148, releasing_batches: 0.0900 +[2023-10-09 08:12:17,209][59242] InferenceWorker_p0-w0 profile tree view: +wait_policy: 0.0001 + wait_policy_total: 2447.0136 +update_model: 201.9260 + weight_update: 0.0008 +one_step: 0.0023 + handle_policy_step: 11339.4089 + deserialize: 64.1551, stack: 193.8669, obs_to_device_normalize: 2541.2573, forward: 5126.9782, prepare_outputs: 2457.4821, send_messages: 464.8983 +[2023-10-09 08:12:17,210][59242] InferenceWorker_p1-w0 profile tree view: +wait_policy: 0.0001 + wait_policy_total: 2411.7379 +update_model: 207.9603 + weight_update: 0.0009 +one_step: 0.0031 + handle_policy_step: 11367.4620 + deserialize: 64.2025, stack: 196.1396, obs_to_device_normalize: 2547.2370, forward: 5126.4876, prepare_outputs: 2469.2741, send_messages: 467.9433 +[2023-10-09 08:12:17,210][59242] Learner 0 profile tree view: +misc: 0.0181, prepare_batch: 269.4444 +train: 3628.8830 + epoch_init: 0.1874, minibatch_init: 13.1219, losses_postprocess: 895.9246, kl_divergence: 32.4376, update: 385.8799, after_optimizer: 2117.3148 + calculate_losses: 167.2517 + losses_init: 0.4044, forward_head: 56.3460, bptt_initial: 1.4463, bptt: 2.0177, tail: 38.1148, advantages_returns: 11.1214, losses: 44.1469 +[2023-10-09 08:12:17,211][59242] Learner 1 profile tree view: +misc: 0.0184, prepare_batch: 272.6299 +train: 3645.7488 + epoch_init: 0.1900, minibatch_init: 13.7339, losses_postprocess: 898.8778, kl_divergence: 32.1064, update: 387.8802, after_optimizer: 2123.8775 + calculate_losses: 171.9544 + losses_init: 0.3919, forward_head: 60.1935, bptt_initial: 1.4531, bptt: 2.0327, tail: 38.5393, advantages_returns: 11.1586, losses: 44.3540 +[2023-10-09 08:12:17,211][59242] RolloutWorker_w0 profile tree view: +wait_for_trajectories: 1.2296, enqueue_policy_requests: 411.2843, process_policy_outputs: 191.8656, env_step: 7462.3843, finalize_trajectories: 3.5055, complete_rollouts: 2.9191 +post_env_step: 376.6037 + process_env_step: 84.0175 +[2023-10-09 08:12:17,211][59242] RolloutWorker_w15 profile tree view: +wait_for_trajectories: 1.2339, enqueue_policy_requests: 406.4651, process_policy_outputs: 191.1354, env_step: 7452.2862, finalize_trajectories: 3.4676, complete_rollouts: 2.9638 +post_env_step: 370.6833 + process_env_step: 82.0019 +[2023-10-09 08:12:17,212][59242] Loop Runner_EvtLoop terminating... +[2023-10-09 08:12:17,212][59242] Runner profile tree view: +main_loop: 14681.6919 +[2023-10-09 08:12:17,213][59242] Collected {0: 100007936, 1: 101154816}, FPS: 13701.6